Implementing Identity Columns in SQL Server

by Eric Charran

Overview

When working with an identity column in SQL Server, developers can leverage several benefits. These benefits to the data model include a highly efficient method of providing data integrity to the database (via referential integrity and entity integrity when coupled with referential integrity constraints), and the abstraction of the natural key from performing this role. This allows for clear and concise relationship demarcation among data and added ease of authoring TSQL against the data model. Identities scale well and can be derived into other columns which can use the identity value as the basis for an actual business value.

When working with the identity column, it is important to understand concepts such as scope, and how they may apply to developing applications that use the identity column in stored procedures or other programmatic TSQL batches. The following sections describe in detail the native SQL Server functions and their implications.

Understanding Identity Scope and Functions

Often, application developers will utilize identities as primary keys. These generated identities will then be used in stored procedures that will pass them to applications. One situation that illustrates this occurs when an application allows a user to insert rows into a table and must maintain a reference to the row just inserted for future updates. One approach in this situation would be to call a stored procedure from the application that would insert the record from parameters passed to it. Once the record has been inserted, the stored procedure uses the @@IDENTITY variable to obtain the identity just inserted and returns that value as an output to the application. The application can later call a separate procedure, which will provide a means to update the record by using the identity value (now the record’s primary key) to locate it and effect changes.

In this situation it is important to understand some general functions related to identities. The following table lists these functions and describes them.

Identity	Function Description
IDENT_CURRENT	Returns the last identity value generated from any session and any scope for the given table.
@@IDENTITY	Returns the last identity value generated for any table for current session and the current scope.
SCOPE_IDENTITY	Returns the last identity value generated for any table for the current session and the current scope.

Table 1.3: Identity functions.

It is important to note that these functions all have similarities in that they return the last generated identity. However, they vary significantly with regard to scope and session. This means that, depending on which scope is desired, there are appropriate places to use each of these functions.

In the previous example of a stored procedure that returns the identity for the last inserted row, using @@IDENTITY becomes a riskier proposition than using SCOPE_IDENTITY. SCOPE_IDENTITY will return the identity column’s last generated value from within the executing procedure. @@IDENTITY will do the same for any executing code. Both are bound to the session.an> because it is less likely that an undesired identity value (one which has been created outside of the scope of the insert procedure) will be returned.

Similarly, IDENT_CURRENT differs from the previous functions in that it is independent of session and scope and depends on the table.

Identity Constraints

While the identity column has significant database value for developers, there are some limiting factors that architects must keep in mind while developing. An identity column can only occur once in a table. Similar to a GUID, SQL Server does not support more than one identity column per table. For this reason, when duplicate identity functionality is required for a table, a custom function must be written and set as the column’s default value using an identity function as its default.

Seeds and Increments

Additionally, the identity column must be of a general numeric data type. The only supported data types for an identity column are decimal, int, numeric, smallint, bigint, or tinyint. The seed is referred to the number with which the identity will begin. For example, a seed of one signifies to SQL Server that the first record inserted into the table will receive and identity column value of one. Similarly, the increment indicates the number by which SQL Server will increase the value of the identity column for each insert. For instance, an increment of ten means that SQL Server will increase the value of the second inserted record in the table by ten more that he previous identity value.

Occasionally, developers may find the need to truncate or empty a data model and reload the data. Since SQL Server will keep track of the last generated identity values and increment based on the increment specified in the identity column, the new table values will continue to receive the next number in the identity sequence.

You can use a DBCC command to reset the identity column to an initial value. The DBCC CHECKIDENT command can be used to check for the current identity value and optionally resets the seed at the specified level. This is useful for avoiding situations in which the identity value is in the millions without a corresponding count of records, or there are dependent columns that mirror the identity through the use of a user-defined function assigned to a column default. Use the following command syntax when resetting an identity:

DBCC CHECKIDENT (jobs, RESEED, 30)

Table 1.4: DBCC CHECKIDENT syntax.

Data Management

Occasionally, developers may be required to perform acute surgery on the data within tables. There may be a series of events that could lead to the deletion of records that are related to other records (for example, if another developer or DBA inadvertently drops constraints and deletes a parent table’s records without deleting a child record). Whatever the circumstance, developers and DBAs may be forced to attempt to reinsert a record into a table with an identity column.

Normally, this would not be an issue because the identity column will naturally increment. However, in situations such as the one previously mentioned, you may need to recreate a deleted record with its identity value in tact. SQL Server natively does not allow one to insert a value into an identity column. So, to perform this task, you should use the SET_IDENTITY_INSERT syntax to tell SQL Server that an insertion into an identity column will be allowed for the batch that will follow the statement.

Essentially, by turning off the identity column temporarily, developers can insert into the table any previously removed identity values. Alternatively, using this technique, identity values can be duplicated. This can be avoided if the identity column has some sort of index on it that prevents duplicates (i.e., unique index, primary key constraint).

The following outlines the syntax used to insert a value into a table with an identity column.

-- SET IDENTITY_INSERT to ON.

SET IDENTITY_INSERT sales ON

GO

-- Attempt to insert an identity column value of 4

INSERT INTO products (SalesID, SKU) VALUES(4, '01216326').

GO

-- SET IDENTITY_INSERT to OFF.

SET IDENTITY_INSERT sales OFF

Table 1.5: Inserting into an identity column.

Choosing Identity Columns

Like most technical approaches, the identity column is not appropriate for every situation. However, if the goal is a stable and efficient data model constructed using rules of normalization, you should consider using an identity column as the primary candidate to be a primary key. You can use other techniques to derive business value from an identity column, but the surrogate key should never be used as a business value. It should simply be a mechanism to efficiently and effectively provide referential integrity and entity integrity within the data model.

When using an identity, it is also important to understand concepts such as scope, and which SQL Server identity function it is appropriate to use in TSQL code when interacting with applications. Depending on the implementation, the identity value returned can either be the intended one, or it could be derived from an unexpected source external to the scope of the executing code. Use the correct identity function to avoid these circumstances.

Eric Charran is currently working as a Technical Architect and Web Developer, in Horsham, PA. Eric holds Microsoft Certifications in SQL Server and has significant experience in planning, designing, and modeling n-tier applications using SQL Server 2000, as well as architecting and implementing .NET framework based applications, solutions and Web services. Eric’s other professional skills encompass Database Administration, Data Warehousing and Application Architecture, Modeling and Design, as well as Data Warehouse Transformation and Population using DTS. Eric is also skilled in developing and implementing ASP.NET and Windows forms applications, using technologies such as COM+, the .NET framework, VB.NET and other .NET languages.

Contributors : Eric Charran
Last modified 2005-04-12 06:21 AM

How do I handle situations like this?

Posted by CHID at 2005-12-22 07:25 AM

INSERT INTO products (SalesID, SKU) VALUES(4, '01216326').

instead i am bound to insert like this...

Insert into productsduplicate SELECT * FROM products.
it is asking me to specify column list..

I shall be glad if you can let me know if there is any way that i can achieve that without specifying column list

or is that the way i am bound to specify columns also.

i use SQLSERVER2000

With Regards,
chid

Replies to this comment

Best Practices (Posted by mougue at 2006-02-14 09:38 PM)

Manually specifying value for Identity Key

Posted by KenCulp at 2007-04-23 11:16 AM

I am currently teaching ADO 2.0 and other related topics including writing a book. I followed the directions ("SET IDENTITY_INSERT " + tableName + " OFF") for a table, did three inserts, and then turned it back on. Identity values were still used.

The problem is that the inserts were being done by calling Update a DataAdapter that used a created insert command (SqlCommandBuilder). The created inserted command does not include a value for the identity key, leaving it null. When null is passed, it used the identity key anyway as one would hope. Thus: WRITE YOUR OWN INSERT COMMAND.

By the way, does the IDENTITY_INSERT have to be turned off for each row inserted or after a series of inserts. Also, should not the table or database be locked while IDENTITY_INSERT is OFF? If so, what is the easiest way to do that?

DBAzine.com

Sections

Personal tools

Menu

Who Are You?