Locks

by Peter Gulutzan

IBM DB2®, Microsoft SQL Server, and Oracle all use locks to control jobs that are running concurrently. This article describes what they do, how they differ, and where the problems are. I will cover all the features essential for understanding locking with these three DBMSs. Nonessential features are ignored; instead I will concentrate on automatic row-level locks, under default or common conditions.

Physics

A lock is really a temporary record that identifies an item, the transaction using the item, and what the transaction has done or will do to the item (such as reading or writing). The following chart shows how the DBMSs store locks:

DBMS	SIZE OF LOCK RECORD	LOCATION OF LOCK RECORD
IBM	About 250 bytes	Volatile memory
Microsoft	About 100 bytes	Volatile memory
Oracle	About 40 bytes	Database page

The “Location Of Lock Record” column shows an important distinction: DB2 and SQL Server store lock information in a volatile place outside the database (usually RAM or virtual memory), while Oracle keeps an “Interested Transaction List” on the same page in which it stores the row. (Of course this is a simplification; links to details are supplied at the end of this article.)

The DB2/SQL SERVER method of storage is better because:

It is silly to reserve space in database pages, which are by definition persistent, for data about locks, which are by definition transient.
It is easy to change information such as “lock type” as the transaction progresses, if it’s in memory. For example, IBM has noted that Oracle can run out of space in a page to store lock data, and that Oracle stores less information about lock type.

The ORACLE method of storage is better because:

As soon as you access the row’s page, the data about locks is available. DB2 and SQL Server must dynamically maintain a separate storage array, with consequent overhead: more space for pointers, more time to search. Besides, Oracle doesn’t need to distinguish between lock types, because its multi-versioning makes the distinction irrelevant — more on that in the next section.

Lock Types

The three lock types are: Shared (S), Update (U), and Exclusive (X).

FIRST TERMINOLOGY WARNING: IBM sometimes uses the term “Share Lock” but “Shared Lock” is correct.

A transaction typically gets a Shared Lock when it is about to read. If Transaction #1 has a Shared Lock on Row #1, then:

Transaction #2 can read Row #1 (it is not blocked).
Transaction #2 can declare an intent to update — that is, it can get an Update lock on Row #1.
Transaction #2 cannot write Row #1. It is blocked, which means it must wait until Transaction #1 releases the lock.

A transaction typically gets an Update Lock when it is about to read in preparation for a write, as in a data-change statement or as signaled by a query’s FOR UPDATE clause. If Transaction #1 has an Update Lock on Row #1, then:

Transaction #2 can read Row #1.
Transaction #2 cannot get an Update Lock on Row #1.
Transaction #2 cannot write Row #1.

SECOND TERMINOLOGY WARNING: Oracle uses the terms “Shared” and “Exclusive” to describe what the rest of the world calls “Update” locks.

A transaction typically gets an Exclusive lock when it is about to write. If Transaction #1 has an Exclusive Lock on Row #1, then:

Transaction #2 cannot read Row #1.
Transaction #2 cannot get an Update Lock on Row #1.
Transaction #2 cannot write Row #1.

Succinctly: Shared blocks Exclusive, Update blocks Update and Exclusive, Exclusive blocks Shared, Update, and Exclusive.

This chart shows the surprising story of which DBMS supports each lock type:

DBMS	SHARED	UPDATE	EXCLUSIVE
IBM	Yes	Yes	Yes
Microsoft	Yes	Yes	Yes
Oracle	No	Yes	No

In fact, the chart is so surprising that I’d better repeat my earlier statement: this article discusses essential features under default or common conditions. There are always exceptions and ways to override — I’m ignoring them.

Oracle needs only Update Locks to fulfill its famous boast that writers won’t block readers and readers won’t block writers (though writers will block and be blocked by other writers or those with a declared intention to write). Oracle is allowing rows to be written and read at the same time. This is unacceptable interference, so Oracle must add a mechanism that DB2 and SQL Server don’t need. Namely, Oracle stores multiple temporary copies of the same page. Here’s an illustration showing what an Oracle page really looks like:

An Oracle Page

The idea is that each transaction knows its own associated System Change Number (SCN), and stores that number when it changes a page. When Transaction #1 reads the page, it knows (by comparing its SCN to the page’s SCN) that it can’t use the current version, which contains Row Data that Transaction #2 has changed. So it picks up the old version. Thus Transaction #2’s row data-change can’t interfere with Transaction #1 because Transaction #1 will ignore the change and use its own, pre-change, version of the page. This is what Oracle multi-versioning is about. (I’ll use the unofficial word “versioning” when I refer to Oracle’s practice of scanning backward so the data will be as of a point in time.)

The three ways to distinguish Oracle from DB2 / SQL Server are thus: (1) storage, (2) lack of Shared or Exclusive Locks, (3) multi-versioning.

Isolation Levels

All DBAzine.com readers know that transactions can be set to different isolation levels, ranging from weak (lots of concurrency but big chance of anomalies) to strong (lots of blocking but small chance of anomalies). Fewer readers know that DBMSs interpret isolation levels differently. Mainly, this is a matter of, once again, putting Oracle on one side of the teeter-totter and putting DB2 and SQL Server on the other side. But there’s another factor — the “standard definitions” of the isolation levels. I’m going to look at the four isolation levels as the SQL Standard describes them, stating how each DBMS supports them, and warning about portability.

Here’s a chart that shows the four ANSI/ISO SQL isolation levels, in order from weakest to strongest:

LEVEL	ASSOCIATED PHENOMENA
READ UNCOMMITTED	Dirty Read, Non-repeatable Read, Phantom
READ COMMITTED	Non-repeatable Read, Phantom
REPEATABLE READ	Phantom
SERIALIZABLE	None of the above

In the “Associated Phenomena” column are three phenomena (ANSI’s term) that can be undesirable.

A Dirty Read happens when Transaction #1 reads a row that Transaction #2 has written but not committed.
A Non-repeatable Read happens when Transaction #1 reads a row, Transaction #2 changes that row, then Transaction #1 reads the row again, getting a different result the second time.
A Phantom happens when Transaction #1 finds all rows that contain some specific data (e.g., name='Smith'), Transaction #2 inserts a new 'Smith' row, then Transaction #1 looks for the 'Smith' rows again, getting a different number of rows back the second time.

READ UNCOMMITTED

A transaction at the READ UNCOMMITTED isolation level is very weak: it makes no locks and ignores others’ locks. READ UNCOMMITTED transactions should be READ ONLY. DB2 will use READ UNCOMMITTED logic only for a small group of statements (such as SELECT INTO and INSERT ... SELECT). Oracle won’t use it at all.

DBMS	SUPPORTS LEVEL?
IBM	Yes
Microsoft	Yes
Oracle	No

Chart: Support for READ UNCOMMITTED.

The “No” in the Oracle column of the chart shown above is not a violation of SQL Standard requirements. The fact is that Oracle’s multi-versioning makes the Dirty Read Phenomenon impossible anyway, so Oracle can automatically upgrade requests for READ UNCOMMITTED to the next level, READ COMMITTED.

THIRD TERMINOLOGY WARNING: IBM calls this level UNCOMMITTED READ.

READ COMMITTED

READ COMMITTED is always the default isolation level for Oracle and SQL Server. Sometimes it’s the isolation level for DB2 as well.

A transaction at this level will lock, but read locks are typically “short duration” — which means they won’t last until the transaction ends. Versioning is also short duration — which means it won’t go back to the start of the transaction.

A digression: The concept of duration is important — duration distinguishes one isolation level from the next.

DBMS	SHARED LOCKS	UPDATE LOCKS	EXCLUSIVE LOCKS	VERSIONING
IBM	Short	Long	Long	N/A
Microsoft	Short	Long	Long	N/A
Oracle	N/A	Long	N/A	Short

Chart: Duration for READ COMMITTED.

Duration with respect to locks:

A short duration is the time it takes to do a read or write. For example, when DB2 performs a table-scan it can get a Shared lock on Page #1, then read Page #1, then (when it’s ready to read Page #2) release the lock on Page #1.
A long duration lasts until the transaction ends — which should mean until COMMIT or ROLLBACK happens, but Oracle and SQL Server will treat single statements as transactions unless you take special steps.

Actually DB2 and SQL Server will keep a lock on a row that you’re reading via a cursor until you read the next row or close the cursor. So you can FETCH and then UPDATE the fetched result without worrying about interference. This is actually a slightly stronger isolation level than the SQL Standard requires for READ COMMITTED, because it prevents other minor phenomena in addition to Dirty Read.

FOURTH TERMINOLOGY WARNING: The name for this “slightly stronger than READ COMMITTED” isolation level is CURSOR STABILITY, which is the name that IBM prefers.

Duration with respect to versioning:

Oracle makes sure the row is the same as it was as of the last COMMIT, or as of the start of the statement, whichever came first. In other words, versioning duration is short, because the versioning goes back only to statement start, not to transaction start.

In summary: At the READ COMMITTED level, something (either read locks or versioning) is short.

REPEATABLE READ

A transaction at the REPEATABLE READ isolation level will lock, and even Shared Locks are long duration.

DBMS	SHARED LOCKS	UPDATE LOCKS	EXCLUSIVE LOCKS	VERSIONING
IBM	Long	Long	Long	N/A
Microsoft	Long	Long	Long	N/A
Oracle	N/A	N/A	N/A	N/A

Chart: Duration for REPEATABLE READ.

The chart above shows Oracle as “Not Applicable” in all columns because, once again, Oracle automatically upgrades to the next isolation level, SERIALIZABLE.

FIFTH TERMINOLOGY WARNING: IBM prefers to call this the READ STABILITY isolation level.

SIXTH TERMINOLOGY WARNING: When you see the words REPEATABLE READ in a DB2 manual, you should take it that IBM actually means SERIALIZABLE.

SERIALIZABLE

The SQL Standard requires SERIALIZABLE to be the default isolation level, but (as we’ve seen) it rarely is. SERIALIZABLE should be an unimportant isolation level, but there is much confusion about it.

DBMS	SHARED LOCKS	UPDATE LOCKS	EXCLUSIVE LOCKS	VERSIONING
IBM	Long	Long	Long	N/A
Microsoft	Long	Long	Long	N/A
Oracle	N/A	Long	N/A	Long

Chart: Duration for SERIALIZABLE.

This chart doesn’t really tell the whole story, since at this strong isolation level it isn’t enough to lock the item: the DBMS must also lock the path to the item. Now, if a table scan is underway, this is fairly simple: the DBMS can lock every page it reads, which happens to be all the pages. With an index lookup, more sophistication is possible. For example, if you look for name = ‘Smith,’ SQL Server will lock the key positions that just precede and just follow ‘Smith’ in the index — thus blocking any attempt to insert a new ‘Smith’ record or to change a ‘Jones’ record to a ‘Smith’ record.

Oracle operates differently. Let’s look at what this SQL statement implies for Oracle:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

Even at this level there are no Shared Locks, so for real serializability you must use the FOR UPDATE clause in all SELECTs.
Oracle makes sure the row is the same as it was as of the last COMMIT, or as of the start of the transaction, whichever came first. In other words, versioning duration is long — therefore Phantoms are invisible.

There is a curious situation here. The SQL Standard says that the SERIALIZABLE isolation level must be truly serializable — that is, no phenomena of any kind are permissible. But for the SET TRANSACTION ISOLATION LEVEL SERIALIZABLE statement, the Standard says specifically that only the three phenomena (Dirty Read, Non-repeatable Read, Phantom) are not permissible. By implication, other phenomena are permissible. I have concluded that a multi-versioning DBMS conforms to the Standard’s requirements (as Oracle claims to do) even if not all transactions are SERIALIZABLE in the ANSI/ISO SQL sense of the word.

Summary

In general, DB2 and SQL Server do the following:

They do almost no locks for READ UNCOMMITTED.
They do short read locks and long write locks for READ COMMITTED.
They do long read locks and long write locks for REPEATABLE READ.
They do long read locks and long write locks, as well as some path locking, for SERIALIZABLE.

In general, Oracle does the following:

It ignores READ UNCOMMITTED and REPEATABLE READ.
It does Update Locks for READ COMMITTED and versioning “as of” the time the statement started.
It does Update Locks for SERIALIZABLE and versioning “as of” the time the transaction started.

Therefore you should keep this in mind:

With Oracle, it’s necessary to execute a SELECT ... FOR UPDATE statement to ensure you’ll get an Update Lock early and thus prevent anomalies. With DB2 and SQL Server, the FOR UPDATE clause is not strictly necessary, but does help to avoid deadlocks.

In DB2 and SQL Server circles, it’s considered necessary to COMMIT transactions as quickly as possible. In Oracle circles, that advice is heard less frequently.

References

Atul, Adya, Barbara Liskov and Patrick O’Neil. “Generalized Isolation Level Definitions.” This is more recent than the “Critique...” paper cited next.

Berenson, Hal, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil and
Patrick O’Neil. “A critique of ANSI SQL Isolation Levels.” A dense, scholarly, oft-cited, hard-to-read, necessary paper.

Fekete, Alan, Dimitrios Liarokapis, Elizabeth O’Neil, Patrick O’Neil and Dennis Shasha. “Making Snapshot Isolation Serializable.” Snapshot Isolation is a level made possible by multi-versioning.

IBM Corporation. “A Technical Discussion of Multi Version Read Consistency.”
A criticism of Oracle’s locking, including some disputable statements.

Microsoft Corporation. “BUG: Locking Behavior in Transactions Is Incorrect.”
The bug involves a lock type that I don’t cover (Intent-share Lock), but I thought it’s notable that READ COMMITTED transactions aren’t always easy to implement.

Oracle Corporation. “Oracle7 Server Concepts — Data Concurrency.”
I couldn’t find an online Oracle9i document that covered locking in the same detail, but this discussion is still nearly correct.

Poch, Norbert and Sven Noltemeier. “DB2 UDB Universal Database Version 7 Concurrency and Locking.”
A 35-slide show that hits all the highlights.

Peter Gulutzan is the co-author of one thick book about the SQL Standard (SQL-99 Complete, Really) and one thin book about optimization (SQL Performance Tuning). He has written about DB2, Oracle, and SQL Server, emphasizing portability and DBMS internals, in previous dbazine.com articles. Now he has a new job: he works for the “Number Four” DBMS vendor, MySQL AB.

Contributors : Peter Gulutzan
Last modified 2005-04-12 06:21 AM

locks in database pages

Posted by jamesRoberts at 2006-08-24 09:52 AM

On the comment:
It is silly to reserve space in database pages, which are by definition persistent, for data about locks, which are by definition transient.
-----

I guess that Oracle do it so that they can spread processing over multiple machines (i.e. RAC). Although you could use a distributed memory model (using remote procedure calls etc), it would be more scaleable to put the lock data on the database record.

DBAzine.com

Sections

Personal tools

Menu

Who Are You?