Chapter 3. Locking and Concurrency - Part 3

by Tom Kyte

From Expert Oracle, Signature Edition: Programming Techniques and Solutions for Oracle 7.3 through 8.1.7, Berkeley, Apress, April 2005.

Part 1 | Part 2 | Part 3

What is Concurrency Control?

Concurrency controls are the collection of functions that the database provides in order to allow many people to access and modify data simultaneously. The implementation of locking in the database is perhaps the most crucial factor in determining the degree of concurrency that your application can support (basically, how well it will scale). As we discussed previously, there are many different types of locks – from TX transaction locks, which are extremely scalable both in terms of performance and cardinality (whether you have one or one billion of them), to TM and DDL locks (applied in the least restrictive mode whenever possible), to the internal locks that Oracle employs to mediate access to its shared data structures, from the very lightweight and fast latching mechanism to the heavier, but feature-rich, enqueue.

But concurrency control goes beyond locks. There are other things that the database can do to provide controlled, yet highly concurrent, access to data. For example, there is a eature of Oracle called multiversioning (introduced in Chapter 1). Since Oracle uses multi-versioning to provide read-consistent views of data, we get the rather pleasant side effect that a reader of data will never be blocked by a writer of data; – writes do not block reads. This is one of the fundamental differences between Oracle and the rest of the databases out there. A read query in Oracle will never be blocked, it will never deadlock with another session, and it will never get an answer that didn’t exist in the database.

Oracle’s multi-versioning model for read consistency is always applied at the statement level (for each and every query) and can also be applied at the transaction level. What I would like to do in this section is to demonstrate how multi-versioning ties in with the various transaction isolation levels defined in the SQL92 standard.

Transaction Isolation Levels

The ANSI/ISO SQL92 standard defines four levels of transaction isolation, with different possible outcomes for the same transaction scenario. That is, the same work performed in the same fashion with the same inputs, may result in different answers, depending on your isolation level. These isolation levels are defined in terms of three ‘phenomena’ that are either permitted or not at a given isolation level:

Dirty read – The meaning of this is as bad as it sounds. You are permitted to read uncommitted, or ‘dirty’, data. This is the effect you would achieve by just opening an OS file that someone else is writing, and reading whatever data happened to be there. Data integrity is compromised, foreign keys violated, unique constraints ignored.
Non-REPEATABLE READ – This simply means that if you read a row at time T1, and attempt to re-read that row at time T2, the row may have changed. It may have disappeared, it may have been updated, and so on.
Phantom read – This means that if you execute a query at time T1, and re-execute it at time T2, additional rows may have been added to the database, which will affect your results. This differs from the non-repeatable read in that in this case, data you already read has not been changed but rather that more data satisfies your query criteria than before.

The SQL92 isolation levels are defined based on whether or not they allow each of the abovephenomena:

Isolation Level	Dirty Read	Non-REPEATABLE READ	Phantom Read
READ UNCOMMITTED	Permitted	Permitted	Permitted
READ COMMITTED		Permitted	Permitted
REPEATABLE			Permitted
READ
SERIALIZABLE

Oracle explicitly supports the READ COMMITTED and SERIALIZABLE isolation levels, as they are defined in the standard. However, this doesn’t tell the whole story. The SQL92 standard was attempting to set up isolation levels that would permit various degrees of consistency for queries performed in each level. REPEATABLE READ is the isolation level that they claim will guarantee a read consistent result from a query. In their definition, READ COMMITTED does not give you consistent results and READ UNCOMMITTED is the level to use to get non-blocking reads.

In Oracle, READ COMMITTED has all of the attributes required to achieve Read-consistent queries. In other databases, READ COMMITTED queries can and will return answers that never existed in the database at any point in time. Moreover, Oracle also supports the spirit of READ UNCOMMITTED. The goal of providing a dirty read is to supply a non-blocking read, whereby queries are not blocked by, and do not block,updates of the same data. However, Oracle does not need dirty reads to achieve this goal, nor does it support them. Dirty reads are an implementation other databases must use in order to provide nonblocking reads.

In addition to the four defined SQL92 isolation levels, Oracle provides another level, read only. A readonly transaction is equivalent to a read-only REPEATABLE READ or SERIALIZABLE in SQL92. It only sees those changes that were committed at the time the transaction began, but inserts, updates and deletes are not permitted in this mode (other sessions may update data, but not the read-only transaction). Using this mode you can achieve REPEATABLE READ and SERIALIZABLE READ, without phantoms.

Let’s now move on to discuss exactly how multi-versioning and read consistency fits into the above isolation schemes, and how other databases that do not support multi versioning would achieve the same results. This is instructive for anyone who has used another database and believes they understand how the isolation levels must work. It is also interesting to see how a standard that was supposed to remove the differences from the databases, SQL92, actually allows for it. The standard, while very detailed, can be implemented in very different ways.

READ UNCOMMITTED

The READ UNCOMMITTED isolation level permits for dirty reads. Oracle does not make use of dirty reads, nor does it even allow for them. The basic goal of a READ UNCOMMITTED isolation level is to provide a standards-based definition that caters for non-blocking reads. As we have seen, Oracle provides for non-blocking reads by default. You would be hard-pressed to make a SELECT query block in the database (there is the special case of a distributed in-doubt transaction, which we discuss in Chapter 4). Every single query, be it a SELECT, INSERT, UPDATE, or DELETE, executes in a readconsistent fashion.

In Chapter 1, Developing building Successful Oracle Applications, Oracle’s method of obtaining readconsistency was demonstrated by way of an accounts example. We’re now going to revisit that example to discuss in more detail what happens in Oracle, using multi-versioning, and what would happen in any number of other databases. Again, we are assuming one database row per block.

We will start with the same basic table and query:

     create table accounts
     ( account_number number primary key,
       account_balance number
     );

     select sum(account_balance) from accounts;

Before the query begins, we have the following data:

Row	Account Number	Account Balance
1	123	$500.00
2	456	$240.25
...	...	...
342,023	987	$100.00

Now, our select statement starts executing and reads row 1, row 2, and so on. At some point while we are in the middle of the query, a transaction moves $400.00 from account 123 to account 987. This transaction does the two updates, but does not commit. The table now looks like this:

Row	Account Number	Account Balance	LOCKED
1	123	($500.00) changed to $100.00	X
2	456	$240.25
...	...	...
342,023	987	($100.00) changed to $500.00	X

So, two of those rows are locked – if anyone tried to update them they would be blocked. So far the behavior we are seeing is more or less consistent across all databases. The difference will be in what happens when the query gets to the locked data.

When the query we are executing gets to the locked block, it will notice that the data on it has changed since the beginning of its execution. In order to provide a consistent (correct) answer, Oracle will at this point recreate the block with the locked data as it existed when the query began. That is, Oracle takes a detour around the lock – it reads around it, reconstructing it from the rollback segment. A consistent and correct answer comes back without waiting for the transaction to commit.

Now, a database that allowed a dirty read would simply return the value it saw in account 987 at the time it read it, in this case $500. The query would count the transferred $400 twice and would present a total that never existed in the accounts table at any point in time. In a multi-user database, a dirty read can be a dangerous feature and, personally, I have never seen the usefulness of it. It not only returns the wrong answer, but it may see data that will never actually exist in the database at any point in time. Say that, rather than transferring, the transaction was actually just depositing $400 in account 987. The dirty read would count the $400 and get the ‘right’ answer, wouldn’t it? Well, suppose the uncommitted transaction was rolled back. We have just counted $400 that was never actually in the database.

The point here is that dirty read is not a feature – rather it is a liability. In Oracle, it is just not needed. You get all of the advantages of a dirty read (no blocking) without any of the incorrect results.

READ COMMITTED

The READ COMMITTED isolation level states that a transaction may only read data that was committed before the transaction began. There are no dirty reads. There may be non- REPEATABLE READ s (rereads of the same row return a different answer) and phantom reads (newly inserted rows become visible to a query that were not visible earlier in the transaction). READ COMMITTED is perhaps the most commonly used isolation level in database applications everywhere. It is rare to see a different isolation level used.

READ COMMITTED isolation is not as cut and dry as it sounds. If you look at the matrix above, it looks straightforward. Obviously, given the rules above, a query executed in any database using READ COMMITTED isolation would behave in the same way, would it not? It will not. If you query multiple rows in a single statement then, in almost every other database, READ COMMITTED isolation can be as bad as a dirty read, depending on the implementation.

In Oracle, using multi-versioning and read consistent queries, the answer we get from the accounts query is the same in READ COMMITTED as it was in the READ UNCOMMITTED example. Oracle will reconstruct the modified data as it appeared when the query began, returning the answer that was in the database when the query started.

Let’s now take a look at how our above example might work in READ COMMITTED mode in other databases – you might find the answer surprising. We’ll pick up our example at the point described in the previous table:

We are in the middle of the table. We have read and summed the first N rows.
The other transaction has moved $400 from account 123 to 987.
It has not yet committed, so rows 123 and 987 are locked.

We know what happens in Oracle when it gets to account 987 – it will read around the modified data, find out it should be $100.00 and complete. Let’s see how another database, running in some default READ COMMITTED mode, might arrive at the answer:

Time	Query	Account transfer transaction
T1	Reads row 1, sum = $500 so far.
T2	Reads row 2, sum = $740.25 so far.
T3		Updates row 1, puts an exclusive lock onblock preventing other updates and reads. Row 1 now has $100.
T4	Reads row N, sum = ....
T5		Updates row 342023, puts an exclusive lock on this block. Row now has $500.
T6	Reads row 342023, discovers that it has been modified. This session will block and wait for this block to become available. All processing on this query stops.
T7		Commits transaction.
T8	Reads row 342,023, sees $500 and presents final answer.

The first thing to notice is that this other database, upon getting to account 987, will block our query. This session must wait on that row until the transaction holding the exclusive lock commits. This is one reason why many people have a bad habit of committing in the middle of their transactions. Updates interfere with reads in most other databases. The really bad news in this scenario is that we are making the end user wait for the wrong answer. You still receive an answer that never existed in the database at any point in time, as with the dirty read, but this time we made them wait for the wrong answer.

The important lesson here is that various databases executing in the same, apparently safe isolation level, can, and will, return very different answers under the exact same circumstances. It is important to understand that, in Oracle, non-blocking reads are not had at the expense of correct answers. You can have your cake and eat it too, sometimes.

REPEATABLE READ

The goal of REPEATABLE READ in SQL92 is to provide an isolation level that gives consistent, correct answers, and prevents lost updates. We’ll take a look at both examples, and see what we have to do in Oracle to achieve this, and what happens in other systems.

Getting a Consistent Answer

If I have a REPEATABLE READ isolation, the results from a given query must be consistent with respect to some point in time. Most databases (not Oracle) achieve REPEATABLE READs via the use of row-level, shared read locks. A shared read lock prevents other sessions from modifying data that you have read. This of course decreases concurrency. Oracle opted for the more concurrent, multi-versioning model to provide read consistent answers.

In Oracle, using multi-versioning, you get an answer that is consistent with respect to the point in time the query began execution. In other databases, using shared read locks, you get an answer that is consistent with respect to the point in time the query completes – that is, when you can get the answer at all (more on this in a moment).

In a system that employs a shared read lock to provide REPEATABLE READs, you would observe rows in a table getting locked as the query processed them. So, using the example from above, as our query reads the accounts table, it would leave shared read locks on each row:

Time	Query	Account transfer transaction
T1	Reads row 1, sum = $500 so far. Block 1 has a shared read lock on it.
T2	Reads row 2, sum = $740.25 so far. Block 2 has a shared read lock on it.
T3		Attempts to update row 1 but is blocked. Transaction is suspended until it can obtain an exclusive lock.
T4	Reads row N, sum = ....
T5	Reads row 342023, sees $100 and presents final answer.
T6	Commits transaction.
T7		Updates row 1, puts an exclusive lock on this block. Row now has $100.
T8		Updates row 342023, puts an exclusive lock on this block. Row now has $500. Commits.

This table shows that we now get the correct answer, but at the cost of physically serializing the two transactions. This is one of the side effects of shared read locks for consistent answers: readers of data will block writers of data. This is in addition to the fact that, in these systems, writers of data will block readers of data.

So, you can see how shared read locks would inhibit concurrency, but they can also cause spurious errors to occur. In this example we start with our original table but this time with the goal of transferring $50.00 from account 987, to account 123:

Time	Query	Account transfer transaction
T1	Reads row 1, sum = $500 so far. Block 1 has a shared read lock on it.
T2	Reads row 2, sum = $740.25 so far. Block 2 has a shared read lock on it.
T3		Updates row 342023, puts an exclusive lock on block 342023 preventing other updates and shared read locks. This row now has $50.
T4	Reads row N, sum = ....
T5		Attempts to update row 1 but is blocked. Transaction is suspended until it can obtain an exclusive lock.
T6	Attempts to read row 342023 but cannot as an exclusive lock is already in place.

We have just reached the classic deadlock condition. Our query holds resources the update needs, and vice versa. Our query has just deadlocked with our update transaction. One of them will be chosen as the victim and will be killed. We just spent a long time and a lot of resources only to fail, and get rolled back at the end. This is the second side effect of shared read locks: readers and writers of data can and frequently will deadlock each other.

As we have seen in Oracle, we have statement level read consistency without reads blocking writes or deadlocks. Oracle never uses shared read locks – ever. Oracle has chosen the harder to implement, but infinitely more concurrent multi-versioning scheme.

Lost Update Prevention

A common use of REPEATABLE READ would be for lost update prevention. If we have REPEATABLE READ enabled, this cannot happen. By definition, a re-read of that row in the same session will result in the same exact data being returned.

In databases other than Oracle, a REPEATABLE READ may be implemented using SELECT FOR UPDATE and shared read locks. If two users select the same row for update, both will place a shared read lock on that data. When the first user attempts to update they will be blocked. When the second user attempts to update, a deadlock will occur. This is not ideal but it does prevent the lost update.

In Oracle, if we want REPEATABLE READ, but do not actually want to physically serialize access to a table with SELECT FOR UPDATE NOWAIT (as demonstrated earlier in the chapter), we actually need to set the isolation level to SERIALIZABLE.

SERIALIZABLE encompasses the lower levels of isolation so if you can do SERIALIZABLE, you can do REPEATABLE READ

In Oracle, a SERIALIZABLE transaction is implemented so that the read-consistency we normally get at the statement level is extended to the transaction. That is, the answers to every query we will execute in our transaction is fixed at the point in time our transaction began. In this mode if we:

     Select * from T;
     Begin dbms_lock.sleep( 60*60*24 ); end;
     Select * from T;

The answers returned from T would be the same, even though we just slept for 24 hours (or we might get an ORA-1555, snapshot too old error). The isolation level would assure us these two queries would always returns the same results. Oracle does this in the same way it provided a read consistent query. It uses the rollback segments to reconstruct the data as it existed when our transaction began, instead of just when our statement began. In a SERIALIZABLE mode transaction however, if we attempt to update data and discover at that point in time the data has changed since our transaction began, we will receive an error about not being able to serialize access. We will cover this in more detail shortly.

It is clear that this is not the optimum approach for our HR application. What would happen in that application is that both users would query the data; both users would update the data on screen. The first user would save their changes, and would succeed. The second user however, would receive an error when they attempted to save their changes. They just wasted a lot of time for nothing. They must restart the transaction, receive our changes, and do it all over again. It prevented the lost update, but at the price of an annoyed end-user. However, if a situation arises where REPEATABLE READ is required, and you do not expect transactions to attempt to update the same rows, then use of the SERIALIZABLE mode is a possible solution.

SERIALIZABLE

This is generally considered the most restrictive level of transaction isolation, but provides the highest degree of isolation. A SERIALIZABLE transaction operates in an environment that makes it appear as if there are no other users modifying data in the database, the database will be ‘frozen’ at the point in time your query began. Your transaction sees the database consistently, at a single point in time. Side effects (changes) made by other transactions are not visible to it, regardless of how long it has been running. SERIALIZABLE does not mean that all transactions executed by the users are the same as if they were executed one right after another in a serial fashion. It does not imply that there is some serial ordering of the transactions that would result in the same outcome. This last point is a frequently misunderstood concept and a small demonstration will clear it up. The following table represents two sessions performing work over time. The database tables A and B start out empty and are created as follows:

     tkyte@TKYTE816> create table a ( x int );
     Table created.

     tkyte@TKYTE816> create table b ( x int );
     Table created.

Now, we have the following series of events:

Time	Session 1 Executes	Session 2 Executes
0:00	Alter session set isolation_level=serializable;
0:01		Alter session set isolation_level=serializable;
0:02	Insert into a select count(*) from b;
0:03		Insert into b select count(*) from a;
0:04	Commit;
0:05		Commit;

Now, when this is all said and done – tables A and B will each have a row with the value of zero in it. If there was some ‘serial’ ordering of the transactions we could not possibly have both tables containing the value zero in them. If Session 1 executed before Session 2 – then table B would have a count of 1. If Session 2 executed before Session 1 – then table A would have a count of 1. As executed above, however, both tables will have a count of zero. They just executed as if they were the only transaction in the database at that point in time. No matter how many times Session 1 queried table B, the count will be the count that was committed in the database at time 0:00. Likewise, no matter how many times Session 2 queries table A, it will be the same as it was at time 0:01.

In Oracle, serializability is achieved by extending the read consistency we get at the statement level to the transaction level. Instead of results being consistent with respect to the start of a statement, they are pre-ordained at the time you begin the transaction. Pretty deep thought there – the database already knows the answer to any question you might ask it, before you ask it.

This degree of isolation comes with a price – that price is the error:

     ERROR at line 1:
     ORA-08177: can’t serialize access for this transaction

You will get this message whenever you attempt to update a row that has changed since your transaction began. Oracle takes an optimistic approach to serialization; it gambles on the fact that the data your transaction wants to update won’t be updated by any other transaction. This is typically the way it happens and the then gamble pays off, especially in OLTP type systems. If no one else updates your data during your transaction, this isolation level, which will generally decrease concurrency in other systems, will provide the same degree of concurrency as it would without SERIALIZABLE transactions. The downside to this is that you may get the ORA-08177 error if the gamble doesn’t pay off. If you think about it, however, the gamble is worth the risk. If you’re using SERIALIZABLE transaction, you should not be expecting to update the same information as other transactions. If you do, you should use the SELECT ... FOR UPDATE as shown above, and this will serialize the access. So, if you

Have a high probability of no one else modifying the same data;
Need transaction level read consistency;
Will be doing short transactions (in order to help make the first bullet point a reality);

then using an isolation level of SERIALIZABLE will be achievable and effective. Oracle finds this method scalable enough to run all of their TPC-Cs (an industry standard OLTP benchmark, see http://www.tpc.org for details). In many other implementations, you would find this being achieved with shared read locks and their corresponding deadlocks, and blocking. Here in Oracle, we do not get any blocking but we will get the ORA-08177 if other sessions change the data we want to change as well. However, we will not get it as frequently as you will get deadlocks and blocks in the other systems.

Read-Only Transactions

Read-only transactions are very similar to SERIALIZABLE transactions, the only difference being that they do not allow modifications so are not susceptible to the ORA-08177 error. Read-only transactions are intended to support reporting needs, where the contents of the report needs to be consistent with respect to a single point in time. In other systems, you would use the REPEATABLE READ, and suffer the associated affects of the shared read lock. In Oracle you will use the read-only transaction. In this mode, the output you produce in a report that uses 50 SELECT statements to gather the data, will be consistent with respect to a single point in time – the time the transaction began. You will be able to do this without locking a single piece of data anywhere.

This is achieved by using the same multi-versioning as used for individual statements. The data is reconstructed as needed from the rollback segments and presented to you as it existed when the report began. Read-only transactions are not trouble-free however. Whereas we might see an ORA-08177 in a SERIALIZABLE transaction, we might expect to see an ORA-1555 snapshot too old error with readonly transactions. This will happen on a system where other people are actively modifying the information we are reading. Their changes (undo) are recorded in the rollback segments. But rollback segments are used in a circular fashion in much the same manner as redo logs. The longer the report takes to run, the larger the chance that some undo we need to reconstruct our data won’t be there anymore. The rollback segment will have wrapped around, and portion of it we need has been reused by some other transaction. At this point, you will receive the ORA-1555, and will have to start over again. The only solution to this sticky issue is to have rollback segments that are sized correctly for your system. Time and time again, I see people trying to save a few MBs of disk space by having the smallest possible rollback segments (why ‘waste’ space on something I don’t really need?). The problem is that the rollback segments are a key component of the way the database works, and unless they are sized correctly, you will hit this error. In 12 years of using Oracle 6, 7 and 8, I can say I have never hit an ORA-1555 outside of a testing or development system. If you hit them there, you know you have not sized the rollback segments correctly and you fix it. We will revisit this issue in Chapter 5, Redo and Rollback.

Summary

In this section, we covered a lot of material that, at times, makes you scratch your head. While locking is rather straightforward, some of the side effects are not. However, it is vital that you understand these issues. For example, if you were not aware of the table lock Oracle uses to enforce a foreign key relationship when the foreign key is not indexed, then your application would suffer from poor performance. If you did not understand how to review the data dictionary to see who was locking whom, you might never figure that one out. You would just assume that the database ‘hangs’ sometimes. I sometimes wish I had a dollar for every time I was able to solve the ‘insolvable’ hanging issue by simply running the query to detect un-indexed foreign keys, and suggesting that we index the one causing the problem – I would be very rich.

We took a look at the meaning of the isolation levels set out in the SQL92 standard, and at how other databases implement their meaning, compared to Oracle. We saw that in other implementations, ones that employ read locks to provide consistent data, there is a huge trade-off between concurrency and consistency. In order to get highly concurrent access to data you would have to decrease your needs for consistent answers. In order to get consistent, correct answers – you would need to live with decreased concurrency. We saw how in Oracle that is not the case – all due to multi-versioning. This short table sums up what you might expect in a database that employs read locking versus Oracles multiversioning:

Isolation Level	Implementation	Writes block reads	Reads block writes	Deadlock sensitive reads	Incorrect query results	Lost updates	Lock escalation or limits
READ UNCOMMITTED	Read locking	No	No	No	Yes	Yes	Yes
READ COMMITTED	(Other databases)	Yes	No	No	Yes	Yes	Yes
REPEATABLE READ		Yes	Yes	Yes	No	No	Yes
SERIALIZABLE		Yes	Yes	Yes	No	No	Yes
READ COMMITTED	Multiversioning	No	No	No	No	No*	No
SERIALIZABLE	(Oracle)	No	No	No	No	No	No

* With select for update nowait

Concurrency controls, and how the database implements them, are definitely things you want to have a good grasp of. I’ve been singing the praises of multi-versioning and read consistency, but like everything else in the world, it is a double-edged sword. If you don’t understand that it is there and how it works, you will make errors in application design. Consider the resource scheduler example from Chapter 1. In a database without multi versioning, and its associated non-blocking reads, the original logic employed by the program may very well have worked. However, this logic would fall apart when implemented in Oracle – it would allow data integrity to be compromised. Unless you know how it works, you will write programs that corrupt data. It is that simple.

Thomas Kyte is the Vice President of the Core Technologies Group at Oracle Corporation and has been with them since version 7.0.9 was released in 1993. Kyte, however, has been working with Oracle since version 5.1.5c. At Oracle Corp., Kyte works with the Oracle database, and more specifically, he helps clients who are using the Oracle database and works directly with them specifying and building their systems or rebuilding and tuning them. Prior to working at Oracle, Kyte was a systems integrator who built large scale, heterogeneous databases and applications for military and government clients. Tom Kyte is the same “Ask Tom” whose column appears in Oracle Magazine, where he answers questions about the Oracle database and tools that developers and DBAs struggle with every day.

Contributors : Tom Kyte
Last modified 2005-12-01 02:43 PM

DBAzine.com

Sections

Personal tools

Menu