How Much Does an Update Cost?

by Jonathan Lewis

There are several application generators on the market that adopt the basic approach of "keep it simple." Simple code can be easier to generate and easier to maintain, even if it may seem a little less efficient. But if a screen generator always generates code to update every column in a table, even if the user changes just one field on the screen, how much does this cost you at the database?

A Brief History of Screen Generators

Once upon a time, SQL*Forms (as it then was) used to have a single SQL statement embedded for each block that was that block's update statement. This statement would update (by rowid) every column in the table that had been referenced in the block. This seemed to be a nice idea, because it kept the code simple and efficient at the client end of the system: there was no need to run a CPU-intensive task to discover which fields had actually changed, and no need to construct dynamically an exact piece of SQL to update the matching columns at the database.

Some time around Forms 4.5 (I may be wrong with the version, and wait to be corrected), Oracle introduced a flag that you could set to make the choice between a single 'update everything' statement, and a dynamically generated 'update just the minimum set of columns' statement. Which option is the smarter?

What Does It Cost to Update a Column?

Specifically we are interested in what it costs to update a column without changing it. If the database could detect that an incoming update was not actually going to change anything then the marginal cost of the redundant update would be minimal. Unfortunately, the database does not try to check for redundant updates. After all, it's reasonable to assume that updates are supposed to change the data. It would be counter-productive to add a check that was almost always true simply to make a small saving on those 'extremely rare and pointless' occasions when a 'no-change' update arrives.

So what could happen if you update a single column in a table using a single transaction? Clearly, the row has to be locked and the data modified, so an Interested Transaction List (ITL) entry in the block has to be acquired. A transaction table slot has to be taken in the undo segment header to act as a globally visible "reference" for the transaction, and an undo record has to be written into an undo block to describe how to reverse out the changes you have just made to the data block. The changes to all three blocks have to be recorded in the redo log (initially in just the log buffer), and in this simple case this will take just one redo record.

When you then commit the transaction, the transaction table slot is updated with the commit SCN and marked as free, and the address of the undo block you have used may also be written back into the free block pool in the undo segment header block. These changes to the undo segment header block are recorded in the redo log (buffer) and the log writer process (lgwr) is called to flush the log buffer to disc, after which your process is informed that the commit has succeeded. (Oracle may also, and in this case probably would, go back and clean up the changed data block, but would not record these clean-up changes in the redo log).

Assume that you, the user, updated just one field on screen — the description above covers the amount of work the database has to do to apply your change. But what if the screen generator has updated the whole row — what marginal costs appear?

The cost of taking the ITL entry, locking and updating the row has probably not changed much, and the cost of acquiring the undo segment header block has not really changed.

But the volume of data written to the undo record has probably gone up significantly — instead of an overhead of about 100 bytes plus the old version of one column, we now have the overhead plus the old version of all the columns. But perhaps that won't matter too much, after all Oracle writes in blocks not records — but if our undo records are now four times the size we will, on average, write about four times as many undo blocks as we need to.

Similarly, the redo record will have changed significantly. In the perfect case, the redo record would have been about 200 bytes long for the update (plus a further 200 relating to the segment header block and transaction audit vector). This would probably have been padded to 512 bytes (the typical o/s block boundary) with redo wastage as we issued the commit. But the main redo record consists largely of two change vectors — the vector for the table block, and the vector for the undo record — and both of these have increased in size because the undo's redo record now includes the old version of all the columns and the table's redo record now includes the new version for all the columns. So the rate at which you churn out redo may have gone up by a factor of something between two and four — and in many high-throughput systems the speed of getting the redo to disc is often a critical performance issue.

But There's More

Extra volume of undo and redo isn't necessarily the most significant issue though — after all, the extra volume generated and buffered isn't usually dramatic given the size of the basic overheads. Furthermore, because Oracle's basic architecture tries to push disk writes into background "asynchronous" processes, the end-user doesn't often see the time-lag due to disk-writes. But there are several considerations that will have a direct impact on the way that the end user sees the performance of the system.

In the event of a column being updated with a 'no-change' update, what do you think Oracle does about:

Row-based triggers of the type 'update of {column list}'
                              Before row
                              After row
                              Instead of

              Updates to indexes that include that column
                              What if those are B*tree indexes
                              What about bitmap indexes
                              What about function-based indexes

             What does Oracle do about referential integrity
                              If this is a column at the child end of a relationship
                              If this is a column at the parent end of a relationship

Triggers

Create a simple table with trigger, and try a test like the one shown in figure 1:

create table t2 (
	id_gp	number(4),
	id_p	number(4),
	n2		number(4)
);

insert into t2 values (1,1,1);
commit;
 
create or replace trigger t2_bru
before update of id_gp on t2
for each row
-- when (
--	   new.id_gp != old.id_gp
--	or new.id_gp is null and old.id_gp is not null
--	or old.id_gp is null and new.id_gp is not null
-- )
begin
	dbms_output.put_line('Updating');
end;
/

column rid new_value m_rid

select rowid rid 
from t2 
where rownum = 1;

update t2 set id_gp = id_gp where rowid = '&m_rid';

The trigger fires. The same thing happens if the trigger is an after-row update. It is left as an exercise to the reader to confirm my assumption that the same thing happens on an instead of trigger.

In passing, a before-row trigger actually generates one extra undo record and one extra redo record — even if it takes no action because of a when clause, such as the necessarily complex clause commented out in the example — so if you have the choice, it is probably a little more efficient to use after-row triggers.

Indexes

You have to be a little fussier with some of the experiments with indexes, as you may want to count logical I/Os to get a full picture of the cost, and this may require you to build a table that is large enough to have a multi-level index. However, with some tests it will be sufficient to examine undo, redo, and locks, and not rely on something as sensitive as logical I/O.

So what happens if you do a 'no-change' update on a column which is part of a simple B-tree index? Nothing. Oracle detects that the indexed value has not changed, and doesn't even traverse the index, let alone lock an entry. The same is true of simple bitmap indexes.

You might wonder if something nasty happens when you switch to function-based indexes - does the function have to be called 'just in case', does Oracle track the dependency between the functions and the columns involved in the function properly? The answer is that everything works properly - you don't find redundant executions of the function or trips to the index. As a test case, you could start with the table from fig. 1, and run the following SQL.

 create or replace function my_fun(
		i1	in	number,
		i2	in	number
)return number deterministic
as
begin
	dbms_output.put_line('Testing function');
	return i1 + i2;
end;
/

create index t2_idx on t2(my_fun(id_gp,id_p));

update t2 set id_gp = id_gp where rowid = '&m_rid';

update t2 set id_gp = id_gp + 1 where rowid = '&m_rid';

You will find that the function is called once (because there is only one row in the table) as the index is created, but it is not called for the 'no-change' update. By the way, the function will be called twice in the second update - once to find the original location in the index, and once to calculate the new location. You may find that the function is called twice more if you issue a rollback - I believe I observed this in the earliest releases with function-based indexes - but it doesn't seem to happen any more.

Referential Integrity

This, perhaps, is where the crunch comes on OLTP systems. You get hit twice - first as a child, and then as a parent.

Take the table from fig. 1, and insert another 9, identical rows into it to get a total of 10 rows. Then run the following tests and check the logical I/O etc.:

update t2 set id_gp = id_gp;

> 10 rows updated.

alter table t1 
add constraint t1_pk primary key (id_gp);

alter table t2 
add constraint t2_fk_p1 foreign key (id_gp) references t1;

update t2 set id_gp = id_gp;

> 10 rows updated

You will find that the number of db block gets (current mode gets) goes up by 10 when the integrity constraint is in place. Why? Because on each update, Oracle checks the foreign key constraint - and it does this by tracking down the parent's primary key index using current mode gets. If you have a large parent table, this could mean three current mode gets every time you update a child column redundantly.

Switching to the parent end of the trap. You need only try to do a 'no-change' update on a parent row when there is no index on the foreign key on the child table to discover that you get the dreaded TM/4 lock. Foreign key indexes aren't necessary if you don't update or delete parent key values, but if you are having problems with random 'hangs' and deadlocks being reported then perhaps you have the classic problem, where you know you don't update the parent keys, but your application generator is doing it behind your back.

There's Always a Trade-off

By now, you may have decided that you obviously have to go back and rewrite lots of code. But writing code to produce the perfect SQL statement every time is likely to increase the risk of errors in the code. The trade-off between risk (and time to code, test and debug) and performance is always a valid point of argument, so if the server is nowhere near capacity then it may be perfectly sensible to ignore the issue — at least in the short term.

And there's another, more subtle, trade-off. If your application generates the perfect SQL for every update that the front-end can fire, then the number of different SQL statements could escalate dramatically.

In theory, if your table has N columns, then there are power(2,N) — 1 possible update statements — even if you restrict yourself to single row updates by rowid. Unless you increase the size of the shared pool, and tweak a couple of parameters such as session_cached_cursors, you may find that your savings in one area are offset by extra expenses (such as library cache contention) appearing elsewhere.

Conclusion

Allowing front-end tools to take the easy option when updating data - by writing a single SQL statement for all possible updates — can add a significant load to your system. If you are running client/server, or N-tier, then you may be better off using extra CPU at the client end of the system to build custom SQL to minimise the cost at the server. The decision is not black and white, though. Make sure that the cost is worth the benefit.

Jonathan Lewis is a freelance consultant with more than 17 years' experience in Oracle. He specialises in physical database design and the strategic use of the Oracle database engine, is author of Practical Oracle 8i - Building Efficient Databases published by Addison-Wesley, and is one of the best-known speakers on the UK Oracle circuit. Further details of his published papers, tutorials, and seminars can be found at www.jlcomp.demon.co.uk, which also hosts The Co-operative Oracle Users' FAQ for the Oracle-related Usenet newsgroups.

Contributors : Jonathan Lewis
Last modified 2005-02-17 02:39 PM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?