Transition Constraints

by Joe Celko

When most newbies think of constraints, they know static column constraints, such as NOT NULL, DEFAULT, and CHECK() clauses. A little bit later, they will learn about simple Declarative Referential Integrity (DRI) constraints. That means simple PRIMARY KEY and REFERENCES clauses with some simple actions

The bad news is that these are not enough for all business rules. A transition constraint says that an entity can be updated only in certain ways. These constraints are often modeled as a state transition diagram. There is an initial state, flow lines that show what are the next legal states, and one or more termination states.

One Beginning, Many (Possible) Endings

As a very simple example, pretend that we have a winery where we get grape juice, and then we produce either wine or vinegar (refer to figure 1). Notice that we have to start with grape juice. It is important to have a single initial state, but you can have many termination states. For example, a credit application might have “accepted” or “rejected” as termination states.

Figure 1.

In this example, we have only one termination state, spoilage. Notice also that wine can change into vinegar, or we can make vinegar directly from the grape juice. But these rules do not allow us to go directly from grape juice to spoilage. (Assume that there is an accounting or tax reason for that for rule.)

Let’s start with a table skeleton and try to be careful about the possible states of our product:

CREATE TABLE Winery
( ..
 fermentation_state VARCHAR(15) 
     DEFAULT 'Grape Juice' 
     NOT NULL
   CHECK (fermentation_state IN ('Grape Juice', 'Wine', 'Vinegar',
'Spoilage')),
..);

We are being good programmers, using a DEFAULT and a CHECK() constraint. But this does not prevent us from turning grape juice directly to spoilage, converting vinegar to wine, and so on.

Triggers

One solution is to add a trigger to the table. The problem with triggers is that, while there is SQL-99 syntax for triggers, every SQL product has a proprietary syntax and often a non-ANSI model. Here is a SQL-99 version that can probably be improved, but it demonstrates the idea:

CREATE TRIGGER CheckFermentationTransitions
AFTER UPDATE ON Winery 
REFERENCING OLD AS O1 NEW AS N1

IF EXISTS 
  (SELECT *
    FROM O1, N1
    WHERE NOT -- not a valid transition 
      ((O1.fermentation_state = 'Grape Juice'
       AND N1.fermentation_state IN ('Wine', 'Vinegar'))
      OR (O1.fermentation_state = 'Vinegar'
        AND N1.fermentation_state = 'Spoilage' 
      OR (O1.fermentation_state = 'Wine'
        AND N1.fermentation_state = 'Spoilage'))))
THEN ROLLBACK;
END IF;

This is a bit messy, but can be mechanically generated. I am assuming that the ways to succeed outnumber violations, so a negation will be easier to maintain and read.

This is often the first approach that a Newbie takes once they get to a certain point in their SQL. They still feel more comfortable with procedural code, so triggers give them comfort.

The problem is that triggers do not pass information to the optimizer, will not port, and run slower than non-procedural code.

CHECK Constraints

You can actually use CHECK() constraints, but you have to store the current and previous states.

CREATE TABLE WineCellarChanges 
( ..
 previous_state VARCHAR(15) NOT NULL,
 current_state VARCHAR(15) DEFAULT 'Grape Juice' NOT NULL,
 CHECK (NOT ((previous_state = 'Grape Juice'
       AND current_state IN ('Wine', 'Vinegar'))
      OR (previous_state = 'Vinegar'
        AND current_state = 'Spoilage')
      OR (previous_state = 'Wine'
        AND current_state = 'Spoilage')))
..);

In effect, the transition table is converted into predicates. This procedure has advantages; it will pass information to the optimizer, will port, and will usually run faster than procedural code.

DRI Transition Constraints

Let’s generalize the CHECK() constraint. A declarative way to enforce Transition Constraints is put the state transitions into a table and then reference the legal transitions. This requires that the target table have both the previous, and the current, state in two columns. Using the winemaking example, we would have something like this:

CREATE TABLE StateChanges
(previous_state VARCHAR(15) NOT NULL,
 current_state VARCHAR(15) NOT NULL, 
PRIMARY KEY (previous_state, current_state));

INSERT INTO StateChanges VALUES ('Grape Juice', 'Grape Juice');
INSERT INTO StateChanges VALUES ('Grape Juice', 'Wine');
INSERT INTO StateChanges VALUES ('Wine', 'Vinegar');
INSERT INTO StateChanges VALUES ('Wine', 'Spoilage');
INSERT INTO StateChanges VALUES ('Vinegar', 'Spoilage');
INSERT INTO StateChanges VALUES ('Spoilage', 'Spoilage');

The target table looks like this.

CREATE TABLE WineCellarChanges 
( ..
 previous_state VARCHAR(15) DEFAULT 'Grape Juice' NOT NULL,
 current_state VARCHAR(15) DEFAULT 'Grape Juice' NOT NULL,
 FOREIGN KEY (previous_state, current_state)
   REFERENCES StateChanges (previous_state, current_state)
   ON UPDATE CASCADE,
 transition_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
..);

If you want to hide this from the users, then you can use an updatable view:

CREATE VIEW Winery (.., fermentation_state, ..)
AS 
SELECT .., current_state, ..
  FROM WineCellarChanges;

The immediate advantages to doing so are that this will pass information to the optimizer, will port, and since the rules are separated from the table declaration, you can maintain them easily.

A not-so-obvious advantage is that the StateChanges table can contain other data and conditions, such as temporal change data. Grape juice becomes wine in (w) days and vinegar in (w+v) days, then they each spoil in (w+s1) and (v+s2) days respectively.

Of course, there are other tricks with extra columns in the StateChanges table and how you join it to the target table. Gee, why don’t you readers come up with something?

Joe Celko was a member of the ANSI X3H2 Database Standards Committee and helped write the SQL-92 standards. He is the author of over 750 magazine columns and seven books, the best known of which is SQL for Smarties, now in its third edition. For more articles, puzzles, resources, and more, go to www.celko.com.

Contributors : Joe Celko
Last modified 2006-08-24 12:33 PM

Surprised about this article

Posted by abravo at 2006-01-28 09:07 AM

Hi Joe,
I don't agree with your proposal at all. Adding a column called previous_state to a table does not sound like a good design practice to me.

State transition in this case could be represented as a hierarchy and the checking of the constraint could be done against that hierarchy, of course in another table.

I would manage this situation as they do with data warehouses to maintain the history using slowly changing dimensions or treating the status changes as a fact.

Thanks

Replies to this comment

In defense of transition tables (Posted by jcelko at 2006-02-03 01:39 PM)

How do you deal with changes?

Posted by tobybest at 2006-08-25 10:09 AM

I have a question for you,
Let's say that I use this approach for transition constraints for a production line. I have all my transition processes before I get the final product.
Now, I add a new transition constraint that fits in the middle. For example. These are the initial transitions:
statea - stateb
stateb - statec
My new transitions are
statea - statea1
statea1 - stateb
stateb - statec
If I do this, how do I deal with the existing data? statea to stateb will no longer be a valid option.
Thanks

DBAzine.com

Sections

Personal tools

Menu

Who Are You?