Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Of Interest » Articles of Interest » An Old Class of Errors
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 2416
 

An Old Class of Errors

by Fabian Pascal

In one of my earlier articles, I discussed the industry’s knack for reinventing wheels — often square wheels at that — simply because it ignores the mistakes of the past, including failures to adhere to the fundamentals of the field. Consider, for example, Dan Tow’s article “Wrong-Errors Bugs: A New Class of Bug” on the O’Reilly Website. Tow’s article starts as follows:

Above all else, we count on databases to reflect the truth consistently, or at least to reflect the table data perfectly. The database cannot be blamed when an application (or the end users of an application) place inaccurate data in its tables, but a database must accurately report the data it holds. Therefore, bugs are not all created equal; there are bugs, and there are wrong-rows bugs, bugs that silently misrepresent the data that the tables hold. Even the craziest, most obscure corner case that potentially misrepresents your data should rightly bring a loud chorus: “The emperor has no clothes!” We depend on the database, above all, not to lie.

First, databases are collections of tables, so they certainly reflect their own data perfectly.

Second, does Tow, like so many practitioners, use the term database to mean both database — the collection of data — and DBMS — the software system managing the data — interchangeably? Such confusion misleads. It is indeed true that a database cannot be blamed when an application (or user) corrupts it; but, the DBMS is certainly to blame for poor support of the integrity necessary to prevent such corruption (assuming, of course, that it’s not the authorized users’ failure to declare the constraints to the DBMS, in which case it’s the users’ fault).

An integrity failure is not a bug.

As we explain in “Un-muddling Modeling,” and, “The Logic of Business Rules,” a database is the computerized logical representation of a conceptual (or business) model, consisting of a set of informal business rules. These rules are the user-understood meaning of the data. Because computers comprehend only formal representations, business rules cannot be represented directly in a database. They must be mapped to a formal representation, a logical model, which consists of a set of integrity constraints. These constraints — the database schema — are the logical representation in the database of the business rules and, therefore, are the DBMS-understood meaning of the data. It follows that if the DBMS is unaware of and/or does not enforce the full set of constraints representing the business rules, it has an incomplete understanding of what the data means and, therefore, cannot guarantee (a) its integrity by preventing corruption, (b) the integrity of inferences it makes from it (that is, query results) — this is another way of saying that the DBMS is, at best, incomplete.

Note: The DBMS-“understood” meaning — integrity constraints — is not identical to the user-understood meaning — business rules — but, the loss of some meaning notwithstanding, we gain the ability to mechanize logical inferences from the data.

No DBMS can guarantee truth. If, for example, there is an EMPLOYEES table in the database

     EMPLOYEES {EMP#, ENAME, DEPT#, SALARY}
that contains a row
     {E21,Pianka,A00,25000}
representing a proposition about an employee:
     Employee uniquely identified by employee number E21, has name Pianka, 
works in department A00, earns salary 25,000

there is no way a DBMS can tell whether the row represents a proposition that is true or not in the real world (in fact, it does not even know what an employee, a name, a department, or a salary is!) The only thing it can, and should do, is to guarantee consistency with the integrity constraints (and, therefore, the business rules) in effect. For example, if there is a business rule

     The maximum employee salary is $20,000
that maps to the integrity constraint
     CHECK MAX(employees.salary) = 20000

then, the above row is inconsistent with the integrity constraint and, therefore, it represents a proposition that cannot possibly be true, because it contradicts a business rule (assuming, of course, that the business rule was declared correctly). So the occurrence of this row in the database is not a bug, it is an integrity failure, or corruption. The DBMS cannot even issue an error message if it is unaware of the constraint, and if it is aware, it would enforce the constraint, not issue the error.

Note very carefully, though, that on the other hand, the DBMS cannot do anything about false propositions represented by rows that are consistent with the integrity constraints. For example, if the proposition:

     Employee uniquely identified by employee number E21, has name Pianka,
  works in department A00, earns salary 19,000

but, in fact, the employee actually earns 17,000, the DBMS has no way of knowing that the row

{E21,Pianka,A00,19000}
represents a false proposition, and will allow it in the database.

Otherwise put, by convention, rows consistent with the constraints are deemed by the DBMS representatives of true propositions that must occur in the database, and rows in violation of the constraints as representative of false propositions that should not occur in the database (this convention is called the Closed World Assumption). It is the authorized users’ responsibility to guarantee the accuracy of the business rules specified in the conceptual model, and the integrity constraints declared to the DBMS. And it’s the DBMS’s responsibility to accept and enforce all the integrity constraints in the schema that represent the full set of business rules.

Consequently, what does Tow mean by “wrong-row bugs” and “bugs that silently misrepresent that data that the tables hold”?

      • Databases do not “reflect the truth consistently,” but rather, the DBMS ensures that they represent what, by convention, we assume to be true: propositions that do not violate the business rules.
      • There is no such thing as a “wrong-row bug.” The only reason users encounter such rows in databases or in query results, is because either DBMSs or users fail to adhere to the formal logic on which the relational model is predicated. It’s the theory that guarantees integrity; ignore it at your peril.
--

Special Offer: Author Fabian Pascal is offering DBAzine readers discounted subscriptions to the Practical Database Foundation series of papers. To take advantage of this offer, contact him via the About page, ordering details are on the Publications page at www.dbdebunk.com.

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author, and lecturer specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, and IRS. He is founder, editor and publisher of Database Debunkings, a Web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry. Together with Chris Date he has recently launched the Practical Database Foundations series of papers that also serve as text for seminars. Author of three books, he has published extensively in most trade publications, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns Against the Grain, Setting Matters Straight, and Test Your Foundation Knowledge.


Fabian Pascal
Last modified 2005-05-26 03:25 PM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone