ODBMS vs. TRDBMS - Reply to Barry

by Fabian Pascal

Before the previous incarnation of this column was terminated (see "Another One Bites the Dust"), the editor there asked one Douglas Barry — Executive Director of the Object Data Management Group (ODMG), consultant, and the author of numerous books — "to write a few words on the benefits of the object database model as compared to the relational database model" (this was in response to my criticism of object database technology, under the warped journalist's concept of "balance": technically grounded arguments countered by industry/vendors self-serving claims (see "Silly Seeley").

I could have easily dismissed Barry's whole argument by simply stating that object-orientation (OO) does not have a data model analogous to the relational data model and, therefore, is not a data management technology. If this is not true, then can Barry specify — clearly and precisely, please! — the structural, integrity, and manipulative features of the so-called "object model," one that is agreed upon. But if I did dismiss it this way, it would rob me of the opportunity to show the many other fallacies in Barry's position.

Barry strikes what may seem a reasonable note:

"Frankly, I hesitate to frame the discussion that way. The reason for my hesitation is that I do not necessarily see object database (ODBMS) and relational database (RDBMS) products as competitive. Rather, I see them as complementary choices that, at times, can co-exist quite well in enterprise architecture. Often, this is with multi-tier architectures. Two examples of how they can co-exist:

Web-based catalogs where content is drawn from multiple, existing sources to "publish" the catalog in the middle tier. The ODBMS manages access to the catalog. The RDBMSs or other storage mechanisms manage the existing sources. Often, the middle-tier catalog is something new that has not existed before in the enterprise.

Trading or auction systems that keep only the current items being traded or auctioned in the middle tier. The ODBMS manages the active trading or auction data. The RDBMS serves as the backend database or database of record which receives the necessary historical data from the ODBMS. Often, the middle-tier trading or auction systems are new or are replacing older, non-DBMS technology."

But this begs the question: it implies a need for two kinds of DBMS for different purposes which, in turn, suggests that RDBMSs cannot handle certain aspects of data management, which ODBMSs can. While this is predictable coming from ODMG, it has no basis in reality.

Please bear in mind that when Barry says RDBMS, he means SQL DBMSs. Any flaws or deficiencies of such products do not result from their being relational, but from not being relational enough, and poor implementations to boot.

Second, the real issue is not whether ODBMSs can, but whether they should coexist with RDBMSs. But Barry explains neither why the latter would not serve certain functions well, nor why the former do a better job with them. He just declares this to be the case. When serious technical analysis is applied to the subject, however, and the fundamental goals of object-orientation are correctly extended to data management and taken to their logical conclusion (see Foundation for Future Database Systems: the Third Manifesto), it is clear that a true ODBMS — as distinct from the current products that the industry produces and calls that — is nothing but a true RDBMS (TRDBMS) — distinct from SQL products. In other words, there is nothing in OO that adds anything of value to a TRDBMS.

Note very carefully that in Barry's examples, ODBMSs seem to be serving as "interfaces" to data managed by RDBMSs. This is not by chance.

"So, when should you consider using an ODBMS? For years, I have been telling my clients that they should be considering the use of an ODBMS only when they have a business need for high performance on complex data. Let me explain this further. If you

are using an object programming language such as Java or C++;

have, for perfectly valid design reasons, chosen to use advanced data structures afforded by the object-programming language; and

need the highest performance possible,

then you should be considering an ODBMS product.

There are design problems that often are best solved using data structures such as lists, vectors, hash tables, queues, stacks, etc. that are available or can be constructed in object programming languages. Sometimes, it is necessary to make these data structures persistent. With an ODBMS product, you can store any conceivable data structure directly with no conversion whatsoever. (If you think about it for a moment, you can see that this is a mighty significant statement.)"

OO is, essentially, not a data management technology, but a programming technology! And here is one implication: the programming language is central in OO and drives the selection of a DBMS. But the application development language is not, should not and does not need to be the criterion for choosing a DBMS (see "To A Hammer, Everything Looks Like Nails," parts 1 and 2). First, because there are other, data management criteria, that are more important (see Practical Issues in Database Management); and second, because any TRDBMS can and should support any object-oriented programming languages (OOPL) for application development ("OO For Application Development, Not Database Management"). And please, oh, please, no "impedance mismatch" nonsense!)

Note, in passing, the shift in (vague) terminology — one of the major problems in OO is fuzziness — from "complex data" to "advanced data structures" (what does 'advanced' mean?). Be that as it may, I dare Barry to explain what exactly in the relational model prohibits, or makes it impossible, for TRDBMSs to represent/manipulate any data structure of arbitrary complexity.

But while data structures are a component of logical models, Barry's "complex structures" — "lists, vectors, hash tables, queues, stacks, etc."— are clearly storage structures, a display of the logical-physical confusion all too common in the industry. Since such structures are physical implementation details, not logical model components, physical data independence mandated by the relational model (but not by OO) means that they are permissible and possible in TRDBMSs (as long as they are not exposed to users in applications), and the fact that SQL DBMSs do not implement them has nothing to do with either OO or the relational data model.

"Performance is also enhanced because there is no conversion or mapping between the object programming data structures and the ODBMS. Mapping between object-programming data structures and an RDBMS reduces performance. The more complex the data structures, the greater the reduction in performance. Being able to store complex object programming data structures directly also saves on development costs because you write less code with an ODBMS as compared to an RDBMS. The reason you write less code is because you have only one data model to develop and maintain. Also, you do not need to write any mapping or conversion code that would be needed if you used a different data model in an RDBMS than the one afforded by the object programming language. This tight coupling between object programming languages and ODBMSs is one reason you are starting to see more and more ODBMSs being used in embedded systems, but I digress ..."

Well, even ignoring the fact that we are not talking about "data structures" here, but storage structures, the argument seems to go as follows:

There are data structures that must persist.
Instead of handling them in the database, where they belong, we are handling them in the programming language, where they do not belong.
So now that we need to map them to the database, relational technology is to blame.

This is what happens when logic is ignored in a field founded on it.

Note: The relational model is logic applied to databases, so how can (predicate) logic have anything to do with performance? It is the physical implementation of the DBMSs that determines performance and, unlike OO, the relational model mandates physical data independence — the insulation of logical models from implementation details. What this means, of course, is that vendors are free to do whatever they darn please at the physical level to maximize performance, as long as they don't expose the details to users in applications. Thus, (all else being constant) if a DBMS performs poorly, that is due to its implementation, not to its being an ODBMS or RDBMS: either can be implemented well or poorly performance-wise.

The argument about less OODBMS code is pure and unadulterated bunk. Reality is the exact opposite, because one of the main relational objectives is to minimize the amount of code, while OO is rooted in programming. Here's from a practitioner's (not a theoretician's!) message to Database Debunkings:

"As a practice project, I've been rewriting a portion of a large Smalltalk application … and I've been stunned to see just how much application code disappears when you have a DBMS that supports declarative integrity constraints. In some classes, over 90% of the methods became unnecessary." [emphasis added]

And here's one of our weekly Quotes (which I absolutely love):

"The SLAC and LBNL researchers wrote more than half a million lines of software code to provide the physicists access to their data in a simple and reliable fashion."

— Objectivity Press Release.

That the vendor boasts about this, says about all.

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, IRS. He is founder, editor and publisher of Database Debunkings, a web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry (Chris Date is a senior contributor). Author of three books, he has published extensively in most trade publications, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns "Against the Grain,"Setting Matters Straight", and for The Journal of Conceptual Modeling. His third book, Practical Issues in Database Management, serves as text for his seminars.

Contributors : Fabian Pascal
Last modified 2005-04-12 06:21 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

ODBMS vs. TRDBMS - Reply to Barry