Oh, Oh not OO Again
Just after I published "OO for Application Development, not Database Management" in my "Against the Grain" column, debunking misconceptions about the object and relational approaches to database management, I came across another article in DMREVIEW, "Object-Oriented or Relational?" by Lou Agosta, revealing similar fallacies and confusion. For every effort to debunk this stuff, more comes out of the woodwork. So here we go again.
"At a recent conference, a Giga client expressed a concern to me. He said, 'Object-oriented programmers have a strong voice in our installation; they are using object-oriented technology as an excuse to question why we need the data group, which I head, to help with integration. They say that objects will do that for them. Is the relational database approach a valid foundation for system integration, and where does the relational approach support or intersect with object-oriented technology (OOT)?"
As I have argued on many occasions, a major problem in the IT industry is the large and increasing number of practitioners -- particularly the younger, Web-oriented generation -- who have a programming/HTML background and no exposure to data fundamentals. They make no distinction between application programs and a DBMS and between files and databases -- they aren't even aware of such distinctions. Many were not around when databases and DBMSs were invented precisely because we learned the hard way that managing data in files with applications is not a cost-effective approach. Since the object approach originates in -- and was intended for -- programming, it is hardly surprising that, in the absence of such a distinction, programmers extend objects to database management, too - to those with a hammer, everything looks like nails. As expressed in one of the weekly quotes at our Database Debunkings Web site: "The only rules that should reside in a database are referential integrity (and sometimes that isn't really necessary). It is also best to keep rules out of your data access code (hard-coding WHERE values). Business rules should be centralized in Java business objects for better manageability, scalability, and so on. Don't let pushy DBAs tell you otherwise. Rules in a database slow down development as well as data access time."
"First things first. Both data and methods (processes, procedures) are required for a complete application. The data without the method is meaningless; the method without the data is empty. Both are required. So what is the issue? The issue is that we really are dealing with different paradigms in comparing object-oriented programming and relational database technologies. An object includes both methods and data - but what happens when there is a vast supplier or customer database or parts database? Presumably, the data cannot be cached in main memory at all times, and it must be persisted [sic] - that is, stored - for reasons of both performance and data integrity. That is where the relational database with object-extensions comes to the fore."
This paragraph starts well, if with a trivial platitude, and deteriorates from there. It seems to imply that database management issues arise only when, and because, some data sets are too large to be kept in memory. But whether data fits in memory or not is a physical implementation detail that has nothing to do with the data model employed at the logical level. Even if data reside in memory, there are still critical reasons to manage it independently of applications -- these are two separate issues. Indeed, data independence, including integrity independence, is a major objective in switching from application files to databases. Agosta does refer to integrity as a raison d'etre for databases, but fails to realize the implications.
As to object-extensions to relational databases, I quote from Date's and Darwen's book The Third Manifesto: ... "the relational model needs no extension, no correction, no subsumption -- and above all, no perversion! -- in order for it to ... support [those few desirable] features that are commonly regarded as aspects of object-orientation, [because they are] orthogonal to (i.e., independent of] the relational model."
"Isn't it foolish to try to decide between the data and methods (or functions) of processing data? In many instances, both objects and relations are needed. In fact, the object-oriented paradigm can be created out of the relational one by designating relational entities as objects and stored procedures or database functions as methods. While the mapping is not perfect, the result is a near paraphrase of the functionality. As usual in computing, there is latency - difference in speed and type of process between the object-oriented run-time process and the storage paradigm by which the process is preserved (stored)."
Foolish, indeed. But one should be careful not to confuse "decide between" with "keep independent of." It is possible to develop object-oriented applications, but using objects to manage databases is a bad idea. As Date argues, designating relational entities as objects does not add anything of value, it is just relabeling. Speed has nothing to do with the case.
"In the final analysis, a conversation between the object-oriented programming team and the data management team results in exercises such as mapping entity-relationship terms to UML (unified modeling language) to facilitate translation and cooperation: entity is class, relationship is association, structural constraints are multiplicities [sic], specialization is classification, and roles and attributes are the same in both classes."
UML is a highly questionable proposition, due in large part to reliance on the object paradigm, see "Basic Concepts in UML: A Request" for Clarification.
"It seems like a false choice to me to try to decide between objects and data - what would the object methods process if not data? Where is the data persisted and stored when it is not being processed? A relational database entity is an object without methods. When the methods are added by way of stored procedures or other functions, you are half way to object-relational technology. Of course, for certain kinds of real-time embedded systems in airplanes, cell phones or complex processes in power plants or pipelines, the importance of data storage is significantly less."
Nonsense. Agosta understands neither what a data model is, nor the relational data model specifically (see Date's "Models, Models Everywhere, Nor Any Time to Think" and my "Something to Call One's Own"). What is more, his argument is as fuzzy as are object concepts. There are "methods available for relational entities": every domain comes with operators and the relational operations (restrict, project, join, union and so on) rely on them for manipulating data in tables. All an application programmer needs to do is to invoke these operations in applications, rather than write his/her own (as is necessary with an ODBMS). The use of stored procedures should be avoided for a variety of reasons, an important one being the superiority of declarative integrity support over a procedural approach.
"Because the client then asked for something to read, I recommended Object Solutions: Managing the Object-Oriented Project by Grady Booch. Booch has excellent insight into processes that are data-centric versus those that emphasize other important components such as the user interface, distributed, legacy systems, computation-centric, real time and architecture-centric. In fact, a good way to break out of the impasse that mistakenly opposes OOT to relational technology is to look at some of the other perspectives, such as architecture and distributed systems, where both approaches are needed in differing degrees."
To the extent that an application has a data component, that should be handled by a DBMS, which supports some data model (which consists of data types, organization, integrity, manipulation).
"I am just guessing, because the information available in the conversation was limited, but the problem or issue could be that the OOT team wants to buy an object-oriented database [sic] such as UniSQL, ObjectStore, Illustra or Omniscience. That is not necessarily bad if there is a specific application that requires one of these niche databases [sic]. However, these databases [sic] will not be suitable for a general commercial business application such as vendor management, parts inventory or customer relations. The head of the data group, with whom I was having the conversation, will have to communicate the trade-offs to the team at large. For example, in 1996 Informix acquired Illustra, and all the other major database vendors (IBM, Oracle, Sybase) began moving toward object-relational products. Then, a consensus on the object-relational data model features emerged in the SQL-3 standard, and all the major database vendors (IBM, Oracle, Sybase, Informix) have released object-relational products that reflect this consensus."
In fact, object DBMSs have failed to gain market share for various reasons I explain in my book, which was pretty predictable.They are essentially application-specific "DBMS building kits," not general purpose DBMSs. SQL-3 is an abomination, complete with pointers, throwing us back decades, and I have serious doubts as to its implementation by vendors.
"When all is said and done, the relational paradigm is still the dominant design in database management systems for commercial business applications. That means all the other alternatives (and paradigms) such as object databases, XML databases and in-memory databases will eventually be (and in some cases have already been) assimilated to the relational model. These other databases will still have relevance in special-purpose niche applications where they are needed because of their special capabilities - such as XML in publishing, object-oriented in engineering complex artifacts and in-memory in Web caching - but they are unlikely to break out of their respective niches to attain widespread commercial application in the average business process."
Does Agosta realize that there is no equivalent alternative whatsoever to the relational model, the only approach with a sound theoretical foundation -- logic and mathematics? And that the paradigm is not just of just "design," but for database management as a whole? And that SQL databases and DBMSs -- not relational ones! -- dominate the market? Is he aware that XML is a throw-back to the bad old hierarchic DBMSs, which were replaced by SQL DBMSs years ago because they were too complex, inflexible, and not cost-effective? And that XML violates relational principles? How can, then, XML databases be "assimilated" in the relational model? (for a reality check on XML see my several articles in the "Against the Grain" column).
Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author, and lecturer specializing in data management. He was affiliated with Codd & Date and for more than 15 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on database technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, IRS. He is founder and editor of Database Debunkings, a Web site dedicated to dispelling prevailing fallacies and misconceptions in the database industry, where C.J. Date is a senior contributor. He has contributed extensively to most trade publications, including Database Programming and Design, DBMS, DataBased Advisor, Byte, Infoworld, and Computerworld and is author of the contrarian column "Against the Grain." His third book, Practical Issues in Database Management - a Guide for the Thinking Practitioner (Addison Wesley, June 2000), serves as text for a seminar bearing the same name.
Contributors : Fabian Pascal
Last modified 2005-04-12 06:21 AM