Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Of Interest » Articles of Interest » On Grinding Water: Reply to St. Laurent
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 2412
 

On Grinding Water: Reply to St. Laurent

by Fabian Pascal

Simon St. Laurent, an associate book editor at O'Reilly & Associates, initiated a blog/thread called Everything Should Be Relational at his employer’s web site, the title purportedly reflecting my position on database management. I responded several times to his so-called "arguments," but got nowhere because, like so many in the industry, he argues without knowledge or ability to ground his positions, to understand the implications thereof, and to reason properly. With this caliber of editors, is there any wonder that database books are what they are?

For the benefit of those who might be misled by him, I am including the exchange, and I debunk his last and the only reply in which he tried — and failed — to back up his arguments. Italics were added to focus the reader on the core issues.

From: Simon St. Laurent

Fabian Pascal seems to think that the relational model should rule data management, period, and that those of us who disagree just don't understand. I hope I don't sound this cranky when I'm pushing XML for data management. If the relational model is so fantastic, why does it seem that no one has fully implemented it?

From: Anonymous

This weblog entry is now featured as the "Quote of the Week" at http://www.dbdebunk.com/ (quotes of the week are meant to represent outstanding examples of ignorance about relational matters).

It does seem a weak argument, like saying in the time of the Wright Brothers "If man-made flying machines are so good, why hasn't anyone made one?"

From: Fabian Pascal

The relational model is nothing but the application of logic--specifically, predicate logic--to database management. Those who trumpet the existence of "other ways to fly"--what is it that they offer as a replacement for logic? And if they offer none, can they really claim with a straight face that database management should not be based on logic?

I also dare them to produce any evidence that relational proponents say anything about "the world." We are referring only to database management, which is the domain of logic, whether ignorami realize and/or appreciate it, or not.

So it is at themselves that they are really laughing, but then ignorance is bliss, isn’t it?

From: Simon St. Laurent

It might be a weak argument, except that machines for flying through the air have been available for decades now. They might not be precisely the design that certain folks want, but mysteriously, they do get off the ground, move to the desired location, and land — all on a surprisingly regular basis.

If Fabian Pascal's rantings about markup weren't so astounding, I might be depressed about being featured in a place that is "meant to represent outstanding examples of ignorance about relational matters." As it is, I'll just laugh, and re-read the book that first taught me about how to use relational databases, SQL and Relational Basics, by Fabian Pascal. (M&T Books, 1991.)

Good stuff, Wright brothers or no [sic], but not the whole universe.

From: Fabian Pascal

What is astounding is that you find my "rantings about markup tags astounding." What exactly did you find astounding? Or should we expect things [such as markup tags] to be useful [for databases] only because you declare them so?

From: Simon St. Laurent

You know a lot about relational databases, and I'm happy to have learned about them from your writings, but you seem utterly incapable of accepting that there might be information for which relational databases are not an ideal storage mechanism or logic, which exists outside of the relational calculus.

Fire and brimstone ravings about ignorami, grinding water, and the dire need for relational logic do absolutely nothing for me. It's like you're screaming that newspapers need to be like television, or that movies and radio should be the same. There's a whole set of information media that can exist separately from relational systems, and you'd do well to acknowledge them.

Your expertise in relational systems is impressive. Your consistent refusal to see the computing world outside of that perspective is ridiculous.

From: Fabian Pascal

[To repeat:] making statements without supporting them is meaningless.

      1. The relational model has nothing to do with storage — that is precisely why it was invented. You suffer from the very common logical-physical confusion.
      2. What other logic — precisely, please! — is there that is applicable to database management? Can you specify another theoretical foundation on which the field can be founded that is equivalent or better than predicate logic and set theory?
      3. Of course it does not do anything for you and I don't expect it to — that's what ignoramus means. It is that which is ridiculous.
      4. And you are grinding water whether you realize it or not. Until and unless you stop declaring and start proving [your points], that is precisely what you are doing.

From: Simon St. Laurent

You appear not to recognize that there are domains where the relational model does not fit particularly well. The relational model (I will not grant you a monopoly on set theory, which also underlies XML work like RELAX NG) works well with data which has been carefully structured to fit its requirements, but tends to collapse when fed data that comes from less-structured sources.

I have no complaints about the relational model's logic when it is applied to data for which it is appropriate. I find the relational model useless at best for dealing with the kinds of messy information - frequently documents, but sometimes data that cannot be forced to fit in tables without loss - that I find most interesting.

If I may quote from SQL and Relational Basics, by Fabian Pascal (M&T Books 1990):

"Codd devised a relational algebra by adjusting set theory to certain types of sets called relations. In the same way in which arithmetic manipulates numbers to create other numbers, relational algebra manipulates relations to create other relations. At a simplified level, relations resemble unordered tables, namely tables whose order of columns and rows can be shuffled at will without any change in their meaning ... Codd's choice of relations was caused precisely by the ability to represent them as tables.... They can - and do - handle most of the data types we usually deal with." (38)

I think it's fairly clear that I spend most of my time outside of what you described as "most of the data types we usually deal with." If you'd care to design a non-tabular relational model that can keep up with a world that fits poorly in "Disciplined Tables," you're quite welcome to do so.

As for me, I'll just be happy that I have a choice of methodologies, which are useful for different situations, as well as tools for processing, storage, and presentation, which are appropriate for different needs.

I suspect that this doesn't constitute proof for you, but I'm not at all interested in your style of proof. Your approach has done wonderful things for some aspects of my work, but I have no plans to let it hobble the rest. If that's "grinding water," perhaps we have different conceptions of water.

From: Fabian Pascal

In several instances I asked you to prove precisely certain points, which you were declaring without grounding. Instead, you either repeated the same declarations, or made additional ones.

That is precisely what I expected, because you can't ground your claims in a persuasive manner, in large part because you don't know enough to do so. "Works well" is not a substitute to providing an equivalent theory to the relational model. What exactly is the theory or logic behind XML, if any?

I ended the exchange at this point and will now address here St. Laurent’s arguments in his last message, the only one where he attempted to back them up.

"I will not grant you a monopoly on set theory, which also underlies XML work like RELAX NG…"

I have no idea what "monopoly" St. Laurent refers to—I made no such argument. But even though I do not know the particulars of RELAX NG, I know enough about XML to know that its structure is hierarchic and the only formal foundation underlying hierarchies is not set theory (and certainly not predicate logic), but graph theory, which—like with XML’s hierarchic predecessor, IMS--had to be discarded, because it is too complex. (In fact, even the core XML object—the document—had to be replaced with an abstraction called ‘sequence’, in order to come up with anything workable.)

"… but tends to collapse when fed data that comes from less-structured sources."

All data is structured by definition, or otherwise it would not be data, because it would carry no informational value! (see Unstructured Thinking). Structure provides meaning, without it there is random meaninglessness and, therefore, no data management.

Structure is not a matter of degree—either something is structured or is not. What is usually meant in the industry by "unstructured data" (essentially a contradiction in terms), or "less"-/"semi"-structured data is data that is not structured in R-tables, usually some form of text. But text is not less structured than tabled data; it simply has different structures e.g.,

      • E-mail: from, to, subject, body, signature, and so on;
      • Articles: title, subtitle, sections, paragraphs, sentences, words, characters, and so on;

So such text is data structured by different data models than relational. Note very carefully, though, that:

      • The inferences that can be drawn from such data—data manipulation—are not the same as those from relationally structured data;
      • Such data can be structured (via analysis and modeling) in R-tables, and should be if the proper inferences are desired;

The relational model is the only general and complete data model with a dual formal foundation, which guarantees correctness (defined as consistency). Moreover, relational structure is the simplest with respect to manipulation and integrity. For informationally equivalent tasks, any other structure adds complexity, but not power (see Paper #1 and Paper #2 in the Database Foundation Series). It is precisely to avoid the unnecessary complications that would accompany "less-structured data" that a relational "collapse" is undesirable.

"I find the relational model useless at best for dealing with the kinds of messy information - frequently documents, but sometimes data that cannot be forced to fit in tables without loss…"

In the absence of knowledge and understanding of fundamentals, the only mess here is St. Laurent’s thinking. Any information can be represented as text, hierarchies, R-tables, or what have you. It is an illusion that tags around text captures for the system equivalent, let alone more meaning than tables (see Tags Do Not a Language Make).

The sum total of integrity constraints on a database is the best understanding a DBMS can have of what the data means. The process by which we convey meaning to the DBMS--database design/schema definition—is nothing but integrity constraint specification. And the fact is that initially XML was just nestable tags—structure--no manipulation and no integrity (which is a special case of manipulation). As such, it could not be used for database management, and it was not intended for that. This is why the industry has been scrambling to come up with XML schema and query capabilities. But why reinvent all this for XML’s hierarchic structure--a regression to a data model that was discarded decades ago because of the very complexities that XML specifications are now running into again, the document/sequence which being a case in point? (see Those Who Forget the Past Are Doomed to Repeat It and The Horror of XML)?

"I think it's fairly clear that I spend most of my time outside of what you described as "most of the data types."

Yes, it is fairly clear that that this is what ignorance will lead to. St. Laurent still relies on a twenty-years old book, a period during which we gained a considerably better understanding of the relational model. He has still not specified the types of data that the relational model cannot handle; and he still fails to understand that a bunch of tags are not a data model (a FOUNDATION SERIES paper on this subject is forthcoming at Database Debunkings). To reiterate: any data can be represented hierarchically or relationally. The only question is which is more practical for manipulation and integrity, and what kind of questions we can ask and answer with each. We already established 30 years ago that hierarchies are unnecessarily complex, and that they can be imposed dynamically in truly relational—not SQL!—systems, (see Chapter 7 in Practical Issues Iin Data Management), thus avoiding unnecessary rigidity and complications. Indeed, is St. Laurent at all aware that the relational model was invented as an alternative to hierarchic technology? He is hooked on a regressive technology because, like so many—and despite his reading my first book—he does not know or comprehend data fundamentals.

"I'll just be happy that I have a choice of methodologies which are useful for different situations."

If you don’t know and understand fundamentals, I can sell you anything. And vendors do just that. That is why we go from fad to fad, often reinventing already discarded fads and giving them a different label (see Skyscrapers with Shack Foundations and The database curmudgeon speaks again). But I understand how media and publishers don’t see any problem with that—new products and fads every few years are quite profitable.

"… but I'm not at all interested in your style of proof …"

Well, it is hardly surprising that somebody who dismisses logic as a basis for data management, and who throws around statements about "other logics" without specifying them, is not interested in "my style of proof" which is--you guessed--logic!! For St. Laurent and his ilk everything is just a matter of opinion. And being unconstrained by knowledge and logic, well, then, you can say anything, no matter how meaningless or inconsistent.

--

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCSF, and IRS. He is founder, editor and publisher of Database Debunkings, a web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry. Together with Chris Date he has recently launched the Database Foundation Series of papers. Author of three books, he has published extensively in most trade publications, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns Against the Grain, Setting Matters Straight, and for The Journal of Conceptual Modeling. His third book, PRACTICAL ISSUES IN DATABASE MANAGEMENT serves as text for his seminars.


Contributors : Fabian Pascal
Last modified 2005-04-12 06:21 AM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone