Slashing a Slashdot Exchange - Part 2

by Fabian Pascal

Last month I debunked some of the most egregious reactions in a long exchange at Slashdot.org to my previous column If You Liked SQL, You’ll LOVE XQuery. This month I debunk a few more, even though they do not really merit a response, because, unfortunately, they are quite representative of the level of discourse in the industry. I urge the reader to note that all reactions ignore my arguments, reasoning and evidence, and fail to provide same for their opposing positions. I will tackle comments of two who ought to know better, a participant in the W3C XML specification process and an academic, in future columns.

negative video: “Fabian Pascal is smart and well-informed, but a zealot. Like all zealots he is willing to sacrifice anything and everything for his vision of technical purity.”

This is quite common: labeling one a “zealot” is an attempt to marginalize arguments and, thus, avoid addressing the issues they raise (see “Lenin, Trotsky, and Freedom from the Tyranny of Knowledge and Reason”). Two points are worth noting:

Smarts and “well-informedness” are inconsistent with zealotry. A smart and well-informed zealot is almost a contradiction in terms, an extremely rare occurrence at best.
More importantly, what does it say about an industry (and, worse, society) if it deems science zealotry, and ignorance progress?

Would “negative video” care to specify what is it that I am “sacrificing”? By “technical purity” does he mean we should compromise on the purity of logic in database management? Isn’t that exactly what we should not sacrifice?

Yintercept: “This quote needs a position in the library of intellectual arrogance as well:

‘Indeed, data/information management requires “some organizing principle”; that is, structure; anything “unstructured” — and many in the industry promote XML for that purpose — is not data, but meaningless random noise that carries no information. ’

A snit crassly dismisses several millennia of literature because it is unstructured.

Quite frankly, meaning and structure are independent of each other. It is possible to find meaning in things with radically different structures. It is true that there is a correlation between structure and the ability to communicate meaning, but a healthy mind can find meanings in things that have not been normalized.

Likewise, you can have meaningless garbage in relational databases. A case in point is the large number of fake web sites that do things like join the FIPS database to product names so that they can have millions of pages that show up in search engines. Likewise, we see academician filling volume after volume of publications with meaningless tripe.”

Another method of avoiding issues that require knowledge and intellect one does not possess: insults. (I’ve been called worse, but it is quite telling that it is usually I — who have never used such language — who am usually accused of a “harsh style”, or ad-hominem attacks because I expose ignorance for what it is, never those who actually insult.)

I don’t know what “radically different structures” means, but note the ignorance and basic reasoning failure.

The relational structure is the relation, not normalization.
I dare anybody to make sense and reconcile the following three statements, in a way that invalidates my quoted argument:

“meaning and structure are independent of one another
“it is possible to find meanings with radically different structures”
“literature is unstructured”

There’s more drivel, but I won’t bother with it.

Chops: “I read this and pretty much gave up getting anything of value out of this article--I hadn’t understood much that went before it, though my distrust of all things XML had led me to believe this guy might know what he's talking about. If you removed NULLs from relational database design, people would reinvent them (poorly) -- probably by using IDs of -1 or 0, or IDs to a special magic ‘null’ row, which I suspect is what he's talking about by ‘it can be handled relationally.’ To suggest that missing or inapplicable values are not part of ‘the real world’ is so wrong it’s... well... wrong. Anyone who’s actually done database work (or programming work, for that matter) knows this.”

Well, at least he admits he does not understand. But that hasn’t stopped anybody for commenting anyway.

Being probably one of the young generation of practitioners who know nothing of the history of their field, Chops is unaware that exception values of the sort he mentions were actually used decades ago, and SQL NULLs were an attempt to get rid of them because they were hugely problematic. Unfortunately, NULLs are not the proper solution, because they are extremely problematic themselves. Those Who Don’t Know the Past Are Condemned to Repeat It, so the likes of Chops may well reintroduce them, or invent nonsense like “a special magic null row,” whatever that means. Had Chops bothered to study the correct relational solution, however, he wouldn’t have had to “suspect,” or made such mistakes (the solution is outlined conceptually in Chapter 10 in my Practical Issues in Database Management; a more detailed treatment is forthcoming in Practical Database Foundations paper #8, “The Final NULL in the Coffin: Missing Data”).

It is Chops who is wrong, as many programmers tend to be, about database management. He is confusing missing and inapplicable data with NULLs, which are distinct: the latter are SQL’s way to represent the former, and a rather bad one at that. As I explain in my book, inapplicable data is an artifact of poor design and, therefore, a red herring.

There is, in fact, only one kind of missing value, unknown. The relational model is based on the real world's two-valued logic (true/false). The fact that we may not know whether a fact is true or not in the real world does not change that reality. So to guarantee correctness, the logic underlying the relational model requires us to record in databases only facts known to be true. As a third truth value, a NULL violates this requirement; NULLs are essentially a mix-up of data with metadata, substituting a three-valued logic different than that of the real world. That means that SQL DBMSs can produce results that are nonintuitive, prone to misinterpretation, or outright wrong. That practitioners are oblivious to these problems does not mean they don’t exist.

Tablizer: “In other words, Dr. Codd was a brilliant theoretician, but a lousy marketer and packager. We just have to agree on and/or find relational operators and syntax that we find more intuitive than those in the original papers. Sometimes I feel that "look-up" would be more intuitive than "Join", for example. Relational as a practice is still young.”

Let me get this straight: Tablizer criticizes the person who, for the first time, put database management on a scientific basis, for not being a marketeer? The marketeers have given us multivalued, object, SQL and XML database management. Are these better than relational database management? If so, why do we need a new “paradigm” every few years?

Wastl: “Basically what I take from this is that the table (e.g. SELECT * FROM foo) is simply a convenient logical representation of a stored relation. That is to say, foo can be implemented by the DBMS as a linked list, a tree, any data structure. True. However, this encoding is usually very inconvenient (consider representing an HTML document or a structured piece of literature in this manner). Besides this, nested structures are at least as logical as flat structures (I continue to call them flat because they *are*). Relational database logic is merely a fragment of first order predicate logic, one that is restricted to - guess what - flat relations, whereas first order predicate logic usually works with *nested* structures (called terms) and relations. XML and other nested data structures fit very well into logic, and in fact we (a research group in Munich and some other places) are working on a logic-based query language [xcerpt.org] that exploits this similarity. I agree with many of the statements that the author of the article makes, in particular regarding XQuery. However, some are so arrogant and *unproven* that it leaves the article in a bad light. Also, while he claims to have a good insight into database theory, I don't think he really has. SQLs big advantages are (1) it is easy to use and (2) it has a very limited expressive power which makes it easy to implement and efficient to evaluate. Other approaches have been considered, e.g. in deductive databases or knowledge base systems. However, those needed languages that were basically Turing complete or at least supported basic recursion (to implement transitive closure) and thus could lead to very inefficient queries.”

How does one respond to such drivel? It is astounding that I am accused of “zealotry”, arrogance and “unproven statements” without any specific evidence, while at the same time all sort of nonsense is being thrown around — some patently false, inconsistent, meaningless, or bordering on the absurd--“I continue to call them flat because they are flat,” “XML and other data structures fit very well into logic,” “SQL is easy to use” and “has limited expressive power that makes it easy to implement and efficient to evaluate”— without an iota of substantiation.

Worse than arrogance is arrogant ignorance (see “Unskilled and Unaware of It”).

Fabian Pascal has a national and international reputation as an independent technology analyst, consultant, author and lecturer specializing in data management. He was affiliated with Codd & Date and for 20 years held various analytical and management positions in the private and public sectors, has taught and lectured at the business and academic levels, and advised vendor and user organizations on data management technology, strategy and implementation. Clients include IBM, Census Bureau, CIA, Apple, Borland, Cognos, UCS, and IRS. He is founder, editor and publisher of Database Debunkings, a Web site dedicated to dispelling persistent fallacies, flaws, myths and misconceptions prevalent in the IT industry. Together with Chris Date he has recently launched the Database Foundations Series of papers. Author of three books, he has published extensively in most trade publications, including DM Review, Database Programming and Design, DBMS, Byte, Infoworld and Computerworld. He is author of the contrarian columns Against the Grain, Setting Matters Straight, and for The Journal of Conceptual Modeling. His third book, Practical Issues in Database Management serves as text for his seminars.

Special Offer: Author Fabian Pascal is offering DBAzine.com readers subscriptions to the Database Foundations Series of papers at a discount. To receive your discount, just let him know you’re a DBAzine reader before you subscribe! Contact information is available on the “About” page of his site.

Contributors : Fabian Pascal
Last modified 2006-01-04 01:39 PM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

Slashing a Slashdot Exchange - Part 2