Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Of Interest » Articles of Interest » Capitalizing on Open Source Databases
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 3558
 

Capitalizing on Open Source Databases

by Howard Fosdick

The software trend of the decade is open source software, free software distributed under a variety of similar licenses. Linux and Apache spring to mind as two widely used open source products critical to their market segments.

But what about open source databases? Are they ready for primetime? How can IT organizations capitalize on them? This brief article can’t tell you everything you need to know, but we can orient you and offer some perspective. We’ll discuss the major OSDBMSs, where they best fit in most IT organizations, their typical uses, and where to get further information. The goal is to give you, the Oracle, DB2, or SQL Server professional, a quick intro to the world of open-source databases.

There are at least a dozen OSDBMSs, but most interest focuses on the three market share leaders:

MySQL“The most popular open source database”

PostgreSQL“The open source database with the most features”

Berkeley DB“The world’s most popular application-specific (embedded) database”

MySQL. MySQL (pronounced “my ess-que-el”) is probably the most popular OSDBMS in IT organizations, in part because it runs on almost any platform. If you’re used to commercial products like DB2, Oracle, or SQL Server, MySQL traditionally lacked many of the features you’d expect in your database. But the product is quickly catching up. Versions 4.0 and 4.1 added a slew of new features, including new SQL statements and functions, subqueries, UTF-8 Unicode, multi-table updates from UPDATE and DELETE statements, and ACID-compliant transactions as a standard engine feature. Release 5.0 is today “in preview” and adds triggers, stored procedures, cursors, online backups, and views.

These lists show that MySQL does not have all the features you expect in Oracle, DB2, or SQL Server, but it is improving rapidly. Read about MySQL’s new features, or view a quick comparison of MySQL 4.1 features versus those of SQL Server 2000. Last year, the company behind MySQL, MySQL AB, inked a partnership deal with SAP AG, giving it rights to the sophisticated SAP database. This should further accelerate MySQL’s technological advance. MySQL AB also offers the SAP DB open source database under the new name MaxDB.

PostgreSQL. If you’re an Oracle, DB2, or SQL Server professional, you’ll feel right at home when you look at PostgreSQL (pronounced “post-gress-que-el”). This product traces its lineage back to the Ingres and Postgres projects at the University of California at Berkeley, and unlike MySQL, it was designed as a full relational engine from Day 1. The product retains an edge in features over MySQL (see a comparison of the two products). But it may not be as fast as the famously speedy MySQL, and while it runs on many platforms, it’s almost exclusively identified with the Unix-Linux-BSD universe.

Berkeley DB. Berkeley DB is a small-footprint database for application-specific needs and embedded systems. It may be the most popular database in the world for this purpose. If you’re not familiar with embedded databases, view them as a set of data manipulation APIs (application programming interfaces). The database does not have daemon processes or threads that remain in the system; instead, it only runs when invoked. Traditionally, we haven’t seen application-specific databases in IT, but this might change with the proliferation of handheld and mobile devices supporting business functions in many companies.

Where OSDBMSs Make Sense

Where do IT organizations use open source databases?

Prototyping and quick development. Freely downloadable, widely documented, and with relatively small footprints, the OSDBMSs are useful for prototyping and quick development. You could rapidly prototype in a language like Perl, for example, using its Database Interface (or DBI) and connecting your program to MySQL on your desktop. The database-independent nature of the Perl DBI makes it easy to later point your program to another local or remote database.

Embedded systems and application-specific development. Here, Berkeley DB is most popular, but MySQL is also widely used for databases that are “dedicated” to a specific application or user. Motorola uses Berkeley DB in its smart phones. Cisco used Berkeley DB to create a host-based system to remotely configure devices via the Internet.

Web sites. The Web is thick with small to medium Web sites that are based on MySQL. MySQL’s exceptional speed and ease of administration and maintenance make it ideal for this purpose. This model is so popular that these Web sites are referred to as LAMP applications (the letters stand for Linux, Apache, MySQL, and the “P” for either PHP, Perl, or Python). LAMP applications offer outstanding reliability and uptime at the cost of moderate scalability.

Mixed-database Web sites. It’s not unusual for large sites to circumvent the scalability limits of open source databases merely by setting up a large flock of Linux/MySQL front-end machines to their corporate database. For example, Sabre Holdings Corp. became famous as one of the first large online transaction processing systems back in the 1960s. Today, their HP Himalaya Non-Stop servers hold the master database, but all fare pricing and routing queries are handled by a front-end farm of 45 Linux servers running MySQL.

Departmental and small company databases. The scalability and features of the OSDBMSs trail their commercial counterparts, but they work well for small to medium projects. What are their limits? The Open Source Developer’s Network (OSDN) switched from open source to DB2 as their number of page views per day topped 500,000 (15/second). As with other OSDBMS limitations, this one is lifting over time.

DBA-less environments. MySQL is generally considered more a “set it and forget it” database than most commercial products. Where DBA expertise cannot be found or at smaller companies with limited staff, this can be key. Part of the appeal of LAMP applications are their low personnel cost. For example, Rhode Island put up its Rules and Regulations Database Web site as a LAMP application for $40,000 — less than many consulting firms would charge just to analyze the problem.

Mixed environments with commercial databases. Many large IT organizations need the performance and scalability of Oracle, DB2, and SQL Server. But they take advantage of the ease of use and lower costs of OSDBMSs whenever possible. Yahoo! Inc. runs most of its business on Oracle, but takes advantage of MySQL to run its popular Yahoo! Finance Web site. This open-source/commercial database mix is increasingly popular. Large companies use commercial databases for more demanding applications like the corporate database, then farm out specific or departmental applications to OSDBMSs.

Environments requiring lowest-cost systems. While total cost of ownership (TCO) for any database is debatable, there’s little doubt that open source means less expensive than commercial. An example: NASA moved their Acquisition Internet Service from Oracle to MySQL primarily due to cost. The application supports a 700-gigabyte database that is as fast as it is inexpensive.

Limitations

So what are the limitations of open source databases? They are (painting with a very broad brush):

      • Fewer features
      • Lesser scalability
      • Fewer database modules or components bundled with the database engine (no bundled OLAP, data mining, reporting services, clustering support, ETL tools, fewer administrative tools, and so on)
      • Fewer third-party tools
      • Fewer applications

Continued OSDBMS development will address some of these shortcomings, while the open-source bandwagon must achieve critical mass to produce more tools and applications. Until then, follow the conventional wisdom of starting with small projects and learning as you go. Incremental gains are always better than jumping in too big and assuming unnecessary risk.

Licensing and Support

The three OSDBMSs use different models for licensing and support. Berkeley DB and MySQL are backed by companies that are responsible for product development and maintenance (Sleepycat Software and MySQL AB, respectively). They have a “dual licensing” model: the software is free and open source. But companies buy licenses when they intend to sell products based on the database, or when they “redistribute” applications they develop to multiple sites, or when they want to keep their application source code proprietary. (Read the license pages for the Sleepycat and MySQL for the details.) Commercial licenses for Berkeley DB and MySQL cost way less than Oracle, DB2, and SQL Server.

PostgreSQL is an open-source community project ¾  there is no company behind it. A global network of Internet-linked volunteers develops and maintains PostgreSQL. The product is distributed under the more liberal BSD license, which basically allows you to do whatever you like with the software as long as you maintain the credits.

Before open source became mainstream, IT managers were concerned about support. Who could they turn to when problems turned up? Now, most IT professionals realize that open source support varies by product, just as it does for commercial products. Moreover, the rise of the Internet as a means of information exchange has led to a new, emerging support model, in which the traditional advantages of commercial support contracts erode in the face of online forums, active mailing lists, and targeted support contracts. Oracle, IBM, and Microsoft offer good support, but it is expensive. Many IT sites are evolving to a new support model combining online community and targeted support contracts.

Where to Learn More

Open-source databases are on a roll. While less widely publicized than products like Linux or Apache, they are used in many IT organizations today and their use is increasing. If you don’t work with them today, you may well run into them sometime in the next several years.

Fortunately, in the open-source world, boatloads of information are freely available online. Go to the vendors’ Web sites for product descriptions, manuals, tools, and the databases themselves:  MySQL at www.mysql.com, PostgreSQL at www.postgresql.org and Berkeley DB at www.sleepycat.com.

An experienced Oracle, DB2, or SQL Server professional can teach him or herself about OSDBMSs through any of the many available books. Browse at Amazon.com or Barnes & Noble and you’ll find books for beginners and experts alike. The table lists my recommendations for a quick start:

MySQL Sams Teach Yourself MySQL in 21 Days, by Anthony Butcher.
  Or, if you’re in a hurry:  Sams Teach Yourself MySQL in 24 Hours, by J. Melonie.
  More advanced: MySQL, 2nd Edition, by Paul DuBois.
   
PostgreSQL Beginning Databases with PostgreSQL, by R. Stones and N. Matthew.
  More advanced:  Practical PostgreSQL, by O’Reilly / Command Prompt.
   
LAMP applications Beginning Databases with PostgreSQL, by R. Stones and N. Matthew.
  More advanced: Practical PostgreSQL, by O’Reilly / Command Prompt.
   
Berkeley DB Berkeley DB, by Sleepycat Software.

Books: Quick-start picks.

Database professionals must continually learn to keep up with change. With open source databases, you’ll find the learning easy and inexpensive.

--

Howard Fosdick is both an Oracle-certificationified DBA and an IBM-certificationified DB2 UDB DBA. He has written many technical articles and is a popular conference speaker, and he has worked hands-on as an independent DBA contractor since 1989. Reach him at hfosdick@compuserve.com.


Contributors : Howard Fosdick
Last modified 2005-11-11 01:12 PM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone