views:

165

answers:

4

EDIT I've just started skimming Codd's famous 1970 paper that started it all, that Oracle was based on (A Relational Model of Data for Large Shared Data Banks [pdf]), and was amazed to find that it seems it will answer this SO question. It talks about databases in the market at that time ("hierarchical" and "network" - like NoSQL?), the need for independence from internal representation, and a clear explanation of how to apply mathematical "relations" to a database.


Historically, what feature of relational databases gave what benefit that caused businesses to adopt it, making it massively successful?

Today, there are many reasons to use a RDB: it's standard, products are mature, debugged, full-featured, there's a choice of vendors, there's support, there's a trained workforce and so on. But why did it become so popular?

I've heard "hierarchical databases" were popular before relational databases - they sound like a key-value store, where the value can be another set of key-values. If so, that is similar to the object oriented databases that were publicized a decade or two ago; and also to XML/document databases and NoSQL.

Maybe ACID transactions (atomicity etc)? But that doesn't seem specific to RDB.

Maybe because relational databases enabled you to define a data schema that was purely about the data - independent of a particular programming language, version of an application (evolution), or purpose of the application (this makes "impedance mismatch" an inevitable) But any database with a data schema has this feature.

Maybe because the relational model is mathematically sound? But this doesn't sound like it would convince managers to adopt it - and what would be the business benefit.

Maybe because the mathematical model gives you a way to rearrange the database into different normal forms to give different performance characteristics, which are mathematically guaranteed to not change the meaning of the data? This seems plausible, and my uni textbooks make a big deal of it, but it doesn't sound very compelling to me as a practical business benefit (maybe I'm missing something)?

To summarise: historically, what made the relational model win so decisively over the hierarchical model? I'm also interested in whether RDB still have some special quality that actively makes them a better practical choice for businesses (other than the benefits of being a standard mentioned above).

Many thanks if you can shed some light - I've long been curious about this.

+3  A: 

For the same reason why the script languages are popular.

You can make a query with your favorite text editor and just issue it, without bothering about the actual physical schema.

It's not the fastest model, not the most reliable model — it's just the most productive model. You can write ten times as many queries in an hour.

You may want to read this article in my blog which compares the most popular database models:

Quassnoi
now a community wiki - please feel free to modify/add! thanks for your article, it seems to be right on the money. I'm having trouble connecting your answer here and it, but I haven't read the article in full detail yet.
13ren
+2  A: 

From my knowledge, it is the normalization theory (the well known Codd's Third Normal Form) to define relational data model that is easy and efficient for storing and retrieving. This followed by the Standard Query Language (SQL) which allows it to be used across all the relational db system. Standardization was definitely lacking back then which also make this appealing to many.

Fadrian Sudaman
Thanks - I was reading in "Crossing the Chasm" that Oracle spearheaded the standardization on SQL, by porting their database to everything in sight (even though SQL was developed by IBM). This standardization helped relational databases enormously, but I believe they were already popular (and had "won" against hierarchical databases) before then.What made the Third Normal Form special compared to the others?
13ren
+1  A: 

The concept of making a logical representation of data abstracted from its physical representation was perhaps the most game-changing aspect of Codd's idea. He was apparently the first person who fully realised the benefits of separating logical and physical concerns and therefore the first to devise a data model worthy of the name. By describing a model based on relations, without navigational links or pointer structures he also created something uniquely powerful, flexible and of lasting relevance.

To be accurate it must be said that it was the SQL model rather than the relational one which eventually proved more successful commercially. SQL is a long way from a truly relational data model or language even though it would not have come into being without Codd's ideas to inspire it. The relational model's creator was naturally disappointed that SQL rather than relational became the database standard. Four decades later I think we have plenty of cause to regret that Codd's relational model isn't better supported by DBMS software.

dportas
Thanks - "a model based on relations [rather than] pointers" - this seems to be the answer, though I need more context to understand it. I remember the logical/physical distinction being emphasized. "SQL rather than relational became the database standard" - but isn't SQL a (partial) implementation of the relational model? [BTW: I feel you've left a word out when you compare SQL and relational, since they seem to be different kinds of things - but probably I just don't understand the background well enough)
13ren
+1  A: 

One key was the self contained products - you no longer had to manually define and maintain your key files (indexes) and the ability to change the data model with less effort. Combine that with the SET based structures made it a compelling product set to work with. Combine the SQL language on top of that to return data and it was a win-win situation over traditional ISAM data structures primarily associated with COBOL languages.

Mark Schultheiss
13ren
Mark Schultheiss