views:

305

answers:

5

First of all, I should point out that I'm not asking this question for any practical reason. I'm just asking out of sheer curiosity.

Why is it that SQL distributions are so non-standard? I know there's an ANSI standard for SQL, but it just seems that developers of SQL databases go out of their way to find different ways to do things. First of all I should ask: am I getting the point? Are there really that many meaningful differences in the way SQL databases work or is it just the two databases I've been working with (MS-SQL and Postgresql)?

Next, why are there so many differences? I should say that I'm a beginner of this, so I haven't gotten to any of the bigger issues that I may be missing.

+8  A: 

The ANSI standard specifies only a limited set of commands and data types. Once you go beyond those, the implementors are on their own. And some very important concepts aren't specified at all, such as auto-incrementing columns. SQLite just picks the first non-null integer, MySQL requires AUTO INCREMENT, PostgreSQL uses sequences, etc. It's a mess, and that's only among the OSS databases! Try getting Oracle, Microsoft, and IBM to collectively decide on a tricky bit of functionality.

John Millikin
+5  A: 

It's a form of "Stealth lock-in". Joel goes into great detail here:

Companies end up tying their business functionality to non-standard or weird unsupported functionality in their implementation, this restricts their ability to move away from their vendor to a competitor.

On the other hand, it's pretty short-sighted because anyone with half a brain will tend to abstract away the proprietary pieces, or avoid the lock-in altogether, if it gets too egregious.

1800 INFORMATION
+2  A: 

It's certainly effective lock-in, as 1800 says. But in fairness to the database vendors, the SQL standard is always playing catch-up to current databases' feature sets. Most databases we have today are of pretty ancient lineages. If you trace Microsoft SQL Server back to its roots, I think you'll find Ingres - one of the very first relational databases written in the '70s. And Postgres was originally written by some of the same people in the '80s as a successor to Ingres. Oracle goes way back, and I'm not sure where MySQL came in.

Database non-portability does suck, but it could be a lot worse.

Neall
+4  A: 

John: The standard actually covers lots of subjects, including identity columns, sequences, triggers, routines, upsert, etc. But of course, many of these standards-components may have been brought in place later than the first implementations; and this could be a reason why SQL standards compliance is somewhat low, generally.

Neall: There are actually areas where the SQL standard is ahead of the implementations. For example, it would be nice to have CREATE ASSERTION, but as far as I know, no DBMS implements assertions yet.

Personally, I believe that the closed nature of some ISO standards (like the SQL standard) is part of the problem: When a standard is not readily available online, it's less likely to be known by implementors/planners, and too few customers ask for compliance because they don't know what to ask for.

Troels Arvin
+3  A: 

First, I don't find databases to be as, say, browsers or operating systems in terms of incompatibility. Anyone with a few hours of training can start doing selects, inserts, deletes and updates on any SQL database. Meanwhile, it's difficult to write HTML that renders identically on every browser or write system code for more than one OS. Generally, differences in SQL are related to performance or fairly esoteric features. The major exception seems to be date formats and functions.

Second, database developers generally are motivated to add features that differentiate their product from everyone else. Products like Oracle, MS SQL Server and MySQL are vast ecosystems that rarely cross-pollinate in practice. At my workplace, we use Oracle and MySQL, but we could probably switch over to 100% Oracle in about a day if needed or desired. So I care a lot about the shiny toys Oracle gives us with each release, but I don't even know what version of MySQL we are using. IBM, Microsoft, PostgreSQL and the rest might as well not exist as far as we are concerned. Having the features to get and keep customers and users is far more important than compatibility in the database world. (That's the positive spin on the "lock-in" answer, I suppose.)

Third, there are legitimate reasons for different companies to implement SQL differently. For instance, Oracle has a multi-versioning system that allows very fast ans scalable consistent reads. Other databases lack that feature, but usually are faster inserting rows and rolling back transactions. This is a fundamental difference in these systems. It doesn't make one better than the other (at least in the general case), just different. One should not be surprised if the SQL ontop of a database engine takes advantage of its strengths and attempts to minimize its weaknesses. In fact, it would be irresponsible of the developers to not do this.

Jon Ericson