views:

105

answers:

6

My employer has a database committee and we've been discussing different platforms. Thus far we've had people present on SqlLite, MySQL, and PostreSql. We currently use the Microsoft stack so we're all quite familiar with Microsoft Sql Server.

As a part of this comparison I thought it would be interesting to create a small reference application for each database platform to explore the details involved in working with it.

First: does that idea make sense or is the comparison require going beyond the scope of a trivial sample application?

Second: I would imagine each reference application having a discrete but small set of requirements that fulfill many of the scenarios we run into on a regular basis. Here is what I have so far, what else can be added to the list but still keep the application small enough to be built in a very limited timespan?

  • Connectivity from the application layer

  • Tools for database administration

  • Process of creating a schema (small "s" schema, tables/views/functions other objects)

  • Simple CRUD (Create, Retrieve, Update, Delete)

  • Transaction support

Third: has anyone gone through this process, what are your findings?

A: 

I don't think you need to test the simple CRUD stuff, it's hard to imagine a vendor that doesn't support the basics.

Lance Roberts
A: 

Firstly, you're going beyond the scope of a sample app, in my humble opinion.

Secondly, I'd pick the one most appropriate to the tool or application you wish to develop. For example, are schemas and transactions relevant for a database that stores a single-user app configuration?

Thirdly, I've worked with Access, SQL Server, SQLite, MySQL, PostgreSQL and Oracle, and they all have their place. If you're in the MS space, go with SQL Server (and don't forget Express). There are also ADO.NET ways to talk to the others in my list. It depends on what you want.

Randolph Potter
So when you worked with a specific database, you just used it because someone else had started with it? Or how did you choose, for example, PostgreSQL vs MySQL - what factors make one better for a specific scenario? Coming from the Microsoft universe I have no context for that decision.
David in Dakota
It depends again. Do some research on each, including market penetration. Making a technical decision without playing with the products will limit you to someone else's opinion. For instance, I like MySQL on the web because it's free, but SQL Server is more robust for larger sites. I would imagine that PostgreSQL is also good enough for enterprise-level databases.
Randolph Potter
A: 

Actually learning capabilities of each RDMS is more crucial. Because it depends on the application. If you need spatial data capabilities PostGIS with PostgreSQL is better than MySQL. If you need easy replication, high availability features MySQL seems better. Also there are license issues. A link for comparison here. All has strengths and weaknesses. First get the requirements of your project or projects than compare it with list the features of the RDMSs you pick and decide which one to go.

Gok Demir
A: 

Frankly, I doubt an arbitrarily-defined simple application would be likely to really highlight the differences between database engines. I think you'd be better to read the advertising literature for the various engines to see what they claim as their strong points. Then consider which of these matter to you, and construct tests specifically designed to verify claims that you care about.

For example, here are pros and cons of database engines I've used the most that have mattered to me. I don't claim this is an exhaustive list, but it may give you an idea of things to think about:

MySQL: Note: MySQL has two main engines internally: MyISAM and InnoDB. I have never used the InnoDB. Pros: Fast. Free to cheap depending on how you're using it. Very convenient and easy-to-use commands for managing the schema. Some very useful extensions to the SQL standard, like "insert ... on duplicate". Cons: The MyISAM engine does not support transactions, i.e. there's no rollback. MyISAM engine does not manage foreign keys for you. (InnoDB does not have these drawbacks, but as I say, I've never used it, so I can't comment much further.) Many deviations from SQL standards.

Oracle: Pros: Fast. Generally good conformance to SQL standards. My brother works for Oracle so if you buy there you'll be helping support my family. (Okay, maybe that's not an important pro for you ...) Cons: Difficult to install and manage. Expensive.

Postgres: Pros: Very high conformance to SQL standards. Free. Very good "explain" plans. Cons: Relatively slow. Optimizer is easily confused on complex queries. Some awkwardness in modifying existing tables.

Access: Pros: Easy to install and manage. Very easy to use schema management. Built-in data entry tools and query builder for quick-and-dirty stuff. Cheap. Cons: Slow. Unreliable with multiple users.

Jay
"insert ... on duplicate" is non-standard -- the standard version is MERGE (which PostgreSQL unfortunately lacks). Also, I don't think its entirely fair to call PostgreSQL slow or easily confused and call MySQL fast. MySQL has neither a real optimizer nor very many plans to choose from (no merge join or hash join!). And what is awkward about modifying existing tables in PostgreSQL?
Jeff Davis
Umm, I know "insert ... on duplicate" is non-standard. That's why I referred to it as an "extension to the SQL standard". It is not at all the same as "merge". "merge" reads from a source table and writes to a destination table, i.e. it expects two tables to be involved. "insert on duplicate" can be used when adding records directly to a table.
Jay
It's true that MySQL has fewer methods for joining tables, but in any benchmark I've seen, it's nevertheless faster. I believe Oracle and MySQL generally win the performance benchmarks. If you examine the explain plans produced by Postgres, I think you will see that as the queries get complex, Postgres often puts together very poor plans. There are complex queries that confuse MySQL, but in my (admittedly anecdotal) experience this happens much less often. My point here isn't to run down Postgres -- it definately has its advantages, and I listed several.
Jay
+1  A: 

Does that idea make sense or is the comparison require going beyond the scope of a trivial sample application?

I don't think it's a good idea. Most of the things that will really affect you are long term database management issues, and how the database management system you choose can handle those things.

You could be tempted in the short term with things like "I found out in 3 seconds how to do this with XYZ database management system". Now, I'm not saying support is not important; quite the contrary. But finding an answer in google in 3 seconds means that you got an answer to a simple question. How quickly, if ever, can you find an answer to a challenging problem?

A short list (not exhaustive) of important things to consider are:

  • backup and recovery -- at both logical level and physical level
  • good support for functions (or stored procedures), triggers, various SQL query constructs
  • APIs that allow real extensibility -- these things can get you out of tough situations and allow you to solve problems in creative ways. You'd be surprised what can be accomplished with user-defined types and functions. How do the user-defined types interact with the indexing system?
  • SQL standard support -- doesn't trump everything else, but if support is lacking in a few areas, really consider why it is lacking, what the workarounds are, and what are the costs of those workarounds.
  • A powerful executor that offers a range of fundamental algorithms (e.g. hash join, merge join, etc.) and indexing structures (btree, hash, maybe a full text option, etc.). If it's missing some algorithms or index structures, consider the types of questions that the database will be inefficient at answering. Note: I don't just mean "slow" here; the wrong algorithm can easily be worse by orders of magnitude.
  • Can the type system reasonably represent your business? If the set of types available is incredibly weak, you will have a mess. Representing everything as strings is kind of like assembly programming (untyped), and you will have a mess.

A trivial application won't show you any of those things. Simple things are simple to solve. If you have a "database committee" then your company cares about its data, and you should take the responsibility seriously. You need to make sure that you can develop applications on it easily with the results you and your developers expect; and when you run into problems you need to have access to a powerful system and quality support that can get you through it.

Jeff Davis
A: 

I think that you can investigate Firebird too

This is an extract of Firebird-General on yahoogroups and I find it quite objective

Our natural audience is developers who want to package and sell proprietary applications. Firebird is easier to package and install than Postgres; more capable than SQLite; and doesn't charge a royalty like MySQL.

Hugues Van Landeghem
True. I only have brief experience with Firebird, but I think Access and Firebird are the only db's I've seen that are easily packaged for deployment by an end-user. Asking your average non-IT person to install MySQL or Postgres or Oracle or MS SQL Server would surely be an excellent way to save your help desk people from boredom.
Jay