views:

849

answers:

10

Why don't databases automatically index tables based on query frequency? Do any tools exist to analyze a database and the queries it is receiving, and automatically create, or at least suggest which indexes to create?

I'm specifically interested in MySQL, but I'd be curious for other databases as well.

+1  A: 

There are tools out there for this.

For MS SQL, use the SQL Profiler (to record activity against the database), and the Database Engine Tuning Advisor (SQL 2005) or the Index Tuning Wizard (SQL 2000) to analyze the activities and recommend indexes or other improvements.

BradC
A: 

Google App Engine does that (see the index.yaml file).

friol
+4  A: 

There are database optimizers that can be enabled or attached to databases to suggest (and in some cases perform) indexes that might help things out.

However, it's not actually a trivial problem, and when these aids first came out users sometimes found it actually slowed their databases down due to inferior optimizations.

Lastly, there's a LOT of money in the industry for database architects, and they prefer the status quo.

Still, databases are becoming more intelligent. If you use SQL server profiler with Microsoft SQL server you'll find ways to speed your server up. Other databases have similar profilers, and there are third party utilities to do this work.

But if you're the one writing the queries, hopefully you know enough about what you're doing to index the right fields. If not then having the right indexes is likely the least of your problems...

Adam Davis
What a silly statement, "database architects prefer the status quo". Yep we're a large cartel that squashes every attempt to make databases self-indexing. Like the simple device you add to your car to get 100mpg that the oil companies are hiding from us.
@Adam Davis: "But if you're the one writing the queries, hopefully you know enough about what you're doing to index the right fields. If not then having the right indexes is likely the least of your problems" - not having the right indexes describes a good proportion of all databases out there...
Mitch Wheat
+1  A: 

MS SQL 2005 also maintains an internal reference of suggested indexes to create based on usage data. It's not as complete or accurate as the Tuning Advisor, but it is automatic. Research dm_db_missing_index_groups for more information.

Nick
A: 

I agree with what Adam Davis says in his comment. I'll add that if such a mechanism existed to create indexes automatically, the most common reaction to this feature would be, "That's nice... How do I turn it off?"

Bill Karwin
+2  A: 

That is a best question I have seen on stackoverflow. Unfortunately I don't have an answer. Google's bigtable does automatially index the right columns, but BigTable doesn't allow arbitrary joins so the problem space is much smaller.

The only answer I can give is this:

One day someone asked, "Why can't the computer just analyze my code and and compile & statically type the pieces of code that run most often?"

People are solving this problem today (e.g. Tamarin in FF3.1), and I think "auto-indexing" relational databases is the same class of problem, but it isn't as much a priority. A decade from now, manually adding indexes to a database will be considered a waste of time. For now, we are stuck with monitoring slow queries and running optimizers.

John C
If there were one right answer the database would do it already. There's always a trade off. You could have 100's of indexes and queries would always run fast but inserts and updates would drag. Which is better? Because your query runs frequently doesn't mean it's the most important job to you.
@Mark Brady: spot on: it's always a trade off.
Mitch Wheat
+1  A: 

Part of the reason may be that indexes don't just give a small speedup. If you don't have a suitable index on a large table queries can run so slowly that the application is entirely unusable, and possibly if it is interacting with other software it simply won't work. So you really need the indexes to be right before you start trying to use the application.

Also, rather than building an index in the background, and slowing things down further while it's being built, it is better to have the index defined before you start adding significant amounts of data.

I'm sure we'll get more tools that take sample queries and work out what indexes are necessary; also probably we will eventually get databases that do as you suggest and monitor performance and add indexes they think are necessary, but I don't think they will be a replacement for starting off with the right indexes.

Mark Baker
A: 

There is a script on I think an MS SQL blog with a script for suggesting indexes in SQL 2005 but I can't find the exact script right now! Its just the thing from the description as I recall. Here's a link to some more info http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx

PS just for SQL Server 2005 +

mcintyre321
A: 

Seems that MySQL doesn't have a user-friendly profiler. Maybe you want to try something like this, a php class based in MySQL profiler.

polyphony
+1  A: 

Yes, some engines DO support automatic indexing. One such example for mysql is Infobright, their engine does not support "conventional" indexes and instead implicitly indexes everything - this is a column-based storage engine.

The behaviour of such engines tends to be very different from what developers (And yes, you need ot be a DEVELOPER to even be thinking about using Infobright; it is not a plug-in replacement for a standard engine) expect.

MarkR