views:

98

answers:

3

I have a large table with millions of records.

Table `price`
------------
id 
product 
site 
value

The table is brand new, and there are no indexes created.

I then issued a request for new index creation with the following query:

CREATE INDEX ix_price_site_product_value_id ON price (site, product, value, id);

This took long long time, last time I was checking ran for 5000+ seconds, because of the machine.

I am wondering if I issue another index creation, will it use the existing index in the process calculation? If so in what form?

Next to run query 1:

CREATE INDEX ix_price_product_value_id ON price (product, value, id);

Next to run query 2:

CREATE INDEX ix_price_value_id ON price (value, id);
+1  A: 

Having an existing index won't improve the speed of creating new indexes because creating an index requires a full scan.

To speed things up, instead of creating three indexes:

  • (site, product, value, id)
  • (product, value, id)
  • (value, id)

you can reverse the column order and have just one index that can serve all three queries:

  • (id, value, product, site)

This is because when MySQL looks for a suitable index it will also be able to use any prefix of an index.

Mark Byers
Correct but note that using the proposed index will not allow the user to search for products or sites only.
cherouvim
Changing the order of indexes won't solve the problem. Check the post of the problem here: http://explainextended.com/2010/09/30/olap-in-mysql-four-ways-to-filter-on-higher-level-dimensions/
Pentium10
+3  A: 

If you're using MySQL version 5.1, and the InnoDB storage engine, you may want to use the InnoDB Plugin 1.0, which supports a new feature called Fast Index Creation. This allows the storage engine to create indexes without copying the contents of the entire table.

Overview of the InnoDB Plugin:

Starting with version 5.1, MySQL AB has promoted the idea of a “pluggable” storage engine architecture, which permits multiple storage engines to be added to MySQL. Currently, however, most users have accessed only those storage engines that are distributed by MySQL AB, and are linked into the binary (executable) releases.

Since 2001, MySQL AB has distributed the InnoDB transactional storage engine with its releases (both source and binary). Beginning with MySQL version 5.1, it is possible for users to swap out one version of InnoDB and use another.

Source: Introduction to the InnoDB Plugin

Overview of Fast Index Creation:

In MySQL versions up to 5.0, adding or dropping an index on a table with existing data can be very slow if the table has many rows. The CREATE INDEX and DROP INDEX commands work by creating a new, empty table defined with the requested set of indexes. It then copies the existing rows to the new table one-by-one, updating the indexes as it goes. Inserting entries into the indexes in this fashion, where the key values are not sorted, requires random access to the index nodes, and is far from optimal. After all rows from the original table are copied, the old table is dropped and the copy is renamed with the name of the original table.

Beginning with version 5.1, MySQL allows a storage engine to create or drop indexes without copying the contents of the entire table. The standard built-in InnoDB in MySQL version 5.1, however, does not take advantage of this capability. With the InnoDB Plugin, however, users can in most cases add and drop indexes much more efficiently than with prior releases.

...

Changing the clustered index requires copying the data, even with the InnoDB Plugin. However, adding or dropping a secondary index with the InnoDB Plugin is much faster, since it does not involve copying the data.

Source: Overview of Fast Index Creation

Daniel Vassallo
Sounds cool. How do I check if the machine has InnoDB Plugin installed?
Pentium10
Do `SELECT VERSION()` in your MySQL client. Is your MySQL server v5.1+?
Daniel Vassallo
my version is: 5.1.33-community-log
Pentium10
@Pentium10: That's good. Then you may wanto to follow the instructions here: [Obtaining and Installing the InnoDB Plugin](http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-plugin-introduction-obtaining.html)
Daniel Vassallo
@Pentium10: You can also check if it is already installed by the `SHOW PLUGINS;` command. Check out the expected output in point 6 here: http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-plugin-installation-dynamic-posix.html
Daniel Vassallo
You may also want to check this out: http://dev.mysql.com/doc/refman/5.1/en/replacing-builtin-innodb.html
Daniel Vassallo
I've upgraded to 5.5 which has the plugin inbuilt, and now everything works smooth. Thank you.
Pentium10
@Pentium10: I'm glad it's working smoothly. Thanks for leaving feedback.
Daniel Vassallo
+3  A: 

I am wondering if I issue another index creation, will it use the existing index in the process calculation? If so in what form?

No, it won't.

Theoretically, an index on (site, product, value, id) has everything required to build an index on any subset of these fields (including the indices on (product, value, id) and (value, id)).

However, building an index from a secondary index is not supported.

First, MySQL does not support fast full index scan (that is scanning an index in physical order rather than logical), thus making an index access path more expensive than the table read. This is not a problem for InnoDB, since the table itself is always clustered.

Second, the record orders in these indexes are completely different so the records need to be sorted anyway.

However, the main problem with the index creation speed in MySQL is that it generates the order on site (just inserting the records one by one into a B-Tree) instead of using a presorted source. As @Daniel mentioned, fast index creation solves this problem. It is available as a plugin for 5.1 and comes preinstalled in 5.5.

Quassnoi
@Quassnoi I would definitely read more about choosing all the stuff engine etc., and how index creation speed works on different data types. Maybe this should be a new material for a new post on your blog. What do you say?
Pentium10
@Pentium10: a good idea.
Quassnoi