An index is a database structure that can help speed access to individual rows of the database, when searching based on the field(s) in the index.
In your example, the CREATE INDEX
statement creates an index named tags_tag
on the table tags
using the column tag
. If you wanted to search for rows from the table based on the tag
field, the database might use the index look up the row more efficiently. Without an index, the database might have to resort to a full scan of the table, which can take much longer (depending on many factors, like size of the table, distribution of values, exact query criteria). Different databases also support different types of indexes, which can be used to search for data in different ways.
There is also a disadvantage of indexes: For every index, write speeds go down for that table. If you insert a row, having an index means that in addition to the database writing to the row itself, it will also have to update the index.
Deciding which columns to put an index on can be tricky, and as always, benchmarks or real-world queries against real-world data are the most accurate way of measuring performance. In general you will want indexes on columns that you will be searching on. So if you are likely to want to look up a row by tag
then it definitely makes sense to put an index there. But if you have an address book, you are (probably) not going to need to search on street, ZIP/postcode or phone number so it is not going to be worth the write performance hit.
Your primary key column(s) will almost always have an index automatically generated by the database. And if you want to keep the values of a particular column unique, a UNIQUE INDEX
can be created to enforce this.
This SO question asks about rules of thumb for database indexes, which may be useful.