Hai guys,
when to set index for a table (ie) during table creation or on performance tuning?
What are the advantages and disadvantages of indexing?
Hai guys,
when to set index for a table (ie) during table creation or on performance tuning?
What are the advantages and disadvantages of indexing?
In general, you set up the indexes during table creation. In this phase you should already have a good indication of how your data will be queried, and which fields will be used most as query criteria.
During performance tuning, it is generally recommended to have the obvious indexes in place. During this phase, you will be able to understand if there are indexes that are not being used efficiently, or if there are queries where an index can increase performance. There is nothing stopping you for dropping or adding new indexes during performance tuning.
There is a balance that needs to be struck. If you KNOW that a table will be queried and FieldA will be part of the where clause and it is a field which is highly selectable (google cardinality) then it makes a good candidate for pre-emptive tuning.
DONT throw indexes on all kinds of fields because you THINK it makes sense, one must KNOW these things. Premature tuning/optimizations are the root of all evil a wise man once said. In this case indexes can hurt the insert/update performance because not only does the table data need to be updated but the index as well.
Side note - for some large data loads people will often drop indexes, do the load and then re-create the indexes so that the load performs more quickly.
As far as the advantages and disadvantages - well that is a huge topic. I suggest you start here.
I'd add the 'obvious' indexes, ie on fields you know will be queried at table create time, and then add others as necessary as part of a tune. It's probably better to have fewer indexes at first and then get a feel for how the system is performing and being used - the profile is your friend hear.
Advantages - faster access (when the index is being used), and the ability to enforce certain business logic like no duplicates.
Disadvantages - table takes more space, inserting rows is slower (can be much slower), updates that touch the key field(s) are slower
Many (most?) DBMS use indexes to support unique constraints. Always create indexes to enforce unique constraints; they (the constraints) are crucial to the correct operation of your database.
Where you have a choice of how to create the index on multiple columns, put the column that will always be referenced in the queries ahead of other fields - usually. This is best if the leading column is also somewhat selective.
After you have the constraints necessary for uniqueness, consider those needed to enforce referential integrity. They are usually mandated by the DBMS too. Again, you cannot afford to have your database in a state of disintegrity -- it is a logical system, and if it contains fallacies, you can prove anything from it, which is not helpful.
After the uniqueness and referential integrity constraints are dealt with (indexed), then you may or may not benefit from some others. Choose carefully, and add as few extras as possible (zero is a good number). Each index slows up update operations (UPDATE, INSERT, DELETE), and uses storage space. The intention is that it should win its place by speeding up queries. However, don't forget that the optimizer has to think about each index and whether it can be useful in answering the query, so indexes also slow down the optimizer (though you'd probably be hard pressed to measure that effect).
When you do add indexes, add them on selective columns (not 'sex' containing 'M' and 'F', but maybe 'dob' containing dates of birth between 1900 and 2010, or maybe even more distinct values than that. Consider whether extra columns will help answer more queries. Some DBMS (such as DB2) provide for indexes with extra columns that are not part of the uniqueness constraint but which provide columns that are frequently used in the query. These can allow an index-only scan (one which does not need to access the table data because the needed values are all in the index).
There is much more that could be said, but this covers a lot of the territory.