I wonder what are your most common sql optimization that you used.
- indexes its the most common optimization
- De normalizing the tables.
- Removing constraints (only if you know what you are doing)
Lowering transaction isolation levels to get around table locks for user queries. Not all the time, but for gui's showing general information it works great.
Caching db output. Avoiding pressuring the database at all seems to be a prudent optimization.
+1 memcached.
Reducing the amount of data that is returned, by only returning the fields required and only returning the rows required. This is the most common, as you do it for every query that returns data.
Adding indexes. This is not done as frequently, as some tables doesn't need any other index than the one created for the primary key.
If you're talking common as in really common then Indexes are the first thing that pops up into my head.
They are a powerful technique that are often misunderstood and quite often abused.
Then I would place de-normalization which can add in quite a bit of performance for many databases.
Query optimization is third and it helps a lot too. I use MySQL these days and Query logging helps a lot for optimization.
Memcached is definitely not common, though caching of some sort is a part of many websites at the scripting end (ASP.Net or PHP).
My favorite list of tips (explained in detail here) is as follows
- Try to restrict the queries result set by using the WHERE clause.
- Try to restrict the queries result set by returning only the particular columns from the table, not all the table's columns.
- Use views and stored procedures instead of heavy-duty queries.
- Whenever possible, try to avoid using SQL Server cursors.
- If you need to return the total table's row count, you can use an alternative way instead of the SELECT COUNT(*) statement.
- Try to use constraints instead of triggers, whenever possible.
- Use table variables instead of temporary tables.
- Try to avoid the HAVING clause, whenever possible.
- Whenever possible, try to avoid using the DISTINCT clause.
- Include SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement.
- Use select statements with the TOP keyword or the SET ROWCOUNT statement if you need to return only the first n rows.
- Use the FAST number_rows table hint if you need to quickly return 'number_rows' rows.
- Try to use UNION ALL statement instead of UNION, whenever possible.
- Do not use optimizer hints in your queries.
1) I've yet to find a situation where
SELECT Field1, Field2, (SELECT Count(*) FROM tblLinked WHERE tblLinked.Field3 = tblSource.Field3) AS TheTotal
FROM tblSource
isn't improved by a LEFT JOIN to a derived table.
SELECT Field1, Field2, IsNull(Linked.TheTotal,0) AS TheTotal
FROM tblSource
LEFT JOIN (SELECT Field3, Count(*) AS TheTotal
FROM tblLinked
GROUP BY Field3) AS Linked ON tblSource.Field3 = Linked.Field3
2) Don't sort the results on the server unless the consuming app is unable to do this itself. This applies less often to web apps, but for desktop apps the client PC usually has plenty of power available and can happily do a sort.
3) Use EXISTS instead of checking the Count of matching entries.
4) Don't get obsessed with doing a query in just one SELECT clause. Judicious use of table variables (and sometimes temporary tables) can massively reduce the rows processed.
By far and above: Making covering indexes
A covering index includes all the columns that the query will need, thereby avoiding the need to do lookups on the results of an index seek. This will then avoid the system feeling like a scan could be quicker (which is remarkably quick considering the cost of lookups).
But also worth mentioning:
Having an index that will allow a merge join. A MERGE join is able to occur when joining two tables that are ordered by the join conditions. But of course, in saying 'table', we really mean 'index', right...
Also - removing scalar functions and using table-valued functions instead... as scalar functions are not able to be simplified out.
Also - putting a unique index on a column that you know to be unique, allowing the query optimizer to use this knowledge to make better optimization choices. Also applies to NOT NULL constraints.
Also - using Binary collation when comparing strings that are in a known case, so that the system doesn't have to consider the different case options.
Of course I could go on all day...
Rob
The two most important things in my experience are fewer joins and less queries. Other than those there's lots of DB specific stuff, COUNT(*) is relatively slow on PgSQL, subselects are dog slow on MySQL, etc.
Couple of hints: Use
delete from table where id>=1 and id<=3;
instead of
delete from table where id=1;
delete from table where id=2;
delete from table where id=3;
Also use 'IN' instead of 'OR' syntax
The biggest optimizations i used recently where quite simple.
Keep as much of the business logic as close as possible to the sql server. Aka keep you business code on the same machine as the sql server. Let your business logic return as little as possible code back to the final client.
Keep your SQL query's as 'short as possible' like Frost said, use single update statements over multiple statements.
Only use transactions when you need them
Create temp tables for partial joins to speed up the joins (dont forget to index them)
I have read all answers and I didn't found LIMIT and OFFSET usage hints. It is very common usage in pagination with "prev" and "next" links. But rendering such a display can consume more resources than the entire rest of the site. When offsetting large number items, query can become very slow. So avoid these queries.
- Do not count total items.
- Show only "n" number first items (for example only top 100).
Such methods uses Google, Twitter and other sites. In Google search there is no accurate number of results. There is only approximate number. Twitter doesn't allow user to view all past tweets. It shows only last n number (I can't remember how much).
There is some link from MySQL performance blog.
Index foreign keys!
Maybe this isn't a sql query syntax optimisation, but more a storage optimisation. But i see it re-occur all the time & its a pet peeve.
The best optimisation I've ever had using SQL was to really understand what was needed to be done the data and REMOVE ton's of SQL from the query.
The fastest query is the query that does not have to be run.
REALLY THINK about what your doing to the data. Are you working row-by-row? (then use set based code).
Do you really need to join to all those tables?
Can two small (simple) queries do the job better and quicker than a single large query?
If you combine these two queries into a single query can it run faster?
Finally, PROFILE your queries (EXPLAIN PLAN or SQL PROFILER) and look at the "IO gets". Generally you want to reduce the number of GET's to a ratio something like 10 gets per output row.
Avoid the use of onboard functions like convertdate, stringreplace and such in your Views. If you can't make sure that the data is in a valid format use stored procedures which run reguallary to 'clean up' the data in your relevant tables.
this is nasty, but it saves the views time i.e. keeps the user happy... ^^
Do not put constraints if not required as constraints will add an index, the more number of indexes, the more time it takes for data insertion.