views:

783

answers:

17

I wonder what are your most common sql optimization that you used.

A: 
  1. indexes its the most common optimization
  2. De normalizing the tables.
  3. Removing constraints (only if you know what you are doing)
Umair Ahmed
Removing constraints is an interesting one. It can help performance of changes, but can hurt performance of queries.
Rob Farley
constraints are use by sql engine to get the most optimized queryplan and removing then may cause performance loss as Rob pointed out.
Hakan Winther
I think removing constraints helps when a lot of insertion is going on.
Umair Ahmed
De-normalizing tables not always drives to optimization. For example, if you are saving result sets (we can call it cached data) in some table. Data automatically will not be normalized, but you can achieve better performance of your system.
Pawka
+1  A: 

Lowering transaction isolation levels to get around table locks for user queries. Not all the time, but for gui's showing general information it works great.

Spence
+4  A: 

Caching db output. Avoiding pressuring the database at all seems to be a prudent optimization.

+1 memcached.

nilamo
+24  A: 

Reducing the amount of data that is returned, by only returning the fields required and only returning the rows required. This is the most common, as you do it for every query that returns data.

Adding indexes. This is not done as frequently, as some tables doesn't need any other index than the one created for the primary key.

Guffa
+1 for returning only the columns you need
carpenteri
+1  A: 

If you're talking common as in really common then Indexes are the first thing that pops up into my head.

They are a powerful technique that are often misunderstood and quite often abused.

Then I would place de-normalization which can add in quite a bit of performance for many databases.

Query optimization is third and it helps a lot too. I use MySQL these days and Query logging helps a lot for optimization.

Memcached is definitely not common, though caching of some sort is a part of many websites at the scripting end (ASP.Net or PHP).

Cyril Gupta
+16  A: 

My favorite list of tips (explained in detail here) is as follows

  1. Try to restrict the queries result set by using the WHERE clause.
  2. Try to restrict the queries result set by returning only the particular columns from the table, not all the table's columns.
  3. Use views and stored procedures instead of heavy-duty queries.
  4. Whenever possible, try to avoid using SQL Server cursors.
  5. If you need to return the total table's row count, you can use an alternative way instead of the SELECT COUNT(*) statement.
  6. Try to use constraints instead of triggers, whenever possible.
  7. Use table variables instead of temporary tables.
  8. Try to avoid the HAVING clause, whenever possible.
  9. Whenever possible, try to avoid using the DISTINCT clause.
  10. Include SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement.
  11. Use select statements with the TOP keyword or the SET ROWCOUNT statement if you need to return only the first n rows.
  12. Use the FAST number_rows table hint if you need to quickly return 'number_rows' rows.
  13. Try to use UNION ALL statement instead of UNION, whenever possible.
  14. Do not use optimizer hints in your queries.
RRUZ
Not sure what you mean by views instead of heavy-duty queries. As a view is just a stored sub-query, it shouldn't make any difference except for ease of reading. Also, the HAVING clause is fine if your solution needs it.
Rob Farley
Rob use the view instead of heavy-duty queries, can reduce network traffic as your client will send to the server only stored procedures or view name instead of large heavy-duty queries text. This can be used to facilitate permission management also, because you can restrict user access to table columns they should not see.
RRUZ
Ah - so it's more along the lines of "Don't use ad-hoc queries when you could query a view or use a stored proc." Sure, that I agree with...
Rob Farley
Rob, an example might be a subquery of 16k, would overload the network at 16K, while running a view would occupy a few bytes of bandwidth.
RRUZ
Yeah, for sure. Using stored procedures from applications is definitely worthwhile.
Rob Farley
RUZZ - Why? You've given a list of unqualified statements with no explanation. If people are to learn "the right way to use SQL" then you must educate them. (I agree with most of your statements, but as a DBA I understand them. Novice developers probably don't.)
Guy
Hi Guy, do you see the part "tips (here explained in detail)"? ;)
RRUZ
+1  A: 

Making sure tables are being joined in the correct order.

Allethrin
Doesn't the database engine do this automatically?! I always thought they reordered queries before execution...
Jim
Yeah sort of. In the case of Sybase ASE, the optimiser will attempt to pick the optimal join order based on query's costing. Usually (if statistics are up to date) it will join in the correct order. But I've seen many instances where the join order wasn't coming out correctly. Also, the table order in the query can make a difference if you have a large number of tables, as the DBMS generally considers x number of table permutations at a time. I always try to list my tables in the order I want them joined (also handy if you ever need to force the join order).
Allethrin
+1  A: 

1) I've yet to find a situation where

SELECT Field1, Field2, (SELECT Count(*) FROM tblLinked WHERE tblLinked.Field3 = tblSource.Field3) AS TheTotal
FROM tblSource

isn't improved by a LEFT JOIN to a derived table.

SELECT Field1, Field2, IsNull(Linked.TheTotal,0) AS TheTotal
FROM tblSource
LEFT JOIN (SELECT Field3, Count(*) AS TheTotal
    FROM tblLinked
    GROUP BY Field3) AS Linked ON tblSource.Field3 = Linked.Field3

2) Don't sort the results on the server unless the consuming app is unable to do this itself. This applies less often to web apps, but for desktop apps the client PC usually has plenty of power available and can happily do a sort.

3) Use EXISTS instead of checking the Count of matching entries.

4) Don't get obsessed with doing a query in just one SELECT clause. Judicious use of table variables (and sometimes temporary tables) can massively reduce the rows processed.

CodeByMoonlight
Re 1)... it should be treated the same way. What DBMS are you using?
Rob Farley
SQL Server. Subselects run much slower than queries using derived tables in my experience.
CodeByMoonlight
+7  A: 

By far and above: Making covering indexes

A covering index includes all the columns that the query will need, thereby avoiding the need to do lookups on the results of an index seek. This will then avoid the system feeling like a scan could be quicker (which is remarkably quick considering the cost of lookups).

But also worth mentioning:

Having an index that will allow a merge join. A MERGE join is able to occur when joining two tables that are ordered by the join conditions. But of course, in saying 'table', we really mean 'index', right...

Also - removing scalar functions and using table-valued functions instead... as scalar functions are not able to be simplified out.

Also - putting a unique index on a column that you know to be unique, allowing the query optimizer to use this knowledge to make better optimization choices. Also applies to NOT NULL constraints.

Also - using Binary collation when comparing strings that are in a known case, so that the system doesn't have to consider the different case options.

Of course I could go on all day...

Rob

Rob Farley
Worth pointing out that the scalar function thing may be local to SQL Server. But the others are across most database systems.
Rob Farley
Rob I saw your blog, you have excellent articles. ;)
RRUZ
:) Thanks RRUZ. Query-covering indexes are definitely worthwhile, as they can make queries run thousands of times faster.
Rob Farley
+1  A: 

The two most important things in my experience are fewer joins and less queries. Other than those there's lots of DB specific stuff, COUNT(*) is relatively slow on PgSQL, subselects are dog slow on MySQL, etc.

Alex Gaynor
A: 

Couple of hints: Use

delete from table where id>=1 and id<=3;

instead of

delete from table where id=1;
delete from table where id=2;
delete from table where id=3;

Also use 'IN' instead of 'OR' syntax

Frost
how about: delete from table where id between 1 and 3 - seems ever more concise
marc_s
Yes, maybe, but the idea is: do not use cycled queries.
Frost
+1  A: 

The biggest optimizations i used recently where quite simple.

Keep as much of the business logic as close as possible to the sql server. Aka keep you business code on the same machine as the sql server. Let your business logic return as little as possible code back to the final client.

Keep your SQL query's as 'short as possible' like Frost said, use single update statements over multiple statements.

Only use transactions when you need them

Create temp tables for partial joins to speed up the joins (dont forget to index them)

Barfieldmv
+1  A: 

I have read all answers and I didn't found LIMIT and OFFSET usage hints. It is very common usage in pagination with "prev" and "next" links. But rendering such a display can consume more resources than the entire rest of the site. When offsetting large number items, query can become very slow. So avoid these queries.

  • Do not count total items.
  • Show only "n" number first items (for example only top 100).

Such methods uses Google, Twitter and other sites. In Google search there is no accurate number of results. There is only approximate number. Twitter doesn't allow user to view all past tweets. It shows only last n number (I can't remember how much).

There is some link from MySQL performance blog.

Pawka
+2  A: 

Index foreign keys!

Maybe this isn't a sql query syntax optimisation, but more a storage optimisation. But i see it re-occur all the time & its a pet peeve.

Nick Kavadias
+1  A: 

The best optimisation I've ever had using SQL was to really understand what was needed to be done the data and REMOVE ton's of SQL from the query.

The fastest query is the query that does not have to be run.

REALLY THINK about what your doing to the data. Are you working row-by-row? (then use set based code).

  • Do you really need to join to all those tables?

  • Can two small (simple) queries do the job better and quicker than a single large query?

  • If you combine these two queries into a single query can it run faster?

Finally, PROFILE your queries (EXPLAIN PLAN or SQL PROFILER) and look at the "IO gets". Generally you want to reduce the number of GET's to a ratio something like 10 gets per output row.

Guy
A: 

Avoid the use of onboard functions like convertdate, stringreplace and such in your Views. If you can't make sure that the data is in a valid format use stored procedures which run reguallary to 'clean up' the data in your relevant tables.

this is nasty, but it saves the views time i.e. keeps the user happy... ^^

KB22
A: 

Do not put constraints if not required as constraints will add an index, the more number of indexes, the more time it takes for data insertion.

Amareswar