tags:

views:

1028

answers:

5

I am an advocate of ORM-solutions and from time to time I am giving a workshop about Hibernate.

When talking about framework-generated SQL, people usually start talking about how they need to be able to use "hints", and this is supposedly not possible with ORM frameworks.

Usually something like: "We tried Hibernate. It looked promising in the beginning, but when we let it loose on our very very complex production database it broke down because we were not able to apply hints!".

But when asked for a concrete example, the memory of those people is suddenly not so clear any more ...

I usually feel intimidated, because the whole "hints"-topic sounds like voodoo to me... So can anybody enlighten me? What is meant by SQL-hints or DB-Hints?

The only thing I know, that is somehow "hint-like" is SELECT ... FOR UPDATE. But this is supported by the Hibernate-API...

+9  A: 

A SQL statement, especially a complex one, can actually be executed by the DB engine in any number of different ways (which table in the join to read first, which index to use based on many different parameters, etc).

An experienced dba can use hints to encourage the DB engine to choose a particular method when it generates its execution plan. You would only normally need to do this after extensive testing and analysis of the specific queries (because the DB engines are usually pretty darn good at figuring out the optimum execution plan).

Some MSSQL-specific discussion and syntax here:
http://msdn.microsoft.com/en-us/library/ms181714.aspx

Edit: some additional examples at
http://geeks.netindonesia.net/blogs/kasim.wirama/archive/2007/12/31/sql-server-2005-query-hints.aspx

BradC
+3  A: 

All modern RDBMS-es have some sort of query optimizer that calculates best query plan, which is sequence of read/write operations needed to execute SQL query.

Sometimes plans can be suboptimal, so RDBMS designers included "hints" in SQL. Hints are instructions you can embed in your SQL that affect query optimizer, With hints you can instruct query optimizer e.g. which indexes it should use, in what order data should be read from tables, ...

So, with hints you can resolve some bottlenecks that the query optimizer cannot solve by itself.

For example, here is list of Oracle hints.

zendar
what you refer to as the 'query analyser' is usally termed the 'query optimiser' (at least in the SQL server world)...
Mitch Wheat
Yep, you are right, Oracle term is also query optimizer. Edited.
zendar
+3  A: 

While HINTS do as the other answers describe, you should only use them in rare, researched circumstances. 9 times out of 10 a HINT will result in a poor query plan. Unless you really know what you are doing, don't use them.

Mitch Wheat
+5  A: 

Query hints are used to guide the query optimiser when it doesn't produce sensible query plans by default. First, a small background in query optimisers:

Database programming is different from pretty much all other software development because it has a mechanical component. Disk seeks and rotational latency (waiting fora particular sector to arrive under the disk head) are very expensive in comparison to CPU. Different query resolution strategies will result in different amounts of I/O, often radically different amounts. Getting this right or wrong can make a major difference to the performance of the query. For an overview of query optimisation, see This paper.

SQL is declarative - you specify the logic of the query and let the DBMS figure out how to resolve it. A modern cost-based query optimiser (some systems, such as Oracle also have a legacy query optimiser retained for backward compatibility) will run a series of transformations on the query. These maintain semantic equivalence but differ in the order and choice of operations. Based on statistics collected on the tables (sizes, distribution histograms of keys) the optimiser computes an estimate of the amount of work needed for each query plan. It selects the most efficient plan.

Cost-based optimisation is heuristic, and is dependent on accurate statistics. As query complexity goes up the heuristics can produce incorrect plans, which can potentially be wildly inefficient.

Query hints can be used in this situation to force certain strategies in the query plan, such as a type of join. For example, on a query that usually returns very small result sets you may wish to force a nested loops join. You may also wish to force a certain join order of tables.

O/R mappers (or any tool that generates SQL) generates its own query, which will typically not have hinting information. In the case that this query runs inefficiently you have limited options, some of which are:

  • Examine the indexing on the tables. Possibly you can add an index. Some systems (recent versions of Oracle for example) allow you index joins across more than one table.

  • Some database management systems (again, Oracle comes to mind) allow you to manually associate a query plan with a specific query string. Query plans are cached by a hash value of the query. If the queries are paramaterised the base query string is constant and will resolve to the same hash value.

  • As a last resort, you can modify the database schema, but this is only possible if you control the application.

If you control the SQL you can hint queries. In practice it's fairly uncommon to actually need to do this. A more common failure mode on O/R mappers with complex database schemas is they can make it difficult to express complex query predicates or do complex operations over large bodies of data.

I tend to advocate using the O/R mapper for the 98% of work that it's suited for and dropping to stored procedures where they are the appropriate solution. If you really need to hint a query than this might be the appropriate strategy. Unless there is something unusual about your application (for example some sort of DSS) you should only need to escape from the O/R mapper on a minority of situations. You might also find (again, an example would be DSS tools working with the data in aggregate) that an O/R mapper is not really the appropriate strategy for the application.

ConcernedOfTunbridgeWells
A: 

There is no such thing as "optimized SQL code", because SQL code is never executed.

SQL code is translated into an execution plan by the Optimizer. The Optimizer will use the information it has to choose (among other things).

  • the order in which tables are involved
  • the join method for each involved table (nested/merge/hash)
  • how to access a table's data (direct table access/ index with bookmark lookup/direct index access) (scan/seek)
  • should parallelism be used, and when to end parallelism (gather streams)

Query hints allow a programmer to over-ride (in most cases) or suggest politely (in other cases) the optimizer's choices.

Query hints can let you force off parallelism, force all joins to be implemented as nested loop, force one index to be used over another... as a few examples.

Since the optimizer is really good, if one over-rides the optimizer, one is generally asking for a non-optimal plan. Query hints are best served when the optimizer does not have the required information to make a good choice.

One place I use query hints is for table variables. Table variables are assumed to have 0 rows by the Optimizer, and so the Optimizer always joins table variables using nested loop (the best join implementation for small numbers of rows). If I have a large table variable - already ordered in a favorable way for merge join, I can specify a merge join be used by applying a query hint.

David B