views:

255

answers:

2

I have to use Hibernate and retrieve data from Oracle but the problem is, that the number of parameters passed to the query is not always the same.

For the sake of simplicity let's consider the following query:

select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 in (?, ?, ... ?)

The number of parameters passed to in clause is between 1 and 500. If the number is about 1-50 it works quite fast, but for 200 it takes a few seconds to execute the query (parsing, creating explain plan, executing the query). Indexes are created and used - it was checked.

The query is created dynamicly so I use Hibernate Criteria API. For the first query (with > 100 parameters) it takes 3-5 seconds, but for the next one it works faster (even if the number of parameters varies). I would like to improve the response time for the first query. What can I do in that case assuming that Hibernate is a must?

I though about removig this dynamic query, creating a few static queries as named queries in xml file (in that case those queries will be precompiled at the beginning) For example

1) one query if the number of parameters is less then 50.

In this case if we have 30 parameters than the query would look like:

select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 in (PAR_1, PAR_2, ..., PAR_30, -1, -1 , ..., -1 ?)

2) the second one if the number is between 50 and 100 etc.

The problem is that it's not so simple using named queries and HQL (in JDBC it would be straighforward). In HQL we passed only a list and we don't specify a number of parameters in that list i.e. In fact there is only one query

'from Person where id in (:person_list)'

myQuery.setParameterList("person_list", myList)

Is there any option to solve that?

By the way, I thought that the explain plan is executed for each new query so for example:

(a) select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 in (?, ?, ..., ?) <100> - explain plan must be created

(b) select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 in (?, ?, ..., ?) <100> - explain plan won't be created because it already exists in cache

(c) select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 in (?, ?, ..., ?) <120> - explain plan should be created (there is no explain plan for a query with 120 parameters) but it takes less time in comparison with (a), almost the same as (b) so probably Oracle can create this plan faster if a similar query was executed before

What is the reason for that?

+1  A: 

There are a couple of things here. First of all, you cannot bind an IN list, at least I am pretty sure you cannot. I suspect Hibernate is using some sort of trick you put your array contents into a static inlist Oracle can use.

Secondly if this query is executed with lots of different parameters, you must you bind variables or the performance of the entire database will suffer.

That said, there is a way to bind an IN list using a 'trick' which Tom Kyte describes on his blog -

http://tkyte.blogspot.com/2006/01/how-can-i.html

The code in there looks like:

ops$tkyte@ORA10GR2> with bound_inlist
2  as
3  (
4  select
5    substr(txt,
6           instr (txt, ',', 1, level  ) + 1,
7           instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 )
8           as token
9    from (select ','||:txt||',' txt from dual)
10  connect by level <= length(:txt)-length(replace(:txt,',',''))+1
11  )
12  select *
13    from all_users
14   where user_id in (select * from bound_inlist);

USERNAME                          USER_ID CREATED
------------------------------ ---------- ---------
SYSTEM                                  5 30-JUN-05
OPS$TKYTE                             104 20-JAN-06

The part:

12  select *
13    from all_users
14   where user_id in (select * from bound_inlist);

Is basically where your query goes. The bit above is the trick which splits the comma separated string into a list of values. Instead of binding a list into the :txt placeholder, you would need to convert the list to a string and just bind that.

Are you sure the difference in query times isn't due to caching or load variations on the machine? Parsing the query will take a little time, but several seconds is a long time.

Stephen ODonnell
I'm pretty sure that oracle didn't cache the result set. I ran the same query but with different parameters and the list of passed parameters contained different number and it was quite fast. I repeted it a few times.Another way to create a static query would be to use the query:select COL_1, COL_2, ..., COL_N from TAB_1 where COL_1 = ? or COL_1 = ? or ... or COL_1 = ? <- here i can control the number of parameters but it doesn't look good.
Kumar
A: 

I've worked with IN(...) queries that had up to a 1000 of ids in that list; I can guarantee you that it does not take several seconds to parse / prepare / cache a statement.

Hibernate does indeed auto-expand the parameter list you specify using the actual number of elements in the list you pass, so if you really wanted to keep it "fixed" at a certain level all you need to do is to append enough -1s to the end. However, this is most certainly not the problem especially since we're talking about speeding up the first query run - no statements have been prepared / cached yet anyway.

Did you look at the execution plans for your queries? Both via explain plan and autotrace enabled? Do they differ when you have 30 elements and 120 elements in your list? Does your actual query really look like "select ... from table where id in (...)" you've posted or is it more complex? I'm willing to bet that somewhere between 30 and 120 elements Oracle decides (perhaps mistakenly) that it'll be faster not to use an index, which is why you're seeing the time increase.

ChssPly76
Actually my query has one additional join but in general it's not complex. I've checked explain plan in dev environment for:1) in clause containing 2 or 3 parameters and in that case index was used as expected2) in clause containing 10 or more parameters and there was a full scan instead of index. That was surprising but it turned out that the table contained a few hundred of records and full scan was faster comparing to index (we used a hint to compare the cost - and index was slower than full scan which is good).
Kumar
I'm guessing that based on your experience the dynamic query with one join, appropriate indexes and less than 200 ids should last much shorter (assuming that even in future this table may contain about 50 mln of records)? What would be the difference between static and dynamic query? 20%?
Kumar
Is the other table joined on index? Are you returning **lots** of data? Table with couple of hundred records is never the bottleneck. As far as static vs dynamic queries go, the difference depends on your hardware, query, oracle configuration and many other things. It's usually a **lot** less than 20%, though.
ChssPly76