views:

8784

answers:

7

Is there any way to get around the Oracle 10g limitation of 1000 items in a static IN clause? I have a comma delimited list of many of IDs that I want to use in an IN clause, Sometimes this list can exceed 1000 items, at which point Oracle throws an error. The query is similar to this...

select * from table1 where ID in (1,2,3,4,...,1001,1002,...)
+22  A: 

Put the values in a temporary table and then do a select where id in (select id from temptable)

Otávio Décio
Personally I'd put the values into the temp table and use a JOIN to query the values. I don't know whether that's actually better performance or not, though.
Neil Barnwell
@Neil Barnwell - I think any decent SQL engine would optimize so that the IN and a JOIN would have pretty much the same performance. Using IN at least for me is clearer on its intent.
Otávio Décio
@ocdecio - my tests with Oracle 10g show different (and clearly worse) explain plans for the IN, compared to the JOIN. Personally I'd use the JOIN, and would recommend others to *test* different approaches to see differences in performance, rather than guess.
jimmyorr
@jimmyorr - thank you for taking the time to check the performance, although the OP didn't seem too preocupied with that.
Otávio Décio
The IN vs JOIN thing is generally due to the possibility of NULLS in the IN list.
WW
+7  A: 

You may try to use the following form:

select * from table1 where ID in (1,2,3,4,...,1000)
union all
select * from table1 where ID in (1001,1002,...)
rics
Make that a UNION ALL
David Aldridge
+3  A: 

Use ...from table(... :

create or replace type numbertype
as object
(nr number(20,10) )
/ 

create or replace type number_table
as table of numbertype
/ 

create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
  open p_ref_result for
    select *
    from employees , (select /*+ cardinality(tab 10) */ tab.nr from table(p_numbers) tab) tbnrs 
    where id = tbnrs.nr; 
end; 
/

This is one of the rare cases where you need a hint, else Oracle will not use the index on column id. One of the advantages of this approach is that Oracle doesn't need to hard parse the query again and again. Using a temporary table is most of the times slower.

edit 1 simplified the procedure (thanks to jimmyorr) + example

create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
  open p_ref_result for
    select /*+ cardinality(tab 10) */ emp.*
    from  employees emp
    ,     table(p_numbers) tab
    where tab.nr = id;
end;
/

Example:

set serveroutput on 

create table employees ( id number(10),name varchar2(100));
insert into employees values (3,'Raymond');
insert into employees values (4,'Hans');
commit;

declare
  l_number number_table := number_table();
  l_sys_refcursor sys_refcursor;
  l_employee employees%rowtype;
begin
  l_number.extend;
  l_number(1) := numbertype(3);
  l_number.extend;
  l_number(2) := numbertype(4);
  tableselect(l_number, l_sys_refcursor);
  loop
    fetch l_sys_refcursor into l_employee;
    exit when l_sys_refcursor%notfound;
    dbms_output.put_line(l_employee.name);
  end loop;
  close l_sys_refcursor;
end;
/

This will output:

Raymond
Hans
tuinstoel
+3  A: 

Where do you get the list of ids from in the first place? Since they are IDs in your database, did they come from some previous query?

When I have seen this in the past it has been because:-

  1. a reference table is missing and the correct way would be to add the new table, put an attribute on that table and join to it
  2. a list of ids is extracted from the database, and then used in a subsequent SQL statement (perhaps later or on another server or whatever). In this case, the answer is to never extract it from the database. Either store in a temporary table or just write one query.

I think there may be better ways to rework this code that just getting this SQL statement to work. If you provide more details you might get some ideas.

WW
Excellent questions! I often use the array-technique I already posted but I use it when the user has hand picked multiple rows in a user interface data grid. However it is unlikely that a user picks >1000 rows by hand.
tuinstoel
+3  A: 

I am almost sure you can split values across multiple INs using OR:

select * from table1 where ID in (1,2,3,4,...,1000) or 
ID in (1001,1002,...,2000)
Peter Severin
The max number of values in an IN clause is yet one of those limitations that you are never supposed to be limited by.
erikkallen
One can do that but it means Oracle sees a different query every time and that means a lot of hard parsing and that will slow things down.
tuinstoel
+1  A: 

I wound up here looking for a solution as well.

Depending on the high-end number of items you need to query against, and assuming your items are unique, you could split your query into batches queries of 1000 items, and combine the results on your end instead (pseudocode here):

      //remove dupes
      items = items.RemoveDuplicates();

      //how to break the items into 1000 item batches        
      batches = new batch list;
      batch = new batch;
      for (int i = 0; i < items.Count; i++)
      {
       if (batch.Count == 1000)
       {
        batches.Add(batch);
        batch.Clear()
       }
       batch.Add(items[i]);
       if (i == items.Count - 1)
       {
        //add the final batch (it has < 1000 items).
        batches.Add(batch); 
       }
      }

      // now go query the db for each batch
      results = new results;
      foreach(batch in batches)
      {
          results.Add(query(batch));
      }

This may be a good trade-off in the scenario where you don't typically have over 1000 items - as having over 1000 items would be your "high end" edge-case scenario. For example, in the event that you have 1500 items, two queries of (1000, 500) wouldn't be so bad. This also assumes that each query isn't particularly expensive in of its own right.

This wouldn't be appropriate if your typical number of expected items got to be much larger - say, in the 100000 range - requiring 100 queries. If so, then you should probably look more seriously into using the global temporary tables solution provided above as the most "correct" solution. Furthermore, if your items are not unique, you would need to resolve duplicate results in your batches as well.

Mike Atlas
He, my solution is more correct :) There is no need for a temporary table.
tuinstoel
correct thinking the code is a bit too boilerplate, though. we use Lists.partition() from google-collections to make this almost a one-liner
Andreas Petersson
I don't see any advantage. If you don't want to use an Oracle collection or a temp table, then use Peter Severin's solution. Peter Severin's solution results in less database calls and less parsing than this solution. Also easier in the client because you have only one batch.
tuinstoel
I wish I had a .Partition() method/function =)
Mike Atlas
A: 

Surely this answer:

Put the values in a temporary table and then do a select where id in (select id from temptable)

is doing the same thing, it will still select a possible 1000+ values from the temp table that resides inside the IN clause?

Matthew
Yes, the difference is it won't fail with an Oracle error.
Tony Andrews