tags:

views:

178

answers:

5

I want to write a funcion or procedure that can be used in the IN clause of another procedure. The function or procedure would return ID numbers.

The main procedure would say something like

SELECT * FROM EMPLOYEES WHERE OFFICE_ID IN (GET_OFFICE_IDS); -- GET_OFFICE_IDS requires no parameters

GET_OFFICE_IDS returns a VARCHAR2 with the ID separated by commas. When I run the main procedure, I get a "ORA-01722: invalid number" error which makes sense but I don't know where I need to go from here.

Do I need GET_OFFICE_IDS to create a temp table that the main procedure uses? If so, will there be a performance penalty?

+2  A: 

I'm not up on oracle SQL, but are you not able to simply put another select statement in the IN Clause to return the IDs?

SELECT * FROM EMPLOYEES WHERE OFFICE_ID IN (SELECT ID FROM tbl_X WHERE x=y);

...or were you hoping to do something a bit more complicated?

Tanner
It's more complicated because the actual SELECT statement has about five levels of subqueries and would be used all over my program, which is why I want to place it either in a function or procedure.
can you save a query/view to retrieve the records you want and do something similar - IN(select id from qryRequiredOfficeIDs)? That way you can reuse the query from other places in the program. HTH
Tanner
+3  A: 

The simple brute force approach:

WHERE ','||GET_OFFICE_IDS||',' LIKE '%,'||OFFICE_ID||',%'

It would better to change GET_OFFICE_IDS to return a nested table and use something like:

OFFICE_ID IN (SELECT * FROM TABLE(GET_OFFICE_IDS))
Dave Costa
Perhaps nested table is the way to go. I'll have to read up on them. I don't use them much.
With that SQL you wouldn't be able to do "select *" in the nested query as it would return multiple columns.
Tanner
Does a nested "table" imply disc access (i.e. performance penalty) or is it just a term Oracle came up with for that collection type?
"nested table" is just a name for a type of collection. In this context it would just be in memory. You can also store them in tables, in which case they would be stored on disk.
Dave Costa
+3  A: 

Here is a working example of the nested table solution, using the EMP table:

create type t_ids is table of integer
/

create or replace function get_office_ids return t_ids
is
   l_ids t_ids := t_ids();
   l_idx integer := 0;
begin
   for r in (select empno from emp where deptno=10)
   loop
      l_ids.extend;
      l_idx := l_idx+1;
      l_ids(l_idx) := r.empno;
   end loop;
   return l_ids;
end;
/

select ename from emp where empno in (select * from table(get_office_ids));


ENAME
----------
CLARK
KING
TEST
MILLER
BINNSY
FARMER
Tony Andrews
Do I have to worry about data crossing over when I have concurrent accesses from different users or is that handles automatically?
No, the nested table data is private to each user's session.
Tony Andrews
+1  A: 

You can probably do this with a ref_cursor (ref cursor c := 'select '||.... )

But a pipelined function works very well. use it like this:

create or replace type type_varchar2 as table of varchar2(100);

create or replace function GET_OFFICE_IDS return TYPE_varchar2 PIPELINED
is
  retval VARCHAR2(100);
begin
  -- put some sql here which results in statements as below
 retval := '135';
 PIPE ROW (retval);
 retval := '110';
 PIPE ROW (retval);
end GET_OFFICE_IDS;


select *
from entries
where id in (SELECT COLUMN_VALUE FROM TABLE(GET_OFFICE_IDS));

Normally a pipelined function performs very well. However a subquery with A LOT of entries performs not always very good.

Edwin
+1  A: 

EDIT: I broke the cardinal rule of SO, I didn't answer the OP. Since there already is an accepted answer, I felt it prudent to warn.

As a rule, it's a very bad idea to mix SQL and PL/SQL. There are 2 separate engines for code. There's a SQL engine and a PL/SQL engine. Forcing thousands of switches back and forth will absolutely kill performance.

I understand why programmers want to do this. I get it. It's all encapsulately and warmy and fuzzy but it will sap you badly. Like nature it will seduce you with its sights and its sounds and then it will break your ankle.

Even something as stupid as this.

create or replace function my_Date (p_Date in date)
return varchar
as
begin

    return to_char(p_Date, 'yyyy/mm/dd');

end;

Will Kill your execution time.

Turn turn on autotrace

then run these.

select to_char(created, 'yyyy/mm/dd'), to_char(last_ddl_time, 'yyyy/mm/dd')  from all_objects


select my_date(created), my_Date(last_DDL_TIME) From all_objects

The second one takes twice the time to run. I get the answer to query 1 in 1 second and 2 in 2 seconds.

And this is AN EXTREMELY simple case... all I'm doing is casting values. Imagine if you have to join to it like you want to. That's really the worst case.

Now think of what the optimizer is completely unable to do when you hide things in a function.

When you do an IN, sometimes that far faster to do as a join. If certain conditions are true, the optimizer will do that for you. It will convert the IN to a JOIN. But because you've disguised the select inside a function, it can no longer determine if the conditions are ripe. You've FORCED the optimizer to do something sub-optimally.

One key statistic the optimizer relies on is rowcount. Is it one row or 1 billion. It knows from stats on the tables and the indexes. There are no stats on your function.

You can put them there, it possible to hint the cardinality, I'm not saying you can't, but why? Why would you want to? Seemingly you're using the function because you're a diligent programmer who has been told his whole life to factor redundant code into functions.

Those rules in your head, almost none apply to SQL. The optimizer is not a compiler. It can't Inline your function. Only you can help your optimizer get the best plan.