views:

410

answers:

3

Let's say I have a function call on a select or where clause in Oracle like this:

select a, b, c, dbms_crypto.hash(utl_raw.cast_to_raw('HELLO'),3)
  from my_table

A similar example can be constructed for MS SQLServer.

What's the expected behavior in each case?

Is the HASH function going to be called once for each row in the table, or DBMS will be smart enough to call the function just once, since it's a function with constant parameters and no side-effects?

Thanks a lot.

+4  A: 

For SQL server, it will be evaluated for every single row.

You will be MUCH better off by running the function once and assigning to a variable and using the variable in the query.

Joe
You beat me, I had the same points made but you typed faster.
HLGEM
+7  A: 

The answer for Oracle is it depends. The function will be called for every row selected UNLESS the Function is marked 'Deterministic' in which case it will only be called once.

CREATE OR REPLACE PACKAGE TestCallCount AS
    FUNCTION StringLen(SrcStr VARCHAR) RETURN INTEGER;
    FUNCTION StringLen2(SrcStr VARCHAR) RETURN INTEGER DETERMINISTIC;
    FUNCTION GetCallCount RETURN INTEGER;
    FUNCTION GetCallCount2 RETURN INTEGER;
END TestCallCount;

CREATE OR REPLACE PACKAGE BODY TestCallCount AS
    TotalFunctionCalls INTEGER := 0;
    TotalFunctionCalls2 INTEGER := 0;

    FUNCTION StringLen(SrcStr VARCHAR) RETURN INTEGER AS
    BEGIN
        TotalFunctionCalls := TotalFunctionCalls + 1;
        RETURN Length(SrcStr);
    END;
    FUNCTION GetCallCount RETURN INTEGER AS
    BEGIN
        RETURN TotalFunctionCalls;
    END;

    FUNCTION StringLen2(SrcStr VARCHAR) RETURN INTEGER DETERMINISTIC AS
    BEGIN
        TotalFunctionCalls2 := TotalFunctionCalls2 + 1;
        RETURN Length(SrcStr);
    END;
    FUNCTION GetCallCount2 RETURN INTEGER AS
    BEGIN
        RETURN TotalFunctionCalls2;
    END;

END TestCallCount;




SELECT a,TestCallCount.StringLen('foo') FROM(
    SELECT 0 as a FROM dual
    UNION
    SELECT 1 as a FROM dual
    UNION
    SELECT 2 as a FROM dual
);

SELECT TestCallCount.GetCallCount() AS TotalFunctionCalls FROM dual;

Output:

A                      TESTCALLCOUNT.STRINGLEN('FOO') 
---------------------- ------------------------------ 
0                      3                              
1                      3                              
2                      3                              

3 rows selected


TOTALFUNCTIONCALLS     
---------------------- 
3                      

1 rows selected

So the StringLen() function was called three times in the first case. Now when executing with StringLen2() which is denoted deterministic:

SELECT a,TestCallCount.StringLen2('foo') from(
    select 0 as a from dual
    union
    select 1 as a from dual
    union
    select 2 as a from dual
);

SELECT TestCallCount.GetCallCount2() AS TotalFunctionCalls FROM dual;

Results:

A                      TESTCALLCOUNT.STRINGLEN2('FOO') 
---------------------- ------------------------------- 
0                      3                               
1                      3                               
2                      3                               

3 rows selected

TOTALFUNCTIONCALLS     
---------------------- 
1                      

1 rows selected

So the StringLen2() function was only called once since it was marked deterministic.

For a function not marked deterministic, you can get around this by modifying your query as such:

select a, b, c, hashed
  from my_table
cross join (
  select dbms_crypto.hash(utl_raw.cast_to_raw('HELLO'),3) as hashed from dual
);
Mark Roddy
Thanks for the example, Mark. Your example has a SIDE-EFFECT though (TotalFunctionCalls := TotalFunctionCalls + 1;) when calling StringLen function. So I don't think Oracle would be able to optimize in this case.
Pablo Santa Cruz
Awesome Mark. Great example. What Oracle version did you use to run it?
Pablo Santa Cruz
@Pablo As far as I know you can mark any function as deterministic and it's your responsibility to make sure there aren't any side effects. If dbms_crypto.hash() isn't deterministic you could always declare your own function marked deterministic and call that instead. I ran this on Oracle XE (which is based on 10g).
Mark Roddy
@Mark Thanks a lot Mark. Great example. Wasn't aware of the DETERMINISTIC modifier for Oracle functions.
Pablo Santa Cruz
Oh my god. Where were you three months ago? So many hand-rolled caching things going on...Actually, why didn't our DBAs catch this?
aehiilrs
@aehiilrs: you may want to look into function-based indexes too http://download-east.oracle.com/docs/cd/B13789_01/appdev.101/b10795/adfns_in.htm#1006368
Mark Roddy
+1  A: 

short answer....it depends.

If the function is accessing data ORACLE does not know if it is going to be the same for each row, therefore, it needs to query for each. If, for example, your function is just a formatter that always returns the same value then you can turn on caching (marking it as Deterministic) which may allow for you to only do the function call once.

Something you may want to look into is ORACLE WITH subquery:

The WITH query_name clause lets you assign a name to a subquery block. You can then reference the subquery block multiple places in the query by specifying the query name. Oracle optimizes the query by treating the query name as either an inline view or as a temporary table

I got the quoted text from here, which has plenty of examples.

northpole