views:

177

answers:

6

I'm trying to load data from oracle to sql server (Sorry for not writing this before)

I have a table(actually a view which has data from different tables) with 1 million records atleast. I designed my package in such a way that i have functions for business logics and call them in select query directly.

Ex:

X1(id varchar2)
x2(id varchar2, d1 date)
x3(id varchar2, d2 date)

Select id, x, y, z, decode (.....), x1(id), x2(id), x3(id) 
FROM Table1

Note: My table has 20 columns and i call 5 different functions on atleast 6-7 columns. And some functions compare the parameters passed with audit table and perform logic

How can i improve performance of my query or is there a better way to do this

I tried doing it in C# code but initial select of records is large enough for dataset and i get outofmemory exception.

my function does selects and then performs logic for example:

Function(c_x2, eid) 

  Select col1 
    into p_x1 
    from tableP 
   where eid = eid; 

  IF (p_x1 = NULL) THEN 
    ret_var := 'INITIAL'; 
  ELSIF (p_x1 = 'L') AND (c_x2 = 'A') THEN 
    ret_var:= 'RL'; 

    INSERT INTO Audit
      (old_val, new_val, audit_event, id, pname) 
    VALUES 
      (p_x1, c_x2, 'RL', eid, 'PackageProcName'); 

  ELSIF (p_x1 = 'A') AND (c_x2 = 'L') THEN 
    ret_var := 'GL'; 

    INSERT INTO Audit
      (old_val, new_val, audit_event, id, pname) 
    VALUES 
      (p_x1, c_x2, 'GL', eid, 'PackgProcName'); 

  END IF; 

RETURN ret_var;
A: 

Create a sorted intex on your table.

Introduction to SQL Server Indizes, other RDBMS are similar.

Edit since you edited your question:

Using a view is even more sub-optimal, especially when querying single rows from it. I think your "busines functions" are actually something like stored procedures?

As others suggested, in SQL always go set based. I assumed you already did that, hence my tip to start using indexing.

Johannes Rudolph
A: 

A couple of tips:

  • Don't load all records into RAM but process them one by one.
  • Try to run as many functions on the client as possible. Databases are really slow to execute user defined functions.
  • If you need to join two tables, it's sometimes possible to create two connections on the client. Fetch the data main data with connection 1 and the audit data with connection 2. Order the data for both connections in the same way so you can read single records from both connections and perform whatever you need on them.
  • If your functions always return the same result for the same input, use a computed column or a materialized view. The database will run the function once and save it in a table somewhere. That will make INSERT slow but SELECT quick.
Aaron Digulla
i have written same code in c# using datareader where i'm getting each row and performing logic in C# and then inserting ... this is also taking lots of time...Functions i use take parameter and perform some logic on that : like comparision and then calcultaing date or other data...I'm not using any joins in my query
rs
(-1) Use Set based SQL - not record by record. Try to use native SQL functions which are fast. Use the DB as a DB - it is built for joins / updates / queries. Don't think that you can rewite the database in your our programming skills.
Guy
i'm doing functions in C# not in sql , If i try to load everything at onetime in my memory it is thorwing outof memory exception that is why I'm using datareader... i know what db is built for...i'm trying to get answer and learn not rewrite db plz...
rs
+4  A: 

i'm getting each row and performing logic in C# and then inserting

If possible INSERT from the SELECT:

INSERT INTO YourNewTable
        (col1, col2, col3)
    SELECT
        col1, col2, col3
        FROM YourOldTable
        WHERE ....

this will run significantly faster than a single query where you then loop over the result set and have an INSERT for each row.

EDIT as for the OP question edit:

you should be able to replace the function call to plain SQL in your query. Mimic the "initial" using a LEFT JOIN tableP, and the "RL" or "GL" can be calculated using CASE.

EDIT based on OP recent comments:

since you are loading data from Oracle into SQL Server, this is what I would do: most people that could help have moved on and will not read this question again, so open a new question where you say: 1) you need to load data from Oracle (version) to SQL Server Version 2) currently you are loading it from one query processing each row in C# and inserting it into SQL Server, and it is slow. and all the other details. There are much better ways of bulk loading data into SQL Server. As for this question, you could accept an answer, answer yourself where you explain you need to ask a new question, or just leave it unaccepted.

KM
insert into select will not work for me because i'm loading oracle table to sql serverAnd i cannot do insert audit if i use CASE
rs
wow! wouldn't _i'm loading oracle table to sql server_ have been something you put in the question? right near the top??
KM
OMFG++
OMG Ponies
+2  A: 

My recommendation is that you do not use functions and then call them within other SELECT statements. This:

SELECT t.id, ...
       x1(t.id) ...
  FROM TABLE t

...is equivalent to:

SELECT t.id, ...
       (SELECT x.column FROM x1 x WHERE x.id = t.id)
  FROM TABLE t

Encapsulation doesn't work in SQL like when using C#/etc. While the approach makes maintenance easier, performance suffers because sub selects will execute for every row returned.

A better approach would be to update the supporting function to include the join criteria (IE: "where x.id = t.id" for lack of real one) in the SELECT:

SELECT x.id
       x.column 
  FROM x1 x

...so you can use it as a JOIN:

SELECT t.id, ...
       x1.column
  FROM TABLE t
  JOIN (SELECT x.id,
               x.column 
          FROM MY_PACKAGE.x) x1 ON x1.id = t.id

I prefer that to having to incorporate the function logic into the queries for sake of maintenance, but sometimes it can't be helped.

OMG Ponies
+1, there are multiple problems at work here, I thought of this, but just went with the INSERT from SELECT. you explained the function issue better than I would have too...
KM
can someone edit above comment for me .. i wanted to add code snippet
rs
edit your question and put the code in there, as far as I can tell you can only __BOLD__ and _ITALICS_ in comments, no code formatting
KM
thanks @KM i did that...
rs
A: 

Firstly you need to find where the performance problem actually is. Then you can look at trying to solve it.

  1. What is the performance of the view like? How long does it take the view to execute without any of the function calls? Try running the command

    How well does it perform? Does it take 1 minute or 1 hour?

    create table the_view_table
    as
    select *
    from the_view;
    
  2. How well do the functions perform? According to the description you are making approximately 5 million function calls. They had better be pretty efficient! Also are the functions defined as deterministic. If the functions are defined using the deterministic keyword, the Oracle has a chance of optimizing away some of the calls.

  3. Is there a way of reducing the number of function calls? The function are being called once the view has been evaluated and the million rows of data are available. BUT are all the input values from the highest level of the query? Can the function calls be imbeded into the view at a lower level. Consider the following two queries. Which would be quicker?

    select 
      f.dim_id, 
      d.dim_col_1, 
      long_slow_function(d.dim_col_2) as dim_col_2
    from large_fact_table f
    join small_dim_table d on (f.dim_id = d.dim_id)
    select 
      f.dim_id, 
      d.dim_col_1, 
      d.dim_col_2
    from large_fact_table f
    join (
      select 
        dim_id, 
        dim_col_1, 
        long_slow_function(d.dim_col_2) as dim_col_2
    from small_dim_table) d on (f.dim_id = d.dim_id)

    Ideally the second query should run quicker as it calling the function fewer times.

The performance issue could be in any of these places and until you investigate the issue, it would be difficult to know where to direct your tuning efforts.

MikeyByCrikey
Thank you, I'm again going thru functions and query and will see how i can avoid unecessary calls..
rs
A: 

Personally I'd create an SSIS import to do this task. USing abulk insert you can imporve speed dramitcally and SSIS can handle the functions part after the bulk insert.

HLGEM