views:

150

answers:

6

When faced with the task of copying a record in a database and modifying just a handful of values, I tend to use a temporary table to avoid writing out all of the unchanged columns. Does anyone know how this would affect performance on a large scale system?

A quick example (which also shows why I prefer the temporary table method):

Let's say I have a table mytbl with 50 columns col1...col50. I want to insert a new record that is an exact duplicate of the row where col5 = 'Some Value' except that col45 will be set to 'Some other value'.

Method 1

CREATE GLOBAL TEMPORARY TABLE tmptbl AS
  SELECT * FROM myschema.mytbl;

INSERT INTO tmptbl
  (SELECT *
   FROM myschema.mytbl
   WHERE mytbl.col5 = 'Some Value');

UPDATE tmptbl
SET col45 = 'Some Other Value';

INSERT INTO myschema.mytbl
  (SELECT * FROM tmptbl);

DROP TABLE tmptbl;

Method 2

INSERT INTO myschema.mytbl (col1,
                            col2,
                            col3,
                            col4,
                            col5,
                            col6,
                            col7,
                            col8,
                            col9,
                            col10,
                            col11,
                            col12,
                            col13,
                            col14,
                            col15,
                            col16,
                            col17,
                            col18,
                            col19,
                            col20,
                            col21,
                            col22,
                            col23,
                            col24,
                            col25,
                            col26,
                            col27,
                            col28,
                            col29,
                            col30,
                            col31,
                            col32,
                            col33,
                            col34,
                            col35,
                            col36,
                            col37,
                            col38,
                            col39,
                            col40,
                            col41,
                            col42,
                            col43,
                            col44,
                            col45,
                            col46,
                            col47,
                            col48,
                            col49,
                            col50)
SELECT col1,
       col2,
       col3,
       col4,
       col5,
       col6,
       col7,
       col8,
       col9,
       col10,
       col11,
       col12,
       col13,
       col14,
       col15,
       col16,
       col17,
       col81,
       col19,
       col20,
       col21,
       col22,
       col23,
       col24,
       col25,
       col26,
       col27,
       col28,
       col29,
       col30,
       col31,
       col32,
       col33,
       col34,
       col35,
       col36,
       col37,
       col38,
       col39,
       col40,
       col41,
       col42,
       col43,
       col44,
       'Some Other Value',
       col46,
       col47,
       col48,
       col49,
       col50
FROM myschema.mytbl
WHERE col5 = 'Some Value';

How much overhead does creating/dropping a temporary table introduce? If, for example, this were part of a daily process on a production sized system, would the additional overhead be noticeable? I realize that depends a lot on the specifics of the system, but a general idea would be fantastic.

+4  A: 

I can't see how this could possibly be faster with a temporary table.

Even with the overhead of temporary table creation (and destruction) aside (two operations), you are performing three separate DB operations instead of one combined one. Also, you are performing two operations that involve individual locking on your non-temporary table, leading to less potential concurrency than a single combined op. Granted, this is somewhat database specific with regard to which actual locks are acquired.

Ultimately, I think method 1 has much more overhead than method 2, without even considering the certainly non-negligible cost of temporary table creation / destruction.

Michael Goldshteyn
It is faster to *write* method 1, and it may be easier to maintain code that looks like method 1, but users (and system purchasers) need the methods that *perform* best.
Philip Kelley
Well, you could wrap this into a stored proc, at least, so that users didn't have to be so verbose. But, I do agree that there is a maintenance issue with method 2, due to deficiencies in the SQL languange in "describing" such an operation. Now, one thing to consider is how normalized a table with that many columns is. Perhaps there is a need to re-evaluate this schema and solve the problem that way.
Michael Goldshteyn
I do realize that Method 1 has more overhead. I acknowledged that in the OP. My issue is whether the overhead is enough to negatively impact the system. i.e. if this were part of a daily process, would it add a significant amount of processing time? I realize that depends a lot on the specifics of the system, but I would like a general idea.
dpatch
Another issue implied commit by the DDL commands may leave you without the ability to rollback a complete unit of work.
Shannon Severance
One could argue that method 2 _could_ be written faster with some dynamic query logic. And, yes, the time for creation and destruction of the temporary table is non-negligible.
Adam Hawkes
+1  A: 

If your procedure will be working with fewer than 250 rows, you may want to consider using a table variable instead of a temp table, as a table variable uses memory instead of physically writing out to tempdb.

XSaint32
+3  A: 

I fear this may be a case of premature optimization. It appears that this approach "solves" a problem which may or may not exist, but certainly adds a fair degree of complexity to the system. My suggestion is

  1. Do the simplest thing that could possibly work (in this case, probably the INSERT-SELECT combination).
  2. Measure the results.
  3. If solution #1 works, fine. You did the simplest thing you could and it solved your problem.
  4. If solution #1 doesn't work (or doesn't work well enough), also fine - move to the next simplest thing that could possibly work.
  5. Iterate until the problem is sufficiently "solved" or the budget for time or money is exhausted.

Share and enjoy.

Bob Jarvis
+1 for measure, measure, measure.
DCookie
+2  A: 

I'm assuming the main goal of your approach is to avoid having to write out all 50 columns twice, which I suppose has the disadvantage of you having to make change if another column is added to the table. Others have commented on the source table structure, temporary table overhead and premature optimisation etc. This wasn't really your question, but on that assumption, I can see a third method that might work:

declare
    type tmptbl_type is table of mytbl%rowtype;
    cursor c is
        select * from mytbl where col5 = 'Some Value';
    tmptbl tmptbl_type;
begin
    open c;
    loop
        fetch c bulk collect into tmptbl limit 1000;
        for i in 1..tmptbl.count loop
            tmptbl(i).col45 := 'Some Other Value';
        end loop;
        forall i in 1..tmptbl.count
            insert into mytbl values tmptbl(i);
        exit when c%notfound;
    end loop;
    close c;
end;
/
Alex Poole
+8  A: 

First of all, this is not how temporary tables work in Oracle. Global temporary tables are permanent structures, it's just the data in them which is temporary. So there is no need to create and drop the table for each operation. That just adds overhead.

Secondly, the situations where temporary tables are justified are actually quite rare. In most scenarios a simple variable will suffice. This syntax - which inserts a single row - is valid from 9iR2 onwards:

declare
    lrec emp%rowtype;
begin
    select *
    into lrec
    from emp
    where empno = 1234;

    lrec.empno = 9999;
    lrec.sal = 5000;

    insert into emp values lrec;
end;

Note that parentheses are not required in this formulation.

APC
+2  A: 

You could do it this way.

INSERT INTO myschema.mytbl (SELECT * FROM myschema.mytbl WHERE mytbl.col5 = 'Some Value');

UPDATE myschema.mytbl SET col45 = 'Some Other Value' WHERE col5 = 'Some Value' AND rownum<=1;

This is even shorter than using a temporary table and has none of the drawbacks of temporary tables. Of course if your actual problem is more complex than what you have shown this may not work.

Leigh Riffel