tags:

views:

125

answers:

4

Hello

I got a little question about performance of a subquery / joining another table

INSERT
INTO Original.Person
  (
    PID, Name, Surname, SID
  )
  (
    SELECT ma.PID_new , TBL.Name , ma.Surname, TBL.SID 
    FROM Copy.Person TBL , original.MATabelle MA
    WHERE TBL.PID         = p_PID_old
      AND TBL.PID         = MA.PID_old
  );

This is my SQL, now this thing runs around 1 million times or more.
Now my question is what would be faster?

if I change TBL.SID to (Select new from helptable where old = tbl.sid)
or if I add helptable to the from and do the joining in the where?

edit1
well this script runs only as much as there r persons.

my program have 2 moduls one that populates MaTabelle and one that Transfers Data. this program does merge 2 database together and coze of this, sometimes the same Key is uesd.
Now im working on a solution that no dublicated Keys existing.

My solution is to make a HelpTable. The owner of the key(SID) generates a new key and writes it into a help table. All other Tables who uses this key can read it from the help Table

edit2
just got something in my mind:
if a table as a Key that can be null(foreignkey that is not linked)
then this wont work with the from or?

greets
Auro

+2  A: 

Joining would be much faster than a subquery

ovais.tariq
is it coze he only needs to read the table once?
Auro
Because with a subquery, a temp table will be created first and then it will be joined.
ovais.tariq
+1  A: 

The main difference betwen subquery and join is subquery is faster when we have to retrieve data from large number of tables.Because it becomes tedious to join more tables. join is faster to retrieve data from database when we have less number of tables.

Also, this joins vs subquery can give you some more info

Space
+6  A: 

Modern RDBMs, including Oracle, optimize most joins and sub queries down to the same execution plan.

Therefore, I would go ahead and write your query in the way that is simplest for you and focus on ensuring that you've fully optimized your indexes.

If you provide your final query and your database schema, we might be able to offer detailed suggestions, including information regarding potential locking issues.

Edit

Here are some general tips that apply to your query:

  • For joins, ensure that you have an index on the columns that you are joining on. Be sure to apply an index to the joined columns in both tables. You might think you only need the index in one direction, but you should index both, since sometimes the database determines that it's better to join in the opposite direction.
  • For WHERE clauses, ensure that you have indexes on the columns mentioned in the WHERE.
  • For inserting many rows, it's best if you can insert them all in a single query.
  • For inserting on a table with a clustered index, it's best if you insert with incremental values for the clustered index so that the new rows are appended to the end of the data. This avoids rebuilding the index and often avoids locks on the existing records, which would slow down SELECT queries against existing rows. Basically, inserts become less painful to other users of the system.
Marcus Adams
well to provide my db schema is not possible coze its a round 100 tables or more, and its hard to understand it. i will look now what i will do.
Auro
We don't need your entire schema, just the tables referenced in the query.
Marcus Adams
well im using now the Subquery its easier to do thx for your help.
Auro
A: 

Instead of focussing on whether to use join or subquery, I would focus on the necessity of doing 1,000,000 executions of that particular insert statement. Especially as Oracle's optimizer -as Marcus Adams already pointed out- will optimize and rewrite your statements under the covers to its most optimal form.

Are you populating MaTabelle 1,000,000 times with only a few rows and issue that statement? If yes, then the answer is to do it in one shot. Can you provide some more information on your process that is executing this statement so many times?

EDIT: You indicate that this insert statement is executed for every person. In that case the advice is to populate MATabelle first and then execute once:

INSERT 
INTO Original.Person 
  ( 
    PID, Name, Surname, SID 
  ) 
  ( 
    SELECT ma.PID_new , TBL.Name , ma.Surname, TBL.SID  
    FROM Copy.Person TBL , original.MATabelle MA 
    WHERE TBL.PID         = MA.PID_old 
  );

Regards, Rob.

Rob van Wijk
see first poste edit :D
Auro