views:

3667

answers:

13

What are some things I can do to improve query performance of an oracle query without creating indexes?

Here is the query I'm trying to run faster:

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a,
itempages b,
keygroupdata c
WHERE a.ItemType IN (112,115,189,241)
AND a.ItemNum = b.ItemNum
AND b.ItemNum = c.ItemNum
ORDER BY a.DateStored DESC

None of these columns are indexed and each of the tables contains millions of records. Needless to say, it takes over 3 and half minutes for the query to execute. This is a third party database in a production environment and I'm not allowed to create any indexes so any performance improvements would have to be made to the query itself.

Thanks!

A: 

Remove the ORDER BY

perform the sort, after you pull the rows back to your application.

EvilTeach
Is it not obvious that database kernel developers spend quite a lot of time optimizing common components such as sort technologies? I find it unlikely that it's worth the effort to do this.
David Aldridge
Why yes it is.If the database is swamped, like I assume it is, then it may be that pulling the data back then sorting it would be of benefit.It would require some experimentation to see if this is a valid approache in his particular case, but he was asking for ideas. That was an idea.
EvilTeach
Well, by all means update your response to say, "If the database is swamped ...". Then we can talk about what "swamped" means, and whether in this case removing the ORDER BY also switches the optimizer to a FIRST_ROWS mode that adversely affects the execution plan etc..
David Aldridge
+1  A: 

Without indexing, that query is only going to get worse as the table size increases. With that said, try removing the order by clause and doing that sort on the client side.

Josh Bush
Is it not obvious that database kernel developers spend quite a lot of time optimizing common components such as sort technologies? I find it unlikely that it's worth the effort to do this.
David Aldridge
I'm sure that the database sorts are optimized, that's not the question. He asked how he could improve the query performance and the most obvious thing to do is remove the sort. I've pulled sorts to the client side with some real improvements on heavily loaded databases.
Josh Bush
+4  A: 

First I'd rewrite the query to be ANSI standard:

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a
INNER JOIN itempages b ON b.ItemNum = a.ItemNum
INNER JOIN keygroupdata c ON c.ItemNum = b.ItemNum
WHERE a.ItemType IN (112,115,189,241)
ORDER BY a.DateStored DESC

This makes it easier to read and understand what is going on. It also helps you not make mistakes (i.e. Cross Joining)that might cause real big problems. Then I'd get the Explain plan to see what the DBMS is doing with that query. Is it trying to use some indexes? Is it joining the tables correctly?

Then I'd review the tables that I'm working with to see if there are any indexes that already exist that I could be using to make my query faster. Finally as everyone else has suggested I'd remove the Order By clause and just do that in code.

Rob Booth
Is it not obvious that database kernel developers spend quite a lot of time optimizing common components such as sort technologies? I find it unlikely that it's worth the effort to do this
David Aldridge
@David Aldridge: I once had a query that took 90 seconds to complete on SQL Server 7. It had originally been written on SQL Server 6.5, which did not support ANSI SQL 92 join syntax.As part of my efforts to reduce execution time I did as Rob advises. The query then took 3 seconds to execute. YMMV.
Dave
That's nice to know :) but it's a real stab-in-the-dark, and should make no difference (and this is Oracle of course). If there is a difference it ought to manifest in an execution plan difference.
David Aldridge
I guarantee ORDER BY in the query will be faster than sorting in code.
John Stauffer
+6  A: 

Ask the third party to index its join columns, as they should have done in the first place! Without indexes, Oracle has nothing to go on other than brute force.

Tony Andrews
Unfortunately, that is not an option. My edge case requirement is not going to affect how they operate on a daily basis.
Chris Conway
+4  A: 

You may want to try creating a materialized view on any of those tables. You can then create an index on the materialized view that will help speed the query (which would then be querying the materialized view instead of the raw table).

Of course, if your underlying table is updated your view and indexes will need to be refreshed.

BQ
An On Commit Refresh Materialized view would prevent you from having to refresh manually when the table changes, but I'm not sure this is an option in the question given (although it is not specifically excluded).
Leigh Riffel
True. No idea if the OP can do anything at all in the database, but this gives him the option of keeping his hands off the original schema. OP: If you do the on commit refresh, make sure you're not doing frequent writes to those tables.
BQ
+1  A: 

If the query inputs are constant or predictable (the itemType IN (...)), then an alternative would be to run the query once or twice a day and store the results in a local table, with indices where appropriate.

You can then make the costly query 'offline' and have quicker/better results for an interactive query.

Ken Gentle
+1  A: 

First, look at the execution plan. Does it accurately reflect the number of rows to be retrieved at each stage of the query execution? How selective is the predicate "a.ItemType IN (112,115,189,241)"? Does the execution plan show any use of temporary disk space for joins or sorts?

Actually, maybe you can modify the question to include the execution plan.

Also make sure you do not have hash joins disabled, which is sometimes the case in OLTP-tuned systems, as they are the most efficient way of equijoining bulk data in Oracle. They ought to show up in the execution plan.

David Aldridge
+1  A: 

You can try filtering on item type before you join your tables, as shown here.

If you are running on Oracle prior to 9i, this would sometimes give surprising benefits.

select 
  c.claimnumber,
  a.itemdate, 
  c.dtn,
  b.filepath
from 
  (
  select itemdate
  from items it
  where it.itemtype in(112,115,189,241)
  ) a
  itempages b,
  keygroupdata c
where a.itemnum = b.itemnum
  and b.itemnum = c.itemnum

You can also try adding the hints /+RULE/ or /+ORDERED/ to see what happens... again, particularly with older versions, these would sometimes give surprising results.

SELECT /*+RULE*/
  c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM
  items a,
  itempages b,
  keygroupdata c
WHERE a.ItemType IN (112,115,189,241)
  AND a.ItemNum = b.ItemNum
  AND b.ItemNum = c.ItemNum
ORDER BY a.DateStored DESC
JosephStyons
A: 

Are statistics gathered on these tables? If not, gathering statistics might change the execution plan, although it wouldn't necessarily be for the better.

Aside from that, look at the execution plan. You may see that it is joining the tables in a non-optimal order (e.g. it could be joining b and c before joining with a which has the filter condition).

You could use hints to try to affect the access paths, join order, or join method.

Update: Responding to the comment led me to this presentation, which might be helpful or at least interesting.

Dave Costa
If there aren't ANY indexes on the tables as the OP states, statistics aren't useful. He's stuck with a full table scan. Otherwise, good point.
BQ
Whoops. OP just said those columns don't have indexes, not that there aren't any. Statistics may actually help. But turning them on may still be out of his options, given the third-party app issue.
BQ
Statistics on the table alone can have an effect. I have a feeling there was a good paper by Mogens Norgaard about this, but I don't recall where I found it.
Dave Costa
+1  A: 

Is this a query that you run often? It seems like it would be in the DB owner's interest to create the indexes that you need to speed this query up. The 3.5 minutes you're spending running the query must have some impact on their production environment!

Also, have they been running update statistics on the tables? That might improve performance, as the join order is computed based on the statistics of the tables.

BTW, what are you allowed to do? Just read? If you can create temporary tables and put indexes on those, I might consider making temporary copies of the table, indexing those, and then do the index-assisted join with the temp copies.

Toybuilder
A: 

Sometimes you can see a benefit by adding extra pathways for the optimizer to choose by adding what seems like redundant elements to the where clause.

For instance, you've got A.ItemNum = B.ItemNum AND B.ItemNum = C.ItemNum. Try adding A.ItemNum = C.ItemNum as well. I'm pretty sure, however, that the optimizer is intelligent enough to figure that out on its own - worth a try though.

ScottCher
@David Aldridge: hoping for a thumbs up or down on this from you 8). I've seen it work for older versions of Oracle but I don't know if its still a viable trick
ScottCher
Oracle does join transitivty - automatically adding in A.ItemNum = C.ItemNum (certain in 10g onwards). It only does this for one extra step beyond what you have specified. It wouldn't help this case but could in some other situations.
WW
@WW. The last I heard this was not true for joins. It was true for equalities. A.id = myID and A.ID = B.ID ... Oracle will add B.ID = myID but not a join. And even if they claim to do this, I've had HUGE performance increases from adding the join... so it might have the ability but doesn't always.
A: 

Depending on the datatype of the ItemType column you may experience faster executing using the following if it's a varchar, Oracle will do implict conversions.

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a,
itempages b,
keygroupdata c
WHERE ((a.ItemType IN ('112','115','189','241'))
AND (a.ItemNum = b.ItemNum)
AND (b.ItemNum = c.ItemNum))
ORDER BY a.DateStored DESC
Brian Schmitt
A: 

If you say that there are no indexes, then does this also mean that there are no primary or foreign keys defined? Obviously analysing the tables and gathering statistics are important but if metadata such as defining how the tables are supposed to be joined doesn't exist, then Oracle may well choose a poor execution path.

In that case using a hint such as /*+ ORDERED */ may well be the only option to make the optimizer reliably choose a good execution path. It may also be worth adding foreign keys and primary keys but define them as DISABLE and VALIDATE.

I guess the usefulness of this comment depends on how far the aversion to indexes goes so YMMV.

MikeyByCrikey