views:

63

answers:

2

Let's say I have an Oracle database table of 40MM 9-digit ZIP codes which includes the 4-digit ZIP code for each. I have a list of 800 5-digit ZIP codes and I need to find all the associated 9-digit ZIP codes. There are 40K 5-digit ZIP codes altogether. Assume we have indexed the 5-digit ZIP code field with a B*Tree (conventional) index. With database performance and scalability in mind, is the best method to:

  1. Query for all the records in a single statement? Something like this run once:

    SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)

  2. Query once for each of the 5-digit ZIP codes? Something like this run 800 times:

    SELECT * FROM ZIPCODE WHERE ZIP5 = :1

  3. Some batch size in between the two?

What do you think, and why? Some [optional] follow-up thoughts if you're in a thinking mood:

  • What tests can we run to validate your thinking?
  • Does the best method change as the volumes change (e.g., if we have 100 9-digit ZIP codes mapped to 10 5-digit ZIP codes, looking for 3 of those 10)?
  • Is the best thing for the database different than the best thing for the app servers which are absorbing the query results?
+4  A: 

Let the database server decide how to handle it. Even if it internally does a query 800 times, it will still be faster. It has to parse the query only once, and has to send results only once.

Thus, use SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)

Sjoerd
+1 with the caveat that an IN list can only go up to 1000 items. If you are already at 800, that is a bit too close for my liking.Stick the list into another table (global temporay table ?) and do a SELECT * FROM ZIPCODE WHERE ZIP5 IN (SELECT ZIP5 FROM ...)
Gary
I like this answer. The Oracle optimizer is smarter than us all. Usually...
Mike M. Lin
+1 for Gary's answer. Selecting from a table is much faster than an 800 in list.
Rene
+2  A: 

Do you have the option to create an external table? I.e.,

CREATE TABLE zip5 (zip5 varchar2(5))
ORGANIZATION EXTERNAL
(
  TYPE ORACLE_LOADER
  DEFAULT DIRECTORY <some oracle DIRECTORY object>
  LOCATION (<yourDirectory>:'zip5 filename.txt'
)

Place your zip5 text file in the OS directory specified by your Oracle directory object, then execute:

SELECT * FROM zipcode JOIN zip5 ON (zipcode.zip5 = zip5.zip5);

This is a more generic solution than rebuilding your query each time your ZIP5 list changes.

DCookie