Let's say I have an Oracle database table of 40MM 9-digit ZIP codes which includes the 4-digit ZIP code for each. I have a list of 800 5-digit ZIP codes and I need to find all the associated 9-digit ZIP codes. There are 40K 5-digit ZIP codes altogether. Assume we have indexed the 5-digit ZIP code field with a B*Tree (conventional) index. With database performance and scalability in mind, is the best method to:
Query for all the records in a single statement? Something like this run once:
SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)
Query once for each of the 5-digit ZIP codes? Something like this run 800 times:
SELECT * FROM ZIPCODE WHERE ZIP5 = :1
Some batch size in between the two?
What do you think, and why? Some [optional] follow-up thoughts if you're in a thinking mood:
- What tests can we run to validate your thinking?
- Does the best method change as the volumes change (e.g., if we have 100 9-digit ZIP codes mapped to 10 5-digit ZIP codes, looking for 3 of those 10)?
- Is the best thing for the database different than the best thing for the app servers which are absorbing the query results?