views:

50

answers:

5

I have a large list of strings (1500 email addresses to be more specific) and I need to look up a piece of data in a very large DB table for each of the strings (e.g. the primary key, mydataitem). How can I do it efficiently?


For example, this is way too slow (amongst other problems):

$stringArray = ('foo','bar','baz',..., 'for 1000s of items');

foreach($stringArray as $mystring) {
    $res = mysql_query("select mydataitem,blah FROM users WHERE blah = '$mystring'");
    $info=mysql_fetch_assoc($res);
    ...
}

Things I want to avoid:

  • I don't want to loop over the list and do a SELECT for each item. i.e. 1500 queries (as in example above)
  • I don't want to read the whole table into an array in one query and do the lookup in code because it would take too much memory. The DB table has 100k+ rows.
  • I don't want to build a massive query with 1499 ORs because the query would be too big. (for example "select mydataitem FROM users WHERE blah = 'aaa' OR blah = 'bbb' OR ...")

Note: I'm using MySql v5.0.45


Update: Thanks everyone - for some reason I thought IN was just for Integer ID lists - now I know better.

+1  A: 
WHERE blah IN ('aaa','bbb',...)

But it's still not efficient.

If you explain why you need to retrieve so many records, we might perhaps be able to come up with a more efficient logic.

EDIT

Create a temporary table holding these values, and use a join to that temp table in your select query

Mark Baker
The temporary table idea is brilliant. Thanks - I'll probably use it!
Tom
+1  A: 

I think what you want is

SELECT mydataitem FROM users WHERE blah IN ('foo', 'bar', 'baz', ...)
deinst
Your query doesn't link mydataitem to a particular email address.
Scott Saunders
+1  A: 

If you use a prepared statement, you can prepare it outside of your loop and then use it within your loop. That should run more quickly than a new mysql_query call each time.

I don't know what you mean by a query being "too big". Try it and see how slow it is.

Add a key on the email field if you don't already have one.

This doesn't sound like the kind of code that would be running often, so I would guess it's okay if it takes a second. If not, maybe you can explain the goal of this code and we can help you figure out a better way of accomplishing that goal.

Scott Saunders
+2  A: 
mysql_query("select mydataitem,blah FROM users WHERE blah IN ('"
    .implode("','",array_map('mysql_real_escape_string',$stringArray)."')';

Better yet, use mysqli or PDO which can use prepared statements:

$stmt = $PDO->prepare('select mydataitem,blah FROM users WHERE blah IN ('
  .implode(',',array_fill(0,count($stringArray),'?')).')';
$stmt->execute($stringArray);
Wrikken
Your query doesn't link mydataitem to a particular email address.
Scott Saunders
Pfff. SELECT mydataitem,blah then, as `blah` apparantly IS the email address.
Wrikken
+1  A: 

Unless this needs to but lightning fast I would just use a MySQL in clause

turn the array into a string:

$emails = array("[email protected]", "[email protected]", "[email protected]");

$list = "(\"". implode("\", \"", $balls) . "\")";

then just use it in your sql

$sql = "select mydataitem FROM users WHERE blah in {$list}";
Kelly Copley
Your query doesn't link mydataitem to a particular email address.
Scott Saunders