views:

120

answers:

6

My client ordered another addition to the script, but I can't figure out how to fix the slowdown? The table has about 50,000 rows.

 while($stats = mysql_fetch_array($get_stats)) {
  if ($stats['ip'] == gethostbyaddr($stats['ip'])) { // new code
   $is_undef = "Yes";            // causing problems
  } else { $is_undef = "No"; }      // end new code

 echo "<tr><td>" . date("d M Y g:i a ", strtotime($stats['date'])) . "</td><td>" .
       $stats['ip'] . "</td><td>" .
          parse_url_domain($stats['ref_url']) . "</td><td>" .
             $is_undef . "</td></tr>";
 }

This is the query:

 $get_stats = mysql_query("SELECT * FROM visitors WHERE site='$_GET[site]' AND date >= '$start_date' AND date <= '$end_date' ");
+4  A: 

I think that you might have an issue with "gethostbyaddr". Looping over that 50k times is going to be REALLY slow.

Also, not that it is relevant to the question, but you might want to think about sql injection a little bit. I hope that isn't the actual query that you are running. If so, someone can simply drop your table.

Brandon Hansen
Yes, 50000 gethostbyaddr is a mightily bad idea. 50000 DNS requests?!
EFraim
How can I fix the injection?
Norbert
injection should be a new question.
madcolor
Norbert - you may want to use db abstraction (AdoDB or any sort of that) with query preparation and parameter binding.+1 to Brandon for spotting SQL injection.
Eimantas
Thanks for that, Eimantas. I'll put this in a new question.
Norbert
A: 

I guess gethostbyaddr() is slow because it fails to find a host for the IP address. What's the purpose of the following expression anyway?

$stats['ip'] == gethostbyaddr($stats['ip'])

Assume $stats['ip'] is 127.0.0.1 and the corresponding host is localhost. So you'd compare 127.0.0.1 (an IP address) to localhost (a host name).

Philippe Gerber
gethostbyaddr() returns the unmodified IP address on failure to look up the host name, so those being equal would indicate such a failure.
Ty W
Whay Ty said. I'm trying to get all IPs that don't return a valid host name.
Norbert
Ah, I see ... thx. :-)
Philippe Gerber
A: 

from the comments at http://php.net/gethostbyaddr:

"Just wanted to let everyone know that gethostbyaddr() takes more than 20 seconds to respond if the IP address is not listed in DNS."

I'd say looking these hostnames up in a loop is probably slowing you down.

also the variables you're sending to that query need to be properly escaped or you're really asking for trouble.

Ty W
A: 

DNS requests can take quite a long time, especially across 50k records. If your client demands having the hostname instead of IP address for records, you might want to run some kind of background process to cache the hostnames instead of looking them up every page load.

also, most ISPs use blocks of ip addresses, so you could start to build tables which track ip ranges and hostmasks for ISPs to cut out the DNS lookups

JimR
A: 

I think the gethostbyaddr() call is slowing you down.

See http://us3.php.net/manual/en/function.gethostbyaddr.php#88920

Donnie C
+1  A: 

I would suggest doing this check before you add each address to your database table (ie once each instead of 50,000 times every time the data is viewed!

Hugh Bothwell
Yea, I ended up suggesting it to the client and he accepted. Thanks.
Norbert