ansaurus

Question

Where should I do the rowcount when checking for existence: sql or php?

Answer 1

+2 A:

YMMV, but I suspect that if you are only checking for existence, and don't need to use the retrieved data in any way, the COUNT() query will be faster. How much faster will depend on how much data.

Matt McClellan 2008-12-03 15:49:20

Answer 2

+2 A:

The fastest is probably asking the database if something exists:

SELECT EXISTS ([your query here])

rebra 2008-12-03 15:49:34

This can be MUCH faster becuase you don't care HOW MANY which saves SQL time.

n8wrl 2008-12-03 15:51:28

And it gives the (database server) optimizer a chance to do its thing: No need to count stuff if you just need to know it's there.

rebra 2008-12-03 15:53:26

You are both wrong, this method may be faster for small tables, but once you get past a certain amount of data SELECT COUNT(*) is faster. If you have to blindly choose, pick COUNT(*) because it's fast even on large tables and equally fast on small tables.

TravisO 2008-12-03 15:56:51

and in my case (mysql) I have no choice since EXISTS can only be used in subqueries it seems

tharkun 2008-12-03 16:04:26

rebra 2008-12-03 16:11:55

ok, I'm getting there! thank you all!

tharkun 2008-12-03 16:18:18

@TravisO: I'm not sure why you think COUNT would ever be faster than EXISTS *especially* on large tables. EXISTS can stop as soon as it finds one row that matches the given criteria. COUNT has to keep going.

Tom H. 2008-12-03 16:18:58

Answer 3

A:

I would definitely do it in the PHP to decrease load on the database.

In order to get a count and get the returned rows in SQL you would have to do two queries.. a COUNT and then a SELECT

The PHP way gives you everything you need in one result object.

adam 2008-12-03 15:50:55

I do not think this will reduce the load on the database. The database has to actually get the data, and it must be transferred and transformed to PHP. In contrast, when doing a SELECT EXISTS the database can just consult its index.

rebra 2008-12-03 15:59:10

Answer 4

A:

SELECT COUNT(*) FROM table

is the best choice, this operation is extremely fast both on small tables and large tables. While it's possible that

SELECT id FROM table

is faster on small tables, the difference in speed will be microscopic. But if you have a large table, this operation can be very slow.

Therefore, your best bet is to always choose to COUNT(*) the table (and it's faster to do * than it is to pick a specific column) as overall, it will be the fastest operation.

TravisO 2008-12-03 15:59:16

Now we're talking database implementations. COUNT(*) will not always be fast. In Postgresql for example, a COUNT(*) on a very large table could be quite slow. EXISTS on the other hand was made for this purpose.

rebra 2008-12-03 16:01:33

it's just that as far as I can see in mysql there is no EXISTS

tharkun 2008-12-03 16:02:48

http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html

rebra 2008-12-03 16:03:48

yes, for subqueries but not for the main query, right?

tharkun 2008-12-03 16:05:46

I would try. (I cannot myself)

rebra 2008-12-03 16:09:08

Answer 5

+1 A:

If all you are doing is checking for the existance, then

Select count(*) ...

But if you will retrieve the data if it exists, then just get the data and check it in PHP, otherwise you'll have two calls.

John MacIntyre 2008-12-03 16:03:29

Answer 6

+1 A:

For me is in the database.

Making a count(1) is faster than $result->num_rows because in the $result->num_rows you make 2 operations 1 select and a count if the select has a count is faster to get the result. Except if you also want the information from the db.

Luís Tedim 2008-12-03 16:07:03

Answer 7

+1 A:

SELECT 1 
  FROM (SELECT 1) t 
 WHERE EXISTS( SELECT * FROM foo WHERE id = 42 )

Just tested, works fine on MySQL v5

COUNT(*) is generally less efficient if:

you can have duplicates (because the DBMS will have to exhaustively search all of the records/indexes to give you the exact answer) or
have NULL entries (for the same reason)

If you are COUNT'ing based on a WHERE clause that is guaranteed to produce a single record (or 0) and the DBMS knows this (based upon UNIQUE indexes), then it ought to be just as efficient. But, it is unlikely that you will always have this condition. Also, the DBMS may not always pick up on this depending on the version and DBMS.

Counting in the application (when you don't need the row) is almost always guaranteed to be slower/worse because:

You have to send data to the client, the client has to buffer it and do some work
You may bump out things in the DBMS MRU/LRU data cache that are more important
Your DBMS will (generally) have to do more disk I/O to fetch record data that you will never use
You have more network activity

Of course, if you want to DO something with the row if it exists, then it is definitely faster/best to simply try and fetch the row to begin with!

Matt Rogish 2008-12-03 16:11:14

that sounds like the best option if I wante to retrieve the data in case of existence! thank you!

tharkun 2008-12-03 16:17:20

would you mind adding a small paragraph about the difference of just checking on one hand and checking plus if exists retrieving on the other hand. then I would mark your answer as accepted.

tharkun 2008-12-03 19:39:10

Answer 8

A:

If you want raw speed, benchmark! In addition to the methods others have suggested:

SELECT 1 FROM table_name WHERE ... LIMIT 1

may be faster due to avoiding the subselect. Benchmark it.

derobert 2008-12-03 16:55:29

ansaurus

tags:

views:

answers:

Where should I do the rowcount when checking for existence: sql or php?

related questions