views:

110

answers:

5

Does anyone know if there is a performace differance between the following we parts of a where SQL string in mysql?

   WHERE ...  AND (field = "" OR field IS NULL);

and

  WHERE ...  AND (NOT (field != "" AND field IS NOT NULL));
+7  A: 

There is but it is negligibly small, so small I can't tell which would be faster and which slower. It's more of a matter of which is more readable — in this case I would pick the former.

BoltClock
do you have any references where i could find out why one would be faster slower, i dont recon i will use it but i am interested in why there would be a difference. i was expecting it to be the same because both subclauses are mathematicaly equivalent
pvgoddijn
I think it has something to do with the `NOT` operator, particularly when you use it to negate `field != "" AND field IS NOT NULL`.
BoltClock
+4  A: 

there is no performance difference as such,. its much more about code-readability here,. and of course the first statement is more readable,.

one more thing i would like to add is when thinking about optimizations,. one should not delve into micro-optimizations because that doesnt really add up to increasing performance because the difference is very negligible and u end up wasting time and resources and making your code less readable, less maintainable and less manageable.,

so whenever thinking about optimizations,. profile your code and start out with things like improving ur app design, db design, utilizing caching and so on,. rather then indulging into discussions about whether echo is faster or print (an example)

ovais.tariq
i agree on not doing micro optimizations, but i was curious about this one because we add this constraint (from Java) in many queries throughout the system, so if there would be a small but significant difference it would improve many queries a little bit.
pvgoddijn
hmm., there is no improvement as such,. so i suggest you use the statement 1 :)
ovais.tariq
+3  A: 

There is performance difference. In the way you have written query it depends upon the value of the variable "field".
In the query "WHERE ... AND (field = "" OR field IS NULL);" if field = "" is TRUE then field IS NULL will not be checked whether TRUE or FALSE. But if field = "" is FALSE then field IS NULL will be checked for TRUE or FALSE. Number of comparisons done depend on the value of "field".
In the query WHERE ... AND (NOT (field != "" AND field IS NOT NULL)); whether field != "" is TRUE or FALSE, field IS NOT NULL will be checked for TRUE or FALSE. Hence in either case comparison will be done twice. And finally NOT is checked. Hence overall three comparisons.
I prefer the former one, though the difference is negligible.
Its better not to use "NOT", although there will not be any significant performance difference considering today's compilers and system configurations.

M LOHIT
this was a good answer
Chris
A: 

The two are equivalent, and should be executed with exactly the same plan. If they are not then there is no guarantee that just because a) executes faster than b) today then they always will do.

The equivalence of the two is a direct application of elementary Boolean Algebra. If you have not had the opportunity already it's well worth your time to take a class or read a text covering it. The basics do not constitute a particularly large subject area, and the concepts are not particularly hard (you probably know much already, although probably not formally), but understanding the rules pays off richly once you have them under your thumb.

Cruachan
i understand the boolean equivalency, but i wasnt sure that mysql would recognize the equivalency (especialy because of the mixed use of NOT and '!='). Are you sure the is the case? The answers seem a little mixed about it so far.
pvgoddijn
I'd be very surprised if it didn't - the two should be parsed down to the same thing. However unless you are dealing with a large slow running query then it would be a mistake to concentrate on efficiency here as because a) readability is more important and b) implementation is highly likely to change,
Cruachan
A: 

According to default behavior of NULLs in SQL server (at least MSSQL) is that:

VALUE Logical_Operator NULL == FALSE

,- that is any Boolean operator with NULL results in FALSE (except special operator "IS NULL"). So in your second expression

"AND field IS NOT NULL"

is redundant part, because

(NULL != "") == FALSE

Now back to your question. First expression should be slower, because when we request sql server to select rows with field which is NULL,- server can't use INDEX on that field, and uses plain table scan (look at execution plan). So because of this my main advice is that never use nulls in any query - better mark empty fields with pre-defined symbol or just empty string.

EDIT: I've seen 'jokes' on mssql that construct below on some cases would run faster than your first expression (and possibly the second):

WHERE ... AND COALESCE(field,"")=""

My best guess at explaining this is that sql server don't likes OR cases, because by looking at query execution plan (with OR case) it is noticeable that server splits query into several parts and after executing these parts - merges results of both OR part queries. In any case you can add this third candidate for your performance experiments.

0x69