views:

756

answers:

12

Ok I need to build a query based on some user input to filter the results.

The query basically goes something like this:

SELECT * FROM my_table ORDER BY ordering_fld;

There are four text boxes in which users can choose to filter the data, meaning I'd have to dynamically build a "WHERE" clause into it for the first filter used and then "AND" clauses for each subsequent filter entered.

Because I'm too lazy to do this, I've just made every filter an "AND" clause and put a "WHERE 1" clause in the query by default.

So now I have:

SELECT * FROM my_table WHERE 1 {AND filters} ORDER BY ordering_fld;

So my question is, have I done something that will adversely affect the performance of my query or buggered anything else up in any way I should be remotely worried about?

+1  A: 

TO improve performance, use column indexes on fields listen in "WHERE"

o_O Tync
+2  A: 

Standard SQL Injection Disclaimers here...

One thing you could do, to avoid SQL injection since you know it's only four parameters is use a stored procedure where you pass values for the fields or NULL. I am not sure of mySQL stored proc syntax, but the query would boil down to

SELECT *
  FROM my_table
 WHERE Field1 = ISNULL(@Field1, Field1)
   AND Field2 = ISNULL(@Field2, Field2)
   ...
 ORDRE BY ordering_fld
n8wrl
That tends to prevent the use of indexes on field1, etc, also should use COALESCE rather than ISNULL
KeeperOfTheSoul
The correct format would be @Field IS NULL OR t.col = @Field. It's a wasted index lookup or table scan otherwise.
OMG Ponies
+8  A: 

You can EXPLAIN your query:
http://dev.mysql.com/doc/refman/5.0/en/explain.html

and see if it does anything differently, which I doubt. I would use 1=1, just so it is more clear.

You might want to add LIMIT 1000 or something, when no parameters are used and the table gets large, will you really want to return everything?

KM
`@KM`: according to the tags, this should be `LIMIT 1000` :)
Quassnoi
@Quassnoi, thanks, I was thinking mysql when I first answered, but when I eddited and added the _TOP 1000_ part I was think sql server.
KM
+4  A: 

If there is a good way in your chosen language to avoid building SQL yourself, use that instead. I like Python and Django, and the Django ORM makes it very easy to filter results based on user input.

If you are committed to building the SQL yourself, be sure to sanitize user inputs against SQL injection, and try to encapsulate SQL building in a separate module from your filter logic.

Also, query performance should not be your concern until it becomes a problem, which it probably won't until you have thousands or millions of rows. And when it does come time to optimize, adding a few indexes on columns used for WHERE and JOIN goes a long way.

Christian Oudard
+5  A: 

WHERE 1 is a constant, deterministic expression which will be "optimized out" by any decent DB engine.

harpo
+34  A: 

MySQL will optimize your 1 away.

I just ran this query on my test database:

EXPLAIN EXTENDED
SELECT  *
FROM    t_source
WHERE   1 AND id < 100

and it gave me the following description:

select `test`.`t_source`.`id` AS `id`,`test`.`t_source`.`value` AS `value`,`test`.`t_source`.`val` AS `val`,`test`.`t_source`.`nid` AS `nid` from `test`.`t_source` where (`test`.`t_source`.`id` < 100)

As you can see, no 1 at all.

The documentation on WHERE clause optimization in MySQL mentions this:

  • Constant folding:

    (a<b AND b=c) AND a=5
    -> b>5 AND b=c AND a=5
    
  • Constant condition removal (needed because of constant folding):

    (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
    -> B=5 OR B=6
    

Note 5 = 5 and 5 = 6 parts in the example above.

Quassnoi
+1 for trying it out, using EXPLAIN and looking up documentation
knittl
`@knittl`: If every developer used `EXPLAIN` and looked up documentation, the world would be a much nicer place!
Quassnoi
+1 for trying to make the world a nicer place :)
waqasahmed
@quassnoi: would be, but is not ;)
knittl
This is excellent, thankyou.
Evernoob
+2  A: 

We've been doing something similiar not too long ago and there're a few things that we observed:

  • Setting up the indexes on the columns we were (possibly) filtering, improved performance
  • The WHERE 1 part can be left out completely if the filters're not used. (not sure if it applies to your case) Doesn't make a difference, but 'feels' right.
  • SQL injection shouldn't be forgotten

Also, if you 'only' have 4 filters, you could build up a stored procedure and pass in null values and check for them. (just like n8wrl suggested in the meantime)

snomag
I'm not too worried about SQL injections, I'm handling that. It was more I was just curious about the acceptability of what I'd done.
Evernoob
+2  A: 

That will work - some considerations:

About dynamically built SQL in general, some databases (Oracle at least) will cache execution plans for queries, so if you end up running the same query many times it won't have to completely start over from scratch. If you use dynamically built SQL, you are creating a different query each time so to the database it will look like 100 different queries instead of 100 runs of the same query.

You'd probably just need to measure the performance to find out if it works well enough for you.

Do you need all the columns? Explicitly specifying them is probably better than using * anyways because:

  • You can visually see what columns are being returned
  • If you add or remove columns to the table later, they won't change your interface
Shin
Oracle checks the cache based on the content of the submitted query, not the fact a user ran a given function or procedure.
OMG Ponies
+2  A: 

Not bad, i didn't know this snippet to get rid of the 'is it the first filter 3' question.

Tho you should be ashamed of your code ( ^^ ), it doesn't do anything to performance as any DB Engine will optimize it.

Clement Herreman
haha, why should I be ashamed of my code though? if the DB engine optimizes it, there's no harm done on the DB side of things and my business logic is now clearer and more readable for the next guy. So where's the shame?
Evernoob
Because you add "1=1", which is useless code, which is bad... Harmless but bad. But everyone does that ^^
Clement Herreman
+1  A: 

The only reason I've used WHERE 1 = 1 is for dynamic SQL; it's a hack to make appending WHERE clauses easier by using AND .... It is not something I would include in my SQL otherwise - it does nothing to affect the query overall because it always evaluates as being true and does not hit the table(s) involved so there aren't any index lookups or table scans based on it.

I can't speak to how MySQL handles optional criteria, but I know that using the following:

WHERE (@param IS NULL OR t.column = @param)

...is the typical way of handling optional parameters. COALESCE and ISNULL are not ideal because the query is still utilizing indexes (or worse, table scans) based on a sentinel value. The example I provided won't hit the table unless a value has been provided.

That said, my experience with Oracle (9i, 10g) has shown that it doesn't handle [ WHERE (@param IS NULL OR t.column = @param) ] very well. I saw a huge performance gain by converting the SQL to be dynamic, and used CONTEXT variables to determine what to add. My impression of SQL Server 2005 is that these are handled better.

OMG Ponies
+2  A: 

I have usually done something like this:

for(int i=0; i<numConditions; i++) {
  sql += (i == 0 ? "WHERE " : "AND ");
  sql += dbFieldNames[i] + " = " + safeVariableValues[i];
}

Makes the generated query a little cleaner.

jnylen
+2  A: 

One alternative i sometimes use is to build the where clause an an array and then join them together:

my @wherefields;
foreach $c (@conditionfields) {
  push @wherefields, "$c = ?",
}

my $sql = "select * from table";
if(@wherefields) { $sql.=" WHERE " . join (" AND ", @wherefields); }

The above is written in perl, but most languages have some kind of join funciton.

MadCoder