views:

37

answers:

2

I have a SELECT query with lot of IF conditions, which I can do either in the query itself (takes DB machine's CPU) or I can put it in my java code (takes server machine's CPU).

Is there any preferred approach here (to put conditions in DB Vs in mid-tier)?

UPDATE: My query is a join on more than 2 tables, and I am using left join to combine and there are some rows which will have corresponding row in 2nd table and some are not. I need to have some default value for those columns when I don't have corresponding row in 2nd table.

SElECT CASE WHEN t2.col1 is null
    then 'default' else t2.col1
    END
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
+1  A: 

Yes, though I would suggest another approach, one that adds no load to the app server and minimal load to the DBMS. It's a little hard to answer the question since you haven't provided a concrete example but I'll give it a shot.

My preferred solution is to get rid of the if conditions totally if you can. At a bare minimum, you can re-jig your database schema to move the cost of calculation away from the select (which happens a lot) and into the insert/update (which happens less often).

That's the normal case, I have seen databases that write more frequently than read, but they're the exception rather than the rule.

By way of example, let's say you store person information and you want to get a list of people whose first name is more than 5 characters long. Don't ask why, I'm the customer, you have to give me what I want :-)

Rather than a monstrous select statement to (possibly) split apart the name and count the characters in it, do that as an insert/update trigger when the data enters the table - that's the only time when the value can change after all.

Put that calculation in another column (indexed) and use that in your select. The cost of the calculation is amortised over al the selects, which will be blindingly fast.

It will take up more storage space but, if you compare the number of database "how can I make this faster?" questions against the number of "how can I use less space?" questions, you'll find the former greatly outweigh the latter.

And, yes, it does mean you store redundant data but the triggers mitigate the possibility of losing ACID properties. It's okay to bend rules if you know the possible consequences and how best to avoid them.


Based on your update, you should put the workload on to the machine where it causes the least impact. That may be the DBMS, it may be the app server, it may even be on the client side (of the app server) itself since that would distribute the cost across a lot of machines rather than concentrating it at a single point.

You should measure, not guess! Set up realistic performance test systems along with realistic production-quality data, then try the different approaches. That's the only real way to be certain.

paxdiablo
Thanks paxdiablo for the great information, but this won't help me in my scenario(and also I am not allowed to change schema now). Updated question with my query.
Reddy
+2  A: 

If it's really something that the DB cannot do any faster than the app server, and which actually reduces the load on the DB server if moved to the app server, then I'd move it to the app server.

The reason: if you reach the limits of your hardware, it's much easier to have multiple app servers than to have a clustered database.

However, the second condition above should be tested thoroughly: many things will not reduce (or even increase) the DB load if moved away from the DB.

Update: For the kind of thing you need, I doubt whether the first condition is satisfied - have you tested it? A simple CASE is completely insignificant, unless the condition or the branches contain some very expensive calculations.

Michael Borgwardt
Thanks for 'The reason:". Probably then I should put conditions in java code. It is not the exact query I am using but yes, it is the just simple case I am using, there are no expensive calculations, only selecting one of the two columns.
Reddy
Can you give any links regarding this?"it's much easier to have multiple app servers than to have a clustered database"
Reddy
@Reddy: see http://insidehpc.com/2006/07/14/what-is-a-clustered-database/ - basically you either have to change your DB design to allow shared-nothing partitioning, or you risk sudden catastrophic network contention when someone does a big query across partitions.
Michael Borgwardt
thanks Michael, will read further. Thanks.
Reddy