views:

32

answers:

2

I have a situation in my application. For each x-axis point in my chart, I am plotting 5 y-axis values. To calculate each of these 5 values, I need to make 4 different queries. Ie, for each x-axis point I need to fire 20 sql queries.

Now, I need to plot 40 such points in the my chart. Its resulting in a pathetic performance where it takes close to a minute to get all the data back from the database. Each of 4 different queries consists of a join between 2 tables. One has only 6 rows. The other close to 10,000. Each of the 4 queries has different WHERE clauses, so they are different queries. For each point in the x-axis, only the values for the where clauses change.

I have tried combining each of the 4 queries into one big string. Basically batch the four selects. These are again batched for each y-axis value. So, for each x-axis point, I am now firing one big command that consists of 20 different select statements.

Technically, I should be experiencing a big performance boost, right? Instead of hitting the db 40x5x4 = 800 times, I am now hitting it just 40 times. But instead of taking 60 seconds, it taking 50-55 seconds... not much of a help.

I am using MySql 5.1, and the 6.1 version of its .Net connector.

What can I do to improve the performance?

Edit:

One of the 4 queries is as follows:

SELECT
  SUM(TIME_TO_SEC(TIMEDIFF(T1.col2, T1.col1))* T2.col1 / (3600 *1000)) AS TotalTime
FROM Table T1
  JOIN Table T2 
  ON T1.col3 = T2.col3
WHERE T1.col4 = 'i'
    AND T1.col1 >= '2009-12-25 00:00:00'
    AND T1.col2 <= '2009-12-26 00:00:00';

The other 3 queries are similar, only the where clause changes slightly. This set of 4 queries is fired 5 times. The first 3 times against the join of table T1 and T2, passing in different values for col4. And the next two times against the join of table T3 and T2 passing in different values for col4. These 5 values are the y-axis values for a particular x-axis point.

The data returned by all these queries is the same format. so, we tried doing a UNION ALL on all these queries. No substantial difference. One strange thing, however, after indexing the foreign key on the table T1 [while it contained over a lakh records], the queries were using the index, but they had become slower. At times, the queries would take double the time to return the data.

A: 

You eliminated the db connection problem.

So the problem must lie in the queries themselves taking some time.

With such small tables, your joins should be extremely fast, unless your foreign key in the 10K records-one is not indexed. If it is not, indexing it should give a significant boost.

Also, it is very likely that your 800 queries can be joined in one. You should gain there too (even if you add some additional joins).

I am thinking to something like

SELECT
  a.*,
  b1.needed_value,
  b2.needed_value,
  b3.needed_value,
  b4.needed_value,
  b5.needed_value
FROM
  table1 a,
  table2 b1,
  table2 b2,
  table2 b3,
  table2 b4,
  table2 b5
WHERE
  a.a_key = b1.b_key AND
  a.a_key = b2.b_key AND
  a.a_key = b3.b_key AND
  a.a_key = b4.b_key AND
  a.a_key = b5.b_key AND
  a.x_id IN (x1, x2, ...., x40)
Benoit Vidis
I have edited my question to include a sample of the query we are executing. I indexed the foreign key on the table that contained a lot of rows. When it was just around 10k, the queries took lesser time. But when the data in that table increased to around 2 lakh rows, the queries were taking double the time as compared to without the index. Is is possible for the indexes to slow down the queries?
Amith George
OK, so the join is not the bottleneck anymore.Do you have an index on col4?
Benoit Vidis
yes, i do have an index of col4. however, col4 will only contain 2 values. 'i' or 'a'. Am not sure if indexing that will make a difference. As such, there will be almost equal number of rows for values 'a' and 'i'.
Amith George
A: 

Without seeing some examples of the queries you are running, this can be tough to make good suggestions on. My first guess is to try using UNIONs to combine multiple distinct queries with the same result structure into a single query that hits the database just once.

But what I think you are describing is that you have value dependencies so you can't run other queries until you have the values from the previous queries. You may be able to work around this by using "derived tables", which is really just a SELECT that acts as a table.

SELECT yaxis.*,xaxis.xval FROM yaxis JOIN (
SELECT xaxis.xval FROM xaxis WHERE xval IN (4, 23, 32, 75, 78) AND group_id=121
) AS xaxis ON yaxis.xval=xaxis.xval

Basically, put a SELECT in parenthesis and assign it a name and you can treat just like a database table.

Brent Baisley
Thanks for suggestion. We tried the union approach but it didnt help reduce the time much. And the queries are independent. No dependency on the previous query's value.
Amith George