views:

112

answers:

3

i've got this union query:

(SELECT INSTALLER, INSTALLTIME, RESULT, JOBNUMBER, HONAME, ADDRESS, CITY, STATE, ZIP, NOTES, SMNOTES, '' as priority, PAFS, upsell, TERM, MMRUPGRADE, WARRANTY, EFT FROM ACCOUNTS 
WHERE INSTALLDATE = '$date' && FUNDINGSTATUS !='DEAD') 
UNION 
(SELECT technician, servicetime, result, ID, Customername, address, city, state, zip, notes, board, priority, '', '', '', '', '', '' FROM service 
WHERE serviceday = '$date') 
ORDER BY INSTALLER, priority

i'm curious if putting an index on the date field will help speed up both queries? or will the fact that i use FUNDINGSTATUS in the first where clause will make that query not utilize the index?

+2  A: 

Most likely it will help, but the only way to be sure is to break open the profiler, and have a look. Starting with version 5.0.37, MySQL has a built-in profiler.

Enable it with

set profiling=1;

To lookup the query_id

show profiles;

And to see the execution plan:

show profile for query x;
Yannick M.
+2  A: 

Answering your very question:

I'm curious if putting an index on the date field will help speed up both queries?

If the condition on installdate and serviceday is selective (that is few rows satisfy it), then yes, it will help.

Date fields usually tend to be selective.

or will the fact that i use FUNDINGSTATUS in the first where clause will make that query not utilize the index?

Yes, the index will still be used.

The engine will use the index to select only the records with installdate = $date and the will additionally filter on the value of fundingstatus.

For best results, create the following indexes:

ACCOUNTS  (installdate, fundingstatus)
service (serviceday)

If DEAD is a frequent value for fundingstatus, it may be better to rewrite this query like this:

SELECT  INSTALLER, INSTALLTIME, RESULT, JOBNUMBER, HONAME, ADDRESS, CITY, STATE, ZIP, NOTES, SMNOTES, '' as priority, PAFS, upsell, TERM, MMRUPGRADE, WARRANTY, EFT
FROM    ACCOUNTS 
WHERE   INSTALLDATE = '$date' AND FUNDINGSTATUS < 'DEAD'
UNION ALL
SELECT  INSTALLER, INSTALLTIME, RESULT, JOBNUMBER, HONAME, ADDRESS, CITY, STATE, ZIP, NOTES, SMNOTES, '' as priority, PAFS, upsell, TERM, MMRUPGRADE, WARRANTY, EFT
FROM    ACCOUNTS 
WHERE   INSTALLDATE = '$date' AND FUNDINGSTATUS > 'DEAD'
UNION
SELECT  technician, servicetime, result, ID, Customername, address, city, state, zip, notes, board, priority, '', '', '', '', '', ''
FROM    service 
WHERE   serviceday = '$date'
ORDER BY
        INSTALLER, priority

so that the range access on both fields (installdate, fundingstatus) can be used.

Quassnoi
Although this is correct, it doesn't actually answer the question he asked.
Allain Lalonde
DEAD isn't frequent yet, but it will be. there will also eventually be about 6-7 other funding status' eventually as well. i'm just curious though, how does the range access help (or more so how does it work). i've just been researching a lot of mysql lately and have only really come across the term a couple of times.
mlebrun15
The range access is more expensive **per row** (as rule of thumb, `10` times as expensive). This means that if your condition filters out `90%` of rows of more, then `range` access will be better.
Quassnoi
in what sense do you mean expensive?
mlebrun15
The range access uses the index which does not contain the values of the columns in your `SELECT` list. It will need to look them up in the table. This lookup takes about `10` times the time it would take to look it in the table. If `100%` of records satisfy your condition and you force the range access path, the query will take `10` times as much time to complete than it would were the full index scan used.
Quassnoi
* full table scan used.
Quassnoi
so you're saying if i return only a small portion of rows, then the range access is a good idea, but if i'm returning a heavy amount of rows, then i should steer clear of the range access?
mlebrun15
`@mlebrun15`: Exactly.
Quassnoi
i love stack overflow. thanks for all the help and knowledge!
mlebrun15
A: 

Having the index on any field in the where clause can always improve the performance, whether by a lot or a little.

To answer your question of whether the index on "date" will be used despite first query having "FUNDINGSTATUS" in the where clause has 2 answers:

  • If there are NO OTHER indexes on the table, then the date index will most definitely be used. That is because finding records with the specific date by index is a lot less work for the DB than searching the entire table, even if it does need to check the FUNDINGSTATUS after finding said records.

  • If there ARE other indexes on the same table, then the answer is "it depends".

    It mostly depends on what % of the data will be data that's notd ead vs. % of the data for a given date.

    The optimizer usually tries to choose the index that will right away pick the smallest # of columns - e.g. if your table has data for 100 days and 1/2 of them is dead rows, then the date index will be chosen because it gives you 1% of the data without scanning the table vs. 50% of the data.

DVK