ansaurus

Question

Answer 1

A:

first of all you don't need the distinct in the subquery since IN eliminates duplicates anyhow Do you need the function call in the WHERE clause and do you have and index on the date_created column?

what happens when you change

WHERE STR_TO_DATE(t1.date_created,'%d-%M-%Y') 
BETWEEN '2009-10-01' AND '2009-10-31')

to

WHERE 1.date_created >= '2009-10-01' 
AND 1.date_created < '2010-01-01'

Sometimes indexes won't be used if you use functions on the column

SQLMenace 2010-06-04 15:40:27

Yes, I need the where clause, basically it's pulling all of the RMAs that shipped for that period, then I'm gathering all historical data related to the list of RMAs that shipped.I do have an index but it's basically an auto incremented index, how would I incorporate that. Sorry, I am self taught so I'm not too savvy with the details. I read the manual on indexes but still didn't see how I could incorporate an index that didn't relate to the data.

Geoff 2010-06-04 15:50:29

I asked if you needed the function not the WHERE clause. The WHERE clause I have should be the same one as you have but should run much faster

SQLMenace 2010-06-04 15:55:23

STR_TO_DATE converts a string to a date, implying that `date_created` is a VARCHAR/TEXT/etc data type so the function would be necessary.

OMG Ponies 2010-06-04 16:22:58

oh, sorry, yes I do need it. I didn't know how to use SET when using LOAD DATA INFILE so the date looks like 01-Oct-09. Unless there is a way to still use that format without the function.

Geoff 2010-06-04 16:22:59

Also, probably worth mentioning, I have applications that use this data so I can't really change it because it will mess up the applications that are assuming it is in the D-Mon-Year format

Geoff 2010-06-04 16:36:54

Answer 2

A:

My advice is to replace the IN with a JOIN, and then consider adding indexes on some of your columns, such as job, and maybe operation and/or result. You should read up on indexes in the MySQL manual, and also on using EXPLAIN to optimize your queries:

http://dev.mysql.com/doc/refman/5.1/en/indexes.html

http://dev.mysql.com/doc/refman/5.1/en/using-explain.html

Here's an example of converting the IN to a JOIN:

SELECT distinct t2.* 
FROM roc_test_results as t2
inner join roc_test_results as t1 on t1.job = t2.job
WHERE t1.operation = 'TEST' 
AND t1.result = 'Passed' 
AND STR_TO_DATE(t1.date_created,'%d-%M-%Y') BETWEEN '2009-10-01' AND '2009-10-31';

Ike Walker 2010-06-04 16:02:22

STR_TO_DATE converts a string to a date, implying that `date_created` is a VARCHAR/TEXT/etc data type so the function would be necessary. An index on the column won't be used because of converting to a different data type.

OMG Ponies 2010-06-04 16:29:01

Forgive me if this sounds like dumb question but, if the date isn't the left most prefix of the index and I'm not including the primary index how will that help.Also, I'm not sure your second paragraph will return what I'm looking for. I first gather a list of RMAs that have passed TEST in the date range, then I need all rows from any date range that pertain to the RMAs that passed TEST in the date range.

Geoff 2010-06-04 16:34:01

@OMG Ponies: You are correct. I didn't look closely enough at the function. I just assumed it was converting a DATETIME to a DATE. So the index won't help.

Ike Walker 2010-06-04 16:52:05

@Geoff: The index won't help. I didn't look closely enough at your query. See my previous comment. As for the way I rewrote the query, again I didn't look closely enough at the original, so you are correct that it won't give you what you want. I'll rewrite it with a join.

Ike Walker 2010-06-04 16:53:47

Answer 3

+1 A:

The date_created data type needs to change to be a DATETIME before it's worth defining an index on the column. The reason being, the index will be worthless if you are changing the data type from string to DATETIME as you are currently.

You've mentioned that you're using LOAD DATA INFILE, and that the source file contains dates in DD-MON-YY format. MySQL will implicitly convert strings into DATETIME if the YY-MM-DD format is used, so if you can correct this in your source file before using LOAD DATA INFILE the rest should fall in to place.

After that, a covering index using:

job
operation
result
date_created

...would be a good idea.

OMG Ponies 2010-06-04 18:33:05

Correct, I know now that I can change the format during the LOAD DATA INFILE using @ and SET but when I added the original data I did not. I guess since I already have applications that depend on it's format I'm kind of stuck. I either have to leave it the way it is and wait forever for the results while also having a horribly indexed table or I have to go through my applications and change all of the queries by removing the STR_TO_DATE since it won't be needed anymore (I'll probably do the latter). This Covering index looks very interesting, thanks for pointing that out.

Geoff 2010-06-04 19:02:50

ansaurus

tags:

views:

answers:

Query Optimization using WHERE IN

related questions