tags:

views:

54

answers:

2

I have this SQL query which due to my own lack of knowledge and problem with mysql handling nested queries, is really slow to process. The query is...

SELECT    DISTINCT PrintJobs.UserName
FROM      PrintJobs
LEFT JOIN Printers
ON        PrintJobs.PrinterName = Printers.PrinterName
WHERE     Printers.PrinterGroup
IN        (
              SELECT    DISTINCT Printers.PrinterGroup
              FROM      PrintJobs
              LEFT JOIN Printers
              ON        PrintJobs.PrinterName = Printers.PrinterName
              WHERE     PrintJobs.UserName='<username/>'
          );

I would like to avoid splitting this into two queries and inserting the values of the subquery into the main query progamatically.

+3  A: 

This is probably not exactly what you are looking for however, i will contribute my 2 cents. First off you should show us your schema and exactly what you are trying to accomplish with that query. However from the looks of it you are not using numeric IDs in the table and are instead using varchar fields to join tables, this is not really a good idea performance wise. Also i am not sure why you are doing:

(select PrinterName, UserName
      from PrintJobs) AS Table1

instead of just joining on PrintJobs? Similar stuff for this one:

(select
      PrinterName,
      PrinterGroup
      from Printers) as Table1

Maybe i am just not seeing it right. I would recommend that you simplify the query as much as possible and try it. Also tell us what exactly you are hoping to accomplish with the query and give us some schema to work with.

Removed the bad query from the answer.

Sabeen Malik
Agree, Table1 looks suspicious. Vote for removing it =)
Yacoder
The idea is to select all users who are members of the same 'Print Groups' as a user <username/>. table PrintJobs (id, UserName,PrinterName), table Printers (PrinterName, PrinterGroup). The PrintJobs table contains a lot of other not relevant fields. I was hoping that it would be better to just join to the subquery, I guess not, I will remove it.
Matthew
Ok that looks a lot better and it saves about 1/2 a second on the subquery and the main query individually.
Matthew
Great, i have to rush out but i suggested a simpler query, please try and let me know if it works better.
Sabeen Malik
I gave it a try. The query I have currently above runs in 5 min 50.60 sec but this one in in excess of 15 minutes and still running. Thanks for your help :)
Matthew
OK i thought about it and its a bad query anyway, sorry about that. One thing though, is the PrinterName column indexed? If not can you try and tell me if you see improvements? How many rows in each table you have?
Sabeen Malik
I gave up after about 30 minutes so I stopped it, there wasn't an improvement unfortunately. PrintJobs has about 1,000,000 rows and Printers has about 350. The PrinterName column is not indexed in either case. I could probably index it in the Printers table but it is not unique in the PrintJobs table.
Matthew
Perhaps the creation of a table view may work around MySQL's handling of nested queries?
Matthew
Ohk thats a huge number, no wonder the query is really slow. So you are saying that when u tried to index it, it took 30 mins and you stopped it after that? Is it possible for you to change the table structure or are you stuck with this? Cause when we are dealing with data of this size, its important that the foreign key and primary key is numerical instead of varchar.
Sabeen Malik
The problem is not really with the nested query but with joining based on varchar data, so even if you take this to a view, it might not show much difference.
Sabeen Malik
BTW yours is an interesting case and query performance optimization is something i love doing, do you want to talk on IM?
Sabeen Malik
Indexing the UserName and PrinterName fields on the PrintJobs table has made a good improvement in speed. Thanks for the suggestion.
Matthew
+1  A: 

This query you have is pretty messed up, not sure if this will handle everything you need but simplifying like this kills all the nested queries and it way faster. You can also use the EXPLAIN command to know how mysql will fetch your query.

SELECT    DISTINCT PrintJobs.UserName
FROM      PrintJobs

LEFT JOIN Printers ON PrintJobs.PrinterName = Printers.PrinterName
AND Printers.Username = '<username/>'
;
Rodrigo
I love it. Except it doesn't appear to work. (I assume you meant PrintJobs.Username =). This returns all distinct users in the PrintJobs table.
Matthew