views:

1236

answers:

3

I'm in over my head with a big mysql query (mysql 5.0), and i'm hoping somebody here can help.

Earlier I asked how to get distinct values from a joined query http://stackoverflow.com/questions/508707/mysql-count-only-for-distinct-values-in-joined-query

The response I got worked (using a subquery with join as)

select *
from media m
inner join
     ( select uid
     from users_tbl
     limit 0,30) map
  on map.uid = m.uid
inner join users_tbl u
  on u.uid = m.uid

unfortunately, my query has grown more unruly, and though I have it running, joining into a derived table is taking too long because there is no indexes available to the derived query.

my query now looks like this

SELECT mdate.bid, mdate.fid, mdate.date, mdate.time, mdate.title, mdate.name, 
       mdate.address, mdate.rank, mdate.city, mdate.state, mdate.lat, mdate.`long`,
       ext.link, 
       ext.source, ext.pre, meta, mdate.img
FROM ext
RIGHT OUTER JOIN (
  SELECT media.bid, 
         media.date, media.time, media.title, users.name, users.img, users.rank, media.address, 
         media.city, media.state, media.lat, media.`long`,
         GROUP_CONCAT(tags.tagname SEPARATOR ' | ') AS meta
  FROM media
  JOIN users ON media.bid = users.bid
  LEFT JOIN tags ON users.bid=tags.bid
  WHERE `long` BETWEEN -122.52224684058 AND -121.79760915942
    AND lat BETWEEN 37.07500915942 AND 37.79964684058
    AND date = '2009-02-23'
  GROUP BY media.bid, media.date
  ORDER BY media.date, users.rank DESC
  LIMIT 0, 30
) mdate ON (mdate.bid = ext.bid AND mdate.date = ext.date)

phew!

SO, as you can see, if I understand my problem correctly, i have two derivative tables without indexes (and i don't deny that I may have screwed up the Join statements somehow, but I kept messing with different types, is this ended up giving me the result I wanted).

What's the best way to create a query similar to this which will allow me to take advantage of the indexes? Dare I say, I actually have one more table to add into the mix at a later date.

Currently, my query is taking .8 seconds to complete, but I'm sure if I could take advantage of the indexes, this could be significantly faster.

+1  A: 

Starting fresh:

Question - why are you grouping by both media.bid and media.date? Can a bid have records for more than one date?

Here's a simpler version to try:

SELECT 
    mdate.bid,
    mdate.fid,
    mdate.date,
    mdate.time,
    mdate.title,
    mdate.name, 
    mdate.address,
    mdate.rank,
    mdate.city,
    mdate.state,
    mdate.lat,
    mdate.`long`,
    ext.link, 
    ext.source,
    ext.pre, 
    meta,
    mdate.img,
    (   SELECT GROUP_CONCAT(tags.tagname SEPARATOR ' | ')
        FROM tags 
        WHERE ext.bid = tags.bid
        ORDER BY tags.bid GROUP BY tags.bid
    ) AS meta

FROM 
    ext

LEFT JOIN
    media ON ext.bid = media.bid AND ext.date = media.date

JOIN
    users ON ext.bid = users.bid

WHERE 
    `long` BETWEEN -122.52224684058 AND -121.79760915942
    AND lat BETWEEN 37.07500915942 AND 37.79964684058
    AND ext.date = '2009-02-23'
    AND users.userid IN
    (    
        SELECT userid FROM users ORDER BY rank DESC LIMIT 30
    )

ORDER BY 
    media.date, 
    users.rank DESC
    LIMIT 0, 30
le dorfier
Well, I'm not grouping on anything, I just formatted the query so that it's readable :)
Quassnoi
thanks le dorfier, i've got things working pretty well now, but when I get past my current deadline, i'll try re-writing the query your way and see if I can get that working. It does look simpler.
pedalpete
+1  A: 

First, check for indices on ext(bid, date), users(bid) and tags(bid), you should really have them.

It seems, though, that it's LONG and LAT that cause you most problems. You should try keeping your LONG and LAT as a (coordinate POINT), create a SPATIAL INDEX on this column and query like that:

WHERE MBRContains(@MySquare, coordinate)

If you can't change your schema for some reason, you can try creating additional indices that include date as a first field:

CREATE INDEX ix_date_long ON media (date, `long`)
CREATE INDEX ix_date_lat ON media (date, lat)

These indices will be more efficient for you query, as you use exact search on date combined with a ranged search on axes.

Quassnoi
pedalpete
If you want SPATIAL, you should keep you coordinates as a special type, GEOMETRY (POINT, in your case). And the fact that LAT/LONG index did help tells us you really want SPATIAL :)
Quassnoi
See updated post for reference to GEOMETRY types.
Quassnoi
A: 

You might want to compare your perforamnces against using a temp table for each selection, and joining those tables together.

create table #whatever create table #whatever2

insert into #whatever select... insert into #whatever2 select...

select from #whatever join #whatever 2

....

drop table #whatever drop table #whatever2

If your system has enough memory to hold full tables this might work out much faster. It depends on how big your database is.

Myforwik