tags:

views:

154

answers:

3

Hi,

I'm writing a particularly troublesome query. It boils down to this:

I obtain a table of the structure:

pid | tid | points

after a pretty large query.

For ease of explanation:

  • pid = problem id
  • tid = team id
  • points = points awarded to that team for that problem.

I want to find the team who has scored max points for a particular pid.

My question is twofold:

  1. If this were a simple table that went by the name teampoints, how do I get the tid that has MAX(points) for every pid? I tried SELECT pid, tid, MAX(points) from teampoints group by pid; but understandably, that would not work

  2. I've arrived at this result after a rather large query. If the answer to my first involves selecting data from teampoints again, is there any way to do that without having to calculate the whole table again?

Thanks

PS: I use mysql.


GORY DETAILS: TREAD WITH CAUTION

I have a few tables in my system, their relevant structures being:

users: uid
teams: tid | eid | teamname
teammembers: tid | uid
events: eid
problems: pid | eid
submissions: subid | pid | uid | eid | points | subts

Some notes: - problems belong to events - users belong to teams - submissions belong to problems(pid) and users(uid). the submissions table has a redundant eid field, which can always be determined from the pid.

The use case is:

  • users form teams. users are identified by uid, teams by tid. Team members are stored in teammembers table.
  • users can make submissions, which are stored in submissions table. submissions are awarded points. subts is the unix timestamp of when the submission was made.
  • users can submit multiple times for the same problem. the latest submission (max subts) is counted.

now, in this set up I want to find the teamname that has scored maximum points for any given event (eid).

I hope this makes my situation clear. I wanted to ask only what I needed to know. I furnish these details up an a request in the comments.

EDIT: the query that generated the teampoints table is:

SELECT s.pid, teamlatest.tid, s.points 
  FROM  submissions s, teammembers tm, teams t, 
      (SELECT max(maxts) AS maxts, pid, tid 
         FROM (SELECT latest.maxts, latest.pid, t.tid 
                 FROM submissions s, teams t, teammembers tm,
                     (SELECT max(subts) AS maxts, pid, uid 
                        FROM submissions 
                        WHERE eid=3 AND status='P' 
                        GROUP BY pid, uid
                     ) AS latest
                 WHERE s.uid=latest.uid 
                   AND s.pid=latest.pid 
                   AND s.subts=latest.maxts 
                   AND latest.uid=tm.uid 
                   AND tm.tid=t.tid 
                   AND t.eid=3
              ) AS latestbyteam
         GROUP BY pid, tid) AS teamlatest
  WHERE s.pid=teamlatest.pid 
    AND teamlatest.tid=t.tid 
    AND t.tid=tm.tid 
    AND tm.uid=s.uid 
    AND s.subts=teamlatest.maxts
A: 
select pid, tid, points
from teampoints tp
where not exists (
    select 1
    from teampoints tp1
    where tp.pid = tp1.pid
    and tp.points < tp1.points)

or somesuch ....

Phil Wallach
A: 

You could store the results of your first query in a temporary table and try something like this

SELECT  pid, tid, points
FROM    teampoints tp
        INNER JOIN (
          SELECT    pid, points = MAX(points) 
          FROM      teampoints 
          GROUP BY  pid
        ) tp_max ON tp_max.pid = tp.pid AND tp_max.points = tp.points

Note that you will get doubles when two teams have equal points on a project.

If you would post your query, it would be easier for us trying to optimize it io trying to invent it all over.

Lieven
store into a temporary how?
Here Be Wolves
and yes, i'll post the query that generated `teampoints`.
Here Be Wolves
+3  A: 
  1. One way:

    SELECT pid, tid, points 
      FROM teampoints
      WHERE (pid, points) IN (
          SELECT pid, MAX(points) 
            FROM teampoints GROUP BY pid
      )
    

    Another, using joins:

    SELECT s1.*
        FROM teampoints AS s1
        LEFT JOIN teampoints AS s2
           ON s1.pid = s2.pid
           AND s1.points < s2.points
        WHERE s2.tid IS NULL
    
  2. You can INSERT INTO a temporary table for the complex query:

    CREATE TEMPORARY TABLE scores (
        pid INT, tid INT, points INT,
        KEY pp (pid, points)
    );
    INSERT INTO scores (pid, tid, points) 
        SELECT <a complex query>
    

    then SELECT the top scorers from that.

outis
I see that in yoru queries, you are having to join `teampoints` with itself. In my case, i'm arriving at `teampoints` after a large query. How do I store it into a temp using `SELECT INTO`?
Here Be Wolves
Turns out MySQL doesn't support `SELECT ... INTO`, but it does support `INSERT INTO ... SELECT`
outis
Okay, so that is multiple queries.. which means I'll need to issue more than one mysql_query() function calls (i'm using php). If I created a view called `teampoints`, would it be equivalent (in performance)?
Here Be Wolves
The old mysql driver is terribly outdated. mysqli supports multiple statements in a single query via `mysqli::multi_query`. PDO support for multi-queries is spotty; depending on your PHP version, you might need to use the PDO_MySQLnd driver. Both also support prepared statements, which are a big win for a number of reasons. As for views, I don't believe they're cached, so will perform worse than a temporary table.
outis