tags:

views:

726

answers:

6

I have a MySQL table that holds many entries with repeated IDs (for various reasons) So you might have something like

ID TIME DATA
1  xx   xx
2  xx   xx
3  xx   xx
1  xx   xx
3  xx   xx

What query can I run through PHP to select each ID only once? So I would like my result set to look like

ID TIME DATA
1  xx   xx
2  xx   xx
3  xx   xx
+2  A: 

Take a ganders at SELECT DISTINCT

Aiden Bell
You guys are awesome!DISTINCT works perfectly. Group By is also good, but I think I need distinct in this case.Thanks to everyone who replied!
Didn't see that green coming :) Nice.
Aiden Bell
Heh, I'd be interested in seeing the actual DISTINCT query that solved this problem!:)
Andomar
There isn't much too it Andomar...$sql = "SELECT DISTINCT post_id FROM vote_records WHERE title LIKE '".$query1."' OR title LIKE '".$query2."' OR title LIKE '".$query3."' OR title LIKE '".$query4."' OR title LIKE '".$query5."'";
@Rudebuyrock ... Would a JOIN on the distinct ids not be better?
Aiden Bell
Aaah so you're just looking for the id, not the other columns. That explains it :)
Andomar
But he still has the TIME and DATA columns in the result in his question.
Sinan Ünür
A: 

The SQL Query would look something like this:

SELECT DISTINCT (ID), TIME, DATA FROM TableName
Matthew Jones
That would select both records (1,'2009-05-16','x'),(1,'2009-04-11','y') because they differ in TIME and DATA.
VolkerK
I KNEW i forgot something. Thanks!
Matthew Jones
nope, the () won't fix it ;-)
VolkerK
Which is why I don't program in MySQL; it's too permissive. Oh well.
Matthew Jones
@Matthew Jones: This has nothing to do with MySQL. DISTINCT works the same in standard SQL as it does in MySQL, or any other brand of database. DISTINCT *always* applies to the whole row, not to a single column, even if you use parentheses.
Bill Karwin
A: 

This design is wrong in every way. It breaks the rules for first normal form. Unless those reasons were really compelling, I'd advise you to redesign this schema.

duffymo
You can't infer that. It could be that ID isn't the primary key by itself, in which case the table could still be in normal form.
Bill Karwin
ID is indeed not the primary key...and I haven't written all the fields, just to save typing...but as you can infer from the time field, IDs have to be repeated every time period...and so do the times for every IDI would gladly accept any suggestions to a better design. Any ideas?
I'm inferring it because it's repeated data, not simply keys. Looks like a one-to-many relationship between this table and an (id, time, data) triple is an order to me. Sorry, I'm still not convinced that this is a good design. Haven't heard anything to change it.
duffymo
A: 

Add a real primary key to the table, maybe called RealId.

Then you can:

select *
from YourTable
where realid in (
    select min(realid)
    from YourTable
    group by id
)

Which would give you one row per id.

Andomar
+1  A: 

It sounds like you're looking for GROUP BY.

You can use a query similar to:

SELECT id, time, data GROUP BY ID

You'll definitely have to do some tweaking to get that query to work with the structure you have but that's the basic idea.

Tizag has a great reference for using GROUP BY with MySQL and PHP.

T Pops
+1 This would actually work in MySQL, because MySQL handles GROUP BY in a nonstandard way.
Bill Karwin
Is it actually guaranteed to return time + data from the same row?
Andomar
No, it makes no guarantee about which row it takes the values from. In practice, it takes them from the row that is stored physically first. If that's what the OP wants, then he can use that (assuming MySQL doesn't change behavior in a future version).
Bill Karwin
See http://stackoverflow.com/questions/1023347/mysql-selecting-a-column-not-in-group-by/1023435#1023435
Bill Karwin
But once it picks a row, it will take all columns from that row? It wouldn't take time from row 1 and data from row 4?
Andomar
Andomar, it should take whichever row was physically first in the DB. I believe rows are stored with all the columns together so there's no way for row 1's DATA column to come before row 2's DATA column. I think the asker's question is too vague for a definitive answer but this is as close as I could come.
T Pops
The tricky part is when you have aggregate functions like MAX() in your query. The non-aggregated columns still take their values from the first row in the group, even if that isn't the row with the MAX() value.
Bill Karwin
Anyway, it's moot because the OP's actual problem is totally different from what he asked. >:-[
Bill Karwin
+3  A: 

The suggestion given by @karim79 and @Matthew Jones, that DISTINCT(ID) can fix this problem, is a common misconception about the way DISTINCT works in SQL. It doesn't help, because DISTINCT always applies to the whole row, not a single column. The parentheses are irrelevant, as they would be in the queries SELECT (1) versus SELECT 1.

The answer given by @T Pops is actually helpful in this case, because MySQL handles GROUP BY in a nonstandard way. See my answer to "Selecting a Column not in GROUP BY" for an explanation.

Another solution is to use LEFT OUTER JOIN creatively to query for the first row per ID. For instance, assuming that the TIME column is unique for a given ID, you could do the following:

SELECT t1.*
FROM MyTable t1 LEFT OUTER JOIN MyTable t2
  ON (t1.ID = t2.ID AND t1.TIME > t2.TIME)
WHERE t2.ID IS NULL;

If t1 is pointing to the earliest row in the group, then there can be no matching row in t2 with an earlier date. Therefore, the LEFT OUTER JOIN will find no match, and leave t2.* as NULL.

Bill Karwin
Excellent explanation. I'll come back and vote you up in about eight hours ;-)
Sinan Ünür
@Sinan: Is that a comment that JOIN would have poor performance? You should try using EXPLAIN to view the optimization plan of a query like this. Then compare it to the plan for a GROUP BY or DISTINCT query, which usually have to use a temp table.
Bill Karwin
@Bill I ran out of votes for today, so I could not upvote your answer. It is important to make the point that DISTINCT is not an appropriate "solution" here. And the rest of your explanation is very clear. There was no sarcasm in my response, just genuine regret that I could not upvote you right at that moment.
Sinan Ünür
Thanks! LOL -- I thought you were being snarky, as if you expected the query to take eight hours to finish, and you were waiting for that before casting a vote. :-)
Bill Karwin