tags:

views:

53

answers:

4

I'm trying to select a record with the most effective votes. Each record has an id, the number of upvotes (int) and the number of downvotes (int) in a MySQL database.

I know basic update, select, insert queries but I'm unsure of how to form a query that looks something like:

SELECT * 
  FROM topics 
 WHERE MAX(topic.upvotes - topic.downvotes)

Please excuse my made up SQL. The tutorials on SQL I find on the internet cover very basic stuff. Does anyone recommend a good book on this subject?

A: 

Does this do it for you?

SELECT *
FROM topics
ORDER BY topic.upvotes - topic.downvotes DESC
LIMIT 1;
Rob Farley
But please be careful with this, as you'll have a hard time ordering it by that field. You may want to have a field which stores upvotes-downvotes, so that you can index it.
Rob Farley
This way takes two - three times as much time run, unlike if you select the difference first and order by it.
Nick Gorbikoff
Why is that Nick? On SQL Server it would be no different. What does MySQL do differently?
Rob Farley
Rob not sure, I'm by no means an SQL expert. I just did a benchmark test ( running 2 kinds of queries side by side - mine first - so nothing was chached) and mine came out about at 0.203 or 0.446 ( when you use GROUP_CONCAT) of a second for about 250K records, while yours took about 0.817 of a second. I don't know the internals, but I think is that in my case calculation is once and kept in temp, and then if it's needed again db just uses that temp column for like order by or group by, versus in your case I think it does the calculation every time. Also you are selcting *, not certain fields.
Nick Gorbikoff
Ran out of room again. I may be totally wrong - but those are my guesses. I'm not a DB admin - I just happen to write a lot of reports for mysql and rails, and since rails (ruby 1.8.6) is slower than php or python or .NET, I have to optimize everywhere I can, and running concise sql queries and limiting number of them is usually the best way. :-)
Nick Gorbikoff
Ok. If you use `SELECT Id, upvotes, downvotes FROM ...` how does that perform?
Rob Farley
Or even just `SELECT Id FROM ...` - as I may not be interested in what the score is, I just want the top item.
Rob Farley
+1  A: 
  SELECT (upvotes-downvotes) AS NetVotes, *
  FROM topics
  ORDER BY NetVotes DESC LIMIT 1
Larry Lustig
+1  A: 

You are not using MAX() right.

Here is pretty fast query:

SELECT id, (upvotes - downvotes) AS popular  
FROM topics 
ORDER BY popular DESC LIMIT 1

To run an update:

UPDATE topics, 
    (here you put your select statement from above in parenthesis ) AS popular_query
SET topic.description = "I'm popular"
WHERE topics.id = popular_query.id

I just ran that on a table with 250,000 records (it's very similar - inventory usage - looking for a most popular part) and it took .203 of a second - on my dev machine - this is not even production server ( where it tppl 0.016 of a second)

UPDATE:

Yeah I didn't think about that possibility that you might have more than one top results.

SELECT GROUP_CONCAT(id) AS popular_ids, (upvotes - downvotes) AS popular 
FROM topics 
GROUP BY popular ORDER BY popular DESC  LIMIT 1

popular_ids - will contain popular records as a text field, that can be easily parsed out if you need to.

Nick Gorbikoff
Wow, thanks for the help. I appreciate the rough benchmark. How would I do an update on that very same record?
Frank
If you go with the first approach, and don't care that you might have multiple results you can just do and update with select:see above - I just edited it.However if you are doing second approach, I would recommend parsing out popular ids in your language of choice, and the doing a separate update statement, updating all records.
Nick Gorbikoff
+2  A: 

There may be more than 1 record that match that condition. In order to get them all you could do something like this in mysql:

SELECT * 
  FROM topics 
 WHERE upvotes - downvotes = (select MAX(upvotes - downvotes) from topics)