ansaurus

Question

Is this SQL select code following good practice?

Answer 1

A:

You can use joins to achieve the same thing, and I would expect joins to work a lot more efficiently than embeded selects.

Tom Gullen 2010-06-18 15:37:05

How? i have no column to select

acidzombie24 2010-06-18 15:46:13

Answer 2

+1 A:

I'm guessing that you're trying to ensure that a user only votes once on each post here.

I wouldn't - I don't - use separate tables for up votes and down votes. Add vote type to your votes table and you won't need correlated subqueries.

coolgeek 2010-06-18 15:40:07

Actually i am allowing an 'up' an 'down' vote. Its really agree and disagree and you are allowed to do both. Since i can do both should i stick them in the same table? i was thinking its easier to have in 2 tables then 1 table and do a check for IP addr and writing the extra AND voteType <> (or =) myVoteType.

acidzombie24 2010-06-18 15:49:34

Yikes. Given the improbability of a need arising for a third (fourth, etc) type of vote field, I would still go with one votes table, with two vote fields, one for up, one for down. One votes table record per user per post. Join votes with post on user id, selecting both votes fields and take the appropriate action in your application. Also, I trust that you're versed in the problems that can arise when using IP addresses for identification...

coolgeek 2010-06-18 21:26:03

Answer 3

+1 A:

Here is my opinions:

It seems that table "UpVotes" and "DownVotes" have same structure and can be merged into one table.
The relation between table "Post" and "Up/DownVotes" can be constrained by foreign key.
Although I am not sure about the performance difference, but I think it would be better to use "join" mechanism rather than nesting two select statement in a select statement.

ZEAXIF 2010-06-18 15:41:35

note i'd have two joins because its 2 values and not an or. The votes tables do use post as a FK and unless i have a column i can check (a bool saying up otherwise i know its down) i dont know how to write it as a join. Do i need that to be able to write it as a join?

acidzombie24 2010-06-18 15:45:39

Up/DownVotes keep in one table with a bool field is a good idea. It is not necessary to use join, but you may need it for performance boost when your database gone large.

ZEAXIF 2010-06-18 16:07:38

What do you mean? i would still have to join my list of votes with the post in both cases.

acidzombie24 2010-06-18 16:11:46

Oh, sorry I just mean SQL join statement (select ... join...).

ZEAXIF 2010-06-18 17:09:50

Answer 4

+1 A:

Since you're asking about good practices... the "upvotes" and "downvotes" appearing in your Posts table looks like you're duplicating data in your database. That's a problem, because now you always have to worry whether or not the data is in sync and correct. If you want to know the number of upvotes then count them, don't also store them in the Post table. I'm not positive that is what you're doing, but it's a guess.

Onto your query... You will probably get better performance using a JOINed subquery instead of how you have it. With the scalar subqueries as columns they have to be run once for every row that is returned. That could be a pretty big performance hit if you're returning a bunch of rows. Instead, try:

SELECT
    P.id,
    P.name,
    P.body,
    P.upvotes,
    P.downvotes,
    COALESCE(UV.cnt, 0) AS upvotes2,
    COALESCE(DV.cnt, 0) AS downvotes2
FROM
    dbo.Posts P
LEFT OUTER JOIN (SELECT post_id, COUNT(*) cnt FROM dbo.UpVotes GROUP BY post_id) AS UV ON UV.post_id = P.id
LEFT OUTER JOIN (SELECT post_id, COUNT(*) cnt FROM dbo.DownVotes GROUP BY post_id) AS DV ON DV.post_id = P.id

Compare it to your own query and see if it gives you better performance.

EDIT: A couple of other posters have advocated a single table for up/down votes. They are absolutely correct. That makes the query even easier and also probably much faster:

SELECT
    P.id,
    P.name,
    P.body,
    P.upvotes,
    P.downvotes,
    SUM(CASE WHEN V.vote_type = 'UP' THEN 1 ELSE 0 END) AS upvotes2,
    SUM(CASE WHEN V.vote_type = 'DOWN' THEN 1 ELSE 0 END) AS downvotes2,
FROM
    dbo.Posts P
LEFT OUTER JOIN Votes V ON
    V.post_id = P.id
GROUP BY
    P.id,
    P.name,
    P.body,
    P.upvotes,
    P.downvotes

Tom H. 2010-06-18 15:45:18

LEFT OUTER JOIN doesnt do what mine does. I am checking if i upvoted/downvotes the post by looking at my IP (its an fmylife like site which like a wiki doesnt require login). Isnt the () part in the outer join mean its doing a subquery? also when you say 'run once for every row that is returned' isnt that the same as a join which is applied on every row?

acidzombie24 2010-06-18 16:10:38

oh and the up/down count is maintained by triggers on insert, update and remove. There should never be a case where it is wrong but it isnt important and i figure i can run checks later on.

acidzombie24 2010-06-18 16:36:29

Nope, a JOIN is most definitely not the same - SQL deals with sets and a JOIN is a set operation, so it is much faster. Although there is a subquery, that query is only run once in this case and the results are then kept until SQL is done with them - it's because they're different kinds of subqueries (correlated vs. non-correlated). As for the triggers and the data never getting out of sync, I'll just say, "famous last words" ;)

Tom H. 2010-06-18 19:09:01

The first is a correlated subquery, it fires once for every row in the results set. The second is a derived table, it fires only once for the entire set of data. It is almost always significantly faster. Basically you don't want to do row-by-row processing in a database 99% of the time. Databases are optimized for set-based oprations.

HLGEM 2010-06-18 21:45:04

ansaurus

tags:

views:

answers:

Is this SQL select code following good practice?

related questions