views:

90

answers:

2

How can you select question_id and tags for a question when one of the tags is known?

I am building a tag-search which searches questions by one tag such that the result shows the question and its tags including the one which was used in the search.

I use the same tables as in this question. They are

Tables

questions          |     tags
-------------------|-----------------
  question_id      |     tag
  title            |     question_id
  was_sent_at_time |

The query

    SELECT question_id, tag
    FROM tags
    WHERE question_id IN 
    ( 
        SELECT question_id
        FROM questions
        ORDER BY was_sent_at_time
        DESC LIMIT 50
    )
    AND tag = $1;          // Problem here

The problem with this query is that it does not show other tags assigned to the question. It may be possible to get the question_id and tags if there exists a given tag.

+1  A: 

You should be able to do this work in the database. you dont' want to do it in the application layer.

ID | Questions
---|----------
 1 | How much does a duck weigh?
 2 | What is your gender?
 3 | What is your ducks gender?

Question ID | Tags
------------|-------
 1          | Duck
 2          | Gender
 3          | Duck
 4          | Gender
 

Note that tag names are duplicated due to your schema design

So to get all questions about ducks (question 1 and 3), you would need to do

SELECT * from tags t
INNER JOIN questions q on t.question_id = q.question_id
WHERE
t.tag = 'Duck'
ORDER BY was_sent_at_time
DESC LIMIT 50
Byron Whitlock
Your duplicate code confused me. It is now clear.
Masi
+1  A: 
    SELECT TOP 50 q.question_id, q.title, t.tag
    FROM tags t
    INNER JOIN questions q
    ON t.question_id = q.question_id
    WHERE q.question_id IN 
    ( 
        SELECT tin.question_id
        FROM tags tin
        WHERE tin.tag = $1
    )
   ORDER BY q.was_sent_at_time DESC

That answers your question (if I understood it correctly), but I think you will have too much duplicate data - you don't really want the question title repeated for each tag. So you should break it out into 2 result sets:

    SELECT TOP 50 q.question_id, q.title
    FROM questions q
    WHERE q.question_id IN 
    ( 
        SELECT tin.question_id
        FROM tags tin
        WHERE tin.tag = $1
    )
   ORDER BY q.was_sent_at_time DESC

and:

  SELECT t.question_id, t.tag
    FROM tags t
    WHERE t.question_id IN 
    ( 
        /* 
       SAME SQL AS ABOVE WITH SELECT q.question_id, 
       OR SELECT FROM A TEMP TABLE THAT ABOVE WAS SAVED TO
       */
    )
   ORDER BY t.question_id

Then when you build your page or whatever, you would bring them together.

JBrooks
**What is the purpose of the words `TOP 50` in your `SELECT` -query?**
Masi
it is the same as limit 50 in sql server.
Byron Whitlock
I'm pretty sure it's the other way around: limit is pgsql, top is mssql.
Kev
It seems that the renaming of the table `tags` with `t` is extraneous, similarly as for the same table inside the subquery by `tin`. - **Is the clarity only reason for the renaming?**
Masi
Yes, top 50 is MSSQL's way of saying "DESC LIMIT 50"
JBrooks
I always prefix the column name with the table name or alias. This is just easier and clearer if you have a short alias. I do the prefix because sometimes it’s not clear which table the column belongs to. And if someone comes along and adds the same column name to another table used in your query it will break your SQL (Ambiguous column name error).
JBrooks