views:

51

answers:

2
Table tags:
article_id     tag
1              Language
1              Java
1              c++
2              Language
2              c++
3              c++

and how can I write SQL(s) query(s) to make the data like below:

Table tags_relations:

tag1        tag2     relations_degree
Language    C++     2
Language    Java    1

note: if the tag refers to the same article then the relations_degree + 1

The purpose of this approach is to calculate the relations between tags, can any one help me on this ?

Thanks a lot.

+2  A: 

Something like:

SELECT A.tag AS tag1, B.tag AS tag2, COUNT(*) as relations_degree
FROM tags A
     INNER JOIN tags B ON A.article_id = B.article_id
WHERE A.tag = 'Language' AND a.tag <> b.tag
GROUP BY A.tag, B.tag
ORDER BY B.tag ASC

Based on your example, I assumed you were limiting it to tag1 = 'Language', otherwise there's the case of tag1 = Java, tag2 = c++, relations_degree = 1. To get that:

SELECT A.tag AS tag1, B.tag AS tag2, COUNT(*) as relations_degree
FROM tags A
     INNER JOIN tags B ON A.article_id = B.article_id
WHERE A.tag > b.tag
GROUP BY A.tag, B.tag
ORDER BY B.tag ASC
Dathan
Thanks , it is sort of right. But can you explain if left join will be better or? Also what's the A.tag > B.tag does ? Thanks.
Shuoling Liu
I added the a.tag > b.tag there to make results unique. Otherwise, you'd have rows in the result set that look like "Language c++ 2" and later a "c++ Language 2", which I assume you don't want. And in this case, the inner join will result in you only seeing results with relations_degree >= 1, whereas switching to a left join will show you rows like "Language NULL 1", which you probably don't want.
Dathan
A: 

I think you should have some sort of "Tag type", so that your set looks more like:

article_id   tag   tag_type
1            Java  Language
1            c++   Language
2            c++   Language
3            c++   Other

Then you could easily just use COUNT(*), grouping by tag_type and tag.

Presuming (without trying to make a PRES out of U and ME) that you have a bunch of other tag_types, then I'm not sure how best to assign tag_types to tags, and you will probably find that easier than me. But this is definitely the way to go.

Rob Farley
Thanks for the answer, but the difficult here is to just find out the relations between the tags, so there do not have a tag_type for each tag. 'language','java' are the tags on one article.
Shuoling Liu
Ah, ok... I misunderstood. I thought you were meaning "There's the language Java, and there's the language C++". So if Java and C++ appeared together on the same article, you'd want that to appear as well?
Rob Farley
Yes, I know you c what I mean now.
Shuoling Liu