tags:

views:

119

answers:

2

I have 3 tables: 'cards', 'tags' and 'cardstags', where cards HABTM tags

Question: What query do I execute on the 'tags' table to count the number of associated 'cards' rows?

I'm looking for something like this:

tags.name | count
----------+------
cricket   |  15          (15 cards are tagged as 'cricket')
soccer    |  23
football  |  12
A: 
select tags.name, count(*) from tags join cardstags 
 on tags.id=cardstags.tag_id group by tags.name
zzzeek
on 5.0.75, I get: ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '*) from tags join cardstags
ysth
just fixed, mysql doesn't want the "cardstags.*", try again.
zzzeek
A: 

If you want only those tags that have at least one card:

select tags.name, count(cardstags.tag_id) from tags inner join cardstags on tags.id=cardstags.tag_id group by tags.id;

To include tags with no cards, use a left join instead of an inner join.

ysth
this one is wrong too. you need to group by tags.name for the correct result.
zzzeek
no, presumably if you have different tag ids with duplicate names, you would want separate results. and if you don't have duplicate names, grouping on tags.id is equivalent to grouping on tags.name.
ysth
It is in fact invalid SQL on most databases to have a non-aggreagated column that is not stated in the GROUP BY when aggregate functions are present. MySQL allows it since MySQL has very poor support for SQL standards. Try it on Postgresql some time, or Oracle, etc. It will raise an error.
zzzeek
here's an in-depth article about the situation, written by a guy who is defending MySQL's position: http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html . He argues that very recent SQL standards are more flexible in this regard. But the problem of returning non-grouped/non-aggregated columns in the result plainly leads to non-deterministic results.
zzzeek
@zzzeek thanks, I'm very familiar with mysql's handy support for more than the standard allows. I wouldn't call that "very poor support for SQL standards" though. re: "non-deterministic", no idea what you mean there; the results are quite deterministic.
ysth
read the linked story, including the example regarding the dates. The results are not deterministic when non-grouped columns are added to the result as it displays the "first" one that matches, but there's no ORDER BY. Its essentially defaulting to insert ordering which should never be relied on for sorting.
zzzeek
um, yes, the story gives a non-deterministic case, but pretty much all the time you would *want* to select a non GROUP BY column, it *is* deterministic, including this case; there is only one tags.name for each tags.id, so there is no problem with returning the "first" one.
ysth
Well, depending on the data present, the results may be essentially random. Which means, the results are basically random unless you make assumptions about the data. Which is just not a very "SQL-like" way of thinking hence the widespread disapproval of MySQL's methods (outside of the MySQL community, of course). This is the database that argued foreign key constraints were unnecessary for many years - another case of "well we *assume* the application did the right thing" - leading to all sorts of abominable practices like Rails polymorphic associations.
zzzeek
in that comment you seem to be arguing first that having primary keys (and the application using queries that assume their existence) is not 'SQL-like' and then turning around and saying not using foreign key constraints is not 'SQL-like'.
ysth