tags:

views:

37

answers:

2

Hello,

I'm somewhat new to SQL queries, and I'm struggling with this particular problem. Let's say I have query that returns the following 3 records (kept to one column for simplicity):
Tom
Jack
Tom

And I want to have those results grouped by the name and also include the fraction (ratio) of the occurrence of that name out of the total records returned.

So, the desired result would be (as two columns):
Tom | 2/3
Jack | 1/3

How would I go about it? Determining the numerator is pretty easy (I can just use COUNT() and GROUP BY name), but I'm having trouble translating that into a ratio out of the total rows returned.

Any help is much appreciated!

+1  A: 

Since the denominator is fixed, the "ratio" is directly proportional to the numerator. Unless you really need to show the denominator, it'll be a lot easier to just use something like:

select name, count(*) from your_table_name
group by name
order by count(*) desc

and you'll get the right data in the right order, but the number that's shown will be the count instead of the ratio.

If you really want that denominator, you'd do a count(*) on a non-grouped version of the same select -- but depending on how long the select takes, that could be pretty slow.

Jerry Coffin
Well, the reason I want the ratio instead of just the pure counts is because I'm using that a filter on my data. That is, I only want to return "Tom" if it makes up over half of the returned records. Otherwise I view it as just noise. Is there smarter way of going about this? Perhaps I should be doing this in my application code?
jjiffer
@jjiffer: Perhaps an approximate answer would be useful? Only *one* group can possibly constitute more than half the records, so perhaps you could just return the record with the top count? That wouldn't necessarily be more than half the records, but maybe it's close enough for your purposes?
Jerry Coffin
That is a good point, but I also have the option of returning nothing if no answer is "convincing" enough. I'll probably tweak things a bit and compare the computation time and accuracy of using a rougher approximation and see what technique comes out ahead. Thanks for your help!
jjiffer
+3  A: 
SELECT name, COUNT(name)/(SELECT COUNT(1) FROM names) FROM names GROUP BY name;
Andy
Well, the "names" portion is actually its own long query in my case. Do I have to just copy and paste the whole subquery to appear in the general query twice? Or is there a way to do something like "SELECT name, COUNT(name)/(SELECT COUNT(1) FROM (SELECT ...) AS my_subquery) FROM my_subquery GROUP BY name;" ? If there is, I can't seem to get the syntax right.
jjiffer
Also, the structure of the query you provided works, but I had to change "COUNT(1)" to "CAST(COUNT(1) AS float", because otherwise it was doing integer division and returning all zeroes.
jjiffer