tags:

views:

59

answers:

1

Hi All,

Two days ago I was asked this question in an interview for Data Analyst position. Could some one please let me know the correct answer for this?

Say there is a single table with three columns.

  • 1st column with GeneId(primary key)
  • 2nd column Flag1,
  • 3rd column Flag2.

Flag1 and Flag2 columns can have values of 0 or 1. How do I write a single SQL query in which I get the count of GeneIds for different combinations of Flag1,Flag2 possible, i.e. Flag1 0, Flag2 1, Flag1 1, Flag2 0, ....& other combinations.

Thanks for your time,

Regards Sashi

+7  A: 
SELECT Flag1, Flag2, COUNT(GeneId) as NumGenes
FROM genetable
GROUP BY Flag1, Flag2
Amadan
How does this cater for the requirement for flag _combination_ counts? For example - how does this tell you the count of all rows that have Flag1 = 1 _and_ Flag2 = 1?
Oded
`GROUP BY Flag1, Flag2` means split the rowset into sections where `(Flag1, Flag2)` are identical. Thus, `(17, 0, 1)` and `(19, 0, 1)` would get grouped together, since they are both `(0, 1)` in the flags department. Then `COUNT(GeneId)` counts all rows where `GeneId` is not `NULL`. Other possibilities are `COUNT(*)` (include `NULL` rows) and `COUNT(DISTINCT GeneId)` which would count repeats only once.
Amadan
@Oded, i don't understand the issue with that query, it will return 4 rows, one for each combination. Provided that each combination exists in the table, i.e if there were no rows in the table then there would be no rows returned from the query.
Chris Diver
@Chris Diver - thanks. Too hot here today, brain slightly fried...
Oded
I know that feeling :)
Chris Diver
Thank You Amadan :)
Sashikiran Challa