tags:

views:

91

answers:

1

I have a table like this:

col1   col2

2001   3001
2002   3001
2003   3001
2004   3002
2002   3003
2001   3003

i want to create it a query that finds out all the combination of items in col1 with a certain item in col1 if they occur to the same item in col2 and the frequency of occurrence. For example, if we want to explore item number "2001" in col1 the query should return:

col3   col4   col5
2001   2002   0.667
2001   2003   0.333

Can someone give me a hint? Thanks ahead.

+1  A: 

Thinking of it this way, you are trying to relate the table to itself through col2, so we'll start off with a join onto itself:

select left.col1, right.col1/*, ...*/ from table left inner join table right on left.col2 = right.col2 where left.col1 = '2001' /* ... */

Now for your aggregate. You want to aggregate all right.col1 to get the count for that column:

select left.col1, right.col1, COUNT(*) from table left inner join table right on left.col2 = right.col2 where left.col1 = '2001' group by right.col2

Which I believe should come out with the raw counts. I believe you'll have to spin over the query and get a total to get the frequency.

As a note, this would be a lot more complicated if col1 weren't constant.

Edited to add: If you are looking for one query, instead of iterating over it in whatever language you are querying from, you would need to do this grouping twice:

SELECT abs_freq.col1, abs_freq.col2, abs_freq.freq / totals.total
FROM (SELECT
       left.col1 AS col1,
       COUNT(*)  AS total
      FROM TABLE LEFT
      INNER JOIN TABLE RIGHT
        ON left.col2 = right.col2
      WHERE left.col1 = '2001'
      GROUP BY left.col1
      ) totals
INNER JOIN (SELECT
              left.col1  AS col1,
              right.col1 AS col2,
              COUNT(*)   AS freq
            FROM TABLE LEFT
            INNER JOIN TABLE RIGHT
              ON left.col2 = right.col2
            WHERE left.col1 = '2001'
            GROUP BY right.col2
            ) abs_freq
   ON abs_freq.col1 = totals.col1
Todd Gardner
Do you mind explain to me in more detail about how to get the frequency?
CKeven
The result returns three columns. The third column in the query above is the number of occurrences in absolute terms, so you would have 2002 -> 2, and 2003 -> 1. To get the frequency, in whatever language you are using, get the total (2+1), then divide the column by the total (2002 -> 2/(2+1) = 0.667, 2003 -> 1/(2+1) = 0.333)
Todd Gardner
I use the following query:select C.tagid1, C.tagid2, C.partial *1.0/sum(C.partial)from (select A.tagid tagid1, B.tagid tagid2, count(*) partialfrom main A inner join main Bon A.itemid = B.itemidwhere A.tagid != B.tagid AND A.tagid=2001group by B.tagid) as C;however, it only return me one row due to the sum function. Do you have any idea how to fix this?
CKeven