views:

5260

answers:

3

I have a generic question that I will try to explain using an example.

Say I have a table with the fields: "id", "name", "category", "appearances" and "ratio"

The idea is that I have several items, each related to a single category and "appears" several times. The ratio field should include the percentage of each item's appearances out of the total number of appearances of items in the category.

In pseudo-code what I need is the following:

  • For each category
    find the total sum of appearances for items related to it. For example it can be done with (select sum("appearances") from table group by category)

  • For each item
    set the ratio value as the item's appearances divided by the sum found for the category above

Now I'm trying to achieve this with a single update query, but can't seem to do it. What I thought I should do is:

update Table T    
set T.ratio = T.appearances /   
(    
select sum(S.appearances)    
from Table S    
where S.id = T.id    
)

But MySQL does not accept the alias T in the update column, and I did not find other ways of achieving this.

Any ideas?

+1  A: 

This is how it is done in mssql, I think mysql is the same or similar:

create table T (id int, ratio float, appearances int)
insert T values (1, null, 2)
insert T values (1, null, 3)

update T
set ratio = cast(appearances as float)/ agg.appearancesSum
from T join (
    select id, sum(appearances) as appearancesSum
    from T
    group by id
) as agg on t.id = agg.id
Sorry, this does not work in MySQL. I'm not down-voting it since I'm sure it works on mssql...
Roee Adler
+1  A: 

Use joins right after UPDATE: http://dev.mysql.com/doc/refman/5.4/en/update.html

so UPDATE table1 inner join table2 on .... set table1.foo=value where table2.bla = someothervalue

With these kind of things, always look at the manual. MySql has a proper reference manual, so it shouldn't be that hard to get the right syntax ;)

Frans Bouma
Thanks, I will try it as soon as I can. And by the way - I did RTFM and tried everything that made sense before posting the question :)
Roee Adler
+9  A: 

Following the two answers I received (none of which was complete so I wrote my own), what I eventually did is as follows:

UPDATE Table AS target
INNER JOIN 
(
select category, appearances_sum
from Table T inner join (
    select category as cat, sum(appearances) as appearances_sum
    from Table
    group by cat
) as agg
where T.category  = agg.cat
group by category
) as source
ON target.category = source.category
SET target.probability = target.appearances / source.appearances_sum

It works very quickly. I also tried with correlated subquery but it was much slower (orders of magnitude), so I'm sticking with the join.

Roee Adler
Please flag an answer as the answer so this question gets removed from the list of unanswered questions :)
Frans Bouma
@Frans: I had to wait 48 hours before I could do so, stack-overflow rules :)
Roee Adler