tags:

views:

92

answers:

3

Dear all,

i am trying to run a simple multiplication of a data.frame column with a scalar A respectively scalar B based on the value of third column (id) of the same data.frame. Somehow I have some (order,sort?) problem – so far the result is definitely wrong. Here are several tries:

mydf$result = subset(mydf,myid==123,multiplyme)*0.6 +
subset(mydf,myid==124,,multiplyme)*0.4

I also tried to use %in% syntax but was not successful either. I know I could MySQL for example and connect to R, but in this case I just want to use (basic) R or plyr at least here. Just for those of you who prefer code over my blabla, here´s how i´d do it in SQL:

SELECT
MIN(CASE WHEN myid=123 THEN multiplyme*0.6 END)
MIN(CASE WHEN myid=124 THEN multiplyme*0.4 END)
FROM mytable
GROUP BY result;

Thx for any help / R-code suggestions in advance! Please note that I have more than 2 ids!

+1  A: 

The command should be:

subset(mydf,myid==123,multiplyme)

or

mydf$multiplyme[mydf$myid==123]

The equivalent SQL command is:

min(mydf$multiplyme[mydf$myid==123]*0.6)+min(mydf$multiplyme[mydf$myid==124]*0.4)

Eduardo Leoni
thx, Eduardo, edited my code. In the original code i used "==" just wrote it incorrect here. So unfortunately that was not the problem here..
ran2
Your subset command is still incorrect. Take a look at ?subset.
Eduardo Leoni
thx for your patience, probably it´s just too hot inside this office and I should head to the Lake – i want to get this fixed first...
ran2
A: 

If you really have two values of myid then ifelse is a simple solution:

> mydf<-data.frame(multiplyme=c(1,2,3,4),myid=c(123,124,124,123))
> with(mydf,multiplyme*ifelse(myid==123,0.6,0.4))
[1] 0.6 0.8 1.2 2.4

For a small number of possible values of myid you can use nested calls to ifelse. But merge provides a cleaner option if myid can take many possible values:

> multdf<-data.frame(myid=c(123,124),m=c(0.6,0.4))
> mydf<-merge(mydf,multdf)
> mydf
  myid multiplyme   m
1  123          1 0.6
2  123          4 0.6
3  124          2 0.4
4  124          3 0.4
> with(mydf,multiplyme*m)
[1] 0.6 2.4 0.8 1.2

Note that merge rearranges the rows, so you may want to have variables or row names to identify observations.

Jyotirmoy Bhattacharya
sorry for not stating: In fact I have more than 2 ids. Besides the merge suggestions helps, because my dataset has been generated by merge.
ran2
+4  A: 

Assuming you only have 123 or 124 in myid:

mydf$result <- mydf$multiplyme * ifelse(mydf$myid==123,0.6,0.4)

If you have other variables in myid add an extra ifelse and a default case.

EDIT:

Since you have extra variables in myid, I'll state the expansion.

mydf$result <- mydf$multiplyme * ifelse(mydf$myid==123,0.6,ifelse(mydf$myid==124,0.4,0))

You can change the 0 at the end to a 1 if in the defualt case you want to keep the value of multiplyme. This can be extended into a chain of ifelse statements if you want to use a different multiple for many values.

However, as mbq comments below, you can use a switch statement if it begins to get unwieldy:

mydf$result <- mydf$multiplyme * sapply(mydf$myid,function(x) switch(as.character(x),"123"=0.6,"124"=0.4))

This would probably be slower though, as this will loop while ifelse is vectorised.

James
+1 for a nice solution; still for many myid levels `switch` will be more handy than a chain of `ifeslse` s.
mbq
Ou, I thought `switch` is vectorized... But than I have another idea -- a plain dictionary, like: `dict<-c("123"=0.6,"124"=0.4)` and then `mydf$multiplyme*dict[mydf$myid]`.
mbq
there are dictionaries in R :) ? i used them in python, but not in R. Did not think of that. Even though I had some more trouble and I just used SQL in the end, I accept this answer because it answers my question and I feel others (and me) can learn from it.
ran2