ansaurus

Question

Answer 1

+2 A:

If I understood correctly what you're trying to achieve, this should do what you want, pretty quick, and without too much memory loss.

#toy data
A <- data.frame(
    A=letters[1:10],
    B=letters[11:20],
    CC=1:10
)

ord <- sample(1:10)
B <- data.frame(
    A=letters[1:10][ord],
    B=letters[11:20][ord],
    CC=(1:10)[ord]
)
#combining values
A.comb <- paste(A$A,A$B,sep="-")
B.comb <- paste(B$A,B$B,sep="-")
#matching
A$DD <- B$CC[match(A.comb,B.comb)]
A

This applies only if the combinations are unique. If they're not, you'll have to take care of that first. Without the data it's quite impossible to know what you're trying to achieve exactly in your complete function, but you should be able to port the logic given here to your own case.

Joris Meys 2010-10-21 10:07:32

thanks for the code, I have tried, but sadly, the table B for me is not the same case as yours, my table B have duplicates of A-B with difference C. You can see that I have a series of conditional in the middle of the ddply function, which is to deal with this issue. And seems the match function will only show the first matched item. thanks anyway.

lokheart 2010-10-22 02:12:08

I have used your method with some tweaking, I have created an unique table B before matching with table A, and it works! thanks!

lokheart 2010-10-22 03:45:18

@lokheart: You could also do something like in this question: http://stackoverflow.com/questions/3990155/r-sort-multiple-columns-by-another-data-frame/3990529#3990529 It's a similar problem, and the solutions there might give you more to work with if you want to tweak it further.

Joris Meys 2010-10-22 08:48:11

Better to use merge or join than pasting together strings to use match.

hadley 2010-10-23 00:16:53

ansaurus

tags:

views:

answers:

plyr in R very slow during merging

UPDATE

related questions