tags:

views:

113

answers:

3

I know this answer must be out there, but I can't figure out how to word the question.

I'd like to calculate the differences between values in my data.frame.

from this:

f <- data.frame(year=c(2004, 2005, 2006, 2007), value=c(8565, 8745, 8985, 8412))

  year value
1 2004  8565
2 2005  8745
3 2006  8985
4 2007  8412

to this:

  year value diff
1 2004  8565   NA
2 2005  8745  180
3 2006  8985  240
4 2007  8412 -573

(ie value of current year minus value of previous year)

But I don't know how to have a result in one row that is created from another row. Any help?

Thanks, Tom

+8  A: 

There are many different ways to do this, but here's one:

f[, "diff"] <- c(NA, diff(f$value))

More generally, if you want to refer to relative rows, you can use lag() or do it directly with indexes:

f[-1,"diff"] <- f[-1, "value"] - f[-nrow(f), "value"]
Shane
Perfect! Thank you.
Tom
@Tom: Great! Please mark this accepted when you get a chance so that people know this is answered your question.
Shane
A: 

Use the diff function

f <- cbind(f, c(NA, diff(f[,2])))
nico
+1  A: 

If year column isn't sorted then you could use match:

f$diff <- f$value - f$value[match(f$year-1, f$year)]
Marek
@mbq Could you be more specific? For 1.000.000 rows times are similar (my 0.8sec, Shane's 0.3). And when you add sorting then is much slower (1.5sec for sorting).
Marek
@Marek You're right, sorry; it's quite a nice solution though. I have read it too fast and misunderstood your code. I'll try to remove it so it won't confuse people.
mbq