views:

78

answers:

3

I tried to do something like this:

x <- data.frame(1:20)  
attach(x)  
assign("x2",1:20,pos="x")  

However, x$x2 gives me NULL.
With x2 I get what I want but it is not part of the data.frame.

Attaching x2 to x manually would work in this simple case but not in the more complex one I need. I try to assign in a loop where I loop over the varnames used in the assign call.

+1  A: 

The Details section of ?assign tells you why your code behaves the way it does.

Why not something simple like:

x["x2"] <- 1:20
Joshua Ulrich
Astonishingly, it works. I was not aware that you can define new variables like this.
Henrik
+4  A: 

Try using within:

x <- data.frame(x=1:20)
x <- within(x, {
  x2 <- x^2
  assign('x3', x2 * 2)
  # ... other assignments
})

It is cleaner to use $ and [[ though, which also gets the column ordering right:

x <- data.frame(x=1:20)
x$x2 <- x$x^2
x[['x3']] <- x$x2 * 2
Charles
Thanks, this was exactly what I was looking for. But, I couldn't get it to work with within.
Henrik
+3  A: 

There are lots of ways to assign a variable etc, and which is best will depend on personal taste. However, a couple of points:

You don't want to be attach()ing anything. It will work fine 9 times out of 10 and then bite you in the ass when you don't expect it, because all you are doing is placing a copy of your object on the search path. Modify the original object and the one on the search path doesn't change to match.

I personally don't like accessing things with $ in general use. It is ugly and engenders a tendency for users to just delve into objects and rip things out as they wish. Doesn't matter so much for your data, but when I see people doing model$residuals I get worried. There are better ways (in this case resid()). Some users also riddle their model formulas with $.

If you are writing scripts for a data analysis that you might come back to months or years later, anything that can help you understand what your code is doing is an invalable bonus in my opinion. I find with() and within() useful for the sort of problem you had because they are explicit about what you want to do.

This is clearer:

x <- data.frame(X = rnorm(10))
with(x, mean(X))
x <- within(x, Y <- rpois(10, 3))

than

x <- data.frame(X = rnorm(10))
mean(x$X)
x$Y <- rpois(10, 3)
## or
x["Y"] <- rpois(10, 3)

Even though they do the same thing.

assign() inside a within() call is just a waste of typing, is it not?

Gavin Simpson
`Within` is slower, much slower than `x$Y <- rpois(10, 3)`. `with` is faster then `x$X`, and I use this function a lot.
Marek
@Marek if speed is a consideration then I'd go with the faster version `$` rather than `within()` and my function code in my packages does tend to do this. Where readability and my understanding when I come back to the code months later are more important, I use `within()`. That said, I haven't noticed `within()` being slow; it is certainly quicker than can think/type at the command line / in an Emacs buffer ;-)
Gavin Simpson