tags:

views:

109

answers:

1

What is the problem with initializing a matrix object to NULL and then growing it with cbind() and rbind() ? In case the number of rows and columns are not known a priori, is it not necessary to grow from NULL?

Edit: My question was prompted by the need to understand memory efficient ways of writing R code. The matrix context is more general and I'm probably looking for suggestions about efficient ways to handle other data objects as well. Apologize for being too abstract/generic, but I did not really have a specific problem in mind.

+1  A: 

It would be helpful if you provided more detail about what you're trying to do.

One "problem" (if there is one?) is that every time you "grow" the matrix, you will actually be recreating the entire matrix from scratch, which is a very memory inefficient. There is no such thing as inserting a value into a matrix in R.

An alternative approach would be to store each object in your local environment (with the assign() function) and then assemble your matrix at the end once you know how many objects there are (with get()).

Shane
Small clarification. I believe you are correct that growing the matrix is inefficient (although actually time-inefficient, not memory-inefficient). But I believe that the R interpreter *does* efficiently do updates of individual cells in a matrix or vector. That is: `a <- rep(1,10); a[[1]] <- 2` does not copy the entire vector in the second assignment, as a purely-functional implementation would. (This is from reading Chambers' book -- someone with knowledge of the R source please correct if wrong!)
Harlan
Right, but doing an update isn't the same thing as starting off the matrix as NULL and then "adding" to it with rbind/cbind (as described in the question). In your example, you started off with a vector size 10, and then changed the values. Initializing to the correct size with rep and then changing the values *is* very efficient.
Shane