views:

188

answers:

1
df <- data.frame(var1=c('a', 'b', 'c'), var2=c('d', 'e', 'f'), freq=1:3)

What is the simplest way to expand the first two columns of the data.frame above, so that each row appears the number of times specified in the column 'freq'?

In other words, go from this:

>df
  var1 var2 freq
1    a    d    1
2    b    e    2
3    c    f    3

To this:

>df.expanded
  var1 var2
1    a    d
2    b    e
3    b    e
4    c    f
5    c    f
6    c    f
+9  A: 

Here's one solution:

df.expanded <- df[rep(row.names(df), df$freq), 1:2]

Result:

    var1 var2
1      a    d
2      b    e
2.1    b    e
3      c    f
3.1    c    f
3.2    c    f
neilfws
Great! I always forget you can use square brackets that way. I keep thinking of indexing just for subsetting or reordering. I had another solution that is far less elegant and no doubt less efficient. I might post anyway so that others can compare.
wkmor1
For large `data.frame` more efficient is to replace `row.names(df)` with `seq.int(1,nrow(df))` or `seq_len(nrow(df))`.
Marek