tags:

views:

3191

answers:

6

How do I add a column in the middle of an R data frame? I want to see if I have a column named "LastName" and then add it as the third column if it does not already exist.

+4  A: 

1) Testing for existence: Use %in% on the colnames, e.g.

> example(data.frame)    # to get 'd'
> "fac" %in% colnames(d)
[1] TRUE
> "bar" %in% colnames(d)
[1] FALSE

2) You essentially have to create a new data.frame from the first half of the old, your new column, and the second half:

> bar <- data.frame(d[1:3,1:2], LastName=c("Flim", "Flom", "Flam"), fac=d[1:3,3])
> bar
  x y LastName fac
1 1 1     Flim   C
2 1 2     Flom   A
3 1 3     Flam   A
>
Dirk Eddelbuettel
+1  A: 

or using cbind:

> example(data.frame)    # to get 'd'
> bar <- cbind(d[1:3,1:2],LastName=c("Flim", "Flom", "Flam"),fac=d[1:3,3])

> bar
  x y LastName fac
1 1 1     Flim   A
2 1 2     Flom   B
3 1 3     Flam   B
Paolo
I'd recommend against using cbind as the semantics are rather complicated: depending on the input you might get either a matrix or a data.frame
hadley
+8  A: 

One approach is to just add the column to the end of the data frame, and then use subsetting to move it into the desired position:

d$LastName <- c("Flim", "Flom", "Flam")
bar <- d[c("x", "y", "Lastname", "fac")]
hadley
Nice one! Hadn't seen that trick. And you can directly reassign it to d too.
Dirk Eddelbuettel
I wish I could combine this answer with Dirk's above or select them both as the selected answer. This is so obvious I kick myself for not thinking of it!
JD Long
A: 

Of the many silly little helper functions I've written, this gets used every time I load R. It just makes a list of the column names and indices but I use it constantly.

##creates an object from a data.frame listing the column names and location
namesind=function(df){

    temp1=names(df)
    temp2=seq(1,length(temp1))
    temp3=data.frame(temp1,temp2)
    names(temp3)=c("VAR","COL")
    return(temp3)
    rm(temp1,temp2,temp3)
}

ni <- namesind

Use ni to see your column numbers. (ni is just an alias for namesind, I never use namesind but thought it was a better name originally) Then if you want insert your column in say, position 12, and your data.frame is named bob with 20 columns, it would be

bob2 <- data.frame(bob[,1:11],newcolumn, bob[,12:20]

though I liked the add at the end and rearrange answer from Hadley as well.

kpierce8
+1  A: 

I always thought something like append() [though unfortunate the name is] should be a generic function

## redefine append() as generic function                                        
append.default <- append
append <- `body<-`(args(append),value=quote(UseMethod("append")))
append.data.frame <- function(x,values,after=length(x))
  `row.names<-`(data.frame(append.default(x,values,after)),
                row.names(x))

## apply the function                                                           
d <- (if( !"LastName" %in% names(d) )
      append(d,values=list(LastName=c("Flim","Flom","Flam")),after=2) else d)
Stephen
+1  A: 

Dirk Eddelbuettel's answer works, but you don't need to indicate row numbers or specify entries in the lastname column. This code should do it for a data frame named df:

if(!("LastName" %in% names(df))){
    df <- cbind(df[1:2],LastName=NA,df[3:length(df)])
}

(this defaults LastName to NA, but you could just as easily use "LastName='Smith'")

Peter McMahan