tags:

views:

92

answers:

4

Hello nice people,

I have a single column in a data frame in R that looks something like this:

blue
green
blue
yellow
black
blue
green

How do I remove all the rows that indicate blue? Please keep in mind that I don't want a NULL value represented in that row: I want the entire row removed.

Thank you :)

A: 
> Data[Data!="blue"]
[1] "green"  "yellow" "black"  "green"

or

> Data[which(Data!="blue",TRUE)]
[1] "green"  "yellow" "black"  "green"

Edit to respond to Joris' comment (this works for 1-column data.frames):

> str(Data)
'data.frame':   7 obs. of  1 variable:
 $ V1: Factor w/ 4 levels "black","blue",..: 2 3 2 4 1 2 3
Joshua Ulrich
That's for a vector, doesn't work with a dataframe. Add a comma after "blue" and forget about the which() option, I can't think of a reason why one would use an extra function if there is absolutely no need for it.
Joris Meys
Thanks Joshua. I'm assuming Data is the name of the data frame, correct? I tried this out, but all it does is print out my data frame. Here is what I inputted: datafr[which(datafr!="ScreenSaverEngine",TRUE)] I've replaced Data with the name of my data frame which is 'datafr'
Eric Brotto
Joris and Joshua... I'm still a bit confused. I've added the comma like this: datafr[datafr!="ScreenSaverEngine,"]. datafr is my dataframe and ScreenSaverEngine are the rows I want to get rid of.
Eric Brotto
@Eric : see the answer of csgillespie. You have to put the comma behind the quotation mark, not before.
Joris Meys
@Joris: it works if the data.frame only has one column. I assumed "single column in a data frame" meant the data.frame only had one column.
Joshua Ulrich
@Eric: That comma behind the " will tell R that you're interested in rows. If you have a data.frame and only one element within [], R will think you're interested in columns. [, x] is equivalent to the latter statement.
Roman Luštrik
+2  A: 

What about

> df1 = data.frame(a=c("Red", "Blue", "Red"), b=1:3)
> df1[df1$a!= "Blue",]
    a b
1 Red 1
3 Red 3
csgillespie
Thanks csgillespie. But unfortunately still having trouble. This is what I have entered:datafr[datafr$"FOCUS.APP"!= "ScreenSaverEngine",] datafr is my data frame; FOCUS.APP is the name of the column; ScreenSaverEngine is the name of the rows I want to eliminate.
Eric Brotto
@Eric : You have to remove the "" around FOCUS.APP. Please be a bit more punctual in copying code, you seem to make those errors regularly. Did you go through the R guide of Owen and the introduction to R already?
Joris Meys
@Joris: I've tried a test case with "" around FOCUS.APP and it worked.
csgillespie
@csgillespie That's new to me, thx for the pointer.
Joris Meys
+4  A: 

Also be careful about the difference between a factor variable and character vector.

Factors retain all original levels by default unless you reassign the altered vector as a new factor, or use one of the relevel functions.

> DF <- data.frame(v = factor(c("red", "blue", "green", "blue")))
> summary(DF)
     v    
 blue :2  
 green:1  
 red  :1  
> summary(DF[ DF$v != "blue", , drop=FALSE])
     v    
 blue :0  
 green:1  
 red  :1  
> DF <- DF[ DF$v != "blue", , drop=FALSE]; DF$v <- factor(DF$v); summary(DF)
     v    
 green:1  
 red  :1  
> 
Dirk Eddelbuettel
If you don't want that behaviour, you shouldn't be using a factor.
hadley
I'm of no particular opinion here: I like factors. I am merely providing assistance to what appears to be a new user -- as those are most likely to fall into the trap provided by R's default value of `stringsAsFavtors`.
Dirk Eddelbuettel
+3  A: 

If all those square brackets and commas and dollar signs confuse you, then why not try 'subset':

> d=data.frame(a=c("Red", "Blue", "Red"), b=1:3)
> subset(d,a!="Blue")
    a b
1 Red 1
3 Red 3
Spacedman
Does subset refactor the variable to get rid of empty levels or do the empty levels remain. What would have happened if one had entered table(a) afterwards?
Farrel