ansaurus

Question

Answer 1

+7 A:

Use regular expressions:

> DF <- data.frame(col=c("Blue-#105", "Green-#8845", "Blue-#999"))
> DF
          col
1   Blue-#105
2 Green-#8845
3   Blue-#999
> DF$col <- gsub("-\\#.*", "", DF$col)
> DF
    col
1  Blue
2 Green
3  Blue
>

Here we say that all strings starting with -# (where the comment char # needs to be escaped) and followed by whatever --- which is .* in regular expression lingo: any char (the dot) repeated as many times as it fits (the star) --- will get replaced by the empty string, or in other words, removed.

Dirk Eddelbuettel 2010-09-27 15:50:40

Answer 2

+3 A:

Use the sub or gsub function. For your example you could do something like:

newcolors <- sub("^([^-]*)-.*$", "\\1", oldcolors )

This assumes that the colors are in a vector 'oldcolors' and puts the results into newcolors. The pattern starts at the beginning of the string (^) then matches 0 or more characters that are not dashes ([^-]), the parens around that says to save what is matched. Then it matches a dash followed by further characters (.) until the end of the string ($), the matched portion (the entire string) is then replaced by whatever was matched within the parens (the color).

Greg Snow 2010-09-27 15:52:00

Hey Greg, I like how concise your answer is, but I am getting an error: unexpected ',' in "newdatafr <- gsub("^([^-]*)-.*$")," newdatafr is equivalent to newcolors in your example.

Eric Brotto 2010-09-27 16:02:51

@Eric : then I think you should copy-paste better. It works fine for me, and the error you provide does not show the same code as Greg posted here.

Joris Meys 2010-09-27 16:15:54

FWIW my `gsub()` call is short / more concise than the `sub()` call shown here. Otherwise, they are of course essentially equivalent.

Dirk Eddelbuettel 2010-09-27 18:16:02

Yes the 2 regex's are equivalent for the example data given. The difference is that Dirk's focuses on what to throw away and mine focuses on what to keep. Which is better would depend on possible differences in future data.

Greg Snow 2010-09-27 20:26:30

ansaurus

tags:

views:

answers:

Splitting a column in an R dataframe.

related questions