tags:

views:

84

answers:

1

Hi,

I have a vector of strings. Check out my vector, it's awesome:

> awesome
[1] "a" "b" "c" "d" "d" "e" "f" "f"

I'd like to make a new vector that is the same length as awesome but where, if necessary, the strings have been uniqueified. For example, a valid output of my desired function would be

> awesome.uniqueified
[1] "a" "b" "c" "d.1" "d.2" "e" "f.1" "f.2"

Is there an easy, R-thonic and beautiful way to do this? I should say my list in real life (it's not called awesome) contains 25000ish mircoarray probeset identifiers.

I'm always nervous when I embark on writing little generic functions (which I'm sure I could do) as I'm sure some R guru has come across this problem in the past, nailed it with some incredible algorithm that doesn't even have to store more than half an element in the vector. I'm just not sure what they might have called it. Probably not uniqueify.

+10  A: 

Try make.unique() where the very first example of the help page is already spot-on:

make.unique(c("a", "a", "a"))
[1] "a"   "a.1"   "a.2"

The help page lists Thomas Minka as author. Buy him a beer one day :)

Dirk Eddelbuettel
to make the whole answer...make.unique(strsplit(awesome,'')[[1]])You need to separate the word out into a character vector
John
The posting title, and the original example, show 'vector of strings' so I don't think we need `strsplit()`.
Dirk Eddelbuettel
truly this has been my cleanest question/answer experience ever on stackoverflow. Thanks so much! Will definitely get Dr Minka a beer should I ever get the opportunity. I suppose, following this pattern, I owe him beer for EP and didn't he develop the 'gates' notation? Probably should get him beer for that, too.
Mike Dewar