tags:

views:

70

answers:

1

how do achieve the equivalent of awk '{print $1}' /tmp/data | sort | uniq -c for a particular column in R?

Example: cat /tmp/data

alama 
alama
alama
bbbb
bbbb
ccc
alama
bbbb
bbbb

awk '{print $1}' /tmp/data | sort | uniq -c

  1 
  4 alama
  4 bbbb
  1 ccc

i.e. count of every item in the column.


Based on @Joshua's suggestion and my particular needs ...

s<-data.frame(table(spam[,1]))
p<-s[s$Freq>=3,]
p[order(p$Freq,decreasing=TRUE ),]
+4  A: 
> set.seed(21)
> Data <- data.frame(V1=sample(letters[1:5],20,TRUE))
> length(unique(Data[,1]))
[1] 5

Based on your updated question -- assuming data is in x:

> table(x)
x
alama  bbbb   ccc 
    4     4     1 
> data.frame(table(x))
      x Freq
1 alama    4
2  bbbb    4
3   ccc    1
Joshua Ulrich
... or use `nlevels` if it is a factor.
Richie Cotton
@Richie But if factor has missing levels then will be difference.
Marek
@Marek: `nlevels(x[, drop=TRUE])` in that case.
Richie Cotton
Edited the question so as clarify the question.
@Joshua: nice :) .... just what I needed to this was `s<-data.frame(table(spam[,1]))p<-s[s$Freq>=3,]p[order(p$Freq,decreasing=TRUE ),]`