ansaurus

Question

Count number of occurences of token in a file

Answer 1

+14 A:

I think you're looking for

uniq --count

-c, --count prefix lines by the number of occurrences

Paul 2008-09-24 17:04:23

this works perfect - thanks

matt b 2008-09-24 17:06:36

Note that with other data sets you may need to sort(1) before uniq(1), as uniq will only group adjacent duplicates.

Adam Backstrom 2008-09-24 17:11:16

Yes, but the OP's already said he's sorted things so I assumed he was on top of that sort of thing ...

Paul 2008-09-24 17:17:18

Answer 2

A:

maybe use xargs? Can't put it all together in my head on the spot here, but use xargs on your sort -u so that for each unique second you can grep the original file and do a wc -l to get the number.

Clyde 2008-09-24 17:04:50

Answer 3

+1 A:

Using AWK with associative arrays might be another solution to something like this.

David 2008-09-24 17:08:21

Answer 4

+1 A:

Just in case you want the output in the format you originally specified (with the number of occurences at the end):

uniq -c logfile | sed 's/\([0-9]+\)\(.*\)/\2: \1/'

Remo.D 2008-09-24 17:15:51

Answer 5

A:

Using awk:

cat file.txt | awk '{count[$1 " " $2]++;} \
                    END {for(w in count){print w ": " count[w]};}'

Tom 2008-10-02 08:11:25

ansaurus

tags:

views:

answers:

Count number of occurences of token in a file

related questions