tags:

views:

34

answers:

1

Here is an example of the type of data that I am trying to manipulate:

1213954013615]: 992
1213954013615]: 993
1213954013615]: 994
1213954013615]: 995
1213954013615]: 995
1213954013615]: 996
1213954013615]: 996
1213954013615]: 996
1213954013615]: 998
1213954247424]: 100
1213954247424]: 1002
1213954247424]: 1007
1213954303390]: 111
1213954303390]: 1110
1213954303390]: 1111
1213954303390]: 1112
1213954303390]: 1114
1213954303390]: 112
1213954303390]: 112
1213954303390]: 112
1213954303390]: 112

...What I am hoping to achieve is to generate an average based on the epoch number on the left. For example, adding 992, 993, 994, 995, 995, 996, 996, 996, 998 and dividing by the number of unique instances of the epoch time "1213954013615", doing this for each unique group of epoch times.

Here is what I have so far:

awk '{arr[$1]+=$2} END {for (i in arr) {print "[epoch", i,arr[i]/NR}}'

But this, of course, divides by the total number of epoch times, I need something equivalent to "uniq" for this, but can't find an equivalent in awk.

Many thanks.

+3  A: 

You almost have it. Its easy to count the number of instances of every epoch in a separate array n:

awk '{arr[$1]+=$2; ++n[$1]} END {for (i in arr) {print "[epoch", i,arr[i]/n[i]}}'
schot
Fastest response ever. Works brilliantly. Thanks schot!
Jimjim Beefcutter
@Jimjim I only had to add a few characters to your own solution, glad te be of help.
schot
@Jimjim, you should consider accepting this answer if it works for you.
glenn jackman