ansaurus

Question

Answer 1

+2 A:

For your last question, you can use split and display simultaneously the two columns :

cat filename | awk '{split($2,tab,":"); id = tab[1]; print id " -> " $3;}'

That prints :

1 -> 1.6
2 -> 1.1
2 -> 3.4
3 -> -1.3
3 -> 6.0
3 -> 1.1
4 -> -1.0
5 -> 10.9

For the complete result you can use :

awk -F, '{ split($1,line,"    "); split(line[2],tab,":"); id=tab[1]; if (sums[id]=="") {sums[id] = 0;} sums[id]+=line[3];} END {for (i=1;i<=length(sums);i++) print i " -> "sums[i]}' < test

that prints :

1 -> 1.6
2 -> 4.5
3 -> 5.8
4 -> -1
5 -> 10.9

Elenaher 2010-10-14 15:04:51

Thanks. I did not know the `split` keyword for `awk`.

wok 2010-10-14 15:05:39

Thanks, your code works (although I have to edit the input since there was a missing space which is not handled then).

wok 2010-10-14 15:39:16

Answer 2

+3 A:

This is assuming you have the two columns you showed before: 1 1.1

BEGIN {
    last = "";
    sum = 0;
}

{
    if ($1 != last) {
        if (last != "") {
            print last " " sum;
        }
        sum = 0;
        last = $1;
    }
    sum = sum + $2
}

END {
    print last " " sum;
}

Will Hartung 2010-10-14 15:06:13

This works great using the output of Elenaher's line.

wok 2010-10-14 15:17:25

Your answer is great to answer my second question. I wish I could upvote it more than once.

wok 2010-10-14 15:53:53

Answer 3

+3 A:

So, assuming that your input looks like this:

unique_col, to_sum
1.3, 1 2 3
1.3, 5 6 7
1.4, 2 3 4

Then this should do the trick:

$ awk -F, '{ if (seen[$1] == "") { split($2, to_sum, " "); seen[$1] = 0; for (x in to_sum) seen[$1] += to_sum[x]; }} END { for (x in seen) { if (x != "") { print x " " seen[x]; }}}' < input
1.3 6
1.4 9

David Wolever 2010-10-14 15:06:51

It works great on your input, but mine is a bit different. Still thanks.

wok 2010-10-14 15:41:57

Ah, sorry — wrote it before you had the example up, so I had to guess =\

David Wolever 2010-10-14 19:17:50

Answer 4

A:

{ 
  b=$2;               # assign column 2 to the variable 'b'
  sub( /:.*/, "", b); # get rid of everything after the first colon in b
  results[b] += $3     
}
END {  for (result in results )print result " " results[result] }

stew 2010-10-14 15:10:20

I get the following message: `syntax error near unexpected token `/:.*/,'`

wok 2010-10-14 15:16:49

Answer 5

+4 A:

$ awk -F"[: \t]+" '{a[$2]+=$NF}END{for(i in a ) print i,a[i] }' file
4 -1
5 10.9
1 1.6
2 4.5
3 5.8

ghostdog74 2010-10-14 15:29:56

So short, and still, it works! Thanks!

wok 2010-10-14 15:38:30

+1 Definitely the most elegant with the `"[: \t]+` !

Elenaher 2010-10-14 15:40:04

I have finally decided to accept this answer since it is really more general and could adapt to a lot of similar problems by tweaking the separators or the number of the columns.

wok 2010-10-14 15:51:45

ansaurus

tags:

views:

answers:

Awk conditional sum from a CSV file

related questions