views:

41

answers:

2

Now my program generates two data files. a.txt and b.txt Take a.txt as an example, it's content just like this:

0,0
0,1
1,0
-3,1
1,-2
1,3
......

b.txt is similar with a.txt.

Now, I hope to find out average distance between the corresponding content in a and b file. In other words, for example, if b.txt like this:

0,0
1,1
1,2
-3,1
1,-2
1,3
......

Then the distance is calculated in this way:

  sqrt[square(0-0)+square(0-0)]
    +sqrt[square(0-1)+square(1-1)]
    +sqrt[square(1-1)+square(0-2)]
    +sqrt[square((-3)-(-3))+square(1-1)]
    +sqrt[square(1-1)+square((-2)-(-2))]
    .......
   _____________
   /Total number(i.e 10,000)

to get the average distance between these two files content.

Question: how to write a shell script which can carry out the calculation process like above? And output the final average distance?

Hint: you may view two groups of coordinates are stored in two files.

Need your kind help..Many many Thanks.

Addition: There are about 10,000 - 100,000 rows for each files.

+1  A: 

This would be much easier in a scripting language like perl or python. However in a shell script you would probably want to use:

  • cut to split the files up
  • bc to do the calculation
  • whatever loop construct you prefer in your script language

I've left this vague in case it is homework.

Nick Fortescue
thanks for you kind answer
MaiTiano
+3  A: 

AWK is really good for doing simple mathematical calculations on files containing rows of delimited data. There is a very good AWK guide here: http://www.vectorsite.net/tsawk.html

A general structure for this program could be:

Store the first row
For each additional row, calculate the distance between it and the last row and overwrite the stored values
Add the distance to a variable containing the distance sum
Divide at the end by the number of rows seen (conveniently stored for you by AWK)
Output the result
danben