ansaurus

Question

Answer 1

A:

I haven't understood your problem fully but

File 1

100 C 20.2
300 B 33.3

File 2

110 C 20.23
320 B 33.34

and you want to compare 3rd column of the two files.

lines1 = file1.readlines()
list1 = [float(line.split()[2]) for line in lines1] # list of 3rd column values

lines2 = file2.readlines()
list2 = [float(line.split()[2]) for line in lines2]

result = map(lambda x,y: x-y < 2,list1,list2)

OR

 result = [list1[i]-list2[i] for i in range(len(list1)) if list1[i] - list2[i] > 2]

Is this what you want??

TheMachineCharmer 2010-03-05 14:05:17

`2437` is Good prime number!!!

TheMachineCharmer 2010-03-05 14:16:56

How can it be what he wants? His data columns are FIXED-WIDTH, and the 5th column has some entries that are all blank. Using str.split() on his data will create a mess. His bolded column is about the NINTH column -- I can't see where you get 3 contiguous columns from.

John Machin 2010-03-05 14:37:59

Right I didn't notice that. Thanks. I should have used slicing. +1 to your good answer :D. Also I have mentioned that I haven't understood the question completely.

TheMachineCharmer 2010-03-05 18:21:32

Answer 2

+2 A:

The direct answer to your question is to alter the last condition,
if y[35:38] !=x[35:38]: so that instead the "field" at [35:38] get converted to int (or float...) and a difference can be applied to them. Giving something like

   try:
     iy = int(y[35:38])
     ix = int(x[35:38])
   except ValueError:
     # here for whatever action is appropriate, including silent ignoring.
     print("Unexpected value for record # %s" % x[7:10])

   if abs(ix - iy) > 2:
     print(x[7:10])

More indirectly, the snippet in the question prompt the following remarks,which may in turn suggest different approaches to the problem.

first off, if the files are strictly "fixed format", if they are very big, and/or if nothing else is done with any of the other "fields" values found in the file, the current approach is valid and probably very efficient.
alternatively, the logic may be made more resilient to possible variations in the file structure etc, by parsing in the "fields" of the file, rather than addressing these as slices of a long string. Loot into the standard library's csv module for possible parser support.
some tests seem goofy / always true etc (like comparing a 3 characters slice to a 2 character string literal. Aside from being logically wrong, this too points to a more "parsed" solution where such logical error are more readily avoided or more obvious.

mjv 2010-03-05 14:35:29

Answer 3

+2 A:

Nothing to do with your problem, but this:

        if y[11]=="C":
            if y[35:38]!= "EN":
# I don't see any "EN" or "OTE" anywhere in your sample input.
# In any case the above condition will always be true, because
# y[35:38] appears to be a 3-byte string but "EN" is a 2-byte string.
                if y[35:38] != "OTE":
                    if x[11]=="C":
                        if x[12] != "C":
                            if y[35:38] !=x[35:38]:
                                print x [7:10]

is ummmmm ...

You may wish to consider an alternative way of expression e.g.

if (x[11] == "C" == y[11]
and x[12] != "C"
and y[35:38] not in ("EN?", "OTE")
and y[35:38] != x[35:38]):
    print x[7:10]

John Machin 2010-03-05 14:39:19

Thanks for the tip :) The code now looks clean and neat :)

forextremejunk 2010-03-05 14:42:46

ansaurus

tags:

views:

answers:

Calculating difference within lists

related questions