tags:

views:

116

answers:

5

I have a minor problem while checking for elements in a list: I have two files with contents something like this

file 1:        file2:
 47            358 47
 48            450 49
 49            56 50

I parsed both files into two lists and used the following code to check

for i in file_1:
   for j in file_2:
      j = j.split()
      if i == j[1]:
        x=' '.join(j)
        print >> write_in, x

I am now trying to get a "0" if the value of file_1 is not there in file_2 for example, value "48" is not there is file_2 so I need to get the output like (with only one space in between the two numbers) Also both the conditions should produce only one output file:

output_file:
  358 47
   0 48
  450 49
   56 50

I tried using the dictionary approach but I didn't quite get what I wanted (actually I don't know how to use dictionary in python correctly ;)). Any help will be great.

+1  A: 

You could modify your code quite easily:

for i in file_1:
    x = None
    for j in file_2:
        j = j.split()
        if i == j[1]:
            x = ' '.join(j)
    if x is None:
        x = ' '.join(['0', i])

Depending on your inputs, the whole task might be of course simplified even further. At the moment, your code is 0(n**2) complexity.

SilentGhost
Actually, it's O(N+M). You just need a sequential scan of each file.
ΤΖΩΤΖΙΟΥ
+2  A: 
r1=open('file1').read().split()
r2=open('file2').read().split()

d=dict(zip(r2[1::2],r2[::2]))

output='\n'.join(x in d and d[x]+' '+x or '0 '+x for x in r1)

open('output_file','wb').write(output)

Test

>>> file1='47\n48\n49\n50'
>>> file2='358 47\n450 49\n56 50'
>>>
>>> r1=file1.split()
>>> r2=file2.split()
>>>
>>> d=dict(zip(r2[1::2],r2[::2])) #
>>> d
{'47': '358', '50': '56', '49': '450'}
>>>
>>> print '\n'.join(x in d and d[x]+' '+x or '0 '+x for x in r1)
358 47
0 48
450 49
56 50
>>>
S.Mark
+1 for being pythonic :)
anijhaw
File objects don't have a split() method. open('file1').read().split() maybe?
Daniel Stutzbach
@Daniel, Thanks, I just missed to type .read()
S.Mark
file1 doesn't have `50` in it
gnibbler
@gnibbler, according to OP's output file and his concept, there is , if 50 is not there, it should show 0 50 in his result.
S.Mark
+1  A: 

Here's a readable solution using a dictionary:

d = {}
for k in file1:
    d[k] = 0
for line in file2:
    v, k = line.split()
    d[k] = v
for k in sorted(d):
    print d[k], k
Daniel Stutzbach
initialising dict to a certain values is better done with `.fromkeys` classmethod.
SilentGhost
A: 

You can try something like:

l1 = open('file1').read().split()
l2 = [line.split() for line in open('file2')]

for x, y in zip(l1, l2):
    if x not in y:
        print 0, x
    print ' '.join(y)

but if you follow your logic, the output should be

358 47
0 48
450 49
0 49
56 50

and not

358 47
0 48
450 49
56 50
kroger
A: 
def file_process(filename1, filename2):

    # read first file with zeroes as values
    with open(filename1) as fp:
        adict= dict( (line.rstrip(), 0) for line in fp)

    # read second file as "value key"
    with open(filename2) as fp:
        adict.update(
            line.rstrip().partition(" ")[2::-2] # tricky, read notes
            for line in fp)
    for key in sorted(adict):
        yield adict[key], key

fp= open("output_file", "w")
fp.writelines("%s %s\n" % items for items in file_process("file1", "file2"))
fp.close()

str.partition(" ") returns a tuple of (pre-space, space, post-space). By slicing the tuple, starting at item 2 (post-space) and moving by a step of -2, we return a tuple of (post-space, pre-space), which are (key, value) for the dictionary that describes the solution.

PS Um :) I just noticed that my answer is essentially the same as Daniel Stutzbach's.

ΤΖΩΤΖΙΟΥ