views:

228

answers:

4

Hi, Please have a look at the code below:

import string
from collections import defaultdict



first_complex=open( "residue_a_chain_a_b_backup.txt", "r" )
first_complex_lines=first_complex.readlines()
first_complex_lines=map( string.strip, first_complex_lines )
first_complex.close()

second_complex=open( "residue_a_chain_a_c_backup.txt", "r" )
second_complex_lines=second_complex.readlines()
second_complex_lines=map( string.strip, second_complex_lines )
second_complex.close()
list_1=[]
list_2=[]
for x in first_complex_lines:
    if x[0]!="d":
        list_1.append( x )
for y in second_complex_lines:
    if y[0]!="d":
        list_2.append( y ) 
j=0
list_3=[]      
list_4=[]
for a in list_1:
    pass
    for b in list_2:
        pass
        if a==b:
            list_3.append( a )    

kvmap=defaultdict( int )
for k in list_3:
    kvmap[k]+=1 
print kvmap

Normally I use izip or izip_longest to club two for loops, but this time the length of the files are different. I don't want a None entry. If I use the above method, the run time becomes incremental and useless. How am I supposed to get the two for loops going?

Cheers, Chavanak

+8  A: 

You want to convert list_2 to a set, and check for membership:

list_1 = ['a', 'big', 'list']
list_2 = ['another', 'big', 'list']

target_set = set(list_2)

for a in list_1:
    if a in target_set:
         print a

Outputs:

big
list

A set gives you the advantage of O(1) access time to determine membership, so you only have to read all the way through list_2 once (when creating the set). Thereafter, each comparison happens in constant time.

jcdyer
+3  A: 

The following code perform the same tasks as yours with greater conciseness, directness, and speed:

with open('residue_a_chain_a_b_backup.txt', 'r') as f:
  list1 = [line for line in f if line[0] != 'd']
with open('residue_a_chain_a_c_backup.txt', 'r') as f:
  list2 = [line for line in f if line[0] != 'd']
set2 = set(list2)
list3 = [line for line in list1 if line in set2]

the following histogramming of lint3 into kvmap is already fine in your code. (In Python 2.5, to use the with statement, you need to start your module with from __future__ import with_statement; in 2.6, no need for that "import from the future", though it does no harm if you want to leave it in).

Alex Martelli
+1  A: 

Refining Alex's code very slightly:

with open('residue_a_chain_a_c_backup.txt', 'r') as f:
  set2 = set([line.strip() for line in f if line[0] != 'd'])

with open('residue_a_chain_a_b_backup.txt', 'r') as f:
  list1 = [line.strip() for line in f if line.strip() in set2]
Robert Rossney
Refining it a little more, if you're using a context processor, you're obviously python > 2.4, which means you can use a generator expression in your set function, and save yourself a list creation: `set2 = set(line.strip for line in f if line[0] != 'd')`.
jcdyer
For some reason I talked myself out of that, but I can't now see why. Do you need a second set of parentheses? I wonder that about generator comprehensions a lot.
Robert Rossney
+1  A: 

Is it the intersection of two set you want, if so you can use the set interaction operation:

list_1 = ['a', 'big', 'list']
list_2 = ['another', 'big', 'list']

intersection = (set(list_1) & set(list_2))

After running this, interaction is a set containing the common items of list_1 and list_2.

Satoru.Logic