views:

71

answers:

2

I am new to python and have the following piece of test code featuring a nested loop and I'm getting some unexpected lists generated:

import pybel  
import math  
import openbabel  
search = ["CCC","CCCC"]  
matches = []  
#n = 0  
#b = 0  
print search  
for n in search:  
    print "n=",n  
    smarts = pybel.Smarts(n)  
    allmol = [mol for mol in pybel.readfile("sdf", "zincsdf2mols.sdf.txt")]  
    for b in allmol:  
        matches = smarts.findall(b)  
        print matches, "\n" 

Essentially, the list "search" is a couple of strings I am looking to match in some molecules and I want to iterate over both strings in every molecule contained in allmol using the pybel software. However, the result I get is:

['CCC', 'CCCC']  
n= CCC  
[(1, 2, 28), (1, 2, 4), (2, 4, 5), (4, 2, 28)]   

[]   

n= CCCC  
[(1, 2, 4, 5), (5, 4, 2, 28)]   

[]   

as expected except for a couple of extra empty lists slotted in which are messing me up and I cannot see where they are coming from. They appear after the "\n" so are not an artefact of the smarts.findall(). What am I doing wrong? thanks for any help.

A: 

allmol has 2 items and so you're looping twice with matches being an empty list the second time.

Notice how the newline is printed after each; changing that "\n" to "<-- matches" may clear things up for you:

print matches, "<-- matches"
# or, more commonly:
print "matches:", matches
Roger Pate
hmm, I'm not getting something here. I expected it to take the first string (CCC) from "search" in the outer loop, iterate through all the molecules in allmol in the inner loop to generate a list of all instances of CCC in there, then go back to the outer loop to pick up the second (CCCC) string and do the same and generate only as many lists as this took (two). I still don't see where the extra ones are being created
Shughes
@Shughes: print allmol. It has a length of 2 and that's why the `for b in allmol` loop iterates twice per n in search.
Roger Pate
A: 

Perhaps it is supposed to end like this

for b in allmol:  
    matches.append(smarts.findall(b))  
print matches, "\n"

otherwise I'm not sure why you'd initialise matches to an empty list

If that is the case, you can instead write

matches = [smarts.findall(b) for b in allmol]
print matches

another possibility is that the file is ending in an empty line

for b in allmol:  
    if not b.strip(): continue
    matches.append(smarts.findall(b))  
    print matches, "\n"
gnibbler