views:

52

answers:

2

I have a list of data that includes both command strings as well as the alphabet, upper and lowercase, totaling to 512+ (including sub-lists) strings. I want to parse the input data, but i cant think of any way to do it properly other than starting from the largest possible command size and cutting it down until i find a command that is the same as the string and then output the location of the command, but that takes forever. any other way i can think of will cause overlapping. im doing this in python

say:

L = ['a', 'b',['aa','bb','cc'], 'c']

for 'bb' the output would be '0201' and 'c' would be '03'

so how should i do this?

+1  A: 

If you must use this data structure:

from collections import MutableSequence

def scanList( command, theList ):
    for i, elt in enumerate( theList ):
        if elt == command:
            return ( i, None )
        if isinstance( elt, MutableSequence ):
            for j, elt2 in enumerate( elt ):
                if elt2 == command:
                    return i, j

L = ['a', 'b',['aa','bb','cc'], 'c']
print( scanList( "bb", L ) )
# (2, 1 )
print( scanlist( "c", L ) )
# (3, None )

BUT

This is a bad data structure. Are you able to get this data in a nicer form?

katrielalex
i dont think so, at least not without a massive overhaul of my program
calccrypto
+2  A: 

It sounds like you're searching through the list for every substring. How about you built a dict to lookup the keys. Of cause you still have to start searching at the longest subkey.

L = ['a', 'b',['aa','bb','cc'], 'c']

def lookups( L ):
    """ returns `item`, `code` tuples """
    for i, item in enumerate(L):
        if isinstance(item, list):
            for j, sub in enumerate(item):
                yield sub, "%02d%02d" % (i,j)
        else:
            yield item, "%02d" % i

You could then lookup substrings with:

lookupdict = dict(lookups(L))
print lookupdict['bb'] # but you have to do 'bb' before trying 'b' ...

But if the key length is not just 1 or 2, it might also make sense to group the items into separate dicts where each key has the same length.

THC4k
now... how do you sort by string length?
calccrypto
`sorted(lookupdict, key=len)`
THC4k
thanks!!! i'd click the up arrow a few more times, but it doesnt work that way
calccrypto