tags:

views:

185

answers:

6

Hello,

In perl, to get a list of all strings from "a" to "azc", to only thing to do is using the range operator:

perl -le 'print "a".."azc"'

What I want is a list of strings:

["a", "b", ..., "z", "aa", ..., "az" ,"ba", ..., "azc"]

I suppose I can use ord and chr, looping over and over, this is simple to get for "a" to "z", eg:

>>> [chr(c) for c in range(ord("a"), ord("z") + 1)]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

But a bit more complex for my case, here.

Thanks for any help !

+2  A: 

Use the product call in itertools, and ascii_letters from string.

from string import ascii_letters
from itertools import product

if __name__ == '__main__':
    values = []
    for i in xrange(1, 4):
        values += [''.join(x) for x in product(ascii_letters[:26], repeat=i)]

    print values
muckabout
It should be `ascii_lowercase`, and you haven't yet accounted for stopping at 'azc'.
Matthew Flaschen
hmm, thanks, here I can have a list of string from a to zzz. So I will do a second loop to copy the items from the first loop to the second one, and stop while encountering the "end" string.I'll answer my question with a complete code sample. Thanks a lot !
Alexis Métaireau
A: 
def strrange(end):
    values = []
    for i in range(1, len(end) + 1):
        values += [''.join(x) for x in product(ascii_lowercase, repeat=i)]
    return values[:values.index(end) + 1]
Alexis Métaireau
Major issues with this: 1) Use of `xrange` instead of `range`. `xrange` no longer has any advantage over `range`, since `range` is a generator and doesn't pre-generate the result list. Thus `xrange` is deprecated, and IIRC, not even in Python 3. 2) Constructing `endvalues` from `values` when you could have just used `list.index()` and a slice operation. 3) This isn't how you mark questions as answered on SO.
Mike DeSimone
@Mike, `xrange` is still required in Python 2.7, which was released less than 2 weeks ago. `range` still returns a list.
Matthew Flaschen
I've updated this one to use slices and index(). Also removed the wrapping text.
Alexis Métaireau
+2  A: 

A suggestion purely based on iterators:

import string
import itertools

def string_range(letters=string.ascii_lowercase, start="a", end="z"):
    return itertools.takewhile(end.__ne__, itertools.dropwhile(start.__ne__, (x for i in itertools.count(1) for x in itertools.imap("".join, itertools.product(letters, repeat=i)))))

print list(string_range(end="azc"))
Philipp
+2  A: 

Generator version:

from string import ascii_lowercase
from itertools import product

def letterrange(last):
    for k in range(len(last)):
        for x in product(ascii_lowercase, repeat=k+1):
            result = ''.join(x)
            yield result
            if result == last:
                return
Mike DeSimone
yeah ! Definitely good (can't vote as I'm only 11 reputation, but sound right !)
Alexis Métaireau
A: 

Here's a better way to do it, though you need a conversion function:

for i in xrange(int('a', 36), int('azd', 36)):
    if base36encode(i).isalpha():
        print base36encode(i, lower=True)

And here's your function (thank you Wikipedia):

def base36encode(number, alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', lower=False):
    '''
    Convert positive integer to a base36 string.
    '''
    if lower:
        alphabet = alphabet.lower()
    if not isinstance(number, (int, long)):
        raise TypeError('number must be an integer')
    if number < 0:
        raise ValueError('number must be positive')

    # Special case for small numbers
    if number < 36:
        return alphabet[number]

    base36 = ''
    while number != 0:
        number, i = divmod(number, 36)
        base36 = alphabet[i] + base36

    return base36

I tacked on the lowercase conversion option, just in case you wanted that.

Wayne Werner
A: 

I would not define range including end element as it is Pythons way that end element is not included, also 'ba' should not be in that range. So this is not surely most concise answer, but this is what I came up with for python range for letters, not the request:

from string import ascii_lowercase
def letter_range(start,end):
    if len(end)<len(start) or start>end:
        return []

    results = [start[:-1]+c
               for c in ascii_lowercase
               if c>=start[:-1] and start<=start[:-1]+c<end ]
    length_m1= len(start)-1

    while len(start)<=len(end):
        results.extend([st+c
                        for c in ascii_lowercase
                        for st in results
                        if (len(st) == length_m1) and  (st+c < end)
                        ])
        length_m1= len(start)
        start+='a'

    return results
print letter_range('a','azc')
print "'zc'>'acz' =",'zc'>'acz'
print "letter_range('cd','edfr')",len(letter_range('cd','edfr')), 'strings'

For test I print out the length, as IDLE became jammed in printing the long lists.

Validity test:

>>> print [ ( a,b ) for a,b in zip(sorted(letter_range('a','azc'),key=len),letter_range('a','azc')) if a !=b ]
[]
Tony Veijalainen