views:

989

answers:

7

I have a script that loops through a series of four (or less) characters strings. For example:

aaaa
aaab
aaac
aaad

If have been able to implement it with nested for loops like so:

chars = string.digits + string.uppercase + string.lowercase

for a in chars:
    print '%s' % a   
    for b in chars:
        print '%s%s' % (a, b)
        for c in chars:
            print '%s%s%s' % (a, b, c)
            for d in chars:
                print '%s%s%s%s' % (a, b, c, d)

Is this sort of loop nesting a bad thing, and if so, what would be a better way of accomplishing what I am doing?

+1  A: 

There are many algorithms for generating every permutation of a set. What you want here is a related problem, but not directly analagous. Suggested Reading

Sparr
+3  A: 

I don't think it's a bad thing, provided you understand (and document :-) it. I don't doubt there may be a more pythonic way or clever solution (with lambdas or whatnot) but I've always favored readability over cleverness.

Since you have to generate all possibilities of 1-, 2-, 3- and 4-character "words", this method is as good as any. I'm not sure how long it would take since you're effectively generating (very roughly) 14 million lines of output (but probably every solution would have that problem).

Pre-calculating the common prefixes may provide a speed boost but you'd be better off measuring it to check (always check, never assume):

chars = string.digits + string.uppercase + string.lowercase
for a in chars:
    print a
    for b in chars:
        ab = '%s%s' % (a, b)
        print ab
        for c in chars:
            abc = '%s%s' % (ab, c)
            print abc
            for d in chars:
                print '%s%s' % (abc, d)

EDIT: I actually did some benchmarks (with Windows-Python 2.6.1) - this version takes about 2.25 time units compared to the original 2.84 so it's 26% faster. I think that might warrant its use (again, as long as it's documented clearly what it's trying to achieve).

paxdiablo
You can save quite some time by always doing a simple `a+b` instead of `'%s%s' % (a, b)`
sth
+4  A: 

Write for the programmer first - the computer second.
If it's clear and obvious to understand then its correct.

If speed matters AND the compiler doesn't optimise it anyway AND if you measure it AND it is the problem - then think of a faster cleverer way!

Martin Beckett
If we followed that advice everywhere, we'd still be using selectionsort..
John Fouhy
Not every programmer is a theoretician. Most of us have work to get done by a certain date and are not bound by the cpu. In this case, mgb is correct and you should write for the programmer. The OP's algo is not that, but it doesn't mean that mgb is wrong, perhaps not helpful though.
Ed Swangren
+14  A: 
import string
import itertools

chars = string.digits + string.letters
MAX_CHARS = 4
for nletters in range(MAX_CHARS):
    for word in itertools.product(chars, repeat=nletters + 1):
        print (''.join(word))

That'll print all 15018570 words you're looking for. If you want more/less words just change the MAX_CHARS variable. It will still have just two fors for any number of chars, and you don't have to repeat yourself. And is pretty readable. .

nosklo
Syntax Error... you're missing a colon.
Algorias
@Algorias: fixed. Thanks.
nosklo
This should mention that it requires python 2.6, but otherwise it's the solution I was going to post. :)
Nick
"is pretty readable"? - I still think the original is more understandable but then I'm a dinosaur and I've never seen itertools, so I won't -1 you :-).
paxdiablo
Can you explain why it is satisfies "no nested for's" when I see two apparently nested copies of the keyword?
Jonathan Leffler
Actually I thought I'd do a performance check to see how much faster itertools is. The results were surprising - under Python 2.6.1 (Windows), itertools is MUCH slower (2.16 as opposed to 0.45 time.clock() units). Without output and run five times each within the script to remove startup time.
paxdiablo
Retract that last benchmark statement (it was a dodgy benchmark :-). Original question took 2.84 units, optimized (see my answer below) took 2.25 and itertools took 2.27. So itertools seems slightly slower but probably not enough to worry about.
paxdiablo
@Nick: If you need it in python < 2.6 there is an implementation of product() here: http://docs.python.org/library/itertools#itertools.product
nosklo
@Jonathan: Because you can use 20, 30, 40 chars, how many you want, without inserting more fors. I changed the text to better reflect what I mean. It could be written to use a single for too, using itertools.chain.from_iterable().
nosklo
string.ascii_letters != (string.lowercase + string.uppercase). `lowercase`, `uppercase` are locale-dependant.
J.F. Sebastian
@JFSebastian: Fixed.
nosklo
I never knew about itertools.product, or at least had never looked into it that much. This a great answer.
sykora
(string.lowercase + string.uppercase) == string.letters
J.F. Sebastian
@nosklo Thanks for your answer!
Matt in PA
@JFSebastian: Ok, much better :) Fixed thanks.
nosklo
+6  A: 

I'm going to submit my answer as the most readable and least scalable :)

import string
chars = [''] + list(string.lowercase)

strings = (a+b+c+d for a in chars
                   for b in chars
                   for c in chars
                   for d in chars)

for string in strings:
    print string

EDIT: Actually, this is incorrect, as it will produce duplicates of all strings of length<4. Removing the empty string from the chars array would just produce 4-char strings.

Normally I'd delete this answer, but I still kinda like it if you need to generate strings of the same length.

Triptych
I like it... it's like an invisible time bomb.
Tom
It covers all words, note the empty string in the list of "characters".
sth
You solution is much more readable but it produces different output: http://stackoverflow.com/questions/482146/replace-nested-for-loops-or-not#482990
J.F. Sebastian
Yup, it will produce duplicates of every string of length<4. I'll make a note.
Triptych
+1  A: 

@nosklo's and @Triptych's solutions produce different results:

>>> list(map(''.join, itertools.chain.from_iterable(itertools.product("ab", 
...     repeat=r) for r in range(4)))) # @nosklo's
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 
 'bab', 'bba', 'bbb']
>>> ab = ['']+list("ab")
>>> list(map(''.join, (a+b+c for a in ab for b in ab for c in ab)))
['', 'a', 'b', 'a', 'aa', 'ab', 'b', 'ba', 'bb', 'a', 'aa', 'ab', 'aa', 
 'aaa', 'aab', 'ab', 'aba', 'abb', 'b', 'ba', 'bb', 'ba', 'baa', 'bab', 
 'bb',  'bba', 'bbb']

Here's modified @Triptych's solution that produce the same output as the @nosklo's one:

>>> ab = "ab"
>>> list(map(''.join, itertools.chain([''], ab, (a+b for a in ab for b in ab),
...     (a+b+c for a in ab for b in ab for c in ab))))
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 
 'bab', 'bba', 'bbb']
J.F. Sebastian
+1  A: 

It doesn't exactly answer the question, but this would return the nth combination for the given maximum length and characters in the alphabet to use:

#!/usr/bin/python

def nth_combination(n, maxlen=4, alphabet='abc'):
    """
    >>> print ','.join(nth_combination(n, 1, 'abc') for n in range(3))
    a,b,c
    >>> print ','.join(nth_combination(n, 2, 'abc') for n in range(12))
    a,aa,ab,ac,b,ba,bb,bc,c,ca,cb,cc
    >>> import string ; alphabet = string.ascii_letters + string.digits
    >>> print ','.join(nth_combination(n, 4, alphabet) for n in range(16))
    a,aa,aaa,aaaa,aaab,aaac,aaad,aaae,aaaf,aaag,aaah,aaai,aaaj,aaak,aaal,aaam
    >>> print ','.join(nth_combination(n, 4, alphabet)
    ...                for n in range(0, 14000000, 10**6))
    a,emiL,iyro,mKz2,qWIF,u8Ri,zk0U,Dxav,HJi9,LVrM,P7Ap,UjJ1,YvSE,2H1h
    """
    if maxlen == 1:
        return alphabet[n]
    offset, next_n = divmod(n, 1 + len(alphabet)**(maxlen-1))
    if next_n == 0:
        return alphabet[offset]
    return alphabet[offset] + nth_combination(next_n-1, maxlen-1, alphabet)

if __name__ == '__main__':
    from doctest import testmod
    testmod()

This of course makes sense only if you need random access to the set of combinations instead of always iterating through them all.

If maxlen is high, some speed optimization could be achieved e.g. by getting rid of string concatenation and re-calculating the length of alphabet and maxlen-1 at each level of the recursion. A non-recursive approach might make sense, too.

akaihola