views:

1604

answers:

11

I'm trying to create an algorithm in C# which produces the following output strings:

AAAA
AAAB
AAAC
...and so on...
ZZZX
ZZZY
ZZZZ

What is the best way to accomplish this?

public static IEnumerable<string> GetWords()
{
    //Perform algorithm
    yield return word;
}
+16  A: 

well, if the length is a constant 4, then this would handle it:

public static IEnumerable<String> GetWords()
{
    for (Char c1 = 'A'; c1 <= 'Z'; c1++)
    {
        for (Char c2 = 'A'; c2 <= 'Z'; c2++)
        {
            for (Char c3 = 'A'; c3 <= 'Z'; c3++)
            {
                for (Char c4 = 'A'; c4 <= 'Z'; c4++)
                {
                    yield return "" + c1 + c2 + c3 + c4;
                }
            }
        }
    }
}

if the length is a parameter, this recursive solution would handle it:

public static IEnumerable<String> GetWords(Int32 length)
{
    if (length <= 0)
        yield break;

    for (Char c = 'A'; c <= 'Z'; c++)
    {
        if (length > 1)
        {
            foreach (String restWord in GetWords(length - 1))
                yield return c + restWord;
        }
        else
            yield return "" + c;
    }
}
Lasse V. Karlsen
I almost downvoted this as soon as I saw the boilerplate in the first proposal. The second one seems OK.
Svante
+14  A: 

There's always the obligatory LINQ implementation. Most likely rubbish performance, but since when did performance get in the way of using cool new features?

var letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();

var sequence = from one in letters
               from two in letters
               from three in letters
               from four in letters
               orderby one, two, three, four
               select new string(new[] { one, two, three, four });

'sequence' will now be an IQueryable that contains AAAA to ZZZZ.

Edit:

Ok, so it was bugging me that it should be possible to make a sequence of configurable length with a configurable alphabet using LINQ. So here it is. Again, completely pointless but it was bugging me.

public void Nonsense()
{
    var letters = new[]{"A","B","C","D","E","F",
                        "G","H","I","J","K","L",
                        "M","N","O","P","Q","R","S",
                        "T","U","V","W","X","Y","Z"};

    foreach (var val in Sequence(letters, 4))
        Console.WriteLine(val);
}

private IQueryable<string> Sequence(string[] alphabet, int size)
{
    // create the first level
    var sequence = alphabet.AsQueryable();

    // add each subsequent level
    for (var i = 1; i < size; i++)
        sequence = AddLevel(sequence, alphabet);

    return from value in sequence
           orderby value
           select value;
}

private IQueryable<string> AddLevel(IQueryable<string> current, string[] characters)
{
    return from one in current
           from character in characters
           select one + character;
}

The call to the Sequence method produces the same AAAA to ZZZZ list as before but now you can change the dictionary used and how long the produced words will be.

Garry Shutler
if that solution is right, it's an amazing solution. thanks for the C# insight. i must buy one of the jon skeet books when he writes about C# 5.0 with metaprogramming :)
Johannes Schaub - litb
+1  A: 

Inspired by Garry Shutler's answer, I decided to recode his answer in T-SQL.

Say "Letters" is a table with only one field, MyChar, a CHAR(1). It has 26 rows, each an alphabet's letter. So we'd have (you can copy-paste this code on SQL Server and run as-is to see it in action):

DECLARE @Letters TABLE (
    MyChar CHAR(1) PRIMARY KEY
)
DECLARE @N INT
SET @N=0
WHILE @N<26 BEGIN
    INSERT @Letters (MyChar) VALUES ( CHAR( @N + 65) )
    SET @N = @N + 1
END
-- SELECT * FROM @Letters ORDER BY 1

SELECT A.MyChar, B.MyChar, C.MyChar, D.MyChar
FROM @Letters A, Letters B, Letters C, Letters D
ORDER BY 1,2,3,4

The advantages are: It's easily extensible into using capital/lowercase, or using non-English Latin characters (think "Ñ" or cedille, eszets and the like) and you'd still be getting an ordered set, only need to add a collation. Plus SQL Server will execute this slightly faster than LINQ on a single core machine, on multicore (or multiprocessors) the execution can be in parallel, getting even more boost.

Unfortunately, it's stuck for the 4 letters specific case. lassevk's recursive solution is more general, trying to do a general solution in T-SQL would necessarily imply dynamic SQL with all its dangers.

Joe Pineda
you cannot beat haskell: print [ a:b:c:d:[] | let q = ['a' .. 'z'], a <- q, b <- q, c <- q, d <- q ]
Johannes Schaub - litb
@Joe Pineda your solution will never generate DCBA because of "ORDER BY 1,2,3,4"
xxxxxxx
Yes it does! I just checked, by running the code on SQL S. 2000: Sequence "DCBA" is row 54107. All possible combinations are indeed produced, I'm expanding the code so it's easier to test.
Joe Pineda
A: 

javascript!

var chars = 4, abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ", top = 1, fact = [];
for (i = 0; i < chars; i++) { fact.unshift(top); top *= abc.length; }
for (i = 0; i < top; i++)
{
    for (j = 0; j < chars; j++) 
     document.write(abc[Math.floor(i/fact[j]) % abc.length]);
    document.write("<br \>\n");
}
Scott Evernden
that's nice, so you're first calculating the number of possible words in top. and you are viewing the characters as numbers in base abc.length :) I thought of that some time ago,it's a nice idea :) and also better than the recursive approach altough division and modulo might take their toll
xxxxxxx
A: 

Python!

(This is only a hack, dont' take me too seriously :-)

# Convert a number to the base 26 using [A-Z] as the cyphers
def itoa26(n): 
   array = [] 
   while n: 
      lowestDigit = n % 26
      array.append(chr(lowestDigit + ord('A'))) 
      n /= 26 
   array.reverse() 
   return ''.join(array)

def generateSequences(nChars):
   for n in xrange(26**nChars):
      string = itoa26(n)
      yield 'A'*(nChars - len(string)) + string

for string in generateSequences(3):
   print string
Federico Ramponi
+1  A: 

Haskell!

replicateM 4 ['A'..'Z']

Ruby!

('A'*4..'Z'*4).to_a
jleedev
+1  A: 

GNU Bash!

{a..z}{a..z}{a..z}{a..z}
Johannes Schaub - litb
+1  A: 

Just a coment to Garry Shutler, but I want code coloring:

You really don't need to make it IQuaryable, neither the sort, so you can remove the second method. One step forwad is to use Aggregate for the cross product, it end up like this:

IEnumerable<string> letters = new[]{
                "A","B","C","D","E","F",                       
                "G","H","I","J","K","L",
                "M","N","O","P","Q","R","S",           
                "T","U","V","W","X","Y","Z"};

var result = Enumerable.Range(0, 4)
                .Aggregate(letters, (curr, i) => curr.SelectMany(s => letters, (s, c) => s + c));

foreach (var val in result)
     Console.WriteLine(val);

Anders should get a Nobel prize for the Linq thing!

Olmo
A: 

Use something which automatically Googles for every single letter combination, then see if there are more ".sz" or ".af" hits then ".com" hits at the five first results ... ;)

Seriously, what you're looking for might be Tries (data structure) though you still need to populate the thing which probably is far harder...

Thomas Hansen
A: 

This is an recursive version of the same funtions in C#:

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;

namespace ConsoleApplication1Test
{
    class Program
    {
        static char[] my_func( char[] my_chars, int level)
        {
            if (level > 1)
                my_func(my_chars, level - 1);
            my_chars[(my_chars.Length - level)]++;
            if (my_chars[(my_chars.Length - level)] == ('Z' + 1))
            {
                my_chars[(my_chars.Length - level)] = 'A';
                return my_chars;
            }
            else
            {
                Console.Out.WriteLine(my_chars);
                return my_func(my_chars, level);
            }
        }
        static void Main(string[] args)
        {
            char[] text = { 'A', 'A', 'A', 'A' };
            my_func(text,text.Length);
            Console.ReadKey();
        }
    }
}

Prints out from AAAA to ZZZZ

eaanon01
A: 

Simpler Python!

def getWords(length=3):
    if length == 0: raise StopIteration
    for letter in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ':
        if length == 1: yield letter
        else:
            for partialWord in getWords(length-1):
                yield letter+partialWord