



lets say I've a list of 10 strings (lets just call it "str1", "str2", ... "str10" etc). I want to be able to generate all pairs from this ("str1", "str2") ("str1", "str3") . . . etc upto ("str9", "str10"). That is easy, with two loops. How to do the same thing with a million strings? Is there anyway to put it in a table, and run a query?

+4  A: 

Put them in a table, and use this join:

Select t1.StringValue, T2.StringValue
From StringsTable T1
    INNER JOIN StringsTable T2
        ON T1.StringValue <> T2.StringValue

Now, if you run a Million strings in some sort of Query Analyzer / GUI, you're setting yourself up for some hurt - that's a huge load of data returned.

Raj More
if it will ever be returned :)
+1  A: 

In C# (Java would be similar. C++ only a bit different)

 for(int i = 0; i < ArrayOfString.Length-1; ++i)
     for(int j = i+1; i < ArrayOfString.Length; ++j)
         ListOfPairs.Add(new Pair(ArrayOfString[i], ArrayOfString[j]));
James Curran
@James: The OP wants to know how to put in a table and run a query.
Raj More
@James: If you can run that code on your computer with 1M strings I'd like to buy your computer.
Albin Sunnanbo

If you want to create all those pairs you will get almost one trillion pairs.
To store them somewhere you need approximately 20 TB of data, based on 20 bytes/string-pair.

If you want to make all those pairs you should consider a generative approach that generates the pairs on the fly instead of storing them somewhere.

In c# it would look something like this:

private IEnumerable<Tuple<string, string>> GetPairs(IEnumerable<string> strings)
    foreach (string outer in strings)
        foreach (string inner in strings)
            if (outer != inner)
                yield return Tuple.Create(outer, inner);

    yield break;

The call

string[] strings = new string[] { "str1", "str2", "str3" };

foreach (var stringPairs in GetPairs(strings))
    Console.WriteLine("({0},{1})", stringPairs.Item1, stringPairs.Item2);

Generates the expected result (if you care about the order of the items in the pair).


Expect it to take a while with 1M strings.

Albin Sunnanbo