views:

47

answers:

2

I have the following code to generate combinations of string for a small list and would like to adapt this for a large list of over 300 string words.Can anyone suggest how to alter this code or to use a different method.

Public Class combinations



Public Shared Sub main()

    Dim myAnimals As String = "cat dog horse ape hen mouse"

    Dim myAnimalCombinations As String() = BuildCombinations(myAnimals)

    For Each combination As String In myAnimalCombinations
        ''//Look on the Output Tab for the results! 
        Console.WriteLine("(" & combination & ")")
    Next combination

    Console.ReadLine()

End Sub



Public Shared Function BuildCombinations(ByVal inputString As String) As String()

    ''//Separate the sentence into useable words. 
    Dim wordsArray As String() = inputString.Split(" ".ToCharArray)

    ''//A plase to store the results as we build them 
    Dim returnArray() As String = New String() {""}

    ''//The 'combination level' that we're up to 
    Dim wordDistance As Integer = 1

    ''//Go through all the combination levels... 
    For wordDistance = 1 To wordsArray.GetUpperBound(0)

        ''//Go through all the words at this combination level... 
        For wordIndex As Integer = 0 To wordsArray.GetUpperBound(0) - wordDistance

            ''//Get the first word of this combination level 
            Dim combination As New System.Text.StringBuilder(wordsArray(wordIndex))

            ''//And all all the remaining words a this combination level 
            For combinationIndex As Integer = 1 To wordDistance

                combination.Append(" " & wordsArray(wordIndex + combinationIndex))

            Next combinationIndex

            ''//Add this combination to the results 
            returnArray(returnArray.GetUpperBound(0)) = combination.ToString

            ''//Add a new row to the results, ready for the next combination 
            ReDim Preserve returnArray(returnArray.GetUpperBound(0) + 1)

        Next wordIndex

    Next wordDistance

    ''//Get rid of the last, blank row. 
    ReDim Preserve returnArray(returnArray.GetUpperBound(0) - 1)

    ''//Return combinations to the calling method. 
    Return returnArray

End Function

End Class

'

CHANGES//

For wordDistance = 1 To inputList.Count.ToString / 2

        Dim count = inputList.Count.ToString

        'Go through all the words at this combination level... 
        For wordIndex As Integer = 0 To inputList.Count.ToString - wordDistance

            'Get the first word of this combination level 
            combination.Add(inputList.Item(wordIndex))
            'And all all the remaining words a this combination level 
            For combinationIndex As Integer = 1 To wordDistance
                combination.Add(" " & inputList.Item(wordIndex + combinationIndex))
            Next combinationIndex

            'Add this combination to the results 

            If Not wordsList.Contains(combination) Then
                wordsList.Add(combination.ToString)
            End If

            'Add a new row to the results, ready for the next combination 
            'ReDim Preserve returnArray(returnArray.GetUpperBound(0) + 1)

        Next wordIndex

    Next wordDistance
+1  A: 

One obvious thing in your code is the usage of ReDim Preserve. That can be quite a slow operation since I think it copies the whole array into a new array every time the size is changed, and since you're doing that inside loops I assume that could be a significant issue.

The simplest way of fixing that is to stop using those kinds of arrays and instead use List with it's Add method.

ho1
+1  A: 

I want to make sure I understand what you are trying to do first. Your problem seems to be:

  • Given a list of strings,
  • Return every possible combination of n items from the list,
  • where n = 2 to length of list

For example, in a list of 5 strings, you would want all combinations of 2 strings, of 3 strings, of 4 strings, and of 5 strings.

If that is an accurate statement of your problem, there is one glaring issue to point out. The number of items you will be generating is on the order of 2 ^ (length of list). This means that trying to generate all combinations of 300 items will never be fast no matter what. Also, for any but the tiniest of lists, you will need to generate items lazily or you will run out of memory.

If you do not want all combinations of all lengths, you may want to clarify your question to better state your desired goal.

Gideon Engelberth
Thanks for the response. I removed the redim process and used arraylists instead of strings to decrease some time but I do acknowledge your point that large list would take a significant amount of time. I changed code a bit but still follow the algorithm above and it works ok but I still would like to limit the size and number of combinations. For example
vbNewbie
if I had the combination list generated as follows for a list of cat, dog, hen, mouse: {cat, dog, hen, mouse, cat dog, dog hen, hen mouse, cat dog hen, dog hen mouse, cat dog hen mouse}::number of items in each combination = 4length of each combination = eg dog hen = 7 including spaces
vbNewbie