views:

65

answers:

3

I need to run "mode" (which value occurs most frequently) on an array of singles in vb6. Is there a quick way do do this on large arrays?

+2  A: 

Have a look online for a decent implementation of a sort algorithm for VB6 (I can't believe it doesn't have one built in!), sort the array, and then go through it counting the occurrences (which will be straightforward as you've all the same items together in the array) - keep a track of the most frequently occurring item on your way through and you're done. This should be O(n ln(n)) - that is, fast enough - if you've used a decent sort algorithm (quicksort or similar).

Will A
+1 Here's some sort algorithms from StackOverflow http://stackoverflow.com/questions/152319/vba-array-sort-function
MarkJ
+2  A: 

You could use a hash table. Hash all of the elements of your array (which is O(n)). You'll need a back-end data structure to hold the unique values that each hash bin contains and the number of occurances (some sort of associative memory similar to the C++ std::map). As long as you can guarantee that there will be no more than a constant, m, number of collisions (for dissimilar hash input values) in any given bin, this is O(m log m), but since m is constant, this is really O(1). This assumption may not be reasonable, but the key is to get good enough spread for your input values.

To pull out the mode, examine all of the elements in the hash table, which will be values that occur in your original input array and the number of times they occur. Find the value with the largest number of occurances (again O(n)). Total complexity is O(n) if you can find a suitable hash function. Worst case performance will be O(n log n) if the hash function doesn't provide you with good collision performance.

On another note, .Net provides a large runtime library that might make this easier. If it's feasible, you might want to consider using a new version of VB.

andand
Thanks, I have to support a huge legacy vb6 app, so I don't always have the luxury of .net
mikeh
+2  A: 

Included a reference to Microsoft Scripting Runtime, and used a Dictionary object to keep tally of frequency, then looked for index highest frequency and the corresponding key is the mode. Not the quickest/most elegant solution, but I just needed something up fast that worked.

Function fnModeSingle(ByRef pValues() As Single) As Single
    Dim dict As Dictionary
    Set dict = New Dictionary
    dict.CompareMode = BinaryCompare
    Dim i As Long    
    Dim pCurVal As Single
    For i = 0 To uBound(pValues)
    'limit the values that have to be analyzed to desired precision'
        pCurVal = Round(pValues(i), 2) 
            If (pCurVal > 0) Then
                'this will create a dictionary entry if it doesn't exist
                dict.Item(pCurVal) = dict.Item(pCurVal) + 1
            End If
    Next

    'find index of first largest frequency'
    Dim KeyArray, itemArray
    KeyArray = dict.Keys
    itemArray = dict.Items
    pCount = 0
    Dim pModeIdx As Integer
    'find index of mode'
    For i = 0 To UBound(itemArray)
        If (itemArray(i) > pCount) Then
            pCount = itemArray(i)
            pModeIdx = i
        End If
    Next
    'get value corresponding to selected mode index'
    fnModeSingle = KeyArray(pModeIdx)
    Set dict = Nothing

End Function
mikeh
the tolerence is the key here imho, it's not really clear that the mode of a floating point type makes sense unless you include a tolerence to compare to
jk
Thanks, good point, which is why I rounded it to 2 decimal places, which solves my current needs.
mikeh
Note that Round uses Bankers' Rounding, which may or may not be appropriate.Also you could probably do things faster by using a straight array to hold an ordered unique list of Singles (and a separate array to hold corresponding counts). Then use a binary search to lookup the Single value.
Jonathan Swift