views:

570

answers:

7

I am looking for an efficient way to detect the number of unique values in an array.

My current approach:

  1. Quicksort array of integers
  2. Then run a loop to compare elements.

In code:

  yearHolder := '';
  for I := 0 to  High(yearArray) do
  begin
    currYear := yearArray[i];
    if (yearHolder <> currYear) then
    begin
      yearHolder := currYear;
      Inc(uniqueYearNumber);
    end;
  end;
A: 

I load the elements into a HashTable and check before adding each item.

eschneider
A: 

A more efficient algorithm would be to dump everything a hash table (not sure if delphi even has this).

  1. Iterate through the list (in this case, yearArray) and use the values as keys in the hash table.
  2. Retreive the number of keys in the hash table.
Babak Naffas
That's what I said...
eschneider
It is...we must have started writing our posts at the same time. The fact that we agree should say something :-)
Babak Naffas
+4  A: 

In general, you can use this algorithm:

  1. Create a hash table that maps year to count of occurrences.
  2. For each number in your array, put a corresponding entry in a hash table.
  3. When done, get the number of entries in the hash.

However, in your case, your variables are named "year". If this is really a year, this is simpler, because years have a very limited range. Say, the range 0-3000 should be enough. So, instead of a hash table, you can use a simple array of counters. Initialize it with 0s. Then when you see the year 2009, increment the element arr[2009]. At the end, count the number of elements with arr[i] >= 1.

Igor Krivokon
Count > 1, that should work better.
gabr
Depends on what is meant by unique. Count == 1 will count only the non-duplicated items. count >= 1 will cound all the items.
Igor Krivokon
...but his solution is more consistent with >= 1, so you are right; I'm updating my answer. Thanks!
Igor Krivokon
Thanks for the good description. With the example below. It helped. Delphi is limited with the data structures.
Greener
+5  A: 

Here is an example with the THashedStringList:

hl := THashedStringList.Create; // in Inifiles
try
  hl.Sorted := True;
  hl.Duplicates := dupIgnore; // ignores attempts to add duplicates
  for i := 0 to  High(yearArray) do
    hl.Add(yearArray[i]);
  uniqueYearCount := hl.Count;
finally
  hl.Free;
end;
Jeff Cuscutis
Thanks for the code example, it helped. I did not use before HashStrings.
Greener
Does the dupIgnore statement work without declaring h1 to be sorted? (See: http://stackoverflow.com/questions/1064446/why-doesnt-thashedstringlist-ignore-duplicates )
lkessler
My bad, I forgot the dupIgnore doesn't work without Sorted := True;
Jeff Cuscutis
A: 

A minor deviation from the plan could be more worthwhile: never add duplicates to the array in the first place, or add them directly to the proposed hash array.

Up till D2009, there is only THashedStringList (which needs a bunch of costly number -> string conversions and hashes on strings to operate), but if you have D2009 then the Generics.Collections unit has some interesting data structures.

Marco van de Voort
No need for number-to-string conversions. Delphi has TBucketList, in the Contnrs unit, with "specializations" for a few data types, including Integer. (It's not called a hash table, but that doesn't mean that's not what it is.)
Rob Kennedy
A: 

I'd recommend adding the items to a Set and, once completed, reading the size of the resulting Set. Because Sets do not allow duplicates, in Java, DDL, .Net, and many (if not all languages), this is a safe, cheap and reliable method.

Elliot
A: 

In Delphi using DeHL we say: List uniqueWidgets := List.Create( MassiveListOfNonUniqueWidgets.Distinct());

:P

alex