J
NB. random array of floating-point numbers
] y =: 10 (?@$%]) 5
0 0.6 0.2 0.4 0.4 0.8 0.6 0.6 0.8 0
NB. count occurrences
({:,#)/.~ y
0 2
0.6 3
0.2 1
0.4 2
0.8 2
NB. order by occurrences
(\:{:"1)({:,#)/.~ y
0.6 3
0 2
0.4 2
0.8 2
0.2 1
NB. pick the most frequent
{.{.(\:{:"1)({:,#)/.~ y
0.6
I would advise against using a hash, as it assumes exact comparisons -- never a good assumption on floating-point numbers. You always want to do an epsilon comparison of some sort. What if your array contains some elements 0.2(00000000)
and 0.2(00000001)
, which really should be considered equal, but aren't because they came from different calculations?
Conveniently, J always does epsilon-comparison by default. Too conveniently, since it's hidden in the /.~
and I have to write more code in order to demonstrate how to do this in other languages like Python...
epsilon = 0.0001
def almost_equal(a, b):
return -epsilon <= a-b <= epsilon
array = [0.0, 0.6, 0.2, 0.4, 0.4, 0.8, 0.6, 0.6, 0.8, 0.0]
# more efficient would be to keep this in sorted order,
# and use binary search to determine where to insert,
# but this is just a simple demo
counts = []
for a in array:
for i, (b, c) in enumerate(counts):
if almost_equal(a, b):
counts[i] = (b, c + 1)
break
else:
counts.append((a, 1))
# sort by frequency, extract key of most frequent
print "Mode is %f" % sorted(counts, key = lambda(a, b): b)[-1][0]