tags:

views:

124

answers:

2

I was searching through the possible work arounds for doing Binary search in Erlang and I found http://ruslanspivak.com/2007/08/15/my-erlang-binary-search/ But I was wondering if the solution in blog runs in O(lg n). Now since the recurrence for Binary search is:T(n) = T(n/2) + c which gives me an execution time of O(lg n).

Since in a C array you have the power of accessing any element in O(1) time. But in erlang if accessing the middle of list takes cn time, then binary search runs in linear overall time as poor as linear search.

I came across lists:nth/2 BIF for finding the nth item in a list but I am not sure about its execution time.

Any comments ?

A: 

nth is O(n). Use array module for constant access data structure (array as in C - almost).

phadej
This is WRONG. The Array module is a very flat tuple tree with a branching factor of about 12, chosen to be a compromise between rewrite time and access time. The access time for a single element is still O(log N). Destructive structures such as ETS tables should allow for constant time access, depending on the data and type of table, but this adds the overhead of copying between the table and any Erlang process. Otherwise, a binary (`<<"some_binary">>`) can allow something that looks like pointer arithmetic and tuples also have O(1) complexity for access to data.
I GIVE TERRIBLE ADVICE
+4  A: 

There are a few data structures that allow O(1) access in Erlang: ETS tables, tuples and binaries.

Now, none of them would really be appropriate for a binary search. The ETS table supports searching from the start, and otherwise, data is copied to your process when returning the result, which is likely not going to be optimal for your use case.

Tuples allow O(1) access with element/2, but modifying them has a certain overhead (which is why the array module uses trees of tuples).

Then you have binaries (<<1,2,3,4,5>>), which allow for something similar to pointer arithmetic, like in the following example:

1> Sorted = <<$a,$b,$c,$d,$e,$f,$g,$h>>.
<<"abcdefgh">>
2> <<_:3/binary, X:1/binary, _/binary>> = Sorted.
<<"abcdefgh">>
3> X.
<<"d">>

However, predicting the performance when building the binary is a bit sketchy, and this kind of pointer arithmetic is harder to do if your values have different types and different sizes when represented in a binary.

Your best bet would likely be to use a list of values, sort it, then use list_to_tuple/1 to navigate around it with element/2.

I would however strongly recommend using a tree to do your searching; it would likely be much much simpler to use the gb_tree module to build a balanced tree and still get O(log N) search.

I GIVE TERRIBLE ADVICE