views:

185

answers:

8

How to search fast in a non-sorted array? I am unable to think of any other search mechanism apart from linear search.

Any pointers will be helpful.

+7  A: 

Yes, without sorting the data a linear search is about as good as you can do.

Jerry Coffin
+3  A: 

Linearly. Or sort your array and search that at log time.

Michael Dorgan
But O(n log n) + O(log n) > O(n). If this is a one time search sorting it is useless. Even faster sorts (if they are applicable) are still only O(n). I agree that if the array is going to be searched multiple times, sorting would be better.
Yacoby
@Yacoby: unless the data in question is "eligible" for a sort that isn't based on comparisons (a few of which are linear). Even these probably don't pay off for a *single* search, but can still break even with far fewer searches.
Jerry Coffin
+4  A: 

Given a truly random non-sorted array, linear search is the best way.

You might be able to do better with some kind of heuristic approach if the data could be assumed to be roughly sorted.

dicroce
+1  A: 

There really isn't any kind of optimization to be done without your array being sorted. All optimizations hinge on having some kind of knowledge about your array being guaranteed. Without that, you can't assume anything and have to go through each item linearly.

Tim Coker
A: 

Once your code is presented with an unsorted list, yes you will need to do a linear search.

There are a couple of things you can do however if it is liable to be searched more than once.

First off, of course, would be sorting the list. It would make the first lookup take quite a while, but subsequent ones would be faster.

Secondly, I'd suggest caching the result. For most usages I've found subsequent lookups have a high likelyhood to be either the same value, or one nearby.

You shouldn't bother with either of these though, unless the list is really large and slow to search. In general, the less code you can get away with writing, the better off everyone is.

T.E.D.
A: 

If you are able, you might consider changing the way the data is stored/created in the first place. Rather than adding items to an array, create a full blown binary tree (or whatever) and work with that. It makes manipulating the data slightly slower but it shouldn't be as slow as re-sorting the data every time and would make your searches much faster.

Mat
+1  A: 

As others have mentionned, a linear search is the way to go. However, depending upon your setup, the following may or may not be relevant to speeding things up.

If some items are searched for more often than others, a small cache might be useful. For example, if you are implementing a lookup() operation for a file system (where the data is ultimately stored on disk somewhere), a cache can help immensely.

Can you sort/sift the data at all so that the most commonly looked for items are near the start of the array?

Is it practical to store your data in two sets? One is sorted, one is unsorted? The sorted, you should be able to search quickly for your item. If not found, go to your unsorted list and search linearly through that. Why? Although you said it is not practical to insert after every insertion, is it practical/efficient to insert from the unsorted list into the sorted list periodically?

Must it all be in one unsorted array? Can you put your data into blocks of sorted data? That is, you might 1000 sorted entries per block. Searching each sorted block might be faster than searching the entire unsorted list.

Just some ideas. They might not be applicable to your situation, but hopefully they are helpful.

Sparky
+1  A: 

There are some clever approaches for optimizing linear search, but in the end none of them really pay of in terms of performance but make the code error-prone and hard to understand. So I'd recommend sticking to a simple linear search (i.e. simple iterating + comparing).

Helper Method
@Helper Method: What are the "clever approaches" that you mentioned?
Lazer
Using a sentinel, or counting down instead of up.
Helper Method
@Helper Method: thanks! Read about them [here](http://en.wikipedia.org/wiki/Linear_search#Using_a_sentinel).
Lazer
@Lazer See my post http://stackoverflow.com/questions/2496013/is-this-linear-search-implementation-actually-useful for further details.
Helper Method