tags:

views:

940

answers:

8

Given an unsorted integer array, and without making any assumptions on the numbers in the array, Is it possible to find two numbers whose difference is minimum in O(n) time. Edit: Difference between two numbers a, b is defined as abs(a-b)

I don't know if this is possible, but still?

A: 

No, not without making assumptions about the numbers/ordering.

It would be possible given a sorted list though.

chills42
+6  A: 

I don't think you can to it in O(n). The best I can come up with off the top of my head is to sort them (which is O(n * log n)) and find the minimum difference of adjacent pairs in the sorted list (which adds another O(n)).

Bill the Lizard
I'm thinking the same but @Michael Todd says you can do it int O(n) so I'm eagerly waiting to see what he posts
non sequitor
@non sequitor: Hopefully he gets back to us on that soon. I'm happy to be proven wrong. :)
Bill the Lizard
I think you're close -- a radixsort would give you O(n) sorting time, then O(n) for the adjacent pair iteration.
fbrereto
@fbrereto: Radix sort isn't O(n), it's O(kn) where k is the number of digits in the values in the array. If we knew that k was small we might use radix sort and gain something. Since we don't know the range of the values, we can't assume that k is any smaller than log n.
Bill the Lizard
There's an O(n) algorithm.. see my post below.
Paul Hankin
Err...I was being flippant because this sounded more like a "homework question I want someone else to do for me" rather than a serious question. I've removed my initial comment.
Michael Todd
@Michael Todd: I was just being flippant too. Your original comment was totally okay.
Bill the Lizard
+12  A: 

Find smallest and largest element in the list. The difference smallest-largest will be minimum.

If you're looking for nonnegative difference, then this is of course at least as hard as checking if the array has two same elements. This is called element uniqueness problem and without any additional assumptions (like limiting size of integers, allowing other operations than comparison) requires >= n log n time. It is the 1-dimensional case of finding the closest pair of points.

sdcvvc
I think that for a set [5, 6, 1, 10] they are looking for 1, not -9.
Kathy Van Stone
@Kathy: Then they should have written "Find two *distinct* integers whose *modulus* of difference is minimum". Caveat specificator :)
Rafał Dowgird
+1 for thinking outside the usual assumptions of *difference*.
Mark Ransom
It can be done in O(n) if we believe Wikipedia: "If we allow randomization to be used together with the floor function, the problem [closest pair of points] can be solved in O(n) time."
J.F. Sebastian
@Sebastian: It requires that you put bounds on the integers.
GMan
A: 

The best I can think of is to counting sort the array (possibly combining equal values) and then do the sorted comparisons -- bin sort is O(n + M) (M being the number of distinct values). This has a heavy memory requirement, however. Some form of bucket or radix sort would be intermediate in time and more efficient in space.

Kathy Van Stone
You'd probably have to use radix sort, no? You're not given the possible range of numbers, so counting sort would be immediately out the window.
Andrew Song
Finding the range is an O(n) operation (min, max).
Kathy Van Stone
Yes, but the range can be arbitrarily large in this case. See Bill the Lizard's comment on his response.
PeterAllenWebb
@Peter, no it can't be. The range cannot be bigger than the max - min of the data type.
Joshua
We are discussing asymptotic analysis. The poster has not asked about 32 bit ints, or 64 bit ints. He could very well be using arbitrary precision integers. I know it seems pedantic, but if this distinction was not drawn, you could claim to sort any list of integers in O(n) time, which we know is not possible.
PeterAllenWebb
Counting sort runs in O(n + k), where k is the size of the range of values, not the number of distinct values.
meriton
A: 

Sort the list with radixsort (which is O(n) for integers), then iterate and keep track of the smallest distance so far.

(I assume your integer is a fixed-bit type. If they can hold arbitrarily large mathematical integers, radixsort will be O(n log n) as well.)

meriton
I don't think you're guaranteed k is constant here. What if k = n?
Andrew Song
Radix sort is only O(n) because it builds limits into the algorithm. If you're sorting normal int values which are guaranteed to be unique, it's actually O(1). If you're sorting arbitrarily large numbers, it isn't O(n) any more.
David Thornley
You're right, integer might refer to a variable-length datatype. I have edited my answer to state that assumption.
meriton
A: 

It seems to be possible to sort unbounded set of integers in O(n*sqrt(log(log(n))) time. After sorting it is of course trivial to find the minimal difference in linear time.

But I can't think of any algorithm to make it faster than this.

liori
A: 

I think the answer is no and the proof is similar to the proof that you can not sort faster than n lg n: you have to compare all of the elements, i.e create a comparison tree, which implies omega(n lg n) algorithm.

EDIT. OK, if you really want to argue, then the question does not say whether it should be a Turing machine or not. With quantum computers, you can do it in linear time :)

tulskiy
Why do you have to compare all the elements? Obviously you need to look at them, but why *compare*?
meriton
OK, let's call it "looking at numbers", anyways, there will be a comparison tree.
tulskiy
You are still assuming every good algorithm for this problem will compare and build comparison trees. Since you do not justify this assumption, your proof is incomplete.
meriton
With integers we can do better, because we have more knowledge about the dataset than just simple ordering relation. For example if we know that our dataset consist of K different types of values, you can use counting-sort and do O(n+K). Also, see my answer.
liori
I wanted to say that any algorithm will have to check every possible pair of numbers, which is omega(n lg n).
tulskiy
Even though you can sort unbounded integers in sub-nlog(n) time? Please really check my answer.
liori
A: 

I think it is possible. The secret is that you don't actually have to sort the list, you just need to create a tally of which numbers exist. This may count as "making an assumption" from an algorithmic perspective, but not from a practical perspective. We know the ints are bounded by a min and a max.

So, create an array of 2 bit elements, 1 pair for each int from INT_MIN to INT_MAX inclusive, set all of them to 00.

Iterate through the entire list of numbers. For each number in the list, if the corresponding 2 bits are 00 set them to 01. If they're 01 set them to 10. Otherwise ignore. This is obviously O(n).

Next, if any of the 2 bits is set to 10, that is your answer. The minimum distance is 0 because the list contains a repeated number. If not, scan through the list and find the minimum distance. Many people have already pointed out there are simple O(n) algorithms for this.

So O(n) + O(n) = O(n).

Edit: responding to comments.

Interesting points. I think you could achieve the same results without making any assumptions by finding the min/max of the list first and using a sparse array ranging from min to max to hold the data. Takes care of the INT_MIN/MAX assumption, the space complexity and the O(m) time complexity of scanning the array.

patros
Who says its being implemented in a system with finite-sized numbers?
Andrew Medico
If your algorithm relies on the ints being bounded by a min and a max, it's actually O(1). Given a maximum on n, the operation will take a bounded amount of time.
David Thornley
The length of your array is not in O(n), but you assume it is initialized with 00-elements. Moreover, how do you *scan through the list* to find the smallest distance?
meriton
But from a "practical perspective", the runtime will be stratospheric when dealing with 32-bit (or worse, 64 bit) integers. The runtime is O(1) but the constant cost is steep indeed :)
hythlodayr
This is O(n) + O(m) = O(n + m) where n = <number of elements> and m = <size of integer>
tster