ansaurus

Question

Compare two integer arrays with same length

Answer 1

+4 A:

You can try a probabilistic approach - convert the arrays into a number in some huge base B and mod by some prime P, for example sum B^a_i for all i mod some big-ish P. If they both come out to the same number, try again for as many primes as you want. If it's false at any attempts, then they are not correct. If they pass enough challenges, then they are equal, with high probability.

There's a trivial proof for B > N, P > biggest number. So there must be a challenge that cannot be met. This is actually the deterministic approach, though the complexity analysis might be more difficult, depending on how people view the complexity in terms of the size of the input (as opposed to just the number of elements).

Larry 2010-03-08 15:21:23

Always a chance of collision.

Seva Alekseyev 2010-03-08 15:29:55

I'm not convinced that this is 100%, but hey, if you're on an interview, you can do worse, probably.

Larry 2010-03-08 15:32:18

example of collision?

Timmy 2010-03-08 15:39:34

@Timmy, this will surely sometimes have collisions, because the amount of different *arrays* is much greater than the amount of different *integers*.

Pavel Shved 2010-03-08 15:45:54

Why do you say "high probability" instead of calculating it? The ideal probability that can't be overcome is (1 - 1/MAX_INT).

Pavel Shved 2010-03-08 16:42:48

*With high probability* is an actual term used in probabilistic algorithms - http://www.algorithmist.com/index.php/With_high_probability

Larry 2010-03-08 16:44:33

@Larry, it's hard use the wiki article you wrote as a trusted source ;-) Anyway, thanks for the info. :-)

Pavel Shved 2010-03-08 16:56:43

Haha, I meant that a shorthand to writing it out, though that is also the first result in Google! ;)

Larry 2010-03-08 17:01:16

And it seems like all the definitions are in papers. But generally, the term is used when you can make the probability as small as you want.

Larry 2010-03-08 17:03:28

Answer 2

+2 A:

Insert all elements from the first array into a hashtable
Try to insert all elements from the second array into the same hashtable - for each insert to element should already be there

Ok, this is not with constant extra space, but the best I could come up at the moment:-). Are there any other constraints imposed on the question, like for example to biggest integer that may be included in the array?

pajton 2010-03-08 15:23:48

"Insert all elements from the first array into a hashtable" You need O(n) memory to allocate a hash table, while you are only given O(1).

Pavel Shved 2010-03-08 15:41:33

@Pavel:no, you don't need O(n). O(n) would be that if the arrays were doubled in size (for example) you'd expect the hash table to also (approximately) double in size. That's not the case at all.

Jerry Coffin 2010-03-08 15:54:57

@Pavel in worst case I need O(n), I stated that in my post. I am aware this is just partial solution, but maybe can be improved.

pajton 2010-03-08 16:04:00

@Jerry, @pajton, you'll need O(n) in *any* case, provided all values in the array are different. When you add an element to a table, you *have to* store the exact value you've added (for future comparisons). It will be either a new item in a list (for bucket lists), or a requirement to have at least n buckets (open addressing). These all are O(n).

Pavel Shved 2010-03-08 16:24:09

@pajton, taking no account of the cost of extra space, your solution is still wrong.Take the following as an example:<1 1 3 3>, <1 3 3 3>Can you say they're the same?

meta 2010-03-09 04:14:38

@meta Right, but here is easy fix: introduce hashmap instead of hashtable and store count of every element with a hash. Still `O(n)`

pajton 2010-03-09 09:34:58

Answer 3

+12 A:

KennyTM 2010-03-08 15:29:00

Only if they are bounded integers.

Larry 2010-03-08 15:37:35

Given that we're talking about a computer here, they're always bounded. Granted, an integer that occupied (say) 16 gigabytes would have an extremely large upper bound, but it's still a bound.

Jerry Coffin 2010-03-08 15:42:22

By that logic, you can say the naive "hashtable" solution is right, by the fact that you just set the hashtable at a constant, but huge, size.

Larry 2010-03-08 15:44:52

@Larry: if you look at my answer, I did say exactly that.

Jerry Coffin 2010-03-08 15:52:52

So the conclusion of this discussion is that it cannot be done

Christopher Oezbek 2010-03-08 16:09:53

@Larry: The hash table solution is correct only if you can layer the table "in-place" with one of the arrays. (Thus use O(1) space). But the radix sort uses O(1) space as long as the integer word size is O(log N). Read the abstract.

KennyTM 2010-03-08 16:42:00

Radix sort's running time is not `O(N)` for unbounded integers.

Larry 2010-03-08 16:58:24

@Larry: That's true. Therefore the remark in the last line.

KennyTM 2010-03-08 17:03:20

By that logic, on bounded integer, create a hash table of size MAX_INT. This is a constant, I'm sure of it! ;P

Larry 2010-03-08 17:04:46

@Larry: That's a bit different. The integer range needs to grow super-polynomially with array size.

KennyTM 2010-03-08 17:13:38

I didn't say it's efficient, just that it's `O(1)` extra space.

Larry 2010-03-08 17:14:33

@Larry: I did not say about efficiency.

KennyTM 2010-03-08 17:28:22

Then what are you saying?

Larry 2010-03-08 17:53:00

@Larry: I mean only if the array is `[1,3,9,27,81,...]` then radix sort will fail.

KennyTM 2010-03-08 17:55:43

Answer 4

A:

For each array, Use Counting sort technique to build the count of number of elements less than or equal to a particular element . Then compare the two built auxillary arrays at every index, if they r equal arrays r equal else they r not . COunting sort requires O(n) and array comparison at every index is again O(n) so totally its O(n) and the space required is equal to the size of two arrays . Here is a link to counting sort http://en.wikipedia.org/wiki/Counting_sort.

Nagaraj 2010-03-08 15:33:53

The space required is O(n). Does not fly.

Seva Alekseyev 2010-03-08 15:36:26

Sorry, i think i jumped early

Nagaraj 2010-03-08 15:42:41

Answer 5

+2 A:

A few answers are basically correct, even though they don't look like it. The hash table approach (for one example) has an upper limit based on the range of the type involved rather than the number of elements in the arrays. At least by by most definitions, that makes the (upper limit on) the space a constant, although the constant may be quite large.

In theory, you could change that from an upper limit to a true constant amount of space. Just for example, if you were working in C or C++, and it was an array of char, you could use something like:

size_t counts[UCHAR_MAX];

Since UCHAR_MAX is a constant, the amount of space used by the array is also a constant.

Edit: I'd note for the record that a bound on the ranges/sizes of items involved is implicit in nearly all descriptions of algorithmic complexity. Just for example, we all "know" that Quicksort is an O(N log N) algorithm. That's only true, however, if we assume that comparing and swapping the items being sorted takes constant time, which can only be true if we bound the range. If the range of items involved is large enough that we can no longer treat a comparison or a swap as taking constant time, then its complexity would become something like O(N log N log R), were R is the range, so log R approximates the number of bits necessary to represent an item.

Jerry Coffin 2010-03-08 15:40:56

I do not agree with you. This way all algorithms operating in O(n) would be constant, because if you have an array of size n, but you know the n is expressed as integer, it has less than MAX_INT elements, therefore constant.

pajton 2010-03-08 15:42:46

It's what make these interview questions academic, sometimes. `O(N)` at all cost isn't practical!

Larry 2010-03-08 15:43:20

@pajton:not really comparable. You're talking about a limit on the size of the array being processed. I'm talking about a limit on the range of the items IN the array, regardless of its size.

Jerry Coffin 2010-03-08 15:47:41

@Pajton Consider the answer that KennyTM gave. It links to a O(n) radix sort that only works if the range of the integers is limited. Otherwise sorting cannot work in O(n)

Christopher Oezbek 2010-03-08 15:47:54

I see your point. But the author didn't say anything about the bound of elements in the array, so if you assume their range is bound it is the same as the length bound.

pajton 2010-03-08 16:03:06

"a bound on the ranges/sizes of items involved is implicit in nearly all descriptions of algorithmic complexity." - not necessarily, the alternative is to say that your big-O analyses of comparison sorts don't estimate time, but numbers of comparisons, copies, and time excluding them.

Steve Jessop 2010-03-08 17:42:44

@Steve: I think we mostly agree here. I said: "nearly all" for a reason -- while it's certainly possible to be more explicit, it's only really common in places like textbooks and standards.

Jerry Coffin 2010-03-08 18:41:27

True, and as long as people are commonly seen making absurd statements like, "no it isn't O(N^2), it's O(N)", there's probably not much point worrying too much how precise they're being as to what they're describing in the first place ;-)

Steve Jessop 2010-03-08 19:05:27

Answer 6

A:

Is this a trick question? If the authors assumed integers to be within a given range (2^32 etc.) then "extra constant space" might simply be an array of size 2^32 in which you count the occurrences in both lists.

If the integers are unranged, it cannot be done.

Christopher Oezbek 2010-03-08 15:41:40

Answer 7

A:

You could add each element into a hashmap<Integer, Integer>, with the following rules: Array A is the adder, array B is the remover. When inserting from Array A, if the key does not exist, insert it with a value of 1. If the key exists, increment the value (keep a count). When removing, if the key exists and is greater than 1, reduce it by 1. If the key exists and is 1, remove the element.

Run through array A followed by array B using the rules above. If at any time during the removal phase array B does not find an element, you can immediately return false. If after both the adder and remover are finished the hashmap is empty, the arrays are equivalent.

Edit: The size of the hashtable will be equal to the number of distinct values in the array does this fit the definition of constant space?

kgrad 2010-03-08 16:04:49

No, it is not equal to the length of the array, it is equal to the number of distinct values in the array.

Svante 2010-03-08 16:36:35

@Svante: whoops, true!

kgrad 2010-03-08 16:59:59

Answer 8

A:

given int are in the range -n..+n a simple way to check for equity may be the following (pseudo code):

// a & b are the array
accumulator = 0
arraysize = size(a)
for(i=0 ; i < arraysize; ++i) {
  accumulator = accumulator + a[i] - b[i]
  if abs(accumulator) > ((arraysize - i) * n) { return FALSE }
}
return (accumulator == 0)

accumulator must be able to store integer with range = +- arraysize * n

Francesco R. 2010-03-08 16:38:17

Your code is not working properly for [1,2,4] [2,3,2]

Victor Hurdugaci 2010-03-08 17:48:23

Answer 9

A:

I imagine the solution will require some sort of transformation that is both associative and commutative and guarantees a unique result for a unique set of inputs. However I'm not sure if that even exists.

Niki Yoshiuchi 2010-03-08 19:17:32

I was thinking something along those lines, eg. proving that for subset A,B of Z\{0,1} where |A| = |B|, then if sum(a_i) = sum(b_i) and prod(a_i) = prod(b_i) we must have A = B after some permutation ... the hard part is the proof :P

Mads Ravn 2010-03-08 21:29:45

Exactly! I tried using sum() and prod() but it failed- on several inputs, for example: [-10, -10, -10, -1, 9] != [6, -3, 5, -10, -10] but their sums and products are equal.

Niki Yoshiuchi 2010-03-08 22:33:41

Answer 10

+1 A:

I claim that: Unless the range of input is specified, then it is IMPOSSIBLE to solve in onstant extra space, and O(n) running time.

I will be happy to be proven wrong, so that I can learn something new.

ArunSaha 2010-03-10 00:41:57

Can you prove the lower bound for this problem?

meta 2010-03-10 05:15:54

Answer 11

A:

How 'bout this - XOR all the numbers in both the arrays. If the result is 0, you got a match.

DevNull 2010-03-18 15:56:14

No - that doesn't work: consider the two 2-element arrays:`a[2] = { 1, 3 };``b[2] = { 2, 0 };`Using your method a and b would be considered to be identical.

Paul R 2010-04-20 07:31:33

ansaurus

tags:

views:

answers:

Compare two integer arrays with same length

related questions