views:

1225

answers:

11

Imagine two positive integers A and B. I want to combine these two into a single integer C.

There can be no other integers D and E which combine to C. So combining them with the addition operator doesn't work. Eg 30 + 10 = 40 = 40 + 0 = 39 + 1 Neither does concatination work. Eg "31" + "2" = 312 = "3" + "12"

This combination operation should also be deterministic (always yield the same result with the same inputs) and should always yield an integer on either the positive or the negative side of integers.

+11  A: 

If A and B can be expressed with 2 bytes, you can combine them on 4 bytes. Put A on the most significant half and B on the least significant half.

In C language this gives (assuming sizeof(short)=2 and sizeof(int)=4):

int combine(short A, short B)
{
    return A<<16 | B;
}

short getA(int C)
{
    return C>>16;
}

short getB(int C)
{
    return C & 0xFFFF;
}
mouviciel
+4  A: 

Let number a be the first, b the second. Let p be the a+1-th prime number, q be the b+1-th prime number

Then, the result is pq, if a<b, or 2pq if a>b. If a=b, let it be p^2.

ASk
I doubt that you'd want a NP solution.
Doesn't this produce the same result for a=5, b=14 and a=6, b=15?
Lieven
Two products of two different primes can't have the same result (unique prime factor decomposition)a=5, b=14 -> result is 13*47 = 611a=6, b=15 -> result is 17*53 = 901
ASk
+3  A: 

The standard mathematical way for positive integers is to use the uniqueness of prime factorization.

f( x, y ) -> 2^x * 3^y

The downside is that the image tends to span quite a large range of integers so when it comes to expressing the mapping in a computer algorithm you may have issues with choosing an appropriate type for the result.

You could modify this to deal with negative x and y by encoding a flags with powers of 5 and 7 terms.

e.g.

f( x, y ) -> 2^|x| * 3^|y| * 5^(x<0) * 7^(y<0)
Charles Bailey
The math is fine. But, as Boris says, if you want to run this as a computer program, you have to take into account the finiteness of the machine. The algorithm will work correctly for a subset of the integers representable in the relevant machine.
Yuval F
I did state this in my second paragraph. The tags on the question indicate 'algorithm', 'mathematical' and 'deterministic', not any particular language. The input range may not be limited and the environment might have an unbounded integer type 'bigint'.
Charles Bailey
+1  A: 

Perhaps you should read up on hash functions, especially perfect hashes? Wikipedia has some nice high-level descriptions with links to implementations and more detailed articles.

Christoffer
+15  A: 

You're looking for a bijective NxN -> N mapping. These are used for e.g. dovetailing. Have a look at this PDF for an introduction to so-called pairing functions. Wikipedia introduces a specific pairing function, namely the Cantor pairing function:

  • pi(k1, k2) = 1/2(k1 + k2)(k1 + k2 + 1) + k2

Three remarks:

  • As others have made clear, if you plan to implement a pairing function, you may soon find you need arbitrarily large integers (bignums).
  • If you don't want to make a distinction between the pairs (a, b) and (b, a), then sort a and b before applying the pairing function.
  • Actually I lied. You are looking for a bijective ZxZ -> N mapping. Cantor's function only works on non-negative numbers. This is not a problem however, because it's easy to define a bijection f : Z -> N, like so:
    • f(n) = n * 2 if n >= 0
    • f(n) = -n * 2 - 1 if n < 0
Stephan202
+1 I think this is the correct answer for unbounded integers.
Unknown
+5  A: 

Is this even possible?
You are combining two integers. They both have the range -2,147,483,648 to 2,147,483,647 but you will only take the positives. That makes 2147483647^2 = 4,61169E+18 combinations. Since each combination has to be unique AND result in an integer, you'll need some kind of magical integer that can contain this amount of numbers.

Or is my logic flawed?

borisCallens
+1 That's what I think too (although I did the calculation saying the order of A and B don't matter)
lc
Yes your logic is correct by the pigeonhole principle. Unforunately the asker did not specify if the integer is bounded or not.
Unknown
Yes, I had that afterthought too, but I thought the message is in essence the same, so I didn't bother recalcing.
borisCallens
Also I just realized I should pick up my chance calculation (literal translation from Dutch) textbooks again.
borisCallens
@Boris: Kansrekening is "probability theory".
Stephan202
+1 if this logic is flawed, so is mine.
Lieven
@Stephan: thx :)
borisCallens
A: 

What you suggest is impossible. You will always have collisions.

In order to map two objects to another single set, the mapped set must have a minimum size of the number of combinations expected:

Assuming a 32-bit integer, you have 2147483647 positive integers. Choosing two of these where order doesn't matter and with repetition yields 2305843008139952128 combinations. This does not fit nicely in the set of 32-bit integers.

You can, however fit this mapping in 61 bits. Using a 64-bit integer is probably easiest. Set the high word to the smaller integer and the low word to the larger one.

lc
A: 

f(a, b) = s(a+b) + a, where s(n) = n*(n+1)/2

  • This is a function -- its deterministic.
  • It is also injective -- f maps different values for different (a,b) pairs. You can prove this using the fact: s(a+b+1)-s(a+b) = a+b+1 gt a.
  • It returns quite small values -- good if your are going to use it for array indexing, as the array does not have to be big.
  • It is cache-friendly -- if two (a, b) pairs are close to each other, then f maps numbers to them which are close to each other (compared to other methods).
  • Disadvantage: the inverse is slow to compute (needs square root).

I did not understand what You mean by:

should always yield an integer on either the positive or the negative side of integers

How can I write (greater than), (less than) characters in this forum?

Zoli
+1  A: 

Check this: http://en.wikipedia.org/wiki/Pigeonhole_principle. If A, B and C are of same type, it cannot be done. If A and B are 16-bit integers, and C is 32-bit, then you can simply use shifting.

The very nature of hashing algorithms is that they cannot provide a unique hash for each different input.

Groo
+1  A: 

It isn't that tough to construct a mapping:

   1  2  3  4  5  use this mapping if (a,b) != (b,a)
1  0  1  3  6 10
2  2  4  7 11 16
3  5  8 12 17 23
4  9 13 18 24 31
5 14 19 25 32 40

   1  2  3  4  5 use this mapping if (a,b) == (b,a) (mirror)
1  0  1  2  4  6
2  1  3  5  7 10
3  2  5  8 11 14
4  4  8 11 15 19
5  6 10 14 19 24


    0  1 -1  2 -2 use this if you need negative/positive
 0  0  1  2  4  6
 1  1  3  5  7 10
-1  2  5  8 11 14
 2  4  8 11 15 19
-2  6 10 14 19 24

Figuring out how to get the value for an arbitrary a,b is a little more difficult.

Dolphin
A: 

I think you can write less than with & l t; and right than with & g t; if there is not html char parsing, but probably there is, I will try to cheat: 5 < 6 & 6 > 5. ( Of course, avoid the space between the characters, if it works, you will see the symbols, not the code )

Lucas