views:

1351

answers:

13

Scroll down to see latest edit, I left all this text here just so that I don't invalidate the replies this question has received so far!


I have the following brain teaser I'd like to get a solution for, I have tried to solve this but since I'm not mathematically that much above average (that is, I think I'm very close to average) I can't seem wrap my head around this.

The problem: Given number x should be split to a serie of multipliers, where each multiplier <= y, y being a constant like 10 or 16 or whatever. In the serie (technically an array of integers) the last number should be added instead of multiplied to be able to convert the multipliers back to original number.

As an example, lets assume x=29 and y=10. In this case the expected array would be {10,2,9} meaning 10*2+9. However if y=5, it'd be {5,5,4} meaning 5*5+4 or if y=3, it'd be {3,3,3,2} which would then be 3*3*3+2.

I tried to solve this by doing something like this:

  1. while x >= y, store y to multipliers, then x = x - y
  2. when x < y, store x to multipliers

Obviously this didn't work, I also tried to store the "leftover" part separately and add that after everything else but that didn't work either. I believe my main problem is that I try to think this in a way too complex manner while the solution is blatantly obvious and simple.

To reiterate, these are the limits this algorithm should have:

  • has to work with 64bit longs
  • has to return an array of 32bit integers (...well, shorts are OK too)
  • while support for signed numbers (both + and -) would be nice, if it helps the task only unsigned numbers is a must

And while I'm doing this using Java, I'd rather take any possible code examples as pseudocode, I specifically do NOT want readily made answers, I just need a nudge (well, more of a strong kick) so that I can solve this at least partly myself. Thanks in advance.

Edit: Further clarification

To avoid some confusion, I think I should reword this a bit:

  • Every integer in the result array should be less or equal to y, including the last number.
  • Yes, the last number is just a magic number.
  • No, this is isn't modulus since then the second number would be larger than y in most cases.
  • Yes, there is multiple answers to most of the numbers available, however I'm looking for the one with least amount of math ops. As far as my logic goes, that means finding the maximum amount of as big multipliers as possible, for example x=1 000 000,y=100 is 100*100*100 even though 10*10*10*10*10*10 is equally correct answer math-wise.

I need to go through the given answers so far with some thought but if you have anything to add, please do! I do appreciate the interest you've already shown on this, thank you all for that.

Edit 2: More explanations + bounty

Okay, seems like what I was aiming for in here just can't be done the way I thought it could be. I was too ambiguous with my goal and after giving it a bit of a thought I decided to just tell you in its entirety what I'd want to do and see what you can come up with.

My goal originally was to come up with a specific method to pack 1..n large integers (aka longs) together so that their String representation is notably shorter than writing the actual number. Think multiples of ten, 10^6 and 1 000 000 are the same, however the representation's length in characters isn't.

For this I wanted to somehow combine the numbers since it is expected that the numbers are somewhat close to each other. I firsth thought that representing 100, 121, 282 as 100+21+161 could be the way to go but the saving in string length is neglible at best and really doesn't work that well if the numbers aren't very close to each other. Basically I wanted more than ~10%.

So I came up with the idea that what if I'd group the numbers by common property such as a multiplier and divide the rest of the number to individual components which I can then represent as a string. This is where this problem steps in, I thought that for example 1 000 000 and 100 000 can be expressed as 10^(5|6) but due to the context of my aimed usage this was a bit too flaky:

The context is Web. RESTful URL:s to be specific. That's why I mentioned of thinking of using 64 characters (web-safe alphanumberic non-reserved characters and then some) since then I could create seemingly random URLs which could be unpacked to a list of integers expressing a set of id numbers. At this point I thought of creating a base 64-like number system for expressing base 10/2 numbers but since I'm not a math genius I have no idea beyond this point how to do it.

The bounty

Now that I have written the whole story (sorry that it's a long one), I'm opening a bounty to this question. Everything regarding requirements for the preferred algorithm specified earlier is still valid. I also want to say that I'm already grateful for all the answers I've received so far, I enjoy being proven wrong if it's done in such a manner as you people have done.

The conclusion

Well, bounty is now given. I spread a few comments to responses mostly for future reference and myself, you can also check out my SO Uservoice suggestion about spreading bounty which is related to this question if you think we should be able to spread it among multiple answers.


Thank you all for taking time and answering!

A: 

Isn't this modulus?

Let / be integer division (whole numbers) and % be modulo.

int result[3];

result[0] = y;
result[1] = x / y;
result[2] = x % y;
Daniel A. White
Only when there are 2 divisors, doesn't work for more.
Eric Burnett
x/y could very well be greater than y and an elementary number to boot in which case it cannot be divided further.
Mikko Rantanen
A: 

Like in my comment above, I'm not sure I understand exactly the question. But assuming integers (n and a given y), this should work for the cases you stated:

multipliers[0] = n / y;
multipliers[1] = y;
addedNumber = n % y;
jfclavette
A: 

Just set x:=x/n where n is the largest number that is less both than x and y. When you end up with x<=y, this is your last number in the sequence.

zvrba
+6  A: 

Edit after final explanation:

I would think base64 is the best solution, since there are standard functions that deal with it, and variants of this idea don't give much improvement. This was answered with much more detail by others here.

Regarding the original question, although the code works, it is not guaranteed to run in any reasonable time, as was answered as well as commented on this question by LFSR Consulting.

Original Answer:

You mean something like this?

Edit - corrected after a comment.

shortest_output = {}

foreach (int R = 0; R <= X; R++) {
    // iteration over possible remainders
    // check if the rest of X can be decomposed into multipliers
    newX = X - R;
    output = {};

    while (newX > Y) {
       int i;
       for (i = Y; i > 1; i--) {
           if ( newX  % i == 0) { // found a divider
        output.append(i);
        newX  = newX /i;  
        break;
           }
       }

       if (i == 1) { // no dividers <= Y
          break;
       }
    }
    if (newX != 1) {
     // couldn't find dividers with no remainder
     output.clear();
    }
    else {
     output.append(R);
            if (output.length() < shortest_output.length()) {
                 shortest_output = output;
            }
    }
}
Yevgeny Doctor
That won't work with the example in the question. x=29 and y=10 => {10,2,9}. The way he states it, your dividers don't have to be dividers at all, just add a magic constant at the end.
jfclavette
You're right. Well, you can just iterate (brute force) over possible values of the remainder, and than check if the remaining value of X is a multiple (with no remainder) of some multipliers
Yevgeny Doctor
He wants the least number of divisors possible, so you should change the for loop to go from Y to 2 instead of from 2 to Y so that the largest possible is always selected.
Eric Burnett
Done. I just thought of primary numbers multipliers by default.
Yevgeny Doctor
One more thing: to find the true best decomposition you'll have to try all the remainders and see which gives the shortest output list. Eg x=27, y=5 gives 3*3*3+0, but a better solution is 5*5+2.
Eric Burnett
Indeed. Corrected.
Yevgeny Doctor
Good try... but Let X = long.MaxValue; Y = 10; And call me when your algorithm completes! :P Running a simple trial where I take the time at R % 1 billion (factors into long.MaxValue ~9,223,372,036 times) it takes ~8:00... extrapolating to 73,786,976,288 minutes.
Gavin Miller
The original problem asked talked about factoring a number into it's multipliers. That problem is not solvable in reasonable time by ANY algorithm - that is what RSA (Public Key / Private Key) encryption is based on...
Yevgeny Doctor
@Yevgeny - I agree and that is the basis for my answer: http://stackoverflow.com/questions/789085/expressing-an-integer-as-a-series-of-multipliers/804128#804128
Gavin Miller
+3  A: 

Updated after the full story

Base64 is most likely your best option. If you want a custom solution you can try implementing a Base 65+ system. Just remember that just because 10000 can be written as "10^4" doesn't mean that everything can be written as 10^n where n is an integer. Different base systems are the simplest way to write numbers and the higher the base the less digits the number requires. Plus most framework libraries contain algorithms for Base64 encoding. (What language you are using?).

One way to further pack the urls is the one you mentioned but in Base64.

int[] IDs;
IDs.sort() // So IDs[i] is always smaller or equal to IDs[i-1].

string url = Base64Encode(IDs[0]);

for (int i = 1; i < IDs.length; i++) {
  url += "," + Base64Encode(IDs[i-1] - IDs[i]);
}

Note that you require some separator as the initial ID can be arbitrarily large and the difference between two IDs CAN be more than 63 in which case one Base64 digit is not enough.

Updated

Just restating that the problem is unsolvable. For Y = 64 you can't write 87681 in multipliers + remainder where each of these is below 64. In other words, you cannot write any of the numbers 87617..87681 with multipliers that are below 64. Each of these numbers has an elementary term over 64. 87616 can be written in elementary terms below 64 but then you'd need those + 65 and so the remainder will be over 64.

So if this was just a brainteaser, it's unsolvable. Was there some practical purpose for this which could be achieved in some way other than using multiplication and a remainder?

And yes, this really should be a comment but I lost my ability to comment at some point. :p

I believe the solution which comes closest is Yevgeny's. It is also easy to extend Yevgeny's solution to remove the limit for the remainder in which case it would be able to find solution where multipliers are smaller than Y and remainder as small as possible, even if greater than Y.

Old answer:

If you limit that every number in the array must be below the y then there is no solution for this. Given large enough x and small enough y, you'll end up in an impossible situation. As an example with y of 2, x of 12 you'll get 2 * 2 * 2 + 4 as 2 * 2 * 2 * 2 would be 16. Even if you allow negative numbers with abs(n) below y that wouldn't work as you'd need 2 * 2 * 2 * 2 - 4 in the above example.

And I think the problem is NP-Complete even if you limit the problem to inputs which are known to have an answer where the last term is less than y. It sounds quite much like the [Knapsack problem][1]. Of course I could be wrong there.

Edit:

Without more accurate problem description it is hard to solve the problem, but one variant could work in the following way:

  1. set current = x
  2. Break current to its terms
  3. If one of the terms is greater than y the current number cannot be described in terms greater than y. Reduce one from current and repeat from 2.
  4. Current number can be expressed in terms less than y.
  5. Calculate remainder
  6. Combine as many of the terms as possible.

(Yevgeny Doctor has more conscise (and working) implementation of this so to prevent confusion I've skipped the implementation.)

Mikko Rantanen
Maybe this idiom could help: "I want to carry exactly x kilograms worth of boxes where each box is as big as possible (within limit y) and the total amount of boxes is as small as possible."Are we thinking the same thing here? Your solution looks interesting, I'll comment later if it really is what I was looking for.
Esko
Carrying boxes is an additive operation which is easy to solve too. Just pack your stuff to boxes of size Y. My solution focuses mainly on minimizing the remainder. So given X of 30 and Y of 5 it will find the solution of 2*3*5 + 0 as it got smaller remainder than 5*5+5.If cpu effiency is of no importance you could break all the numbers from X to X-Y to their terms, combine the terms for the numbers for which all the terms are less or equal to Y and check the one with fewest terms in the end. Of course you need to prepare for NONE of the numbers X..X-Y having all terms less or equal to Y.
Mikko Rantanen
The concept for your solution seems correct, if overly complicated, however as you noted it is effectively the same as Yevgeny's (current) solution. The implementation has the same bug that his did: it will not find better solutions where the remainder is higher (eg x=27, y=5 gives 3*3*3+0, but a better solution is 5*5+2). Also, your GetTerms(...) method seems to be the actual implementation of the Combine() method, and the real implementation is missing.
Eric Burnett
Also, there is a bug with combine: it will not find the smallest combination. For example, {2,2,2,3,3} with y=9 will combine to {2,6,6} instead of {8,9}.
Eric Burnett
Yes, I admit it being overly complicated. The point was to use it to describe the idea behind the solution. I've now removed the solution as Yevgeny's solution has these bugs fixed and is easier to understand as well.Fixing the solution seemed moot as it does not solve the problem as it is unsolvable unless rephrased in some other way. The "Optimal remainder higher than minimum remainder" is noted in the comment above though.(And yup. I misnamed my Combine method as GetTerms by accident. :<)
Mikko Rantanen
I just want to record that this was -I think- the first or second answer which went more into detail than most and contains a lot of relevant information regarding the subject. Thanks Mikko for writing all this!
Esko
+1  A: 

Update: I didn't get everything, thus I rewrote the whole thing in a more Java-Style fashion. I didn't think of the prime number case that is bigger than the divisor. This is fixed now. I leave the original code in order to get the idea.

Update 2: I now handle the case of the big prime number in another fashion . This way a result is obtained either way.

public final class PrimeNumberException extends Exception {

    private final long primeNumber;

    public PrimeNumberException(long x) {
        primeNumber = x;
    }

    public long getPrimeNumber() {
        return primeNumber;
    }
}

public static Long[] decompose(long x, long y) {
    try {
        final ArrayList<Long> operands = new ArrayList<Long>(1000);
        final long rest = x % y;
        // Extract the rest so the reminder is divisible by y
        final long newX = x - rest;
        // Go into recursion, actually it's a tail recursion
        recDivide(newX, y, operands);            
    } catch (PrimeNumberException e) {
        // return new Long[0];
        // or do whatever you like, for example
        operands.add(e.getPrimeNumber());
    } finally {
        // Add the reminder to the array
        operands.add(rest);
        return operands.toArray(new Long[operands.size()]);
    }
}

// The recursive method
private static void recDivide(long x, long y, ArrayList<Long> operands)
    throws PrimeNumberException {
    while ((x > y) && (y != 1)) {
        if (x % y == 0) {
            final long rest = x / y;
            // Since y is a divisor add it to the list of operands
            operands.add(y);
            if (rest <= y) {
                // the rest is smaller than y, we're finished
                operands.add(rest);
            }
            // go in recursion
            x = rest;
        } else {
            // if the value x isn't divisible by y decrement y so you'll find a 
            // divisor eventually
            if (--y == 1) {
                throw new PrimeNumberException(x);
            }
        }
    }
}


Original: Here some recursive code I came up with. I would have preferred to code it in some functional language but it was required in Java. I didn't bother converting the numbers to integer but that shouldn't be that hard (yes, I'm lazy ;)

public static Long[] decompose(long x, long y) {
    final ArrayList<Long> operands = new ArrayList<Long>();
    final long rest = x % y;
    // Extract the rest so the reminder is divisible by y
    final long newX = x - rest;
    // Go into recursion, actually it's a tail recursion
    recDivide(newX, y, operands);
    // Add the reminder to the array
    operands.add(rest);
    return operands.toArray(new Long[operands.size()]);
}

// The recursive method
private static void recDivide(long newX, long y, ArrayList<Long> operands) {
    long x = newX;
    if (x % y == 0) {
        final long rest = x / y;
        // Since y is a divisor add it to the list of operands
        operands.add(y);
        if (rest <= y) {
            // the rest is smaller than y, we're finished
            operands.add(rest);
        } else {
            // the rest can still be divided, go one level deeper in recursion
            recDivide(rest, y, operands);
        }
    } else {
        // if the value x isn't divisible by y decrement y so you'll find a divisor    
        // eventually
        recDivide(x, y-1, operands);
    }
}
boutta
Number <-> character replacement is laughably easy anyway :) This code looks scaringly short and effective, gonna test it right away.
Esko
OK I'm impressed. This has resemblance to what I tried to do originally (and failed at) but ... wow. I'm simply impressed now.
Esko
...heh, my initial impression came down a bit since this has the problem that I feared for, it can't handle all the numbers which can't be divided to parts correctly. For example decompose(49535, 64) will cause StackOverflowError. Good effort so far though, I think you can do some additional magic to this and make it work! :)
Esko
Fixed it now and got even the case with the big prime numbers.
boutta
Yeah, seems nice. Now the only question that remains in your specific solution is how do I really handle primes...
Esko
I would add them to the array since there is no way to split them any further. Unless you want to change the overall approach.
boutta
Updated accorindlgy.
boutta
Deserves a mention for the solution which exposed yet another issue I hadn't even thought about - primes.
Esko
+1  A: 

The original method you chose (a * b + c * d + e) would be very difficult to find optimal solutions for simply due to the large search space of possibilities. You could factorize the number but it's that "+ e" that complicates things since you need to factorize not just that number but quite a few immediately below it.

Two methods for compression spring immediately to mind, both of which give you a much-better-than-10% saving on space from the numeric representation.

A 64-bit number ranges from (unsigned):

                         0 to
18,446,744,073,709,551,616

or (signed):

-9,223,372,036,854,775,808 to
 9,223,372,036,854,775,807

In both cases, you need to reduce the 20-characters taken (without commas) to something a little smaller.

The first is to simply BCD-ify the number the base64 encode it (actually a slightly modified base64 since "/" would not be kosher in a URL - you should use one of the acceptable characters such as "_").

Converting it to BCD will store two digits (or a sign and a digit) into one byte, giving you an immediate 50% reduction in space (10 bytes). Encoding it base 64 (which turns every 3 bytes into 4 base64 characters) will turn the first 9 bytes into 12 characters and that tenth byte into 2 characters, for a total of 14 characters - that's a 30% saving.

The only better method is to just base64 encode the binary representation. This is better because BCD has a small amount of wastage (each digit only needs about 3.32 bits to store [log210], but BCD uses 4).

Working on the binary representation, we only need to base64 encode the 64-bit number (8 bytes). That needs 8 characters for the first 6 bytes and 3 characters for the final 2 bytes. That's 11 characters of base64 for a saving of 45%.

If you wanted maximum compression, there are 73 characters available for URL encoding:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789$-_.+!*'(),

so technically you could probably encode base-73 which, from rough calculations, would still take up 11 characters, but with more complex code which isn't worth it in my opinion.

Of course, that's the maximum compression due to the maximum values. At the other end of the scale (1-digit) this encoding actually results in more data (expansion rather than compression). You can see the improvements only start for numbers over 999, where 4 digits can be turned into 3 base64 characters:

Range (bytes)  Chars  Base64 chars  Compression ratio
-------------  -----  ------------  -----------------
     < 10 (1)      1       2             -100%
    < 100 (1)      2       2                0%
   < 1000 (2)      3       3                0%
   < 10^4 (2)      4       3               25%
   < 10^5 (3)      5       4               20%
   < 10^6 (3)      6       4               33%
   < 10^7 (3)      7       4               42%
   < 10^8 (4)      8       6               25%
   < 10^9 (4)      9       6               33%
  < 10^10 (5)     10       7               30%
  < 10^11 (5)     11       7               36%
  < 10^12 (5)     12       7               41%
  < 10^13 (6)     13       8               38%
  < 10^14 (6)     14       8               42%
  < 10^15 (7)     15      10               33%
  < 10^16 (7)     16      10               37%
  < 10^17 (8)     17      11               35%
  < 10^18 (8)     18      11               38%
  < 10^19 (8)     19      11               42%
  <  2^64 (8)     20      11               45%
paxdiablo
+15  A: 
Unknown
That's 85 characters including the reserved characters which have a special meaning. One of the most special is '#' which isn't even necessarily sent to the server.
Mikko Rantanen
@ Mikko, the person who asked the question neglected to say that he was using a standard web browser client, so I am stating the ability to go to base85 as a matter of completeness.
Unknown
Argh, how did this become a community wiki? I didn't click the button.
Unknown
High enough amount of revisions triggers the community wiki thing, I think the magic number for that is 5 or so. I wonder if this complicates the bounty... :/
Esko
As far as I know, community wiki answers can still get bounty points. Some bounty questions were CW from the beginning and I haven't seen any reports of problems.
Michael Myers
I didn't know multiple edits would turn this into a community wiki, it wasn't in the FAQ. I asked [email protected], but they said they didn't have the power to change it :(
Unknown
+1  A: 

Are you married to using Java? Python has an entire package dedicated just for this exact purpose. It'll even sanitize the encoding for you to be URL-safe.

Native Python solution

The standard module I'm recommending is base64, which converts arbitrary stings of chars into sanitized base64 format. You can use it in conjunction with the pickle module, which handles conversion from lists of longs (actually arbitrary size) to a compressed string representation.

The following code should work on any vanilla installation of Python:

import base64
import pickle

# get some long list of numbers
a = (854183415,1270335149,228790978,1610119503,1785730631,2084495271,
    1180819741,1200564070,1594464081,1312769708,491733762,243961400,
    655643948,1950847733,492757139,1373886707,336679529,591953597,
    2007045617,1653638786)

# this gets you the url-safe string
str64 = base64.urlsafe_b64encode(pickle.dumps(a,-1))
print str64
>>> gAIoSvfN6TJKrca3S0rCEqMNSk95-F9KRxZwakqn3z58Sh3hYUZKZiePR0pRlwlfSqxGP05KAkNPHUo4jooOSixVFCdK9ZJHdEqT4F4dSvPY41FKaVIRFEq9fkgjSvEVoXdKgoaQYnRxAC4=

# this unwinds it
a64 = pickle.loads(base64.urlsafe_b64decode(str64))
print a64
>>> (854183415, 1270335149, 228790978, 1610119503, 1785730631, 2084495271, 1180819741, 1200564070, 1594464081, 1312769708, 491733762, 243961400, 655643948, 1950847733, 492757139, 1373886707, 336679529, 591953597, 2007045617, 1653638786)

Hope that helps. Using Python is probably the closest you'll get from a 1-line solution.

Tim Lin
I'm married to references. Does this entire package have a name?
Glenn
+5  A: 

It sounds as though you want to compress random data -- this is impossible for information theoretic reasons. (See http://www.faqs.org/faqs/compression-faq/part1/preamble.html question 9.) Use Base64 on the concatenated binary representations of your numbers and be done with it.

Dave
Also see: http://en.wikipedia.org/wiki/Kolmogorov_complexity as I stated in my answer.
grieve
+4  A: 

The problem you're attempting to solve (you're dealing with a subset of the problem, given you're restriction of y) is called Integer Factorization and it cannot be done efficiently given any known algorithm:

In number theory, integer factorization is the breaking down of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer.

This problem is what makes a number of cryptographic functions possible (namely RSA which uses 128 bit keys - long is half of that.) The wiki page contains some good resources that should move you in the right direction with your problem.

So, your brain teaser is indeed a brain teaser... and if you solve it efficiently we can elevate your math skills to above average!

Gavin Miller
+2  A: 

OP Wrote:

My goal originally was to come up with a specific method to pack 1..n large integers (aka longs) together so that their String representation is notably shorter than writing the actual number. Think multiples of ten, 10^6 and 1 000 000 are the same, however the representation's length in characters isn't.

I have been down that path before, and as fun as it was to learn all the math, to save you time I will just point you to: http://en.wikipedia.org/wiki/Kolmogorov_complexity

In a nutshell some strings can be easily compressed by changing your notation:

10^9 (4 characters) = 1000000000 (10 characters)

Others cannot:

7829203478 = some random number...

This is a great great simplification of the article I linked to above, so I recommend that you read it instead of taking my explanation at face value.

Edit: If you are trying to make RESTful urls for some set of unique data, why wouldn't you use a hash, such as MD5? Then include the hash as part of the URL, then look up the data based on the hash. Or am I missing something obvious?

grieve
The amount of combinations - even ordered ones - would get huge over time, that's why I dismissed directs lookups originally.
Esko
Short answer but contains good base theory regarding the subject of this question. Thank you grieve, now I can actually act smart and refer to Kolmogorov whenever I hit this or similar issue/problem again :)
Esko
Glad I could help. :)
grieve
+1  A: 

Wrt the original algorithm request: Is there a limit on the size of the last number (beyond that it must be stored in a 32b int)? (The original request is all I'm able to tackle lol.)

The one that produces the shortest list is:

bool negative=(n<1)?true:false;
int j=n%y;
if(n==0 || n==1)
{
list.append(n);
return;
}
while((long64)(n-j*y)>MAX_INT && y>1) //R has to be stored in int32
{
y--;
j=n%y;
}
if(y<=1)
fail //Number has no suitable candidate factors. This shouldn't happen

int i=0;
for(;i<j;i++)
{
list.append(y);
}
list.append(n-y*j);
if(negative)
list[0]*=-1;
return;

A little simplistic compared to most answers given so far but it achieves the desired functionality of the original post... It's a little dirty but hopefully useful :)

ParoXoN