tags:

views:

359

answers:

8

I do not know a whole lot about math, so I don't know how to begin to google what I am looking for, so I rely on the intelligence of experts to help me understand what I am after...

I am trying to find the smallest string of equations for a particular large number. For example given the number

"39402006196394479212279040100143613805079739270465446667948293404245721771497210611414266254884915640806627990306816"

The smallest equation is 64^64 (that I know of) . It contains only 5 bytes.

Basically the program would reverse the math, instead of taking an expression and finding an answer, it takes an answer and finds the most simplistic expression. Simplistic is this case means smallest string, not really simple math.

Has this already been created? If so where can I find it? I am looking to take extremely HUGE numbers (10^10000000) and break them down to hopefully expressions that will be like 100 characters in length. Is this even possible? are modern CPUs/GPUs not capable of doing such big calculations?


Edit:

Ok. So finding the smallest equation takes WAY too much time, judging on answers. Is there anyway to bruteforce this and get the smallest found thus far?

For example given a number super super large. Sometimes taking the sqaureroot of number will result in an expression smaller than the number itself.

As far as what expressions it would start off it, well it would naturally try expressions that would the expression the smallest. I am sure there is tons of math things I dont know, but one of the ways to make a number a lot smaller is powers.

+2  A: 

It looks like you're basically wanting to do factoring on an arbitrarily large number. That is such a difficult problem that it actually serves as the cornerstone of modern-day cryptography.

Tristan
No, this is nowhere as difficult as factoring. You could find nearby numbers and factorise them instead.
ShreevatsaR
@ShreevatsaR: So now you've replaced the difficult task of factoring with the difficult task of first determining that factoring isn't the route to take (not sure how you decide that), then factoring all surrounding numbers? How is that easier?
Mark Peters
@Mark Peters: It's *much* easier to find nearby composite numbers — every n^th number is divisible by n. (Also, factoring is *never* the route to take unless there are large powers of small numbers in the number: this can be checked.)
ShreevatsaR
@ShreevatsaR: So? That doesn't guarantee a smaller equation as my challenge clearly shows. Answer my challenge and then I'll consider believing you. Until then you're not providing any evidence or citations for your views so I'll have to just dismiss them.
Mark Peters
@Mark Peters: As I said, factoring can be optimal when there are large powers in the number — as in your example (or indeed as in the 64^64 in the question). But the assertion is being made that this problem is as hard as factoring, and the fact that factoring is optimal in some cases is insufficient to justify the assertion. Surely the burden of proof (of hardness) is on one making the assertion.
ShreevatsaR
+6  A: 

There's a good program to do that here: http://mrob.com/pub/ries/index.html

Charles
this is very cool and might just work... I'm guessing there must be some numbers for which it will spin forever without finding an exact match, even if you limit it to integers.
rmeador
A: 

The existence of an infinite number of primes means that there will always be numbers that cannot be simplified by factoring. What you're asking for is not possible, sorry.

Steven Sudit
"2^6972593-1" is a prime with over 2 million of digits, but I expressed it in 11 characters
Tom Sirgedas
Yes, some primes can be expressed in terms of small offsets from "round" numbers, but the OP gave examples involving factoring, not mixing and matching.
Steven Sudit
However, the OP did not exclude that as an option.
woodchips
+3  A: 

I asked the question "what's the point of doing this", as I don't know if you're looking at this question from a mathemetics point of view, or a large number factoring point of view.

As other answers have considered the factoring point of view, I'll look at the maths angle. In particular, the problem you are describing is a compressibility problem. This is where you have a number, and want to describe it in the smallest algorithm. Highly random numbers have very poor compressibility, as to describe them you either have to write out all of the digits, or describe a deterministic algorithm which is only slightly smaller than the number itself.

There is currently no general mathemetical theorem which can determine if a representation of a number is the smallest possible for that number (although a lower bound can be discovered by understanding shannon's information theory). (I said general theorem, as special cases do exist).

As you said you don't know a whole lot of math, this is perhaps not a useful answer for you...

David_001
+10  A: 

Just to throw another keyword in your Google hopper, see Kolmogorov Complexity. The Kolmogorov complexity of a string is the size of the smallest Turing machine that outputs the string, given an empty input. This is one way to formalize what you seem to be after. However, calculating the Kolmogorov complexity of a given string is known to be an undecidable problem :)

Hope this helps,

TJ

tjgreen
+1: This is essentially the problem trying to be solved I think (with a better link than my answer which considered information theory as a whole)
David_001
It's simpler than that. We can make a list of which strings evaluate to values, check their values, and then go down the list until we find a string with the desired value. Since a written-out number is an acceptable string, we'll always find a value in finite time for a finite number. It's perfectly decidable, just totally impractical.
David Thornley
@David: whether this works on not depends on what kind of "evaluation" you have in mind. If the string is to be interpreted as an arbitrary Turing machine specification, then what you suggest won't work, because evaluation will diverge for some strings, but you can't tell which (the Halting problem). On the other hand, if the string is to be interpreted as e.g. some kind of simple arithmetical expression, for which the Halting problem is decidable, then your idea would work.
tjgreen
Nice thinking, but I don't agree. The Kolmogorov Complexity refers to *turing complete* languages. The question here refers to a very limited arithmetic expression language which isn't turing complete. By the way, the proof of the Kolmogorov incomputability does not apply to this language (it needs conditions and loops).
Eyal Schneider
This is the worst sort of phenomenon on Stack Overflow, where someone posts an unhelpful answer with big impressive words and everyone upvotes without checking it. The undecidability of Kolmogorov complexity is simply not relevant here, with such a limited language… the problem here is in fact trivially decidable.
ShreevatsaR
As @David, @Eyal and @Shreevastsa mention, we can certainly enumerate all strings less than a certain length (such as the length of the number), so this is completely 100% decidable.
BlueRaja - Danny Pflughoeft
+1  A: 

This really appears to be a mathematics problem, and not programming or computer science problem. You should ask this on http://math.stackexchange.com/

Adam Porad
+3  A: 

You're doing a form of lossless compression, and lossless compression doesn't work on random data. Suppose, to the contrary, that you had a way of compressing N-bit numbers into N-1-bit numbers. In that case, you'd have 2^N values to compress into 2^N-1 designations, which is an average of 2 values per designation, so your average designation couldn't be uncompressed. Lossless compression works well on relatively structured data, where data we're likely to get is compressed small, and data we aren't going to get actually grows some.

It's a little more complicated than that, since you're compressing partly by allowing more information per character. (There are a greater number of N-character sequences involving digits and operators than digits alone.) Still, you're not going to get lossless compression that, on the average, is better than just writing the whole numbers in binary.

David Thornley
Note that the original alphabet has 10 symbols (0-10), and the "target" alphabet allows symbols for multiplication, exponentiation, etc. Still, even if you add (say) 5 symbols, to compress all 10^n strings of length n, you'd need at least n log(10/15) symbols in the worst case, so you cannot compress arbitrary numbers beyond a constant factor.
ShreevatsaR
A: 

While your question remains unclear, perhaps integer relation finding is what you are after.

EDIT:

There is some speculation that finding a "short" form is somehow related to the factoring problem. I don't believe that is true unless your definition requires a product as the answer. Consider the following pseudo-algorithm which is just sketch and for which no optimization is attempted.

If "shortest" is a well-defined concept, then in general you get "short" expressions by using small integers to large powers. If N is my integer, then I can find an integer nearby that is 0 mod 4. How close? Within +/- 2. I can find an integer within +/- 4 that is 0 mod 8. And so on. Now that's just the powers of 2. I can perform the same exercise with 3, 5, 7, etc. We can, for example, easily find the nearest integer that is simultaneously the product of powers of 2, 3, 5, 7, 11, 13, and 17, call it N_1. Now compute N-N_1, call it d_1. Maybe d_1 is "short". If so, then N_1 (expressed as power of the prime) + d_1 is the answer. If not, recurse to find a "short" expression for d_1.

We can also pick integers that are maybe farther away than our first choice; even though the difference d_1 is larger, it might have a shorter form.

GregS