views:

2653

answers:

20

I need to find out the prime factors of over 300 billion. I have a function that is adding to the list of them...very slowly! It has been running for about an hour now and i think its got a fair distance to go still. Am i doing it completly wrong or is this expected?!

(Please no solutions for my code to give it away, but some subtle hints will be welcomed =])

Edit: Ok just for all of ur curiosity, im trying to find the largest prime factor of the number 600851475143. I dont mind the answer but no code!

Edit: Egotistical programmers please move on to the next.

RESULT!! Haha ok guys i finally did it after alot of prime number reading...

 {
  List<Int64> ListOfPrimeFactors = new List<Int64>();
  Int64 Number = 600851475143;
  Int64 DividingNumber = 2;

  while (DividingNumber < Number / DividingNumber)
  {
   if (Number % DividingNumber == 0)
   {
    ListOfPrimeFactors.Add(DividingNumber);
    Number = Number/DividingNumber;
   }
   else
    DividingNumber++;
  }
  ListOfPrimeFactors.Add(Number);
  listBox1.DataSource = ListOfPrimeFactors;
 }
+21  A: 

Are you remembering to divide the number that you're factorizing by each factor as you find them?

Say, for example, you find that 2 is a factor. You can add that to your list of factors, but then you divide the number that you're trying to factorise by that value.

Now you're only searching for the factors of 150 billion. Each time around you should start from the factor you just found. So if 2 was a factor, test 2 again. If the next factor you find is 3, there's no point testing from 2 again.

And so on...

Alnitak
No... erm explain please =]
xoxo
Ohh ok i get it! Thanks alot!
xoxo
Look into researching recursion.
EBGreen
Recursion for prime factoring?
Eclipse
If BIGNUM = firstFactorFound*(BIGNUM/firstFactorFound), then you can factor out BIGNUM/firstFactorFound to get the rest of the factors. Also, the factors of this new number are >= firstFactorFound. You can apply this recursively.
Brian
Don't forget that you only need to go as high as squareroot(new number) when factoring the newly found number.
Brian
The process is recursive, but I wouldn't implement it as a recursive function or you'll get... a stackoverflow ;) Use iteration.
Dave Swersky
I wouldn't really worry about the stack to be honest, even without tail recursion, the default stack size in .NET is 1MB so you can have an awful lot of stack frames, which means that if you can have (x = awful lot) stack frames, you can factorize any number up to x! (and obviously even more)
DrJokepu
A: 

It shouldn't take that long, even with a relatively naive brute force. For that specific number, I can factor it in my head in about one second.

You say you don't want solutions(?), but here's your "subtle" hint. The only prime factors of the number are the lowest three primes.

recursive
No there are more then that! The specific number is 300425737571 so go for it...
xoxo
I almost don't want to know the answer to this... but why do you need the factor of JUST this specific number?
EdgarVerona
It's for projecteuler.
recursive
+15  A: 

Factoring big numbers is a hard problem. So hard, in fact, that we rely on it to keep RSA secure. But take a look at the wikipedia page for some pointers to algorithms that can help. But for a number that small, it really shouldn't be taking that long, unless you are re-doing work over and over again that you don't have to somewhere.

For the brute-force solution, remember that you can do some mini-optimizations:

  • Check 2 specially, then only check odd numbers.
  • You only ever need to check up to the square-root of the number (if you find no factors by then, then the number is prime).
  • Once you find a factor, don't use the original number to find the next factor, divide it by the found factor, and search the new smaller number.
  • When you find a factor, divide it through as many times as you can. After that, you never need to check that number, or any smaller numbers again.
  • If you do all the above, each new factor you find will be prime, since any smaller factors have already been removed.
Eclipse
The number in the question is neither large nor hard to factor.
Sam Meldrum
When we say factoring big numbers is hard, we mean numbers above approximately 2^512. RSA and similar probably use primes up around 2^2048. The number in the OP is only about 2^39... much, much smaller
abelenky
+1  A: 

I wouldn't expect it to take very long at all - that's not a particularly large number.

Could you give us an example number which is causing your code difficulties?

Jon Skeet
600851475143. i cut it in half (thats what i thought i remembered from school) to make it 300425737571. I get rid of the numbers divisible by 2, 3, 5 and 7. Then i get rid of any numbers that cannot be divided into the number and find any primes left...
xoxo
"600851475143. i cut it in half (thats what i thought i remembered from school) to make it 300425737571." Sorry???? If you don't know the basic semantics of the problem, you shouldn't even try.
Daniel Daranas
No Daniel, thanks but i think ill give this one a go!
xoxo
You will "give a go" to cutting it in half?
Daniel Daranas
Im not saying all the solutions i try will be right! Hey some of us like being amatures for a little while!
xoxo
Some of us even love it! =) Anyway, you can only continue from the half if 2 is indeed a factor. But i guess you've figured that out by now.
PEZ
600851475143 and 300425737571 each have four distinct prime factors. They share **none** in common. @Daniel Daranas is suggesting that you understand why "cutting [an odd number] in half" doesn't work. For example, you may try the same with 27, which "cuts" to 13 but shares no factors with 13.
A. Rex
@A. Rex thank you for your clarification. Yes, what I was saying is basically that you can "try" any approaches, but it helps if they are mathematically correct. Dividing the number by 7.65 and then taking its logarithm certainly won't help.
Daniel Daranas
@Daniel Daranas: =)
A. Rex
You said "you shouldn't even try". Maybe that works for you, but for me trying is one of the ways of gaining understanding. You're newer advice to try figure out more about the problem before solving it was much more constructive.
PEZ
Ah thanks PEZ, i have had a look at alot of different algorithms for solving this and im sure i am close... (we can only hope)
xoxo
A: 

Semi-prime numbers of that size are used for encryption, so I am curious as to what you exactly want to use them for.

That aside, there currently are not good ways to find the prime factorization of large numbers in a relatively small amount of time.

Matthew Brubaker
300 billion is less than 40 bits. Encryption primes are closer to 2048 bits...
Ant P.
True on the less than 40 bits part, my mistake. However, encryption primes iirc are usually 512 bits each (1024 bits total for the resulting key). You use two, multiply them, and get a semi-prime (a number with only two prime factors).
Matthew Brubaker
+17  A: 

Finding prime factors is difficult using brute force, which sounds like the technique you are using.

Here are a few tips to speed it up somewhat:

  • Start low, not high
  • Don't bother testing each potential factor to see whether it is prime--just test LIKELY prime numbers (odd numbers that end in 1,3,7 or 9)
  • Don't bother testing even numbers (all divisible by 2), or odds that end in 5 (all divisible by 5). Of course, don't actually skip 2 and 5!!
  • When you find a prime factor, make sure to divide it out--don't continue to use your massive original number. See my example below.
  • If you find a factor, make sure to test it AGAIN to see if it is in there multiple times. Your number could be 2x2x3x7x7x7x31 or something like that.
  • Stop when you reach >= sqrt(remaining large number)

Edit: A simple example: You are finding the factors of 275.

  1. Test 275 for divisibility by 2. Does 275/2 = int(275/2)? No. Failed.
  2. Test 275 for divisibility by 3. Failed.
  3. Skip 4!
  4. Test 275 for divisibility by 5. YES! 275/5 = 55. So your NEW test number is now 55.
  5. Test 55 for divisibility by 5. YES! 55/5 = 11. So your NEW test number is now 11.
  6. BUT 5 > sqrt (11), so 11 is prime, and you can stop!

So 275 = 5 * 5 * 11

Make more sense?

BradC
What do you mean 'divide it out'? And im not sure how the square root affects it?
xoxo
If you haven't found a factor by the sqrt of a number, you're not going to find any no matter how far you go. This makes a HUGE difference when checking larger numbers.
Eclipse
Why do you say "at least all odd numbers ending in 1,3,7 or 9"?
PEZ
PEZ: Because all numbers that end in 5 are not prime, because they are divisible by 5 (except for 5 itself). So it is a waste of cycles to try 15 or 35 or 105 as a possible factor.
BradC
You do have to balance that waste with the effort needed to skip every 5th number. You'd probably get more benefit by skipping every 3rd odd number after 3. Note that from the computers point of view, there's nothing special about numbers that end in 5 in decimal.
Eclipse
Right--test-dividing by 5 each time is too wasteful. Just keep a counter that lets you know when to increment by 4 instead of incrementing by 2. Either way, though, it should be able to find factors pretty darn fast.
BradC
Thanks for the info. I have never reflected over that fact.
PEZ
This answer was really good. I was able to quite quickly whip together a factors() method in python that takes 0.002 secs to produce the factors for 600851475143. (I didn't bother to skip those numbers ending with 5)
PEZ
Thanks, PEZ. Yes, factoring numbers of this magnitude shouldn't take to long on a modern machine. Of course, add a few additional digits to the number...
BradC
Give me a number that I can test. =) (that has a chance of taking less than 10 minutes)
PEZ
Well, it seems my program can't handle really big numbers. Like if I dup that number (using 600851475143600851475143) I get a crap result.
PEZ
Yeah, I was doing it in VBA (I know, I know...) and it capped out at 14 digits.
BradC
I can get my program to work if I test for x*x<num instead of calculating the square root. Of course it would take a life time to really test that huge number so I'm not positive it works, but it's doing some work that it didn't do before. =)
PEZ
you're adding a lot of cycles doing x*x<num every loop.Just do sqrt(num) once at the beginning and store it, then just check against it every loop (and recalc when you find a factor)
BradC
well, i didn't do x*x every loop! =) I did like you suggest with sqrt(). but it takes a bit longer with x*x. for the number in the question it takes like 20% longer. then that difference gets really huge with 13 digits.
PEZ
however, I have another bug in my program now. it takes it 8.5 secs with the number 60085147514311, correct result. then it takes it 4.5 secs with 60085147514313, but then it doesn't give me the correct result. I get [3] when I should get [3, 20028382504771].
PEZ
I'm back to using sqrt() by the way (just not stupidly truncating to int like I did at first).
PEZ
Well, make sure that if your loop ends (when p>sqt(n)), then you assume that your final number N is prime, and is reported as another factor.
BradC
how about when the initial num is a prime then? is that a special case or should it be reported as a factor?
PEZ
I decided that it makes most sense to report no factors for a prime number.
PEZ
Now my program works without bugs I'd say. It takes 25 secs (on my MacBook) for it to figure out that there are no factors for 600851475143131 and it takes 19 secs to report [17, 353442044201843L] for 6008514751431331. Yay! And thanks again. This was fun and a very good lesson for me.
PEZ
Sorry, I can't stop playing with this. =) It doesn't take a life time at all for that number, 600851475143600851475143 (24 digits). It takes less than 0.01 secs to report [71, 73, 137, 839, 1471, 6857, 99990001].
PEZ
but i didnt think 11 was a prime number? isnt a prime factor all of the prime numbers that divide into that number? Or should i go back to school?!
xoxo
a prime number is only divisible by itself and 1.
atsjoo
You are wrong and right, Grace. 11 is indeed a prime number.
PEZ
@Grace, to your latter question, yes, I think so. There are two issues here: "programming" and "solving a real world problem". In your case the real world problem is "prime number" and the very first thing you should do is study the theory about them. Then think about algorithms.
Daniel Daranas
Well actually this is for no particular project i am working on, just a bit of logic practice. And of all the training for the company i work for, (funnily enough) prime numbers never cropped up... You cant question my programming abilities as u have no idea what i get paid to write!
xoxo
I was not questioning your programming abilities. I was just saying that the "programming" should come after you know how to solve the "real world problem" - in this case, prime factor decomposition. And a step for solving the problem is understanding the concepts involved.
Daniel Daranas
+2  A: 
$ time factor 300000000000 > /dev/null

real        0m0.027s
user        0m0.000s
sys         0m0.001s

You're doing something wrong if it's taking an hour. You might even have an infinite loop somewhere - make sure you're not using 32-bit ints.

Ant P.
+3  A: 

One last thing nobody has mentioned, perhaps because it seems obvious. Every time you find a factor and divide it out, keep trying the factor until it fails.

64 only has one prime factor, 2. You will find that out pretty trivially if you keep dividing out the 2 until you can't anymore.

A: 

You could use the sieve of Eratosthenes to find the primes and see if your number is divisible by those you find.

Sam Meldrum
Yes thats what im working from lol
xoxo
a sieve is extremely slow compared to just testing every odd number.
zaratustra
A: 

You only need to check it's remainder mod(n) where n is a prime <= sqrt(N) where N is the number you are trying to factor. It really shouldn't take over an hour, even on a really slow computer or a TI-85.

Anthony Potts
*Sighs* no i think its just my lack of prime number knowledge!
xoxo
Before I became a programmer I was a PhD student in Number Theory, and it looks like there are a number of other people who are in to that too. As mentioned in other posts prime numbers are what keeps everything secure. If you get the time and it interests you, check it out.
Anthony Potts
A: 

Here's a nice blog post about The Quadratic Sieve. It may help.

Yuval F
A: 

Your algorithm must be FUBAR. This only takes about 0.1s on my 1.6 GHz netbook in Python. Python isn't known for its blazing speed. It does, however, have arbitrary precision integers...

import math
import operator

def factor(n):
    """Given the number n, to factor yield a it's prime factors.
    factor(1) yields one result: 1. Negative n is not supported."""
    M = math.sqrt(n)  # no factors larger than M
    p = 2             # candidate factor to test
    while p <= M:     # keep looking until pointless
        d, m = divmod(n, p)
        if m == 0:
            yield p   # p is a prime factor
            n = d     # divide n accordingly
            M = math.sqrt(n)  # and adjust M
        else:
            p += 1    # p didn't pan out, try the next candidate
    yield n  # whatever's left in n is a prime factor

def test_factor(n):
    f = factor(n)
    n2 = reduce(operator.mul, f)
    assert n2 == n

def example():
    n = 600851475143
    f = list(factor(n))
    assert reduce(operator.mul, f) == n
    print n, "=", "*".join(str(p) for p in f)

example()

# output:
# 600851475143 = 71*839*1471*6857

(This code seems to work in defiance of the fact that I don't know enough about number theory to fill a thimble.)

bendin
You lucked out with your choice: 71 shows up fairly quickly and that greatly reduces the time spent. Try factoring this one: 12073531234081300153
Jason S
Yes, naturally, even this fairly stupid algorithm would be better if it didn't waste its time trying values of p that are not prime. However, I should point out that 600851475143 is specifically mentioned in the question, so it wasn't *luck*, as such. ;-)
bendin
+1  A: 

Here's some haskell goodness for you guys :)

primeFactors n = factor n primes where factor n (p:ps) | p*p > n = [n] | n mod p /= 0 = factor n ps | otherwise = p : factor (n div p) (p:ps) primes = 2 : filter ((==1) . length . primeFactors) [3,5..]

Took it about .5 seconds to find them, so I'd call that a success.

Alex Fort
He specifically asked for hints, not code.
EvilTeach
Evilteach is too picky: the please no code was edited later
blabla999
He? I am a she! and no i did ask for no code in the original question but its not the language i am using so its no bother!
xoxo
If you don't sieve, it's less code and faster: primeFactors n = factor n (2:[3,5..])
A. Rex
@Alex Fort: the backticks around mod and div didn't show up. Anybody else who's trying to use this code, put \`backticks\` around mod and div.@A. Rex: faster always? or just in the given case? Would the sieve help more as the input number gets bigger?
LarsH
+1  A: 

Here's one site where you can get answers: Factoris - Online factorization service. It can do really big numbers, but it also can factorize algebraic expressions.

zendar
+1  A: 

The fastest algorithms are sieve algorithms, and are based on arcane areas of discrete mathematics (over my head at least), complicated to implement and test.

The simplest algorithm for factoring is probably (as others have said) the Sieve of Eratosthenes. Things to remember about using this to factor a number N:

  • general idea: you're checking an increasing sequence of possible integer factors x to see if they evenly divide your candidate number N (in C/Java/Javascript check whether N % x == 0) in which case N is not prime.
  • you just need to go up to sqrt(N), but don't actually calculate sqrt(N): loop as long as your test factor x passes the test x*x<N
  • if you have the memory to save a bunch of previous primes, use only those as the test factors (and don't save them if prime P fails the test P*P > N_max since you'll never use them again
  • Even if you don't save the previous primes, for possible factors x just check 2 and all the odd numbers. Yes, it will take longer, but not that much longer for reasonable sized numbers. The prime-counting function and its approximations can tell you what fraction of numbers are prime; this fraction decreases very slowly. Even for 264 = approx 1.8x1019, roughly one out of every 43 numbers is prime (= one out of every 21.5 odd numbers is prime). For factors of numbers less than 264, those factors x are less than 232 where about one out of every 20 numbers is prime = one out of every 10 odd numbers is prime. So you'll have to test 10 times as many numbers, but the loop should be a bit faster and you don't have to mess around with storing all those primes.

There are also some older and simpler sieve algorithms that a little bit more complex but still fairly understandable. See Dixon's, Shanks' and Fermat's factoring algorithms. I read an article about one of these once, can't remember which one, but they're all fairly straightforward and use algebraic properties of the differences of squares.

If you're just testing whether a number N is prime, and you don't actually care about the factors themselves, use a probabilistic primality test. Miller-Rabin is the most standard one, I think.

Jason S
When I do x*x<N instead of x<sqrt(N) it takes longer for the number in question (in Python). Maybe sqrt(N) brings the comparison down to integer types that compare faster?
PEZ
if N is fixed maybe the interpreter is "smart" enough to evaluate it only once...
Jason S
In another thread in factorization I ran a test on the Sieve--it absolutely *STUNK*, even when put up against a crude algorithm that didn't consider all the tricks listed here. (Not that I'm sure all the tricks are good--the cost may be higher than what they save!)
Loren Pechtel
Jason, the thing is that I don't compute sqrt(N) very often, only when I find a factor. It's the comparison that takes place often. Maybe it's much (my tests indicate that it's much) more efficient to compare p<sqrt_N than p_times_p<N.
PEZ
A: 

I spent some time on this since it just sucked me in. I won't paste the code here just yet. Instead see this factors.py gist if you're curious.

Mind you, I didn't know anything about factoring (still don't) before reading this question. It's just a Python implementation of BradC's answer above.

On my MacBook it takes 0.002 secs to factor the number mentioned in the question (600851475143).

There must obviously be much, much faster ways of doing this. My program takes 19 secs to compute the factors of 6008514751431331. But the Factoris service just spits out the answer in no-time.

PEZ
@PEZ: What is the problem 3 forum?
Dimitre Novatchev
It's a forum on the Project Euler site (http://projecteuler.net/) that you get access to once you've submitted the correct answer to the problem we're discussing here. Check it out. But I warn you; It's highly addictive!
PEZ
@PEZ: Many thanks for mentioning ProjectEuler. I already solved 28 problems and the only programming language I used has been XSLT.
Dimitre Novatchev
That's amazing. Can't you put your solutions up on gist.github.com or some such. I'd like to see some more XSLT solutions to problems I've solved. And the last problem was launched last sat so if you solve that one you can probably post on the project forum. They're launching a new problem tomorrow.
PEZ
A: 

Just to expand/improve slightly on the "only test odd numbers that don't end in 5" suggestions...

All primes greater than 3 are either one more or one less than a multiple of 6 (6x + 1 or 6x - 1 for integer values of x).

Dave Sherohman
+1  A: 

The key to understanding why the square root is important, consider that each factor of n below the square root of n has a corresponding factor above it. To see this, consider that if x is factor of n, then x/n = m which means that x/m = n, hence m is also a factor.

JesperE
+7  A: 
Dimitre Novatchev
Way neat! Too bad the problem 3 forum is closed so you can't post this there. I don't think I've seen an XSLT solution there.
PEZ
@PEZ: What is the problem 3 forum?
Dimitre Novatchev
PEZ
Though the problem was presented long ago so the forum is "full" and they have closed it for new entries.
PEZ
@PEZ: Thank you for mentioninh ProjectEuler. I already solved 28 problems and the only programming language I used has been XSLT.
Dimitre Novatchev
A: 

The specific number is 300425737571? It trivially factors into 131 * 151 * 673 * 22567. I don't see what all the fuss is...