views:

503

answers:

7

Find largest number with all contiguous triples being unique primes

If the digits of the answer are

pqrstu...xyz

then pqr, qrs, rst ... etc are all unique primes.

As expected, extra points for speed, conciseness, and other good things.

I can run Python/C/C++/Java on my comp and will try to update the answers with the running time of a provided solution on my computer.

This is not a question I need answered - this is a question you might enjoy answering as it is not an obviously easy problem.

ps : For the skeptics, this is not my homework. I enjoyed solving it, and thought people here like solving problems.

This was a pre-interview question (one needed to solve this to get a real interview) similar to the google question - find first 10 digit prime in expansion of e - the answer is the email address.


Solution times on my machine... (~ avg of 5 runs)

  • Skizz : ~0.01s (A86 Assembly)
  • John Kugelman : 1.886s (Python - new longest function)
  • Mark : 0.351s (Java)
  • cfern : 0.11s (F#)
  • VinyleEm : 0.03s (C++)
+13  A: 
John Kugelman
@John - That is an amazing coincidence. I had asked this question to my friend (a Stanford applied Math PhD), and he had given the exact same answer. While asking this question, I was wondering if anyone here will think of that approach. Its weird that it came up in the very first answer. Do you study computational topology by a further coincidence?
Rajan
Just to clarify the matter to new readers - the graph method is not the only approach. I solved it without graphs.
Rajan
The longest path problem in general is NP-complete, but it might be solvable for this particular graph. I guess it's time to program this up and see what happens...
John Kugelman
NP-completeness isn't necessarily a deal-breaker -- after all, this is a very sparse graph. The maximum outdegree is 4 (from node 019 to 191, 193, 197, and 199). All the rest have 3 or fewer outgoing edges.
Jim Lewis
I would love to see how the graph algo compares with the dynamic programming one.
Rajan
@Naga: The suspense is killing me! Thanks for posting the question.
Jim Lewis
@Jim : You are welcome!! :-)
Rajan
OK, that's about as concise as I can get the python code without changing the algorithm completely.
John Kugelman
Here is some code squeezing trick I just cooked up to replace the primes list :-)primes = [2,3,5] \nprimes = primes + [x for x in range(7, 32, 2) if not 0 in [x%y for y in primes]] \nprimes = primes + [x for x in range(33, 1000, 2) if not 0 in [x%y for y in primes]]
Rajan
Rajan
@Nagarajan: Q is a the Goldman Sachs custom version of J, isn't it? (btw, have you ever seen APL?)
Mark E
Maybe - never worked on it or J. I have only played with Python, and C, and a little bit of Java. Whats APL best suited for?
Rajan
The `nexts` dict can only take two last digits. For example, if you need next possible primes for 9419, you need only two last digits (19). This shortens initialization to: `for p in primes: nexts[p/10].add(p)`, and in `for next in nexts[number % 1000] - used`, take `% 100`. Using this approach, in the second solution you don't need `used`, but can delete edges on the fly: `for next in nexts[number % 100]: nexts[next/10].remove(next); best = max(best,longest(number * 10 + next % 10)); nexts[next/10].add(next)`.
sdcvvc
@Rajan: I would imagine your DP approach is basically the same as this solution, except you don't build the graph explicitly. The graph formed by the relationship between one prime and the next is still implicit in any solution to this problem.
MAK
@MAK - Thats true... I saw the interconnection as well. In fact the simple dynamic programming implementation goes through the exact same numbers in its search, and may also search them in the exact same order if the primes and their next matches are evaluated in the sorted order.The fact that both approaches are "exactly" same gives me an odd feeling. I wonder how Andrew wiles and some other mathematicians felt when they found the link between modular forms and elliptic equations whic is much more deeper(the crux of Fermat's Last Theorem's proof - should it be called the "The Last Proof").
Rajan
@Rajan: I don't find it odd that both solutions are the same. They are the same, only expressed in different ways. Not seeing the graph in most problems leads to similar (but often roundabout) ways of approaching what are usually pretty common problems.
MAK
@MAK - I should have said ""Different kind of feeling difficult to convey" instead of "Odd".With hindsight, we know both are basically the same. The fact that they looked different at first glance from afar ("longest path in a graph" vs "an ad hoc approach via looking at all next possibilities"), and then on closer scrutiny, we found that both are 2 sides of the same coin ("approaches are kind of similar"), and then on still closer, we find they are the exact same face of the coin ("exact same numbers analyzed in exact same order").Let me say I found the journey intriguing. :-)
Rajan
@Rajan: I see what you're saying. Its moments like this that makes CS fun :-).
MAK
+6  A: 

I believe the best answer is 9419919379773971911373313179.

Edit: slightly more concise -- appropriate sorting returns the largest number first. Note the reversal shouldn't count against the computation time, if I were less lazy I'd just store the list in that format to begin with...

edit: slightly faster (now <800ms on my thinkpad edge)

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class primefinder {

    public static void getNextPrime(Integer p, Set<Integer> list,
            List<Integer> best) {
        for (Integer pr : links.get(p)) {
            if (list.contains(pr)) {
                continue;
            }
            list.add(pr);
            getNextPrime(pr, list, best);
            list.remove(pr);
        }
        if (list.size() > best.size()) {
            best.clear();
            best.addAll(list);
        }
    }

    public static void main(String[] args) {
        Collections.reverse(primes);

        long start = System.currentTimeMillis();
        links = new LinkedHashMap<Integer,List<Integer>>();
        for(Integer p : primes){
            int last2 = p % 100;
            List<Integer> valid = new ArrayList<Integer>();
            links.put(p, valid);
            for(Integer q : primes){
                if (q / 10 == last2) {
                    valid.add(q);
                }
            }
        }

        List<Integer> best = new ArrayList<Integer>();
        Set<Integer> temp = new LinkedHashSet<Integer>();
        for (Integer p : primes) {
            temp.add(p);
            getNextPrime(p, temp, best);
            temp.clear();
        }
        System.out.println("Time: " + (System.currentTimeMillis() - start));

        // print stuff
        System.out.printf("%03d", best.get(0));
        for (int i = 1; i < best.size(); i++) {
            System.out.print((best.get(i) % 10));
        }
        System.out.print("\n");
    }

    static Map<Integer,List<Integer>> links;
    static List<Integer> primes = Arrays.asList(11, 13, 17, 19, 23, 29, 31, 37,
            41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,
            109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179,
            181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251,
            257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331,
            337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409,
            419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487,
            491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577,
            587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653,
            659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743,
            751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829,
            839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929,
            937, 941, 947, 953, 967, 971, 977, 983, 991, 997);

}
Mark E
I am still looking through the program to figure out what the issue is, but heads up - its not the right answer. Now we are 2 guys looking at whats wrong with the code, rather than one. :-)
Rajan
@Nagarajan: There's a longer solution?
Mark E
Yes... just a bit longer.
Rajan
This might be the issue - you only compare the list sizes... but not the numbers themselves for the best.
Rajan
@Nagarajan: sure, largest, not longest, one sec.
Mark E
@Nagarajan: it compiles on ideone: http://ideone.com/9hKFp (runs just a fraction too long, though), make sure you grabbed the updated import statements...
Mark E
ideone is an awesome thing... (i assumed it was for java only and started wishing there was something like this for every language) - and then I found that it works for every language!! Thanks a ton Marks.
Rajan
Lets look for another bug - this is still not the largest.
Rajan
@Nagarajan: ok, fixed. so much for brevity.
Mark E
I believe the problem is in this line - if (list.size() > best.size() || (list.size() == best.size() best.addAll(list); }list and best need to be compared whole.
Rajan
@Nagarajan: yeah, I added a full comparison.
Mark E
There's probably a way to avoid that check altogether by reverse sorting the prime list, since it's a DFS it'd find the best answer first that way. Anyway, the python solution is more concise, so I'll quit beating on this.
Mark E
Kudos for the speed of your solution. It beats the pants off of my code!
John Kugelman
@John: thanks, just made it slightly faster. I'm averaging just over 800ms.
Mark E
Damn - this is fast!!
Rajan
+1  A: 

Nice problem!

Apparently my approach is the same as already seen in here except for one handy observation. Since each triple has to be prime, each triple can only end in 1,3,7 or 9. Using this observation, I first generate the tail of the number within a reduced search space of prime triples containing only 1,3,7 or 9.

When I get the longest (and numerically highest) possible 1379-only tail, I look for the two highest digits that can be legally added to the front of the tail.

I coded it in F#, execution time on a 2.33Ghz Core2Duo around 0.39 seconds.

let isprime n = [2..(int (sqrt (float n)))] |> List.forall (fun d -> n%d > 0)

//seeds: all prime triples containing only 1,3,7 or 9
let seeds = 
    [for a in [1;3;7;9] do
        for b in [1;3;7;9] do
            let ab = 10*a+b
            for c in [1;3;7;9] do
                let p = 10*ab+c
                if isprime p then yield p]
    |> Set.ofList                

//find the longest prime chain starting with a 3-digit seed
let longestchain seed =
    let remaining = Set.remove seed seeds

    let rec aux remaining str n =
        seq {
            let cands = remaining |> Set.filter (fun rm -> n%100 = rm/10)
            if cands.IsEmpty then 
                yield str
            else
                for c in cands do 
                    yield! aux (remaining.Remove c) (str + (string (c%10))) c
        }

    aux remaining (string seed) seed
    |> Seq.groupBy (fun s -> s.Length)
    |> Seq.maxBy fst
    |> snd
    |> Seq.max

//a chain is valid if all consecutive 3-tuples inside a string are distinct
let valid (s:string) = 
    (Seq.windowed 3 s 
     |> Seq.distinct 
     |> Seq.length) = (s.Length - 2)    

//All expansions of a string s (xyz....) with a 3-digit prime p (abx)
//such that abx and bxy are prime
let augment (s:string) =
    let x = int (s.Substring(0,1))
    let y = int (s.Substring(1,1))

    {997 .. -2 .. 2}
    |> Seq.filter isprime
    |> Seq.filter (fun p -> p%10 = x)
    |> Seq.filter (fun p -> isprime (10 * (p%100) + y))
    |> Seq.map (fun p -> (string (p/10)) + s)
    |> Seq.filter valid //only valid chains allowed
    |> Seq.head //return highest legal augmentation

// Find the largest number where each consecutive triplet is a prime number
let find() =
    seeds 
    |> Seq.map longestchain //expand each seed to the longest chain possible
    |> Seq.groupBy (fun s -> s.Length) 
    |> Seq.maxBy fst //find the chains of longest length possible
    |> snd 
    |> Seq.map augment //add two valid numbers in front of the each max-chain
    |> Seq.max

let test() = find() = "9419919379773971911373313179" //should return true
cfern
The program has a pretty impressive running time on ideone.com - 0.11s.
Rajan
A: 

Golfscript - 28 chars

9419919379773971911373313179

Maybe the code-golf tag should not be on this question :)

gnibbler
Yes... that was added by the moderators. My intention was never code length minimization.
Rajan
@Nag - Kirk Woll, who added the tag, is not a moderator. If you think it should be removed, I think you can feel free to remove it.
reemrevnivek
Thanks... Now I know some more about stackoverflow.
Rajan
Fast *and* small!
Gabe
+4  A: 

16 Bit Assembler

Source: ~1500

Executable: 313

Time: ~0s (I haven't got any way to time it)

    mov ax,ds
    add ax,1000h
    mov es,ax
    mov ds,ax
    mov dx,640ah
    xor di,di
    xor ax,ax
    mov bx,ax
 l1:stosw
    inc ax
    cmp ax,1000
    jb l1
    mov ax,4
 l2:mov di,ax
    cmp w[di],bx
    je l4
    add di,ax
 l3:mov w[di],bx
    add di,ax
    cmp di,2000
    jb l3
 l4:add ax,2
    cmp ax,2000
    jb l2
    mov si,200
    mov di,2000
    xor cx,cx
    mov bx,100
 l5:mov [di+4],bx
    inc bx
    lodsw
    or ax,ax
    jz l6
    mov [si-2],di
    div dl
    add ah,30h
    mov [di+2],ah
    aam
    or al,al
    jz l6
    or ah,ah
    jz l6
    add ax,3030h
    mov [di],ah
    mov [di+1],al
    mov b [di+3],0
    add di,32
    inc cx
 l6:cmp si,2000
    jb l5
    mov bp,cx
    mov di,2000
 l7:mov ax,[di+4]
    div dh
    mov al,ah
    mul dl
    mov ch,1
 l8:push ax
    add al,ch
    adc ah,0
    mov si,ax
    shl si,1
    mov ax,[si]
    or ax,ax
    jz l9
    mov bl,[di+3]
    mov bh,0
    add bx,bx
    mov [di+bx+6],ax
    mov w [di+bx+8],0
    inc b [di+3]
 l9:pop ax
    add ch,2
    cmp ch,10
    jl l8
    add di,32
    mov ch,0
    loop l7
    mov si,2000
    mov cx,bp
    mov b[100],0
l10:cmp b[si+3],0
    je l11
    mov di,0
    mov ax,5
    stosw
    mov ax,[si]
    stosw
    mov al,[si+2]
    stosb
    or b[si+3],128
    push cx
    call l12
    pop cx
    and b[si+3],127
l11:add si,32
    loop l10
    mov dx,102
    mov ah,9
    int 21h    
    ret
l12:lea di,[si+6]
l14:mov bx,[di]
    or bx,bx
    jz ret
    test b[bx+3],128
    jnz l13
    pusha
    xor di,di
    add di,[di]
    mov al,[bx+2]
    stosb
    inc w[0]
    xor si,si
    mov di,100
    mov cx,[si]
    repz cmpsb
    jle l15
    xor si,si
    mov di,100
    mov cx,[si]
    rep movsb
    mov al,'$'
    stosb
l15:mov si,bx
    or b[si+3],128
    call l12
    and b[si+3],127
    dec w [0]
    popa
l13:add di,2
    jmp l14

Note: Assembled using the A86 assembler, tested with WinXP command prompt.

Skizz
No comments? Also, can A86 be assembled for Linux, where you can use the 'time' utilty to test your program?
reemrevnivek
@reemrevnivek: Real Coders(TM) don't use comments ;-) This won't work with linux as it stands due to the MS-DOS Int21 call. With a bit of a rewrite it might work. Also, A86 is an MS-DOS application :-(
Skizz
Skizz. This is plain awesomeness. I cant say anything more.
Rajan
assembled to pwn! :P
st0le
+1 for using A86 (I paid for my copy in 1992!), using old-school int21 functions that use `$` for string termination (like I learned in 1987), and having labels that all start with `l`
Gabe
If you can solve this program in Assembly, I cant believe that you have no way to time it. :-)
Rajan
+2  A: 

My first post here. Kind of looks like a typical Project Euler question. Same answer as Kugelman. In C++ in 0.101 sec. I guess C++ is much faster.

bool ps[1024];

int seen[1024],maxLen=0;

#define SET(x,a) memset(x,a,sizeof x)

char bf[10035];
string res;

void recurse(int x){
int find = false,prv = 10*(bf[x-2]-'0')+(bf[x-1]-'0');
for(int i=9;i>=1;i-=2)if(ps[prv*10+i] && !seen[prv*10+i]){
    seen[prv*10+i] = true;
    bf[x] = '0'+i;
    find = true;
    recurse(x+1);
    seen[prv*10+i] = false;
}
if(!find){
    if(x > maxLen){
        res = string(bf,bf+x);
        maxLen = x;
    }
    return;
}
return;
}

int main(){
res = "";
SET(ps,true);
ps[0] = ps[1] = false;
for(int i=2;i*i <= 1024;i++)
    if(ps[i])
        for(int j=i*i;j<1024;j+=i)
            ps[j]=false;
for(int i=999;i>100;i-=2)
    if(ps[i]){
        sprintf(bf,"%d",i);
        SET(seen,false);
        recurse(3);
    }
cout << res << endl;
return 0;
}
VinyleEm
+1  A: 

@Rajan - nice question.

Here is a perl solution to the problem. It isn't optimal, but it is easy to understand (hopefully).

#!/usr/bin/perl

use Data::Dumper;

sub test {
        my $x = shift;

        my $s0 = substr($x, 0, 3);
        my $sn = substr($x, 1);

        return ($sn =~ m/$s0/) ? 0 : 1;
}

sub isPrime {
        my $x = shift;

        for (my $p = 3; $p < $x; $p += 2) {
                if (($x % $p) == 0) {
                        return 0;
                }
        }

        return 1;
}

sub genPrimes {
        my %bad = ( 0 => 1,
                    2 => 1,
                    4 => 1,
                    5 => 1,
                    6 => 1,
                    8 => 1 );

        for (my $x = 101; $x < 1000; $x += 2) {
                if (isPrime($x)) {
                        my $d2 = int($x / 100);
                        my $d1 = int($x / 10) % 10;
                        my $d0 = ($x % 10);

                        if ((!exists $bad{$d2}) && (!exists $bad{$d1}) && (!exists $bad{$d0})) {
                                push @primes, $x;
                        }
                }
        }
}

my %prefix = ();

sub genPrefixes {

        for (my $i = 0; $i < scalar @primes; $i++) {
                my $x = $primes[$i];
                my $y = substr($x, 0, 1);
                my $z = substr($x, 1, 2);

                if (!exists $prefix{$z}) {
                        $prefix{$z} = "";
                }

                $prefix{$z} = $prefix{$z} . $y;
        }

        return 0;
}

#
# Step 1: generate the 3 digit primes and the prefixes for the leading 2 digits.
#
genPrimes();
genPrefixes();

#
# Step 2: loop over the current set of primes to see if we can prepend a digit and keep it prime.
#
my $lo_bound = -1;
my $hi_bound = -1;
my $count = 1;
my $best = 0;

while ($count != 0) {
        $count = 0;
        $lo_bound = $hi_bound + 1;
        $hi_bound = (scalar @primes) - 1;
        $best     = $primes[$lo_bound];

        for (my $i = $lo_bound; $i <= $hi_bound; $i++) {
                my $base = $primes[$i];
                my $x = substr($base, 0, 2);
                my $y = $prefix{$x};

                for (my $j = 0; $j < length($y); $j++) {
                        my $digit = substr($y, $j, 1);
                        my $guess = $digit . $base;
                        if (test($guess) != 0) {
                                $count++;
                                push @primes, $guess;
                                if ($guess gt $best) {
                                        $best = $guess;
                                }
                        }
                }
        }
}

#
# Step 3: loop over the current set of primes to see if we can prepend a digit
#         and an even digit and keep it prime.  If we find one, then check to
#         see if it is the best one.
#
for (my $i = $lo_bound; $i <= $hi_bound; $i++) {
        my $base = $primes[$i];

        for (my $j = 9; $j > 0; $j--) {
                for (my $k = 9; $k >= 0; $k--) {
                        if ((($k%2) == 0) || ($k == 5)) {
                                my $guess = $j . $k . $base;
                                my $try1 = substr($guess, 0, 3);
                                my $try2 = substr($guess, 1, 3);
                                if (isPrime($try1) && isPrime($try2) && test($guess)) {
                                        if ((length($guess) > length($best)) || ($guess gt $best)) {
                                                $best = $guess;
                                        }
                                }
                        }
                }
        }
}

#
# Step 4: Print out the answer.
#
printf "%s\n", $best;
No One in Particular