views:

530

answers:

5

I'm trying to solve this programming riddle and although the solution (see code below) works correctly, it is too slow for succesful submission.

  • Any pointers as how to make this run faster (removal of every n-th element from a list)?
  • Or suggestions for a better algorithm to calculate the same; seems I can't think of anything else than brute-force for now...

Basically, the task at hand is:

GIVEN:
L = [2,3,4,5,6,7,8,9,10,11,........]
1. Take the first remaining item in list L (in the general case 'n'). Move it to 
   the 'lucky number list'. Then drop every 'n-th' item from the list.
2. Repeat 1

TASK:
Calculate the n-th number from the 'lucky number list' ( 1 <= n <= 3000)

My original code (it calculated the 3000 first lucky numbers in about a second on my machine - unfortunately too slow):

"""
SPOJ Problem Set (classical) 1798. Assistance Required
URL: http://www.spoj.pl/problems/ASSIST/
"""

sieve = range(3, 33900, 2)
luckynumbers = [2]

while True:
    wanted_n = input()
    if wanted_n == 0:
        break

    while len(luckynumbers) < wanted_n:
        item = sieve[0]
        luckynumbers.append(item)
        items_to_delete = set(sieve[::item])
        sieve = filter(lambda x: x not in items_to_delete, sieve)
    print luckynumbers[wanted_n-1]

EDIT: thanks to the terrific contributions of Mark Dickinson, Steve Jessop and gnibbler, I got at the following, which is quite a whole lot faster than my original code (and succesfully got submitted at http://www.spoj.pl with 0.58 seconds!)...

sieve = range(3, 33810, 2)
luckynumbers = [2]

while len(luckynumbers) < 3000:
    if len(sieve) < sieve[0]:
        luckynumbers.extend(sieve)
        break
    luckynumbers.append(sieve[0])
    del sieve[::sieve[0]]

while True:
    wanted_n = input()
    if wanted_n == 0:
        break
    else:
        print luckynumbers[wanted_n-1]
+1  A: 

You're better off using an array and zeroing out every Nth item using that strategy; after you do this a few times in a row, the updates start getting tricky so you'd want to re-form the array. This should improve the speed by at least a factor of 10. Do you need vastly better than that?

Rex Kerr
Interesting suggestion, thanks. I'll give this a try and report back with with the results!
ChristopheD
+4  A: 

Try using these two lines for the deletion and filtering, instead of what you have; filter(None, ...) runs considerably faster than the filter(lambda ...).

sieve[::item] = [0]*-(-len(sieve)//item)
sieve = filter(None, sieve)

Edit: much better to simply use del sieve[::item]; see gnibbler's solution.

You might also be able to find a better termination condition for the while loop: for example, if the first remaining item in the sieve is i then the first i elements of the sieve will become the next i lucky numbers; so if len(luckynumbers) + sieve[0] >= wanted_n you should already have computed the number you need---you just need to figure out where in sieve it is so that you can extract it.

On my machine, the following version of your inner loop runs around 15 times faster than your original for finding the 3000th lucky number:

while len(luckynumbers) + sieve[0] < wanted_n:
    item = sieve[0]
    luckynumbers.append(item)
    sieve[::item] = [0]*-(-len(sieve)//item)
    sieve = filter(None, sieve)
print (luckynumbers + sieve)[wanted_n-1]
Mark Dickinson
Wow, these two lines run about 10 times as fast as the two in my code above. Very insightful! Let me take a stab at the while loop...
ChristopheD
Or maybe `for x in xrange(0, len(sieve), item): sieve[x] = 0`
Steve Jessop
Since you're asked to output lucky(n) for several values of n, it would make a lot more sense to compute the array once (up to 3000), rather than to start from scratch for each wanted value. It's then probably not worth worrying much about early termination.
Steve Jessop
@Steve Jessop: Mark Dickinson's construct is faster, but thanks for the suggestion!
ChristopheD
Impressive speedups so far, thanks a lot (I'll edit the current version in my question) - it calculates the first 3000 in about 0.10 seconds on my machine. It still fails the time limit on http://www.spoj.pl though...
ChristopheD
Hmm. If that still fails, then either the SPOJ machine is several times slower than all of ours, or else they're asking for a truly huge number of inputs. The latter seems kind of unfair, since surely the question is about lucky number calculation, not about optimising I/O. Personally I would seriously consider benchmarking SPOJ at this point - don't output anything, just execute some loop a large number of times. Binary chop to find out how many times SPOJ can execute it in 1s. Compare with your machine. If you're a factor of 100 times too slow, you're wasting effort just tweaking.
Steve Jessop
@Steve Jessop: That would indeed be an interesting thing to do now; I'll have to call it a day for now unfortunately; but I'll be sure to catch up with this question this weekend!
ChristopheD
@ChristopheD: I think I maligned Mark's early termination idea unfairly. Adding `if len(sieve) < item: luckynumbers.extend(sieve); break` after the first line of the loop gives a 3x speed increase on my machine calculating the first 3000. Sorry, Mark.
Steve Jessop
+2  A: 

An explanation on how to solve this problem can be found here. (The problem I linked to asks for more, but the main step in that problem is the same as the one you're trying to solve.) The site I linked to also contains a sample solution in C++.

The set of numbers can be represented in a binary tree, which supports the following operations:

  • Return the nth element
  • Erase the nth element

These operations can be implemented to run in O(log n) time, where n is the number of nodes in the tree.

To build the tree, you can either make a custom routine that builds the tree from a given array of elements, or implement an insert operation (make sure to keep the tree balanced).

Each node in the tree need the following information:

  • Pointers to the left and right children
  • How many items there are in the left and right subtrees

With such a structure in place, solving the rest of the problem should be fairly straightforward.

I also recommend calculating the answers for all possible input values before reading any input, instead of calculating the answer for each input line.

A Java implementation of the above algorithm gets accepted in 0.68 seconds at the website you linked.

(Sorry for not providing any Python-specific help, but hopefully the algorithm outlined above will be fast enough.)

stubbscroll
@stubbscroll: thanks a lot for this very fine answer! I'll have my go at implementing this weekend (it's late here now) and let you know how it went.
ChristopheD
A: 

Why not just create a new list?

L = [x for (i, x) in enumerate(L) if i % n]
dan04
Hi dan04: I've just rudimentarily tested this but this would be about 200 times slower then the current solution (collected from the answers of gnibbler, Mark Dickinson, Steve Jessop)...
ChristopheD
+6  A: 

This series is called ludic numbers

__delslice__ should be faster than __setslice__+filter

>>> L=[2,3,4,5,6,7,8,9,10,11,12]
>>> lucky=[]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[3, 5, 7, 9, 11]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[5, 7, 11]

So the loop becomes.

while len(luckynumbers) < 3000:
    item = sieve[0]
    luckynumbers.append(item)
    del sieve[::item] 

Which runs in less than 0.1 second

gnibbler
D'oh! A straightforward `del sieve[::item]` is much better than my convoluted "set to-be-deleted items to zero and then filter". +1.
Mark Dickinson
Can't believe I did not think of that! Surely shows how the simplest solution in Python is often the best/fastest. Combined with Mark Dickinson's early termination the solution I edited in in my original answer worked fine within the time limit (it scored 0.58 seconds with the test set)!
ChristopheD
I'll just check if I can get stubbscroll's or Rex Kerr's suggestions running faster then this (this calculates all 3000 of them in 0.0104 on average on my machine) but otherwise I'll be sure to flag this as accepted answer this weekend!
ChristopheD
@ChristopheD, Now the fun challenge is to figure out how to solve the problem 10 times faster still :)
gnibbler
make that 14.5 times faster...
gnibbler
@gnibbler: lol, very true. The fastest python 2.6.2 solution is in @ 0.04 seconds ;-)
ChristopheD
@gnibblers: funny how his username is also gnibbler ;-)
ChristopheD
@ChristopheD: it's possible that the 0.04 seconds solution has all the lucky numbers hardcoded in the source code.
stubbscroll
Did not have time to check the other proposed solutions so will check this as accepted answer (for now)
ChristopheD