views:

3547

answers:

6

Edit2: I finally got it to pass at 10.54 seconds. I used nearly all of your answers to get there and thus it was hard to choose one as 'correct' but I believe the one I chose sums it up the best. Thanks to you all. Final passing code is below.

Edit: I included some of the suggested updates in the included code.

I'm working on a spoj problem, INTEST. The goal is to specify the number of test cases (n) and a divisor (k), then feed your program n numbers. The program will accept each number on a newline of stdin and after receiving the nth number, will tell you how many were divisible by k.

The only challenge in this problem is getting your code to be FAST because it k can be anything up to 10^7 and the test cases can be as high as 10^9.

I'm trying to write it in python and having trouble speeding it up. Any ideas?

Extensions and third-party modules are not allowed. The code is also run by the SPOJ judge machine, so I do not have the option of changing interpreters.

Thanks, Mike

import sys
import psyco
psyco.full()

def main():
    from sys import stdin, stdout
    first_in = stdin.readline()
    thing = first_in.split()
    n = int(thing[0])
    k = int(thing[1])
    total = 0

    list = stdin.readlines()
    for item in list:
        if int(item) % k == 0:
            total += 1

    stdout.write(str(total) + "\n")

if __name__ == "__main__":
    main()
+5  A: 

Use psyco, it will JIT your code, very effective when there is big loop and calculations.

Edit: Looks like third party modules are not allowed,

So, you may try converting your loop to list comprehensions, it supposed to be run at C level, so it should be faster a little bit.

sum(1 if int(line) % k == 0 else 0 for line in sys.stdin)
S.Mark
Foreign modules not allowed as the code is run on their machine.
totallymike
So i tried psyco, apparently it is supported. Didn't speed me up enough but did add a memory footprint so I'm confident that it's there.
totallymike
How large your stdin data, btw? I think it could only boost the speed when there is several thousands of loops or caculations in the data.
S.Mark
spoj doesn't let you see the test data, but they proclaim that you want to be able to handle 2.5M of input data per second. The whole point of the problem is to see how fast your code can run.
totallymike
+8  A: 

[Edited to reflect new findings and passing code on spoj]

Generally, when using Python for spoj:

  • Don't use "raw_input", use sys.stdin.readlines(). That can make a difference for large input. Also, if possible (and it is, for this problem), read everything at once (sys.stdin. readlines()), instead of reading line by line ("for line in sys.stdin...").
  • Similarly, don't use "print", use sys.stdout.write() - and don't forget "\n". Of course, this is only relevant when printing multiple times.
  • As S.Mark suggested, use psyco. It's available for both python2.5 and python2.6, at spoj (test it, it's there, and easy to spot: solutions using psyco usually have a ~35Mb memory usage offset). It's really simple: just add, after "import sys": import psyco; psyco.full()
  • As Justin suggested, put your code (except psyco incantation) inside a function, and simply call it at the end of your code
  • Sometimes creating a list and checking its length can be faster than creating a list and adding its components.
  • Favour list comprehensions (and generator expressions, when possible) over "for" and "while" as well. For some constructs, map/reduce/filter may also speed up your code.

Using (some of) these guidelines, I've managed to pass INTEST. Still testing alternatives, though.

rbp
I didn't expect print to be a huge load as it's only called the one time. I did include psyco in the source upon the next submit. Didn't speed me up much, but it did multiply my memory footprint by a factor of ten, so at least they support it!
totallymike
Of course, print shouldn't slow his submission down for this particular problem. That's why I mentioned "Generally" for these tips :)They actually didn't include psyco when they started supporting python2.6, IIRC. But there was a huge outcry, so they "fixed" it ;)BTW, python2.5 was faster everytime I tried it against 2.6. So that might also help...
rbp
I appreciate the input. Those are good rules to go by.
totallymike
Take a look at the updated list. With very few lines, you can pass this problem (and then you're free to keep on improving your time :))
rbp
+2  A: 

For other readers, here is the INTEST problem statement. It's intended to be an I/O throughput test.

On my system, I was able to shave 15% off the execution time by replacing the loop with the following:

print sum(1 for line in sys.stdin if int(line) % k == 0)
Daniel Stutzbach
Thanks for the update. I've got the sum line in there and it seemed to speed me up, but not quite enough, unfortunately.
totallymike
+4  A: 

Hey, I got it to be within the time limit. I used the following:

  • Psyco with Python 2.5.
  • a simple loop with a variable to keep count in
  • my code was all in a main() function (except the psyco import) which I called.

The last one is what made the difference. I believe that it has to do with variable visibility, but I'm not completely sure. My time was 10.81 seconds. You might get it to be faster with a list comprehension.

Edit:

Using a list comprehension brought my time down to 8.23 seconds. Bringing the line from sys import stdin, stdout inside of the function shaved off a little too to bring my time down to 8.12 seconds.

Justin Peel
Bloody hell, it sounds counterintuitive, but putting everything inside a function *does* work :P
rbp
Hm, edited the code to what now appears in the OP, and still no pass. Is there a better way to call a main() function?
totallymike
I didn't put in the `if __name__ == "__main__":` part. Also, I'd take out the else part of your list comprehension and put []'s around the expression inside the sum. I found that without the []'s it was much slower.
Justin Peel
Module-level variables need to query and modify a hash table to get or set them. Function-level variables need only index an array.
jemfinch
Jemfinch that's valuable information indeed. My code is taking too long, but I feel we're getting somewhere. Or at least that you're all dragging me there. Thanks a lot to all of you, by the way.
totallymike
@totallymike try just using a simple loop like you had at first (inside of a function of course). It was fast enough for me.
Justin Peel
+2  A: 

Just recently Alex Martinelli said that invoking code inside a function, outperforms code run in the module ( I can't find the post though )

So, why don't you try:

import sys
import psyco

psyco.full1()

def main():

    first_in = raw_input()
    thing = first_in.split()
    n = int(thing[0])
    k = int(thing[1])
    total = 0
    i = 0

    total = sum(1 if int(line) % k == 0 else 0 for line in sys.stdin)

    print total
if __name__ == "__main__":
    main()

IIRC the reason was code inside a function can be optimized.

OscarRyz
Thanks, code is in function now.
totallymike
Put the psyco.full() right under the import psyco. Putting it inside the function makes it work very poorly. When I put it inside the main() function with my code, the code works just as fast as not using psyco at all.
Justin Peel
A: 

Using list comprehensions with psyco is counter productive.

This code:

 count = 0
 for l in sys.stdin:
     count += not int(l)%k

runs twice as fast as

count = sum(not int(l)%k for l in sys.stdin)

when using psyco.