ansaurus

Question

Google Code Jam 2010 Large DataSets Take Too Long to Submit

Answer 1

+9 A:

Usually the small data sets are designed to be solved with simple algorithms but the large data sets require some clever ideas to reduce the amount of time required to calculate. The language shouldn't be a big issue here if you have the right algorithm.

Mark Byers 2010-05-08 20:27:07

Thanks Mark. I guess I should have just focused on one problem and writing a better algorithm then.

Travis 2010-05-08 20:30:27

I solved all three using Python 3.0 on a MacBook, with a 2 GHz Core 2 Duo. They all take less than a second on the large data.

Derek Ledbetter 2010-05-09 00:29:49

Answer 2

+1 A:

I made the same mistake. I write in Python, which is inherently slower than other languages due to its interpreted nature. I tried compiling it to C++ code with ShedSkin, but my algorithm was still too slow.

My solution to the Snapper Chain simply "acted out" the scenario. Pseudocode:

set first snapper to powered and off // because it is always powered
set all other snappers to unpowered and off
repeat K times:
   for each snapper in chain: // process changes in on-off-ness
      if powered and on:
         turn off
      elif powered and off:
         turn on
   for each snapper in chain: // process changes in powered-ness
      if not first snapper and previous snapper is powered and on:
         set to powered
      else:
         set to unpowered

I later realized a solution based on the fact that (2^n)-1 == a binary number with n-1 right-justified 1's, but by then I had long since run out of time on the large set.

EDIT: There's a way better solution in the Contest Analysis page on the Code Jam Dashboard.

Andre Boos 2010-05-08 23:25:31

Thanks for the info on the solution present on the analysis page. I'll be sure to check it out.

Travis 2010-05-10 00:17:07

Answer 3

A:

Because you are using ruby, you are going to have to find the most efficient algorithms possible. I would also try using a profiler to find the points which are using up most of your time. There are a few available. One open-source option is http://ruby-prof.rubyforge.org/.

bennybdbc 2010-05-09 05:24:33

Answer 4

+2 A:

First, look at the algorithm that you are using and try to calculate the complexity of it. Afterwards look into the language you are using to implement it. Ruby is known to be slower than other languages, it might have an impact especially if the set is very large and time limits are short.

Have a look at this website Language Benchmarks. It compares various languages in terms of speed, performance and memory consumption.

Pran 2010-05-09 07:09:21

Answer 5

+1 A:

You need the right algorithm to be successful in GCJ. Also I'd argue it's much more important how fast you can code a program in the language than how fast it is - provided the limited coding time allowed.

I used Python for GCJ and did not have a case in which the language speed "failed" me. One can say that Python is 2x faster than Ruby (per lang.benchmarks shootout); and when i used Psyco (JIT compiler module) i get about 5x speed-up - but that's small beer, the choice of language can bring only linear speed increase. Say 10 times, big whoops.

The problems in GCJ on the other hand are designed with combinatorial explosion in mind and the larger inputs lead to much larger increase of time (or memory) needed. Taking for example GCJ 2010-1C, "Making Chess Boards". Assuming square board for simplicity, the naive implementation has complexity O(n*4). The fast yet complicated judge implementation was described as O(n*2 log(n*2)). My simpler solution - which unfortunately came to me way after the round ended - is O(n*3). The difference between power of 3 to power of 4 might not seem significant but in the large input there was 512x512 table to process, for it the 3 algorithms will have to do iterations in the magnitude of

naive   68,719,476,736
mine       134,217,728
judge's      4,718,592

So my implementation on that input will be roughly 30x slower than judge's solution and ~ 500x faster than the naive code. On my oldish desktop (1.2GHz Athlon) my python code runs the large input for slightly under 4 mins. I can speculate that the optimal solution would have run in under 10 seconds - but who cares, as long as you fit in under 8 mins?

On the other hand, the n**4 algorithm would take ~500*4 min = 33 hours to run. Which is very much not acceptable and no optimizing compiler or over-clocked CPU are likely to save us from that morass.

Sure, some optimizations are possible - just adding psyco.full() decreased my runtime 5x to 46 sec. Also running the code on my faster laptop (2GHz dual core) sped it up 3 times. That's 15 times "only" - never mind let's say we sped it up 50x - that is still 10 times too slow to be able to use the "naive" algorithm on the large input.

So if you have bad algorithm, there is no optimization/compiler/hardware to help. On the other hand, if you have the best algorithm in mind, you can use computer 30 times slower than my nine-year old PC and still get results on time with scripting language of the rank of python/ruby. Which by the way is main goal of GCJ problem writers - contestants distinguishing themselves on base of programming skills and not compiler/hardware/network connection.

Nas Banov 2010-06-05 20:33:47

ansaurus

tags:

views:

answers:

Google Code Jam 2010 Large DataSets Take Too Long to Submit

related questions