views:

65

answers:

2

Hi, I am working on a Project Euler question. But it take quite a long time (30 Minutes last time) for my PC to get the answer.

When doing time command on my Linux PC. I get result as this:

real    1m42.417s 
user    0m18.204s
sys  1m24.026s

This is a time based on much smaller dataset than the question asked.

So my question is, is this time result indicate some thing I could do to optimize my program. My guessing is that GC working most of the time or some standard library functions from ruby take too long to excuse. Also please note there are not lots of I/O involved, and the program will print out result in the very end.

Adding profile result according to feedback from AboutRuby:

 %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 66.03     9.70      9.70    10508     0.92     1.26  Array#find_index
 24.10    13.24      3.54  1027032     0.00     0.00  Fixnum#==
  4.90    13.96      0.72     1046     0.69    13.38  Array#each
  1.97    14.25      0.29      525     0.55    29.05  Integer#upto
  0.61    14.34      0.09     9322     0.01     0.01  BasicObject#!=
  0.48    14.41      0.07     7733     0.01     0.01  Fixnum#%
  0.34    14.46      0.05    10508     0.00     0.00  BasicObject#==
  0.27    14.50      0.04    10715     0.00     0.00  Fixnum#/
  0.27    14.54      0.04      523     0.08     1.19  Object#find_div
  0.20    14.57      0.03     8411     0.00     0.00  Fixnum#>=
  0.14    14.59      0.02     8383     0.00     0.00  Fixnum#<=
  0.14    14.61      0.02      526     0.04     0.04  Class#new
  0.14    14.63      0.02      523     0.04     0.10  Enumerable.inject
  0.14    14.65      0.02     3206     0.01     0.01  Array#<<
  0.14    14.67      0.02    11245     0.00     0.00  Fixnum#+
  0.07    14.68      0.01      523     0.02     0.02  Fixnum#>
  0.07    14.69      0.01     8134     0.00     0.00  Fixnum#-
  0.00    14.69      0.00        4     0.00     0.00  IO#set_encoding
  0.00    14.69      0.00      523     0.00     0.00  Float#floor
  0.00    14.69      0.00      523     0.00     0.00  Math.sqrt
  0.00    14.69      0.00      523     0.00     0.00  Fixnum#to_f
  0.00    14.69      0.00        5     0.00     0.00  Module#method_added
  0.00    14.69      0.00      526     0.00     0.00  Array#initialize
  0.00    14.69      0.00        1     0.00   700.00  Object#find_abundants
  0.00    14.69      0.00        2     0.00     0.00  Kernel.require
  0.00    14.69      0.00      523     0.00    26.71  Object#can_sum
  0.00    14.69      0.00        2     0.00     0.00  Kernel.gem_original_require
  0.00    14.69      0.00       73     0.00     0.00  Hash#default
  0.00    14.69      0.00        1     0.00 13990.00  Object#search_for_can
  0.00    14.69      0.00      246     0.00     0.00  Fixnum#to_s
  0.00    14.69      0.00      246     0.00     0.00  Kernel.inspect
  0.00    14.69      0.00        1     0.00     0.00  Array#inspect
  0.00    14.69      0.00        1     0.00     0.00  Kernel.p
  0.00    14.69      0.00        1     0.00 14690.00  #toplevel

==================================================================================== Here is the new top time consuming function calls (3 minutes after speed up):

 %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 30.19    19.52     19.52  4313353     0.00     0.01  Set#include?
 28.38    37.87     18.35    56246     0.33     0.68  Hash#each_key
 15.96    48.19     10.32    28125     0.37     2.97  Integer#upto
  7.55    53.07      4.88   271292     0.02     0.02  Set#add
  6.40    57.21      4.14  4313353     0.00     0.00  Hash#include?
  2.23    58.65      1.44    28123     0.05     0.85  Object#find_div
  1.52    59.63      0.98   271292     0.00     0.00  Hash#[]=
  1.28    60.46      0.83    56246     0.01     0.69  Set#each
  1.25    61.27      0.81    56253     0.01     0.04  Class#new
  1.16    62.02      0.75   237659     0.00     0.00  Fixnum#+
  1.08    62.72      0.70    28125     0.02     0.05  Set#initialize
  0.84    63.26      0.54    28124     0.02     0.18  Enumerable.inject
  0.63    63.67      0.41    28123     0.01     0.02  Math.sqrt
  0.23    63.82      0.15    28125     0.01     0.01  Hash#initialize
  0.22    63.96      0.14    28123     0.00     1.23  Object#can_sum
  0.19    64.08      0.12    28123     0.00     0.00  Float#floor
  0.17    64.19      0.11        2    55.00   115.00  Array#each
  0.15    64.29      0.10        1   100.00   210.00  Array#inspect
  0.14    64.38      0.09    56246     0.00     0.00  Kernel.block_given?
  0.12    64.46      0.08    28123     0.00     0.00  Fixnum#to_f
  0.12    64.54      0.08    28124     0.00     0.00  NilClass#nil?
  0.12    64.62      0.08     6966     0.01     0.02  Kernel.inspect
  0.05    64.65      0.03     6966     0.00     0.00  Fixnum#to_s
  0.00    64.65      0.00        3     0.00     0.00  Array#initialize
  0.00    64.65      0.00        1     0.00     0.00  Kernel.respond_to?

Probably could be used for further speedup, or try improve algorithm instead?

A: 

All of the Project Euler questions (at least the ones I know of) are solvable in a reasonable amount of time in almost any programming language. That it takes 30 minutes for your program to run indicates that you have a bad algorithm.

adamse
I am working on that Sir :-D
pstar
My point is that I will further improve my algorithm when I got the first 5 times speed up from sys time when I see it necessary.
pstar
I realize that in the end which data structure for the data(Array, Set, Hash) doesn't matter that much than a better algorithm. I rewrite my algorithm after read about how other people implement theirs. And my new program actually runs in a few seconds without too much efforts in optimization. So I guess you are right Sir, algorithm is much more important than data structure or other language implementation details in my case.
pstar
May I ask which problem you were working on?
adamse
A: 

Well, a majority of time in sys normally suggests it's waiting for IO. However, if you say there's not much IO, that's probably not the problem. However, it is still spending a lot of time in sys, so that suggest some other system call.

If you're making a really huge amount of objects, it will take time to deallocate all of these objects. Does it print the result, then take time to finish before finally returning to the shell? I had this problem once where a spidering program entered a nasty loop, ended up using almost 4 gigs of memory. It took several minutes for that program to shut down.

Another thing you can do is benchmark the program, and small portions of the program. You can also run the profiler to see what code is being run the most.

AboutRuby
According to the Profiler result, looks like Array#find_index, probably I will try hash?Not sure how to benchmark yet.
pstar
Also it doesn't look like lots of time doing memory allocation to lots of objects to me.
pstar