views:

70

answers:

3

Hi,

Which of these two forms of Array Initialization is better in Ruby?

Method 1:

DAYS_IN_A_WEEK = (0..6).to_a
HOURS_IN_A_DAY = (0..23).to_a

@data = Array.new(DAYS_IN_A_WEEK.size).map!{ Array.new(HOURS_IN_A_DAY.size) }

DAYS_IN_A_WEEK.each do |day|
  HOURS_IN_A_DAY.each do |hour|
    @data[day][hour] = 'something'
  end
end

Method 2:

DAYS_IN_A_WEEK = (0..6).to_a
HOURS_IN_A_DAY = (0..23).to_a

@data = {}

DAYS_IN_A_WEEK.each do |day|
  HOURS_IN_A_DAY.each do |hour|
    @data[day] ||= {}
    @data[day][hour] = 'something'
  end
end

The difference between the first method and the second method is that the second one does not allocate memory initially. I feel the second one is a bit inferior when it comes to performance due to the numerous amount of Array copies that has to happen.

However, it is not straight forward in Ruby to find what is happening. So, if someone can explain me which is better, it would be really great!

Thanks

+2  A: 

I wrapped both of the code snippets into separate methods and did some benchmarking. Here are the results:

Benchmark.bm(7) do |x|
  x.report ("method1") { 100000.times { method1 } }
  x.report ("method2") { 100000.times { method2 } }
end

             user     system      total        real
method1 11.370000   0.010000  11.380000 ( 11.392233)
method2 17.920000   0.010000  17.930000 ( 18.328318)
Eimantas
Hey thank you very much!! I decipher that the first one is almost twice as good as the second one!!
Bragboy
+3  A: 

Before I answer the question you asked, I'm going to answer the question you should have asked but didn't:

Q: Should I focus on making my code readable first, or should I focus on performance first?

A: Make your code readable and correct first, then, and only if there is a performance problem, start to worry about performance by measuring where the performance problem is first and only then making changes to your code.

Now to answer the question you asked, but shouldn't have:

method1.rb:

DAYS_IN_A_WEEK = (0..6).to_a
HOURS_IN_A_DAY = (0..23).to_a

10000.times do

  @data = Array.new(DAYS_IN_A_WEEK.size).map!{ Array.new(HOURS_IN_A_DAY.size) }

  DAYS_IN_A_WEEK.each do |day|
    HOURS_IN_A_DAY.each do |hour|
      @data[day][hour] = 'something'
    end
  end

end

method2.rb:

DAYS_IN_A_WEEK = (0..6).to_a
HOURS_IN_A_DAY = (0..23).to_a

10000.times do

  @data = {}

  DAYS_IN_A_WEEK.each do |day|
    HOURS_IN_A_DAY.each do |hour|
      @data[day] ||= {}
      @data[day][hour] = 'something'
    end
  end

end

Results of brain-dead benchmark:

$ time ruby method1.rb

real    0m1.189s
user    0m1.140s
sys 0m0.000s

$ time ruby method2.rb

real    0m1.879s
user    0m1.780s
sys 0m0.020s

Looks to me like user time usage (the important factor) has method1.rb a lot faster. You, of course, should not trust this benchmark and should make your own reflecting your actual code use. This, however, is something you should do only after you have determined which code is your performance bottleneck in reality. (Hint: 99.44% of computer programmers are 100% wrong when they guess where their bottlenecks are without measuring!)

JUST MY correct OPINION
Got it what you were trying to imply.
Bragboy
+3  A: 

What's wrong with just

@data = Array.new(7) { Array.new(24) { 'something' }}

Or, if you are content having the same object everywhere:

@data = Array.new(7) { Array.new(24, 'something') }

It's much faster, not that it would matter. It is also much more readable, which is the most important thing. After all, the purpose of code is communicating intent to the other stakeholders, not communicating with the computer.

             user   system     total       real
method1  8.969000 0.000000  8.969000 ( 9.059570)
method2 16.547000 0.000000 16.547000 (16.799805)
method3  6.468000 0.000000  6.468000 ( 6.616211)
method4  0.969000 0.015000  0.984000 ( 1.021484)
That last line also shows another interesting thing: the runtime is dominated by the time needed to create the 7*24*100000 = 16.8 million 'something' strings.

And of course there is another important obversation: your method1 and method2 that you are comparing against each other do two completely different things! It doesn't even make sense to compare them against each other. method1 creates an Array, method2 creates a Hash.

Your method1 is equivalent to my first example above:

@data = Array.new(7) { Array.new(24) { 'something' }}

While method2 is (very roughly) equivalent to:

@data = Hash.new {|h, k| h[k] = Hash.new {|h, k| h[k] = 'something' }}

Well, except that your method2 initializes the entire Hash eagerly, while my method only executes the initialization code lazily in case an uninitialized key is read.

In other words, after running the above initialization code, the Hash is still empty:

@data # => {}

But whenever you try to access a key, it will magically appear:

@data[5][17] # => 'something'

And it will stay there:

@data # => {5 => {17 => 'something'}}

Since this code doesn't actually initialize the Hash, it is obviously way faster:

             user   system     total       real
method5  0.266000 0.000000  0.266000 ( 0.296875)

Jörg W Mittag
Hi Jörg!! I was waiting for 'your' reply. The reason I put something there is I am doing something there. Lets just say its not a constant value everytime. And thank for throwing light over the Hash vs Array concept in Ruby.
Bragboy