views:

446

answers:

2

I am somewhat new to Ruby and although I find it to be a very intuitive language I am having some difficulty understanding how implicit return values behave.

I am working on a small program to grep Tomcat logs and generate pipe-delimited CSV files from the pertinent data. Here is a simplified example that I'm using to generate the lines from a log entry.

class LineMatcher
  class << self
    def match(line, regex)
      output = ""
      line.scan(regex).each do |matched|
        output << matched.join("|") << "\n"
      end
      return output
    end        
  end
end


puts LineMatcher.match("00:00:13,207 06/18 INFO  stateLogger - TerminationRequest[accountId=AccountId@66679198[accountNumber=0951714636005,srNumber=20]",
                       /^(\d{2}:\d{2}:\d{2},\d{3}).*?(\d{2}\/\d{2}).*?\[accountNumber=(\d*?),srNumber=(\d*?)\]/)

When I run this code I get back the following, which is what is expected when explicitly returning the value of output.

00:00:13,207|06/18|0951714636005|20

However, if I change LineMatcher to the following and don't explicitly return output:

    class LineMatcher
      class << self
        def match(line, regex)
          output = ""
          line.scan(regex).each do |matched|
            output << matched.join("|") << "\n"
          end
        end        
      end
    end

Then I get the following result:

00:00:13,207
06/18
0951714636005
20

Obviously, this is not the desired outcome. It feels like I should be able to get rid of the output variable, but it's unclear where the return value is coming from. Also, any other suggestions/improvements for readability are welcome.

+3  A: 

In ruby the return value of a method is the value returned by the last statement. You can opt to have an explicit return too.

In your example, the first snippet returns the string output. The second snippet however returns the value returned by the each method (which is now the last stmt), which turns out to be an array of matches.

irb(main):014:0> "StackOverflow Meta".scan(/[aeiou]\w/).each do |match|
irb(main):015:1* s << match
irb(main):016:1> end
=> ["ac", "er", "ow", "et"]

Update: However that still doesn't explain your output on a single line. I think it's a formatting error, it should print each of the matches on a different line because that's how puts prints an array. A little code can explain it better than me..

irb(main):003:0> one_to_three = (1..3).to_a
=> [1, 2, 3]
irb(main):004:0> puts one_to_three
1
2
3
=> nil

Personally I find your method with the explicit return more readable (in this case)

Gishu
Sorry, yes, I had a formatting error. Corrected in the question.
csamuel
+3  A: 

Any statement in ruby returns the value of the last evaluated expression. You need to know the implementation and the behavior of the most used method in order to exactly know how you'll program will act.

#each returns the collection you iterated on. That said, the following code will return the value of line.scan(regexp).

line.scan(regex).each do |matched|
  output << matched.join("|") << "\n"
end

If you want to return the result of the execution, you can use map that works as each but returns the modified collection.

class LineMatcher
  class << self
    def match(line, regex)
      line.scan(regex).map do |matched|
        matched.join("|")
      end.join("\n") # remember the final join
    end        
  end
end

There are several useful methods you can use depending on your very specific case. In this one you might want to use inject unless the number of results returned by scan is high (working on arrays then merging them is more efficient than working on a single string).

class LineMatcher
  class << self
    def match(line, regex)
      line.scan(regex).inject("") do |output, matched|
        output << matched.join("|") << "\n"
      end
    end        
  end
end
Simone Carletti
Thanks for the feedback... I think my confusion resulted from the fact that it was hard to tell what actually gets interpreted as the last statement.
csamuel