views:

188

answers:

4

In this code, I create an array of strings "1" to "10000":

array_of_strings = (1..10000).collect {|i| String(i)}

Does the Ruby Core API provide a way to get an enumerable object that lets me enumerate over the same list, generating the string values on demand, rather than generating an array of the strings?

Here's a further example which hopefully clarifies what I am trying to do:

def find_me_an_awesome_username
  awesome_names = (1..1000000).xform {|i| "hacker_" + String(i) }
  awesome_names.find {|n| not stackoverflow.userexists(n) }
end

Where xform is the method I am looking for. awesome_names is an Enumerable, so xform isn't creating a 1 million element array of strings, but just generating and returning strings of the form "hacker_[N]" on demand.

By the way, here's what it might look like in C#:

var awesomeNames = from i in Range(1, 1000000) select "hacker_" + i;
var name = awesomeNames.First((n) => !stackoverflow.UserExists(n));

(One Solution)

Here is an extension to Enumerator that adds an xform method. It returns another enumerator which iterates over the values of the original enumerator, with a transform applied to it.

class Enumerator
  def xform(&block)
    Enumerator.new do |yielder|
      self.each do |val|
        yielder.yield block.call(val)
      end
    end
  end
end

# this prints out even numbers from 2 to 10:
(1..10).each.xform {|i| i*2}.each {|i| puts i}
A: 

lists have an each method:

(1..100000).each
ennuikiller
...okay, keep going. :)
mackenir
... okay, now you start searching about Ruby iteration.
Geo
But your code just iterates over the integer range. It doesn't generate a new enumerable of strings. Please try and put yourself in my idiot shoes :).
mackenir
`(1..100000).each { |j| i = j.to_s ; do_something j }` or did you want to do something different?
anshul
@anshul yes, I want to do something a little different from that. I'd like to separate the code that generates the enumerable from the code that works with the enumerable values. TBH it's not the only way to do it, but would make the code cleaner (in my opinion of course). Thanks.
mackenir
@mackenir, you should probably just expose an `each` method that takes a bock and delegates to `Range#each` as suggested. Keep enumerable magic internal to the implementation.
molf
@molf I don't think working with Enumerator objects directly is that magical, in fact it's pretty useful.
mackenir
+1  A: 

It sounds like you want an Enumerator object, but not exactly.

That is, an Enumerator object is an object that you can use to call next on demand (rather than each which does the whole loop). (Many people use the language of internal versus external iterators: each is internal, and an Enumerator is external. You drive it.)

Here's how an enumerator might look:

awesome_names = Enumerator.new do |y|
  number = 1
  loop do
    y.yield number
    number += 1
  end
end

puts awesome_names.next
puts awesome_names.next
puts awesome_names.next
puts awesome_names.next

Here's a link, to more discussion of how you might use Enumerators lazily in Ruby: http://www.michaelharrison.ws/weblog/?p=163

There's also a section on this in the Pickaxe book (Programming Ruby by Dave Thomas).

Telemachus
Thanks. Hmmm. Find definitely stops enumerating when it findsa matching element. You can confirm this by running find on a very large range, with the predicate 'false' and 'true' the latter returns instantly. If both were enumerating everything they'd both return in the same time.Re: enumeration with next, I'm trying to find the 'enumerable transformer' facility in order to write more terse, declarative code, and manually enumerating wont really achieve that.Maybe the answer is to just implement it.
mackenir
With all the CRs removed that's less comprehensible. What I mean is, (1..1000000000000000000).find {|i|true} is quick, and (1..1000000000000000000).find {|i|false} is slow. Meaning find just enumerates until it 'finds'.
mackenir
Useful link - I think I understand it, and it helped answer the question.
mackenir
+1  A: 
class T < Range
  def each
    super { |i| yield String(i) }
  end
end

T.new(1,3).each { |s| p s }
$ ruby rsc.rb
"1"
"2"
"3"

The next thing to do is to return an Enumerator when called without a block...

DigitalRoss
+3  A: 

What you are looking for is Enumerator.new { with_a_block }. It's new in Ruby 1.9 (upcoming in 1.8.8), so require 'backports' if you need it.

As per your example, the following will not create an intermediate array and will only construct the needed strings:

require 'backports'

def find_me_an_awesome_username
  awesome_names = Enumerator.new do |y|
    (1..1000000).each {|i| y.yield "hacker_" + String(i) }
  end
  awesome_names.find {|n| not stackoverflow.userexists(n) }
end

You can even replace the 100000 by 1.0/0 (i.e. Infinity), if you want.

To answer your comment, if you are always mapping your values one to one, you could have something like:

module Enumerable
  def lazy_each
    Enumerator.new do |yielder|
      each do |value|
        yielder.yield(yield value)
      end
    end
  end
end

awesome_names = (1..100000).lazy_each{|i| "hacker_#{i}"}

For more generic cases, where one value of your source enumerable (here your Range) might yield any number of values, you can use something like the filter example that Brian Candler suggests to add in the official doc here.

Marc-André Lafortune
@marc, this looks like it! Do you know how I might turn this pattern into a more concise re-usable method taking an 'enumerable thing', and a 'transformer function'?In this example, the 'transformer function' would be `{|i| "hacker_" + String(i) }`, and the 'enumerable thing' would be `(1..100000)` or whatever.
mackenir
Thanks very much. I updated my question with a possible implementation that I worked out from your answer, and also from reading the link that @Telemachus posted, before I noticed *your* update :) Again, thanks!
mackenir
@mackenir: The Enumerable transform function is `map`. You should just be able to change `each` to `map` and keep the algorithm otherwise unchanged.
Chuck
@Chuck Enumerable#map is a synonym for Enumerable#collect so it returns an array rather than an Enumerator.
mackenir
@mackenir: Actually, I misread. How is the method given *not* what you're asking for? It takes a transformer function and gives an enumerator that yields the result of applying that transformation.
Chuck
@Chuck - by 'the method given' do you mean Enumerable#map? It isn't what I'm after, as it returns an array.
mackenir
@mackenir: By "the method given," I mean Marc-Andre's `lazy_each` in this answer. It takes a block and returns an enumerator that yields the results of calling the block on each item in the collection.
Chuck
Uh... you've lost me @Chuck. Where did I say otherwise?
mackenir