tags:

views:

128

answers:

1

I have a Sinatra app with a long running process (a web scraper). I'd like the app flush the results of the crawler's progress as the crawler is running instead of at the end.

I've considered forking the request and doing something fancy with ajax but this is a really basic one-pager app that really just needs to output a log to a browser as it's happening. Any suggestions?

+2  A: 

Unfortunately you don't have a stream you can simply flush to (that would not work with Rack middleware). The result returned from a route block can simply respond to each. The Rack handler will then call each with a block and in that block flush the given part of the body to the client.

All rack responses have to always respond to each and always hand strings to the given block. Sinatra takes care of this for you, if you just return a string.

A simple streaming example would be:

require 'sinatra'

get '/' do
  result = ["this", " takes", " some", " time"]
  class << result
    def each
      super do |str|
        yield str
        sleep 0.3
      end
    end
  end
  result
end

Now you could simply place all your crawling in the each method:

require 'sinatra'

class Crawler
  def initialize(url)
    @url = url
  end

  def each
    yield "opening url\n"
    result = open @url
    yield "seaching for foo\n"
    if result.include? "foo"
      yield "found it\n"
    else
      yield "not there, sorry\n"
    end
  end
end

get '/' do
  Crawler.new 'http://mysite'
end
Konstantin Haase