There's nothing wrong with using a state file. The only catch will be that you need to ensure you have fully committed your changes to the state file before your program enters a section where it may be interrupted. Typically this is done with an IO#flush call.
For example, here's a simple state-tracking class that works on a line-by-line basis:
class ProgressTracker
def initialize(filename)
@filename = filename
@file = open(@filename)
@state_filename = File.expand_path(".#{File.basename(@filename)}.position", File.dirname(@filename))
if (File.exist?(@state_filename))
@state_file = open(@state_filename, File::RDWR)
resume!
else
@state_file = open(@state_filename, File::RDWR | File::CREAT)
end
end
def each_line
@file.each_line do |line|
mark_position!
yield(line) if (block_given?)
end
end
protected
def mark_position!
@state_file.rewind
@state_file.puts(@file.pos)
@state_file.flush
end
def resume!
if (position = @state_file.readline)
@file.seek(position.to_i)
end
end
end
You use it with an IO-like block call:
test = ProgressTracker.new(__FILE__)
n = 0
test.each_line do |line|
n += 1
puts "%3d %s" % [ n, line ]
if (n == 10)
raise 'terminate'
end
end
In this case, the program reads itself and will stop after ten lines due to a simulated error. On the second run it should display the next ten lines, if there are that many, or simply exit if there's no additional data to retrieve.
One caveat is that you need to remove the .position file associated with the input data if you want the file to be reprocessed, or if the file has been reset. It's also not possible to edit the file and remove earlier lines or it will throw off the offset tracking. So long as you're simply appending data to the file, or restarting it, everything will be fine.