views:

16

answers:

0

How do I require external libraries when running Amazon EMR streaming jobs written in Ruby?

I've defined my mapper, and am getting this output in my logs:

/mnt/var/lib/hadoop/mapred/taskTracker/jobcache/job_201008110139_0001/attempt_201008110139_0001_m_000000_0/work/./mapper_stage1.rb: line 1: require: command not found

My first reaction is that either the streaming jar isn't realizing that its executing a ruby script (I've got a shebang declaration at the top of the script pointing to /usr/bin/ruby) or that there's something funky going on with the way the streaming API deals with referencing external libraries.

Thanks in advance!

Isaac