I have a large set of text files in an S3 directory. For each text file, I want to apply a function (an executable loaded through bootstrapping) and then write the results to another text file with the same name in an output directory in S3. So there's no obvious reducer step in my MapReduce job.
I have tried using NONE as my reducer, but the output directory fills with files like part-00000, part-00001, etc. And there are more of these than there are files in my input directory; each part- files represents only a processed fragment.
Any advice is appreciated.