ansaurus

Question

Answer 1

A:

Each reducer uses an OutputFormat to write records to. So that's why you are getting a set of odd and even files per reducer. This is by design so that each reducer can perform writes in parallel.

If you want just a single odd and single even file, you'll need to set mapred.reduce.tasks to 1. But performance will suffer, because all the mappers will be feeding into a single reducer.

Another option is to change the process the reads these files to accept multiple input files, or write a separate process that merges these files together.

bajafresh4life 2010-08-16 13:01:38

insttead of changing map red tasks, i overrided getFilenameForKeyValue() function.. and this worked..... thanks.

echo 2010-08-19 03:50:40

Answer 2

A:

Multiple Output files will be generated based on number of reducers.

You can use hadoop dfs -getmerge to merged outputs

Harsha Hulageri 2010-08-17 07:17:58

thanks :) but i need to do this by map reduce only,

echo 2010-08-19 03:51:23

ansaurus

tags:

views:

answers:

MultipleOutputFormat in hadoop

related questions