ansaurus

Question

What is the use of the 'key K1' in the org.apache.hadoop.mapred.Mapper ?

Answer 1

+1 A:

I could be wrong (I have read map/reduce tutorials but haven't used it for real projects yet), but I think in general it is the identifier of input entry; for example, tuple (file name, line number). In this particular case it's supposedly line number, and it's of no interest for word counts. It could be used if the idea was to, say, aggregate word counts on per-line, not per-file basis (or for multiple files if key did contain that info).

StaxMan 2009-04-22 18:17:31

Answer 2

+1 A:

When the InputFormat is TextInputFormat, the Key is the bytes offset from the beginning of the current input file.

Value is simply the line of text at that offset.

If SequenceFileInputFormat was used, the Key would be whatever was stuffed into the Key position of the 'record. The same for Value.

Bottom line is that they Key/Value types are dependent on the input type (text, sequence file, etc).

ckw

cwensel 2009-04-22 22:06:08

ansaurus

tags:

views:

answers:

What is the use of the 'key K1' in the org.apache.hadoop.mapred.Mapper ?

related questions