views:

196

answers:

1

Hi,

I'm a .NET programmer doing some Hadoop work in Java and I'm kind of lost here. In Hadoop I am trying to setup a Map-Reduce job where the output key of the Map job is of the type Tuple<IntWritable,Text>. When I set the output key using setOutputKeyclass as follows

JobConf conf2 = new JobConf(OutputCounter.class);
conf2.setOutputKeyClass(Tuple<IntWritable,Text>.class);

I get a whole bunch of errors because generics and the ".class" notation don't seem to fly. The following works fine though

JobConf conf2 = new JobConf(OutputCounter.class);
conf2.setOutputKeyClass(IntWritable.class);

Anyone have any pointers on how to set the output key class?

Cheers, Jurgen

+3  A: 

In java, generics are erased at compile time, so the best you can do is:

 conf2.setOutputKeyClass(Tuple.class);

If you can, to make this better, you can subclass Tuple to keep a type at runtime:

 public class IntWritableTextTuple extends Tuple<IntWritable, Text> {}

And then use that as your parameter to setOutputKeyClass.

Note, I know nothing about Hadoop, so this may not make any sense there, but in general with java Generics, this is what you do.

Yishai
Yeah, that's the best I could come up with, too.
Michael Myers
I believe extending a generic type with no body as you show in `IntWritableTextTuple` just to bypass such errors in discouraged.
Hemal Pandya
I think that discouraging a pattern is fine - if you present an alternative. Here there is no alternative, so what else can you do?
Yishai