tags:

views:

38

answers:

1

Hi,

We can provide input files to the mapper as

FileInputFormat.setInputPaths(conf, inputPath);

Is it possible to pass a reference to memory say a DOM tree constructed using a DOM parser after parsing an xml file as an input to mapper function of Hadoop framework.

What other possibilities are there ?

Thanks, L

A: 

No, you can't specify memory (RAM) based information.

The reason is that in general Hadoop applications will be distributed over a lot of physically separated systems. The current version of Hadoop "only" supports distributed data using HDFS ... which is a file system.

What you can do is add the DOM parser as a preprocessing step to your mapper and simply specify your input test file as the input. You can most easily do that by creating your own derivative of FileInputFormat.

HTH

Niels Basjes
i have been doing that only.. just wanted to make sure that Hadoop take only files as input
lam
Thank you very much for the answer
lam