views:

57

answers:

2

Hi all,

This is one of my first try with Map Reduce on AWS in its Management Console. Hi have uploaded on AWS S3 my runnable jar developed on Hadoop 0.18, and it works on my local machine. As described on documentation, I have passed the S3 paths for input and output as argument of the jar: all right, but the problem is the third argument that is another path (as string) to a file that I need to load while the job is in execution. That file resides on S3 bucket too, but it seems that my jar doesn't recognize the path and I got a FileNotFound Exception while it tries to load it. That is strange because this is a path exactly like the other two...

Anyone have any idea?

Thank you

Luca

A: 

Hi Luca,

This is a problem with AWS, please check Lesson 2 at http://meghsoft.com/blog/. See if you can use FileSystem.get(uri, conf) to obtain a file system supporting your path.

Hope this helps.

Sonal

Sonal
Sonal, thank you for your suggestion. I think this should be the right way.Moreover, I found that a FileInputStream instead a string path can be enough for my needs: I tried withclassifierPath = args[2];FileSystem inputFS = FileSystem.get(URI.create(classifierPath),conf);ObjectInputStream objectClassifierStream = new ObjectInputStream(inputFS.open(new Path(classifierPath))); loadedClassifier = CRFClassifier.getClassifier(objectClassifierStream);but I'm still having problem when I load objectClassifierStream: exception "Bad Header".Maybe I made a mistake in my code?Thank you.
zero51
A: 

Sonal,

thank you for your suggestion. In the end the solution was using the DistributedCache.

Loading the file before to run the job I can access inside the Map Class everithing I need by overriding the confiure method and taking the file from the distributed cache (already loaded with the file).

Thank you,

Luca

zero51