views:

321

answers:

1

Hello,

I learnt Hadoop a few months back and managed to do a very introductory programming project on it. I want to do a small - medium sized project or series of small programming assignments with Hadoop. I have seen lot of ideas around but I dont see anything that can be finished in about 60-70 hours of work so a pretty small scale project as I want to do that in my spare time along with other studies. Most project ideas I have seen sort of large to go on for 2-3 months. My main objective out of this exercise to develop good expertise in programming with Hadoop environment not to do any research or solve specific problems. I see Hadoop being used lot of with webservices maybe that would be an interesting track for small projects.

Thank you in advance.

Regards, Aditya

A: 

I recently came across the Cloud9 MapReduce library developed by UMD. Judging by you previous question about NLP I thought this might be of use to you.

There are some exercises at the bottom of the page which might provide you with a place to start.

For some information on text processing using MapReduce you could take a look at this draft book linked to by its author Jimmy Lin:

Data-Intensive Text Processing with MapReduce

Binary Nerd