Hi,
I tried printing out values using System.out.println(), but they won't appear on the console. How do i print out the values in a map/reduce application for debugging purposes using Hadoop?
Thanks, Deepak.
Hi,
I tried printing out values using System.out.println(), but they won't appear on the console. How do i print out the values in a map/reduce application for debugging purposes using Hadoop?
Thanks, Deepak.
Lines printed out using System.out.println and System.err.println go to the logs that Hadoop writes on a per-job basis. You can view them using the web UI by clicking through to individual tasks.
See this blog post from Cloudera for a lot of advice on testing and debugging Hadoop jobs: http://www.cloudera.com/blog/2009/07/advice-on-qa-testing-your-mapreduce-jobs/
The page @SquareCog points to is a very good source of information on debugging a MapReduce job once you are running it on a cloud.
Before you reach that point though you should consider writing unit tests for your mappers and reducers, so you can verify that the basic logic works. If you are interested in unit tests to test drive your map and reduce logic check out mrunit, which works in a similar fashion to JUnit.
I'd like to add, that you can use Counters
Hadoop JavaDoc for debugging as well. E.g. if you want to check, how often a certain line in your program is passed.