views:

190

answers:

1

We are running Hadoop on Amazon EC2 cluster. We start the master, slaves and attach the ebs volumes and finally waiting for hadoop jobtracker, tasktracker etc to start and we have timeout of 3600 seconds. We are noticing 50% of the time that job tracker is not able to start before the timeout. Reason being, hdfs is not initialized properly and still in safemode and job tracker is unable to start. I noticed few connectivity issues between nodes on EC2 as I tried manually pinging slaves.

Did anyone face similar issue and know how to solve this?

A: 

I'm not sure, whether this issue is related to Amazon EC2. I had this problem very often too - although I had a pseudo-distributed installation on my machine.

In these cases I could turn the safemode off manually and safely.
Try this command:bin/hadoop dfsadmin -safemode leave

I think you can't do wrong here. It seems to be a buggy feature of hadoop. I used 0.18.3, what version do you run?

Peter Wippermann
In our case, I actually went and pinged the amazon ec2 instance that hdfs was having problem connecting to and the ping failed. So, I am concerned this is an amazon issue.I am running hadoop 0.20.1
Algorist