views:

76

answers:

2

Is it possible to add new nodes to Hadoop after it is started? I know that you can remove nodes (as that the master tends to keep tabs on the node state).

+3  A: 

You can add new nodes by just booting up a new one with a proper hadoop-site.xml (one that points back to the master namenode and jobtracker).

That said, removing nodes is a really bad idea without rebalancing your hdfs file blocks to prevent removing all the dupes of a given block.

If you drop three datanodes, you could lose all the dupes for a given block (that has a replication of 3, the default), thus corrupting the file the block belongs too. Removing two nodes could leave you with one replica, and it could be corrupt (known to happen with dying disks).

cwensel
Well I want to build a cluster that may have unreliable clients [network connection etc]
monksy
You're right Hadroop isn't made for dynamic scaling, its ment for fixed clusters
monksy
+1  A: 

i think as long as you don't use them as datanodes your wouldn't have an issue ... of course the data locality aspect of hadoop is gone at that point.

tobiassp