I am working on a search engine design, which is to be run on cloud. We have just started, and have not much idea about Hdoop. Can anyone tell if HBase , MapReduce and HDFS can work on a single machine having Hdoop installed and running on it ?
views:
58answers:
3Yes. In my development environment, I run
- NameNode (HDFS)
- SecondaryNameNode (HDFS)
- DataNode (HDFS)
- JobTracker (MapReduce)
- TaskTracker (MapReduce)
- Master (HBase)
- RegionServer (HBase)
- QuorumPeer (ZooKeeper - needed for HBase)
In addition, I run my applications, and map and reduce tasks launched by the task tracker. Running so many processes on the same machine results in a lot of contention for CPU cores, memory, and disk I/O, so it's definitely not great for high performance, but there is no limitation other than the amount of resources available.
Yes you can. You can even create a Virtual Machine and run it on there on a single "computer" (which is what I have :) ).
The key is to simply install Hadoop in "Pseudo Distributed Mode" which is even described in the Hadoop Quickstart.
If you use the Cloudera distribution they have even created the configs needed for that in an RPM. Look here for more info in that.
HTH
same here, I am running hadoop/hbase/hive on a single computer. If you really really want to see distributed computing on a single computer, grab lots of RAM, some hard disk space and go like this -
- make one or two virtual machines (use virtual box)
- install hadoop on each of them, make ur real instalation (not any virtual one) as the master, rest slave
- configure hadoop for real distributed environment
- now when hadoop starts, you should actually have a cluster of multiple computers (one real, rest virtual)
this could just be an experiment, because unless you have a decent multi-cpu or multi-core system, such a configuration will actually consume more on maintaining itself than giving you any performance.
gud luck.
--l4l