What is the closest thing like Hadoop, but in C++?
In particular, I want to do distributed computing using MapReduce.
Thanks!
What is the closest thing like Hadoop, but in C++?
In particular, I want to do distributed computing using MapReduce.
Thanks!
BerkeleyDB is a key/value based database ( but it does not use the MapReduce algorithm).
Berkeley DB replication groups consist of some number of independently configured database environments. There is a single master database environment and one or more client database environments
: http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/rep.html
The original MapReduce implementation by Google is in C++, but unfortunately it is not made available to the public. That leaves you with the following options:
I recommend giving the first one a try. It works uses stdin/stdout as interface, and works like a charm.
MongoDB http://www.mongodb.org/display/DOCS/Home:
MongoDB (from "humongous") is a scalable, high-performance, open source, schema-free, document-oriented database. Written in C++, MongoDB features: Replication and fail-over support... MapReduce for complex aggregation...
List of Map-Reduce implementations can be found at http://en.wikipedia.org/wiki/MapReduce#Implementations.
Some of these frameworks are implemented in C++. But if you only interested in writing map-reduce applications using C++, then Hadoop also has pipes. Pipes is library which allows C++ source code to be used for Mapper and Reducer code.
Cloudstore migt be worth a look and its written in C++.
CloudStore (KFS, previously Kosmosfs) is Kosmix's C++ implementation of Google File System. It parallels the Hadoop project, which is implemented in Java.
Sector/Sphere - distributed file system with integrated map-reduce framework.