views:

71

answers:

3

I'm beginning to learn some Hadoop/MapReduce, coming mostly from a PHP background, with a little bit of Java and Python.

But, it seems like most implementations of MapReduce out there are in Java, Ruby, C++ or Python.

I've looked, and it looks like there are some Hadoop/MapReduce in PHP, but the overwhelming body of the literature seems to be dedicated to those 4 languages.

Is there a good reason why PHP is a 2nd class language in cloud computing projects like those that involve Hadoop/MapReduce? This is particularly surprising, considering that, outside of cloud computing world, PHP seems like its the most commonly supported language, to the detriment of the 3 above (sans C++) languages.

If this is arbitrary--if PHP is just as good at handling these operations as, say, Python, what libraries/projects should I look into?

+7  A: 

PHP is designed primarily as a language for displaying output to a browser. Most jobs being run on MapReduce/Hadoop clusters have nothing to do with displaying output.

They instead tend to lean much more heavily towards data processing. PHP is not the most commonly supported language for data processing, by far. Thus, it's logical that the most common supported languages for data processing-related technologies don't include PHP.

Amber
This makes sense, but if that were the case, wouldn't implementing in a functional programming language make more seems for a distributed data processing task, something like scala?
yc
+1  A: 

The reason is PHP lack of support for multi-threading and process communication.

Artefacto
A: 

You can take a look at Doctrine MongoDB Object Document Mapper. It supports map/reduce.

Michał Pipa