views:

165

answers:

1

For hadoop application development, are PHP frameworks less popular ?If so, why? Else,please do point literature/documentation/tutorials for a specific framework? (stuff for Symfony would be awesome!)

+1  A: 

Basically Hadoop lets you run you send your own java classes into the cluster to be run on the appropriate servers at the appropriate times. All non JVM languages API for Hadoop basically involve running pre-existing java classes, so nothing fancy is possible.

Also, hadoop is not really interactive framework. It is geared more for batch processing huge amount of data in somewhat reasonable time. The rule of a thumb is if you sub-job processes less then 15MB in one step you shouldn't bother with hadoop. The overhead of setting up the job and shuffling data and results around is just going to kill you.

There is nothing stopping you from creating your own hadoop API for PHP similarly how it is done for Python. However, PHP is not really popular with the crowd who does large scale data processing.

Vlad
Thanks, Vlad!Still, the reason as to why PHP is out of favour with the hadoop community is unclear to me. For instance, there is a Ruby app, open source, trendingtopics.org demonstrating use of hadoop in an applicationThere isnt anything equivalent done by anybody, thislong!Would be nice to know if there are more signficant reasons behind the same.
well what trendingtopics.org actually is not using Ruby for the hadoop purposes. Ruby is used just as a front-end to the database. The database is populated by a hadoop job. The hadoop job is written in combination of hive and python. Here is the code http://github.com/datawrangling/trendingtopics/tree/master/lib/scripts/. It is exactly the application that I had in mind when I mentioned python bindings.
Vlad