views:

113

answers:

2

I was reading and hearing some stuff about cloud computing and map-reduce techniques lately. I am thinking of playing around with some algorithms to get practical experience in that field and see what is possible right now.

Here is what I want to do: I would like to use some public cloud platform (e.g. Google App Engine, Google Map Reduce, Amazon ECS, Amazon Map Reduce) that comes with built in map reduce functionality or if it comes without built in support, use an additional map reduce java libary (e.g. Hadoop, Hive), and implement/deploy some algorithms.

Has anyone made some experience in that field and indicate a good point to start? Or name some combinations which have worked well in practice?

Thanks in advance!

+2  A: 

Amazon EC2 has some pre-bundled Hadoop AMIs. See Running Hadoop on Amazon EC2 for a tutorial.

In particular, the Cloudera distribution comes to mind - it comes with Pig and Hive as well.

danben
I read the article you referenced, very useful. I guess I will start with hadoop maybe use some of my local computers and later try to deploy it to EC2. Thank for the answers.
Nils Schmidt
+1  A: 

Apache Hadoop is a major open-source Java distributed computing framework, and it includes a MapReduce subproject that is based off of the original Google MapReduce.

Kaleb Brasee
Would you also suggest to use then Amazon EC2 to run Hadoop on? Is it possible to deploy Hadoop based applications onto Googles Application Engine?
Nils Schmidt