Hello,
I'm thinking about using hadoop to process large text files on my existing windows 2003 servers (about 10 quad core machines with 16gb of RAM)
The questions are:
Is there any good tutorial on how to configure an hadoop cluster on windows?
What are the requirements? java + cygwin + sshd ? Anything else?
HDFS, does it play nice on windows?
I'd like to use hadoop in streaming mode. Any advice, tool or trick to develop my own mapper / reducers in c#?
What do you use for submitting and monitoring the jobs?
Thanks