Do you know any large datasets to experiment with Hadoop which is free/low cost? Any pointers/links related is appreciated.
Prefernce:
Atleast one GB of data.
Production log data of webserver.
Few of them which I found so far:
Also can we run our own crawler to gather data from sites e.g. Wikipedia? Any pointers on how to do this is appreciated as well.