views:

775

answers:

4

Iam learning data mining and wondered how Python figures when it comes to data mining? Are there good tools for data mining in python?

+1  A: 

The question is a bit ambiguous, but IMHO, with Python being a great all-round language, there is no reason why you can't use it for data mining.

You also have the advantage that Python/GTK could be used for displaying progress and/or output

It also have a great range of libraries.

If you want to distribute your data mining across many servers, each with discrete data then you could have a look at Parallel Python for distributing workloads.

I say, in place of a DSL, why not?

Other languages such as Erlang might also be of interest (and it is all the rage so you will be down with the kids too :S )

Aiden Bell
+3  A: 

The Orange data-mining framework is written in, and extensible via, Python.

Dirk Eddelbuettel
+1, loving the retro UI on that
Aiden Bell
I checked out orange. Pretty cool. But no updates so far for python 2.6 and 3.x
Goutham
+5  A: 

The book Programming Collective Intelligence might be a great start for you. It uses Python and implements a couple of interesting data mining algorithms, covering lots of ground.

nikow
Thanks. Saw this suggested elsewhere too. Will give it a shot.
Goutham
The same author, Toby Segaran, occupation "data magnate", also has a good new book I am just starting called Programming the Semantic Web, which is more about real data mining than some of the theoretical linguistic stuff associated with the term.
bvmou
Interesting, I did not know about the new book, thanks.
nikow
A: 

While Collective Intelligence is excellent and does use Python, an actual TOOL that's ready to go is the Orange Toolkit as mentioned by user Dirk Eddelbuettel.

It gives you a visual environment within which you can build your DM process. It also gives you scriptability of all the components via Python.

ybakos