views:

81

answers:

2

I have recently become interested in the field(s) of data mining and machine learning. The idea of going through huge datasets and trying to correlate hidden patterns and trends is fascinating. So far I have done the following

  • Used Weka to load simple data sets and generate decision trees
  • Continously read books, wiki's, blogs and SO on the same
  • Started playing around SQL Server DM and Python API's
  • Have an idea on options of freely available data sets on the web(freedb, UN etc)

What is hindering me is the minute I try to go beyond classification/associsciation and into priori/apriori algorithms I am stuck because understanding mathematical equations and logic is not(to put it modestly) one of my strong points.

So my question would be are there anybody in the Data mining field(in the role of product owner or builder) who are not naturally mathematicians? If so, how would you approach in undestanding the field since free tools like Weka and Rapid-miner both expects some mathematical/statistical background?

P.S: Excuse me if I made some mistake in the query like mixing Data mining and analytics when they are separate as I am still getting my feet wet. I hope my core question is clear.

+2  A: 

Well, being able to do some analysis of what the data mining models are showing is absolutely vital. However, these days all of the math and statistics are taken care of by the data mining models. You don't need to understand the math behind them (although it helps).

For example, you can look through the SQL Server Analysis Services Data Mining Algorithms and see that even the technical reference is how to use these implementations, not how to recreate them.

If you can understand the business cases and you can understand what the data mining is telling you, there's really no need to delve into the math behind it.

As for some of the free tools, I've never used them, so I can't speak to them. However, I'm a big fan of SSAS and those data mining models, which don't require an extensive mathematical background.

Eric
+1  A: 

As Eric says, and as far as you only intend to use the existing algorithms and APIs and make sense from them, I don't see problems with the required math/statistics skill set (anyway, you'll need some previous basic knowledge/level).

Now, if you intend to do research or if you want to improve or modify existing algorithms, or why not, create your own algorithms, then math and statistics is a MUST. I just started doing some research in this area, and I'm still trying to fill my skills gap =)

Abel Morelos