views:

96

answers:

3

Someone has just told my boss what data mining can do to a company like recommendation , predictive modelling. Basically we are a website company. I am going on leave for 6 months. So my boss said that I can learn some DM techniques so that when I come back we can visit small shops or small companies to provide them with predictive data using data mining algos.

The shops will be only having sql files or csv files for customers or more.

Now I only know MYSQL and have no idea what data mining is and whether it works like I am thinking above, I mean, is it possible that if someone has database of customers, shopping and I can apply data mining technique . I mean

(raw mysql or sql data) or (csv files) ----data mining--> (some useful result)
  • 1) Is the above system correct or am I wrong
  • 2) Will the shops or business would like to have that or am I missing something

My PLAN of learning those is in following order. I am thinking of first getting some sql server 2008 cert because in my area most are using microsoft so may be I need to know sql

1)MCTS: SQL Server 2008, Implementation and Maintenance
2)MCTS: SQL Server 2008, Database Development
3)MCTS: SQL Server 2008, Business Intelligence Development and Maintenance

(or should I go for oracle and oracle data warehousiong ... I want to first do some databse properly)

4)Data Mining with Microsoft SQL Server 2008 (2009)     
5)Python for dummies    
6)Programming Collective Intelligence: Building Smart Web 2.0 Applications

Is my flow correct or can I achieve my result a better way. The reason I am doing cert is to get some understanding for sql and in case I don't get that job after 6 months I can get into new job related to data mining or BI or at least sql server.

Please help me

A: 

Uh, to do data-mining effectively, you need to know a lot of math. Your path is like "i want to be a surgeon, so I'll learn how to cut with a scalpel". Yes, knowing some SQL and is probably necessary (just depends on how your data is organized), but FAR from sufficient.

zvrba
Thanks for the info , but i have very solid Math and have done my theses in AI which involves statistics and probability , neural netwroks , but i did that in c++. You can add the things so that i know what else to do.
Mirage
A: 

Seems like you are doing it all wrong.

The most important thing is to learn data mining, AI and predictive science topics, all those hardcore math and CS stuffs. Not database technology which is although important, but not very related to data mining fields.

There is one book I would recommend, and I think it is tailored to your needs: Programming Collective Intelligence.

alt text

Ngu Soon Hui
Don't u think that part will be done when i read the collective intelligence book. and that book is written in python so i need to learn that as well. You mean to say that i should go for python then data mining techniques and if require i can study sql because i alreadu know quite bit of sql. PLease can you give me some idea where to start , where to end
Mirage
@Mirage, yes. Skip the sql part because it is not very essential as far as data mining is concerned.
Ngu Soon Hui
When using SQL Server Data Mining toolset you need to have some backgroud in statistics but dont have to know adaptive / cognitive mathematics , fractals , advance claclus etc which you would incase you are not using the inbuilt algo but writing you own. In my 6 years with Microsoft Data Mining I never had to write my own algo as the available algo were sufficent to asnwer the question I had.
Ashwani Roy
+1  A: 

Ok this is not a simple YES / NO answer. You are doing some thing right. This way you will know the SQL Server Data Mining tool set. And you will undertstand which algo to use where. (How will Naive Byes . Different from Decision Tree..etc )

Once you know this stuff , second thing is getting to know you data and how to make the FLAT tables that will serve as input. This is most important because this is the data you will use to train you modles. You dont need to know the internal mathematics behind ANN algorithm and so on. You should just know how to use it. There are data mining add-ins for excel (2007 onwards) which you can use to play around .

There are some data mining videos on http://channel9.msdn.com by Rafal Luckawiski. They are good for giving some idea on how to begin.

After this it is a matter of practice and the more you play with new data and make new models and analyze results the better you are going to become.

Let me know if you need more info on PPTs, Samples etc

Ashwani Roy
Thanks ashwani that was very good answer. you can send some ppts , samples to [email protected]. The making of FLAT tables which u were saying will come in step 1,2,3 i mentioned or or its not covered in steps i mentioned. because those are books which i am thinking of studying
Mirage
have a look at this http://channel9.msdn.com/posts/MSCOMSWE/All-you-need-to-know-about-Microsoft-Business-Intelligence-44/
Ashwani Roy