views:

149

answers:

1

Hello!

I need to develop a tool for web log data mining.

Having many sequences of urls, requested in a particular user session (retrieved from web-application logs), I need to figure out the patterns of usage and groups (clusters) of users of the website.

I am new to Data Mining, and now examining Google a lot. Found some useful info, i.e. querying Frequent Pattern Mining in Web Log Data seems to point to almost exactly similar studies.

So my questions are:

  1. Are there any python-based tools that do what I need or at least smth similar?
  2. Can Orange toolkit be of any help?
  3. Can reading the book Programming Collective Intelligence be of any help?
  4. What to Google for, what to read, which relatively simple algorithms to use best?

I am very limited in time (to around a week), so any help would be extremely precious. What I need is to point me into the right direction and the advice of how to accomplish the task in the shortest time.

Thanks in advance!

+1  A: 

1&2: Orange has a frequent pattern mining module. It also supports clustering.

3.I have just check the content of the book. There is not a chapter for frequent pattern mining. Anyway, it is generally a good book for beginners in data mining. You will find it very useful to help you define your problem precisely.

4.You need to understand the input and output of clustering, frequent pattern mining/association rule mining. So google these algorithms, or find a good data mining text book to read.

Yin Zhu
Thanks for a nice answer, Yin! I hope I'll get some more advice, so not choosing your answer as accepted yer.
Igor Katson