views:

100

answers:

3

Hi I’m preparing my graduation project from computer science, I made this website and its running perfectly but my supervisor requested me to apply data mining on the website. But I don’t understand what I should do. The website is a social network, each user will have a profile and blog and access to some e-books that required you to be registered so you can download. The website also contain a music server that contain songs that a registered user can choose a song to download or to add it as a favorite in his profile page, the website contain ads (I used OpenX script), so this is most of the website services where I can perform data mining , the website is www.sy-stu.com.

I need ideas and what is the best way to present it in the interview

A: 

I dont know exactly what kind of data you are trying to mine, but have you check out google analytics? It is very easy to setup, once you register all you need is to include the javascript provided to your web pages. Google analytics will give you plenty of statistic about access to your site information regarding your site and visits. Is that what you need? The data produced is very easy to read as well and will be suitable for you to present I reckon.

Fadrian Sudaman
@Fadrian well I need somethin more specific I used google analytics.I have the ebooks section,music center, blogs like a study what catagory of users update their blogs frequently or download a specific categories of books , but how to apply this and show it off ,I do need help in that never done such thing
arin
Data mining is a very broad term and can involve some complex algorithm to it (and I'm no expert at that). Looks to me that all you need to mine and report on specific user activities. The easiest way to go about that is to create an audit/activity log table and as user is making changes you insert a record e.g. username, section type, activity type, date time, etc. You can have plain table listing or use time element as well to report on freq, or graph/chart for reporting the distribution. An simple example will be a piechart to show percentage of activities by section/type in the last month.
Fadrian Sudaman
+2  A: 

You can ask your professor what was his intention of using data mining. Data mining algorithms can do various tasks, you need first define what you want to accomplish and then find some algorithms for this and technical possibilities.

Some ideas that came to my mind about usage of data mining in your project:

  1. you can use data mining to find what songs (ebooks,etc.) can be favorited by a user based on other people favorites songs (find similarities, probably association rules would be a good algorithm for this).
  2. you can use some clustering algorithms to group users based on some parameters and suggest them that they could become a connections with other people from the same group (if you have something like this)

Good luck!:)

Lukasz Dziedzia
@Dzida Exactly but I never done such a thing can you show me examples or provide me with external examples.
arin
@arin - I don't mean to be rude, but have you searched the net for examples (e.g. "how to perform data mining")? Having to research a topic shouldn't come as a shock to a student.
Gert G
@arin DataMining is a huge topic. First you need to ask your professor about his exact expectations. I hope he is concious how broad is DataMining and he should give you some advices and pointers. Then you can find for some tutorials with data mining libraries implemented in technology you made your website (there should be some).
Lukasz Dziedzia
+1  A: 

Firstly, ask for clarification from your supervisor. Don't say 'What do you mean?', but ask 'Are you expecting something like this?' because it shows that you've at least thought about it.

If you can't think of anything, or your supervisor is vague, perform some simple data retrieval and analysis, e.g.

  • most active members
  • the most / least popular songs and books.
  • number of ads clicked etc
  • most popular website features

Just elementary analysis should suffice - you aren't doing a statistics degree. Work out the most songs downloaded in a day or per user, the average songs per user, how many users visit each day and how many sign up and never visit.

The purpose is to demostrate that your website is logging all activity, so that when you are asked 'how many books did the 20 most active users download in June' you will be able to work out the answer.

The alternative is a website that just runs and you don't have any knowledge of how your users are behaving and what they are doing, which means you aren't able to focus on things that they find important.

Kirk Broadhurst