views:

307

answers:

2

I'm trying to incorporate user statistics into a site and decided to go for this in my users table:

  • time when the user registers
  • time when the user verifies
  • visit count
  • time of last visit

What other statistics am I missing? Should I track each login time in a separate table too? Is that considered good auditing or too much?

There are other usage stats that I thought could be helpful too like page views, etc. Do you recommend implementing my own tracking or using a ready-made solution like google analytics (I feel it will add too much external code to the site, so I thought it might be better to implement my own page view tracking). Any thoughts on this?

+1  A: 

This would depend on the type of services/data your site offers. Generally having a table of all logins is not needed .. as it just adds data which might never be needed. I think what you have is ok. Maybe you need to add the signup ips and last login ip etc in there. Just incase.

As for tracking, i think using a third party tool would be good but that wouldnt actually give u the detail per user . cause google analytics doesnt know which user in ur database visited which page.. if you REALLY need that.. keep that in a separate table and make sure you have archiving in place .. otherwise it would grow huge and inserts will get slower and slower.

Auditing should always be done based on what is actually needed.. too much needless auditing will just slow ur application down.

Sabeen Malik
+1 for the google analytics point. makes sense, google doesn't know which db user is visiting.
Chris
+2  A: 

Deciding on the extent of usage tracking in your website is ultimately a personal decision depending on the long-term goals of your website/app.

Some things to consider:

  1. Are these statistics going to be used by the web app owner to decide future product direction? (judging popularity of features, identifying high load sections of your site, etc.)
  2. Are these statistics going to be used by the web app owner for marketing reasons? (campaign success ratios, conversion metrics, marketing goals, etc.)
  3. Are these statistics going to be used by your users for a feature(s)? (link traffic for popularity, number of profile views, etc.)

For #1 and #2, it's very important to decide whether you value user-based statistics or page-based statistics or both.

For #3, it's easier to implement on an ongoing basis, usually when you introduce a new feature.

That's the overall strategy. In your case, (not knowing the above stated goals), I would think you'd need:

  • table of user logins (each row is an entry of a user login) - this helps if you ever need to compile the stats of what is the most popular time of access of your users, do users "clump" together or are the logins spread apart? Additionally, you should archive the data in this table every month and store only your monthly metrics in another table or in a file on disk.
  • table of user failed logins - this table is usually overlooked but extremely important for diagnosing bugs and/or security attacks
  • table of active users - use this with a combination of a cronjob to take snapshots of this table to figure out a trend in active users on the site
  • table of users page views (each row is a user/page pair) - this helps decide which pages/features are more popular and helps decide the future product direction

This all being said, don't be afraid to use 3rd party tools like Google Analytics (especially if your case is #2). There's no sense in re-inventing the wheel and implementing our own usage metrics layer (usually comes at a performance cost and Google has more bandwidth and higher performance than you will probably have).

There are other tools like Mint (http://haveamint.com) which you can install server-side and customize for our own usage metrics.

Pras
+1 for the excellent answer. You're right, the objective is future directions, identifying the most active users, etc. My concern is that at the early stage, it's hard to know what's worth tracking. I don't want to track everything then realize the data was worthless, just slowing down the site.
Chris
Alternatively, if you're good with Apache log analysis, with careful URL structure on your webapp, you could analyse click/event/page tracking. Though you'll have to make sure you change your Apache log format to save the pertinent info (IP address, UserAgent, page path, etc.) This is usually a good compromise if you don't want to bog your performance down, but comes at the cost of being able to compile these statistics from Apache logs.
Pras
Which way would you personally go?
Chris
Personally, I would weigh the amount of statistics required against the performance costs to the user. I'd track the user related statistics in the db (probably user logins and failed login attempts). For the feature/page specific pageviews, I'd use Google Analytics, especially to measure user registration conversion (track how many users were served the "Thank you for signing up" page).Apache logging though having the best performance, is really a pain to analyse (even with the tools that are available).
Pras