views:

194

answers:

3

I'm considering writing my own tool for tracking visitors/sales as Google Analytics and others are just not comprehensive enough in the data dept. They have nice GUIs but if you have SQL skills those GUIs are unnecessary.

I'm wondering what the best approach is to do this.

I could simply just log the IP, etc to a text file and then have an async service run in the background to dump it into the DB. Or, maybe that's overkill and I can just put it straight in the DB. But one DB WRITE per web request seems like a poor choice where scalability is concerned. Thoughts?

As a sidenote, it is possible to capture the referring URL or any incoming traffic, right? So if they came from a forum post or something, you can track that actual URL, is that right?

It just seems that this is a very standard requirement and I don't want to go reinventing the wheel.

As always, thanks for the insight SOF.

A: 

Have you looked at Log Parser to parse the IIS logs?

Mitch Wheat
That looks cool as a sidenote, I'll check it out if i decide to go w/ the text logging route.
Scott
+1  A: 

The answer to this question mentions the open-source GAnalytics alternative Piwik - it's not C# but you might get some ideas looking at the implementation.

For a .NET solution I would recommend reading Matt Berseth's Visit/PageView Analysis Services Cube blog posts (and earlier and example and another example, since they aren't easy to find on his site).

I'm not sure if he ever posted the server-side code (although you will find his openurchin.js linked in his html), but you will find most of the concepts explained. You could probably get something working pretty quickly by following his instructions.

I don't think you'd want to write to a text file - locking issues might arise; I'd go for INSERTs into a database table. If the table grows too big you can always 'roll up' the results periodically and purge old records. As for the REFERER Url, you can definitely grab that info from the HTTP HEADERS (assuming it has been sent by the client and not stripped off by proxies or strict AV s/w settings).

BTW, keep in mind that Google Analytics adds a lot of value to stats - it geocodes IP addresses to show results by location (country/city) and also by ISP/IP owner. Their javascript does Flash detection and segments the User-Agent into useful 'browser catagories', and also detects other user-settings like operating system and screen resolution. That's some non-trivial coding that you will have to do if you want to achieve the same level of reporting - not to mention the data and calculations to get entry & exit page info, returning visits, unique visitors, returning visitors, time spent on site, etc.

There is a Google Analytics API that you might want to check out, too.

CraigD
Thanks Craig, I'll check that out.Ya, I'm still planning on keeping Google Analytics for all the reasons you suggest. I'm just supplementing w/ my own impl. The main features lacking appear to be, for example:I want to log EACH time your IP hits my site, and where that ad was placed. Then, when/if you make a purchase, I want to log that as well. This way, I can see which advertising is most effective on a per VISITOR and per SIGNUP basis. It also helps to know how many times you had to come back in order to buy, etc. Maybe their api will help, I'll check that out.
Scott
You should definitely look at the Ecommerce tracking in GAnalytics if you aren't already. It enables Revenue Analysis by language, network locations and by traffic source keywords PLUS visits-to-purchase, days-to-purchase, etc. You do have to put some additional javascript on your order-complete page, but it's worth it for the extra reporting data.Also check out their Campaign Tracking http://code.google.com/apis/analytics/docs/gaJS/gaJSApiCampaignTracking.htmlGAnalytics is much more sophisticated than it first appears.
CraigD
A: 

I wouldn't have though writing to a text file would be more efficient than writing to a database - quite the opposite, in fact. You would have to lock the text file while writing, to avoid concurrency problems, and this would probably have more of an impact than writing to a database (which is designed for exactly that kind of scenario).

I'd also be wary of re-inventing the wheel. I'm not at all clear what you think a bespoke hits logger could do better than Google Analytics, which is extremely comprehensive. Believe me, I've been down the road and written my own, and Analytics made it quite redundant.

Dan Diplo
Thanks Dan, maybe you're right about analytics. See my comment above to Craig. The main thing I want to track is # of visits vs # of sales and how many times you had to return in order to purchase. Additionally, I don't see anywhere in Google Analytics to track the exact referring URL that each visitor came from. So, if I place 10 ads today, I want to see which ads attracted which visitors. Subsequently, I want to see which of those visitors RETURNED and purchased. Unless I'm missing something, I don't think GA has this level of depth.
Scott
Dan Diplo