views:

718

answers:

9

I am looking into getting a log file analysis tool for multiple sites.

All the sites currently run Google analytics, however I have noticed that the stats that this provides can be a bit off (lower numbers than log files report etc etc).

It needn't be expensive although expensive tools will be considered.

It needs to be scriptable so I can generate reports in an automated fashion.

I know this is a bit discussy, but I can't find authoritative info on Google just reams of marketing puffery. So would like some recommendations based on real world usage.

Clarification: When I speak about Google Analytics spitting out results that are a bit off, we had a specific instance where one site in Google was reporting a number for people who had passed through a purchasing process, however the e-commerce software was reporting a much higher number (throwing the percentages of drop offs in the purchasing process right out), so I really need something in conjunction with Google Analytics (Which IMO is good for general stats only)

Update: I've had a good look round at the recommendations made here and a further hunt on the web. I'm currently evaluating Mach5 Analyzer

It appears to be a good option and is under £200 for the Gold version. If I get on with it I'll post it as a recommendation here.

Update: Nope Mach5 Analyzer was no good, not as quick as they said, not detailed enough reporting. Although some nice features. So the hunt continues. Anymore suggestions gratefully received.

Update: Still on the hunt for an analyzer. Sticking with Google at the moment with extra reports being generated by Weblog Expert. There must be a better way!

Update: I've got an in depth report on web analytics packages form e-consultancy http://www.e-consultancy.com/publications/web-analytics-buyers-guide-2008/. It's 250 pages in length, so going to take a while to digest. I'll post again if I glean anything useful.

Update: Well I've finished my hunt and horror of horrors I'm back where I started. Google Ananlytics. The large scale commercial solutions are often comprehensive thorough and accurate, but expensive. I've found that despite it's failings there really is no better bang for buck than Google Analytics. Therefore this is what I will use by default unless a really large enterprise client wants some of the more "expensive" features. Thanks for everyone's help and interest.

+1  A: 

I use a combination of Google Analytics and webalizer.

webalizer is a free of charge tool that parses apache logs into static HTML per year, per month and per day reports.

diciu
It should be noted that google analytics is not an accurate representation of web site usage. For more info see: http://stackoverflow.com/questions/18080/best-traffic-performance-usage-monitoring-module#105110
Toby Mills
A: 

Ive often seen somethings based on Awstats.

Please note that their security record is a bit bad, so the "static" version is clearly the way to go.

EDIT: Btw. regarding Google analytics, ive often been irritated at sites where im actually waiting for the google analytics (or in the old days, the cgi image thing with the counter in it). I dont like the idea of being depend on other "places" uptime as well as my own if I have a choice

svrist
A: 

perl :P

http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder

Aaron Maenpaa
hah I was going to say Awk
George Jempty
+2  A: 

AWStats might be usefull if you want to run the analysis yourself.

You could also look at StatCounter for a hosted one...

Marius
+3  A: 

Hi,

We've spent A LONG TIME looking at various different log file analysers. A good few years ago we ran SawMill - but on a moderately busy site it took ages to process the logs (to the point where on one of our machines, we were generating logs faster than we could process them).

We then switched to LiveStats which seemed to be quite a bit better. It was more reliable, processed logs in near real-time and the reports looked better. Then - they got bought by Microsoft and the product vanished.

So - more recently we had another trawl around, and tried various products including the newer SawMill and various others (Virtual servers are a must for trying out these things - so easy to roll-back an installation and try another product!).

All of the products had come on quite a way since the last time we looked at them. However, they were all quite cumbersome to setup, and none of them really produced reports that looked that good.

Another downside is the cost - some of them cost several thousand US dollars.

In the end, we opted for Google Analytics for one main reason. How could I justify paying thousands of dollars for a product I'd have to charge my clients for when they could turn around and say to me, "Well - instead of paying you, I'll just have Google Analytics for free, and get better reports, too!".

As for the numbers in the stats, in my experience a lot of different products end up with different results as they'll all use slightly different algorithms to calculate the results. I think you should only compare your month-on-month results within the same system, rather than comparing the same month accross different systems (if you see what I mean!).

Hope that helps (although it's probably not the answer you're looking for!).

Chris Roberts
Accepted your post as answer as this is the exact same conclusion I came to. Should have just listened in the first place!
Tim Saunders
+1  A: 

Part of the reason Google Analytics shows a lower number is because of JavaScript, without it Google will not get data for the user. Personally I use Google, and a 'Roll-Your-Own' to track the non JavaScript Users (ie, OLD browsers, text browsers, page scrapers, RSS, Search Engines.). Check this out: http://unkwndesign.com/blog/?p=10 (notice: this is my blog so this may be a shameless plug in disguise, be warned, but hey at least I'm honest) :)

Unkwntech
+1  A: 

It may be a bit more basic than what you're looking for, but Log Parser, a free utility from Microsoft is very good. It allows you to run queries against your web logs using SQL-like syntax. It will also create custom HTML reports or chart images.

It all runs from the command line, so can be easily scripted and automated. I'm not sure how it would integrate with Google Analytics though.

See Microsoft Log Parser Web Site for full details.

jules
If I had a nickel for everytime someone on SO mentions the MS Log Parser, I'd have a s**t load of nickels :)
Patrick Cuff
A: 

I like WebLog Expert Pro. Most analyzers track the usual stuff but I had a few extra requirements like a command line mode so that I can run it through a schedular or on demand through a web page. The command line needs to accept parameters like a date/time range and wildcards so it can analyze all files in a folder.

The pro version can export to Excel, pdf, ..etc and it can email you the report plus some other nice features. It's also fast.

I don't like hosted analzyers like Google Analytics because they require putting Javascript in every page which is a big hassle because some of my users do not know how to do it or they might forget plus it creates extra traffic to Google's servers.

Abdu
+1  A: 

I'm sure the OP has finished (would be interesting to know what he chose) and since he'd been using Google Analytics and was willing to pay I'd imagine the features he needed would exclude Awstats. But since I see it was mentioned a few times anyway I'd like to point out JAWstats which is a front-end built to work with AWStats and provide a much more professional/usable interface. It doesn't expose all the features of AWstats yet, but as far as I know it's the best UI going for OSS.

I use it for my own sites, it supports multi-site log analysis (data collection via awstats of course, but with nice site change pop and all the visual candy you get with packages like Mint, etc).

Anyway, I'm a long time fan of AWStats (specifically it's databasebreak feature so I can cron 'live' reports by day and watch traffic) and now I'm hooked on JAWstats so I figured I'd mention it incase anyone looking for a great free package comes across the tread. ;)