views:

55

answers:

3

We have a large number (read: 50,000) of relatively small (read under 500K, typically under 50K) log files created using log4net from our client application. A typical log looks like:

Start Painless log
Framework:8.1.7.0
Application:8.1.7.0
2010-05-05 19:26:07,678 [Login ] INFO  Application.App.OnShowLoginMessage(194) - Validating Credentials...
2010-05-05 19:26:08,686 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Checking for Application Updates...
2010-05-05 19:26:08,830 [1     ] INFO  Framework.Globals.InstanceStartup(132) - Application Startup
2010-05-05 19:26:09,293 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Purchase History Data>:True
2010-05-05 19:26:09,293 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Shopping Assistant>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Shopping List>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Teeth>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Scanner>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Value Comparison>:True
2010-05-05 19:26:09,294 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes CRM>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Salesforce.com>:False
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes Mail>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Sales Leads>:True
2010-05-05 19:26:09,295 [1     ] INFO  Framework.PluginManager.LogPluginState(150) - Plugin <Configurator>:True
2010-05-05 19:26:09,297 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,342 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,489 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Global Handlers...
2010-05-05 19:26:10,495 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Starting Main Window...
2010-05-05 19:26:10,496 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Initializing Components...
2010-05-05 19:26:11,145 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Restoring Location...
2010-05-05 19:26:11,164 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Plug Ins...
2010-05-05 19:26:11,169 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Panels...Order Manager
2010-05-05 19:26:11,181 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Orders...
2010-05-05 19:26:11,274 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Done Loading 1 Order
2010-05-05 19:26:11,314 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Panels...All Products
2010-05-05 19:26:11,471 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Purchase History Data
2010-05-05 19:26:11,545 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Shopping List
2010-05-05 19:26:11,597 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Teeth
2010-05-05 19:26:11,768 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Scanner
2010-05-05 19:26:11,810 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Value Comparison
2010-05-05 19:26:11,840 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Sales Leads
2010-05-05 19:26:11,922 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...(Done!)
2010-05-05 19:26:11,923 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Adding Handlers...
2010-05-05 19:26:11,925 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Main Window Handlers...
2010-05-05 19:26:11,932 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Loading Main Menu...
2010-05-05 19:26:11,949 [1     ] INFO  Application.App.OnShowLoginMessage(194) - Initialization Complete.
2010-05-05 19:26:13,662 [1     ] INFO  Framework.ProductSearch.Search(342) - User entered term: <>

I'd like to be able to to parse these logs server side (either when they're uploaded or nightly) to extract either exceptions (which always log at ERROR or FATAL) or other specific log messages like:

2010-05-05 20:05:24,951 [1     ] INFO  Framework.ProductSearch.Search(342) - User entered term: <kavo>

to get the 'kavo' term so we can find out what people are really searching for.

I've tried parsing the text by hand (using String.Split() and similar methods) but it really feels like I'm reinventing the wheel.

Is there a nice library to do this kind of log extracting?

A: 

This looks like a typical case of .NET regular expressions

This way you can search for patterns like "User entered term: ". You can extract the group match afterwards

Eric
A: 

can log4net not read it's own format patterns? Once it reads the format back in, can't you then parse just the %message% section?

A better question maybe why you don't just write that search term to a database at the same time you make the log call or have another log file through log4net that just logs the %message% and all you write to that log is the search term? I really don't think you want to be parsing logs if you can avoid it...at least going forward.

Chad
+1  A: 

You could use Microsoft's Log Parser (Download here). The tool also exposes a COM interface that will let you access the results programmatically. This blog post has a small (partial) instructions for doing that with version 2.1.

So now instead of feeling it is reinventing the wheel, you probably feel like it is over-engineered. ;)

Tuzo
+1 for LogParser I've been using it successfully for pretty large files. It is very flexible (SQL-like syntax for parsing/extracting) and fast as well. You might have a better experience if you can settle on a "structured" Logfileformat (e.g. CSV) though.
Christian.K