views:

82

answers:

4

I have a bunch of log files which are pure text. Here is an example of one...

Overall Failures Log
SW Failures - 03.09.2010 - /logs/swfailures.txt - 23 errors - 24 warnings
HW Failures - 03.09.2010 - /logs/hwfailures.txt - 42 errors - 25 warnings
SW Failures - 03.10.2010 - /logs/swfailures.txt - 32 errors - 27 warnings
HW Failures - 03.10.2010 - /logs/hwfailures.txt - 11 errors - 31 warnings

These files can get quite large and contain a lot of other information. I'd like to produce an HTML file from this log that will add links to key portions and allow the user to open up other log files as a result...

SW Failures - 03.09.2010 - <a href="/logs/swfailures.txt">/logs/swfailures.txt</a> - 23 errors - 24 warnings

This is greatly simplified as I would like to add many more links and other html elements. My question is -- what is the best way to do this? If the files are large, should I generate the html before serving it to the user or will jsp do? Should I use perl or other scripting languages to do this? What are your thoughts and experiences?

A: 

pygmentize can handle some formats, although you may need to whip up a custom lexer for most cases.

Ignacio Vazquez-Abrams
Thanks, this is interesting, unfortunately, I think that a bunch of my links will need to be hard-coded so python or perl is probably the better way to go.
prometheus
`pygmentize` is just a frontend to `pygments`, so it's Python under the sheets anyways.
Ignacio Vazquez-Abrams
+1  A: 

I'd use python regular expressions.

>>> import re
>>> a = re.compile(r'[SH]W Failures - \d\d.\d\d.\d\d\d\d - (.*) - \d+ errors -
\d+ warnings')
>>> str = 'SW Failures - 03.09.2010 - /logs/swfailures.txt - 23 errors - 24 warnings'  
>>> b = a.match(str)
>>> b
<_sre.SRE_Match object at 0x7ff34160>
>>> b.groups()
('/logs/swfailures.txt',)
>>> str.replace(b.group(1), '<a href="%s">%s</a>' % (b.group(1), b.group(1)))
'SW Failures - 03.09.2010 - <a href="/logs/swfailures.txt">/logs/swfailures.txt</a> - 23 errors - 24 warnings'
Pierre-Antoine LaFayette
+3  A: 

Here is a simple example using Perl's HTML::Template:

#!/usr/bin/perl

use strict; use warnings;
use HTML::Template;

my $tmpl = HTML::Template->new(scalarref => \ <<EOTMPL
<!DOCTYPE HTML>
<html><head><title>HTMLized Log</title>
<style type="text/css">
#log li { font-family: "Courier New" }
.errors { background:yellow; color:red }
.warnings { background:#3cf; color:blue }
</style>
</head><body>
<ol id="log">
<TMPL_LOOP LOG>
<li><span class="type"><TMPL_VAR TYPE></span>
<span class="date"><TMPL_VAR DATE></span>
<a href="<TMPL_VAR FILE>"><TMPL_VAR FILE></a>
<span class="errors"><TMPL_VAR ERRORS></span>
<span class="warnings"><TMPL_VAR WARNINGS></span>
</li>
</TMPL_LOOP>
</ol></body></html>
EOTMPL
);

my @log;
my @fields = qw( TYPE DATE FILE ERRORS WARNINGS );

while ( my $entry = <DATA> ) {
    chomp $entry;
    last unless $entry =~ /\S/;
    my %entry;
    @entry{ @fields } = split / - /, $entry;
    push @log, \%entry;
}

$tmpl->param(LOG => \@log);
print $tmpl->output;

__DATA__
SW Failures - 03.09.2010 - /logs/swfailures.txt - 23 errors - 24 warnings
HW Failures - 03.09.2010 - /logs/hwfailures.txt - 42 errors - 25 warnings
SW Failures - 03.10.2010 - /logs/swfailures.txt - 32 errors - 27 warnings
HW Failures - 03.10.2010 - /logs/hwfailures.txt - 11 errors - 31 warnings
Sinan Ünür
+2  A: 

I like awk because of its automatic field parsing:

/failures.txt/ {
        $6="<a href=\"" $6 "\">" $6 "</a><br>"
}

{
        print
}
lhf
Why the down vote?
lhf
Don't know, I voted you up. Thanks!
prometheus