I would just write the script..
It might have been written countless times before, but I doubt it will have been for the right database, or for your specific log configuration (Not sure about the W3C Extended Log Format, but with many others you can define a custom formatting)
Looking at log format doc, it should be pretty trivial to take each field ad create a column in the DB for it..
Then, to parse the example log from the log-format doc:
#Version: 1.0
#Date: 12-Jan-1996 00:00:00
#Fields: time cs-method cs-uri
00:34:23 GET /foo/bar.html
12:21:16 GET /foo/bar.html
12:45:52 GET /foo/bar.html
12:57:34 GET /foo/bar.html
..the following script will work fine, which only took a few minutes to write:
import re
import sys
mr = re.compile("^(\d\d:\d\d:\d\d) ([A-Z]+) (.+)$")
def insert_into_database(time, rtype, uri):
print "INSERT INTO database (%s, %s, %s)" % (time, rtype, uri)
for line in open("logfile.log").readlines():
m = mr.match(line)
if not m:
sys.stderr.write("Invalid line: %s\n" % line.strip())
else:
insert_into_database(m.group(1), m.group(2), m.group(3))
May not be the most robust/reliable script ever, but it works (well, aside from the insert_into_database function!)