tags:

views:

17

answers:

2

I've run into several different situations where I need to implement a "log cleanup" regex. I've had to re-implement a couple times, but the basic variant is this:

The Original

(23:59:59)
Username says:
user inputted text
(00:00:13)
Username
user inputted action
(00:01:42)
Username says:
user inputted text
(00:02:13)
Username says:
user inputted text

I'm looking for a good lookahead/lookbehind regex that converts it to:

(23:59:59) Username says: user inputted text
(00:00:13) Username user inputted action 
(00:01:42) Username says: user inputted text
(00:02:13) Username says: user inputted text

What's your angle of attack, and why?

+1  A: 

Unless regex is absolutely necessary,

awk '/^\(/{print ""}{printf "%s ",$0}' file

the logic behind is to print all the lines without newline, except when "(" is encountered as the first character. this can be implemented in any language.

Bash

#!/bin/bash

while read -r LINE
do
 case "$LINE" in
   "("* ) echo
 esac
 printf "%s " $LINE
done<"file"
ghostdog74
The only potential shortfall here is if the first line in a message is a parentheses. This could be common in some formats. I guess I could also implement a {nn:nn:nn} format?
Martindale
+2  A: 

In Perl, according to your input sample:

#!/usr/bin/perl
use strict;
use warnings;

my @l;
my $str;
while(<DATA>) {
  chomp;
  $str .= $_." ";
  unless($.%3) {
    push @l,$str,"\n";
    $str = '';
  }
}
print @l;

__DATA__
(23:59:59)
Username says:
user inputted text
(00:00:13)
Username
user inputted action
(00:01:42)
Username says:
user inputted text
(00:02:13)
Username says:
user inputted text

Ouput :

(23:59:59) Username says: user inputted text 
(00:00:13) Username user inputted action 
(00:01:42) Username says: user inputted text 
(00:02:13) Username says: user inputted text 
M42