views:

479

answers:

1

Hi,

i have an application that create a log file of the following format:

2009-03-27 15:30:50 Start
2009-03-27 15:30:51 Starting Component 1 Installation
2009-03-27 15:30:52 blah
2009-03-27 15:30:53 blah
2009-03-27 15:30:54 blah
2009-03-27 15:30:55 ~~~ Finished Component 1 Installation ~~~
2009-03-27 15:30:56 Starting Component 2 Installation
2009-03-27 15:30:57 blah
2009-03-27 15:30:58 blah
2009-03-27 15:30:59 blah
2009-03-27 15:30:60 ~~~ Finished Component 2 Installation ~~~
2009-03-27 15:30:61 Starting Component 3 Installation
2009-03-27 15:30:62 blah
2009-03-27 15:30:63 blah
2009-03-27 15:30:64 blah
2009-03-27 15:30:65 ~~~ Finished Component 3 Installation ~~~
2009-03-27 15:30:66 Finished

I guess you get the point of the format.

The thing that i want to achive is to have a powershell script that will return a table of result showing

  1. Component Name
  2. Start Time
  3. End Time
  4. Time Taken

I call it "advance" script because it will probably include: parsing, formatting, creating of new objects, etc

  • I am a newbie in powershell
+3  A: 

The following script might do that for you ... IF you resort to sensible time values ... more than 60 seconds in a minute are weird (and gave me a few exceptions here that caused major headaches over the date/time parsing, until I noticed why the exceptions were thrown ...)

$logfile = $args[0]

$log = get-content $logfile

$Components = @()

switch -regex ($log) {
    "(.*) Starting (.*) Installation" {
        $c = New-Object PSObject
        $st = [DateTime]::ParseExact($Matches[1], "yyyy'-'MM'-'dd HH':'mm':'ss", $null)
        $c | Add-Member -Type NoteProperty -Name Component -Value $Matches[2]
        $c | Add-Member -Type NoteProperty -Name StartTime -Value $st
    }
    "(.*) ~~~ Finished" {
        $et = [DateTime]::ParseExact($Matches[1], "yyyy'-'MM'-'dd HH':'mm':'ss", $null)
        $c | Add-Member -Type NoteProperty -Name EndTime -Value $et
        $c | Add-Member -Type NoteProperty -Name TimeTaken -Value ($c.EndTime - $c.StartTime)
        $Components += $c
    }
}

$Components

I could get the execution time down a bit (by about 25 percent) with the following code:

$logfile = $args[0]

foreach ($l in Get-Content $logfile) {
    if ($l.Length -ge 30) {
        if ($l.Substring(20,8) -eq "Starting") {
            $c = New-Object PSObject
            $st = [DateTime]::ParseExact($l.Substring(0,19), "yyyy'-'MM'-'dd HH':'mm':'ss", $null)
            $c | Add-Member -Type NoteProperty -Name Component -Value $l.Substring(29, $l.Length - 42)
            $c | Add-Member -Type NoteProperty -Name StartTime -Value $st
        } elseif ($l.Substring(24,8) -eq "Finished") {
            $et = [DateTime]::ParseExact($l.Substring(0,19), "yyyy'-'MM'-'dd HH':'mm':'ss", $null)
            $c | Add-Member -Type NoteProperty -Name EndTime -Value $et
            $c | Add-Member -Type NoteProperty -Name TimeTaken -Value ($c.EndTime - $c.StartTime)
            $c
        }
    }
}

but I find it much less readable (let alone maintainable) than the first solution. That's very rigidly linked to the layout of the log file.

Joey
Thanks for the script (and sorry for the mistake on the seconds!).The script is working great from functional point of view but from performance point of view it slow, can you suggest maybe what could make it faster?
Well, currently the script reads the complete log file and then does the processing. I just tried it on a 800 KiB Logfile and the difference between first reading, then parsing and doing it in a pipeline isn't noticeable (in fact, the pipeline is even slower). Currently no idea how to make it faster
Joey
The New-Object and Add-Member calls account for around half of the runtime, here. Parsing with regex is apparently the other half (date/time parsing isn't noticeable). Rewriting without regex won't be very fun or readable, though.
Joey
Thanks for the prompt response.originaly when i tried to do i though of doingit using pipes and than using hashes to add relevant property to the needed object, your solution is easier to understand
Also,.. how did you check the performance of it?Is there an option to have the regex compiled?
I was using timethis (http://bit.ly/esXXA) and commented out certain portions of the code to measure the impact. Runtime over that 800 KiB log file was around 2.1 to 2.3 seconds here, I added a solution that cut it down to 1.6 seconds, but that may not be worth the increased barrier to understanding
Joey
There may be an option to compile regexes but I doubt that will play well with built-in PS functionality like switch -regex, -match and $Matches. Haven't looked at that so far
Joey
OK i have checked with compiling - redundant
You can speed things up a bit by avoiding New-Object/Add-Member and instead using Select-Object to add everything in one go: $c = "" | Select-Object Component,StartTime,EndTime,TimeTaken; $c.Component = $l.Substring(29, $l.Length - 42)
JasonMArcher
The only downside it is that you only get dynamically typed NoteProperties. But in this case that is all you were adding anyway.
JasonMArcher
oh, have to try ... still much to learn :)
Joey
The performance is VERY bad!