tags:

views:

48

answers:

4

I need help with my regex to grab my host information from this logfile:

Tue Aug 24 10:22:14 2010: test1.colo_lvm:check:INFO:    host=test1.dom.colo.name.com
Tue Aug 24 10:22:14 2010: test1.colo_lvm:check:INFO: "/home/bin64"/admin --user="foo-bar" --password="*****" --host="test1.dom.colo.name.com" --port="9999" --socket="/tmp" variables

My regex is also grabbing the 2nd line to include the hostname in double quotes and other pieces of data on that line, which I am not interested in. The first line is fine only. So, I'm just interested in test1.dom.colo.name.com and nothing else.

My regex so far is this:

if ($line =~ m/(host=)(.+)/){

Thanks!

A: 

Try this:

$line =~ m/host="?([^"\s]+)/

You don't need parens around the host= if you don't actually want to parse that out as data (which, since you're always matching it, it doesn't seem you need to). Using [^"\s]+ will give you a string that doesn't have an " or whitespace characters in it, which will prevent it from running beyond the field boundaries.

The "? bit before the capture will allow the value to be quoted (or not) while keeping any quote marks out of the actual matched data, so you don't have to worry about stripping them out in your data processing.

Amber
+1  A: 

It'll work better if you exclude spaces and quotes from the match:

host=([^\s"]+)

By excluding quotes this will match the host=... in the first line while ignoring the --host="..." in the second line.

Edit: This simple test script works for me on your sample input. What happens if you run this?

#!/usr/bin/env perl

while ($line = <>) {
    if ($line =~ /host=([^\s"]+)/) {
        print "$1\n";
    }
}
John Kugelman
shouldn't that be: `host="?([^\s"]*)`
slebetman
I'm not sure if the asker is trying to match the second line or ignore it entirely and only match the first line. I went with the latter interpretation.
John Kugelman
It's grabbing the hostname from the 2nd line with the `"test1.dom.colo.name.com"` plus port info. I don't care for the 2nd line at all.
jda6one9
Guys, I am trying your suggestion..but getting no match. Its in an if statement, `if ($line =~ m/host=([^\s"]*)/){`. I will edit my question with this.
jda6one9
@John K. Thank you! Its working fine now. Appreciate the help.
jda6one9
A: 

If hostname cannot contain whitespace then I'd do: /(host=)(\S+)/

slebetman
close, but this picked up the 2nd line with the double quotes which i don't need.
jda6one9
+1  A: 

Here is a regex to do that:

/host="?([^\s"]+)"?/m

Your first line does not have quotes around the data; the second line does. Hence the "? construct. Assumably you cannot have a space (or a closing quote) so grab everything other than those. Hence ([^\s"]+)

Cheers!

Edit: This works:

use strict; use warnings;
my $i=1;
while (<DATA>) {
    print "match on line $i: $1\n" if /host="?([^\s"]+)"?/;
    $i++;
}

__DATA__
Tue Aug 24 10:22:14 2010: test1.colo_lvm:check:INFO:    host=test1.dom.colo.name.com
Tue Aug 24 10:22:14 2010: test1.colo_lvm:check:INFO: "/home/bin64"/admin --user="foo-bar" --password="*****" --host="test1.dom.colo.name.com" --port="9999" --socket="/tmp" variables

Output:

match on line 1 test1.dom.colo.name.com
match on line 2 test1.dom.colo.name.com
drewk
Tried this too, no luck. It doesn't grab anything. `if ($line =~ m/host="?([^\s"]+)"?/){`
jda6one9
@jda6one9: You posted sample data different than you tried "at home." The regex I posted works perfectly on your posted sample data - both in Perl and in a regex sim. What is different?
drewk
@drewk- this works fine too. sorry for the confusion.
jda6one9