views:

2527

answers:

6

Because of the more tedious way of adding hosts to be monitored in Nagios (it requires defining a host object, as opposed to the previous program which only required the IP and hostname), I figured it'd be best to automate this, and it'd be a great time to learn Perl, because all I know at the moment is C/C++ and Java.

The file I read from looks like this:

xxx.xxx.xxx.xxx hostname #comments. i.dont. care. about

All I want are the first 2 bunches of characters. These are obviously space delimited, but for the sake of generality, it might as well be anything. To make it more general, why not the first and third, or fourth and tenth? Surely there must be some regex action involved, but I'll leave that tag off for the moment, just in case.

+5  A: 

A simple one-liner is

perl -nae 'print "$F[0] $F[1]\n";'

you can change the delimiter with -F

David Nehme
+5  A: 

Let's turn this into code golf! Based on David's excellent answer, here's mine:

perl -ane 'print "@F[0,1]\n";'

Edit: A real golf submission would look more like this (shaving off five strokes):

perl -ape '$_="@F[0,1]
"'

but that's less readable for this question's purposes. :-P

Chris Jester-Young
Well played, sir.
Axeman
Thanks! I amended the entry with something even golfier, but probably more indecipherable too. :-P
Chris Jester-Young
+2  A: 

David Nehme said:

perl -nae 'print "$F[0] $F[1}\n";

which uses the -a switch. I had to look that one up:

-a   turns on autosplit mode when used with a -n or -p.  An implicit split
     command to the @F array is done as the first thing inside the implicit
     while loop produced by the -n or -p.

you learn something every day. -n causes each line to be passed to

LINE:
    while (<>) {
        ...             # your program goes here
    }

And finally -e is a way to directly enter a single line of a program. You can have more than -e. Most of this was a rip of the perlrun(1) manpage.

trenton
"autosplit" mode is also known as "awk mode", and the use of @F as the name of the array is taken from awk.
rjray
Perl 5.10 adds a variant to `-e` `-E`, which essentially does `use 5.010`
Brad Gilbert
+6  A: 

Here's a general solution (if we step away from code-golfing a bit).

#!/usr/bin/perl -n
chop;                     # strip newline (in case next line doesn't strip it)
s/#.*//;                  # strip comments
next unless /\S/;         # don't process line if it has nothing (left)
@fields = (split)[0,1];   # split line, and get wanted fields
print join(' ', @fields), "\n";

Normally split splits by whitespace. If that's not what you want (e.g., parsing /etc/passwd), you can pass a delimiter as a regex:

@fields = (split /:/)[0,2,4..6];

Of course, if you're parsing colon-delimited files, chances are also good that such files don't have comments and you don't have to strip them.

Chris Jester-Young
You should almost always use chomp instead of chop. chop always removes the last character from a string. chomp removes the current line terminator (normally "\n") from the string, if present. If the line doesn't end with the terminator, chomp does nothing. chop may remove stuff you don't expect.
cjm
The Unix way is that all text files end with a newline. Thus, you never read a line with no newline at the end, unless your file is stuffed. This goes double for files like the ones in /etc. :-)
Chris Jester-Young
Just curious, chop doesn't refer to anything in particular. Are you piping the file into the program in this case?
ray
Perl has a lot of "implicit stuff", to make programs succinct (Python people hate that, hence Python's rule is to be explicit). chop uses $_ by default, as does split, as does pattern matching. [continues]
Chris Jester-Young
[continued] The -n option (see line 1) makes Perl read lines (from stdin if no arguments, otherwise each named file) into $_, and the whole program is really in a while loop. That's why the "next" statement (equivalent to "continue" in C) works.
Chris Jester-Young
I've had plenty of unix files end without a newline. I think you're confusing the issue with editors that add the last newline for you. Either way, why not use the safer one regardless?
brian d foy
+5  A: 

The one-liner is great, if you're not writing more Perl to handle the result.

More generally though, in the context of a larger Perl program, you would either write a custom regular expression, for example:

if($line =~ m/(\S+)\s+(\S+)/) {
     $ip = $1;
     $hostname = $2;
}

... or you would use the split operator.

my @arr = split(/ /, $line);
$ip = $arr[0];
$hostname = $arr[1];

Either way, add logic to check for invalid input.

slim
I'd say that it's more idiomatic to do list assignment: e.g., ($ip, $hostname) = ($1, $2) in the first case, or ($ip, $hostname) = (split ' ', $line)[0,1] in the second. (The 0,1 is just in case people want to use other numbers. If not, ($ip, $hostname) = split ' ', $line will work just fine.
Chris Jester-Young
No, more idiomatic would be "if (my ($ip,$hostname) = $line =~ /(\S+)\s+(\S+)/) {".
ysth
Both of you are right, but I'm not of the opinion that idiomatic == better.
slim
A: 

Since ray asked, I thought I'd rewrite my whole program without using Perl's implicitness (except the use of <ARGV>; that's hard to write out by hand). This will probably make Python people happier (braces notwithstanding :-P):

while (my $line = <ARGV>) {
    chop $line;
    $line =~ s/#.*//;
    next unless $line =~ /\S/;
    @fields = (split ' ', $line)[0,1];
    print join(' ', @fields), "\n";
}

Is there anything I missed? Hopefully not. The ARGV filehandle is special. It causes each named file on the command line to be read, unless none are specified, in which case it reads standard input.

Edit: Oh, I forgot. split ' ' is magical too, unlike split / /. The latter just matches a space. The former matches any amount of any whitespace. This magical behaviour is used by default if no pattern is specified for split. (Some would say, but what about /\s+/? ' ' and /\s+/ are similar, except for how whitespace at the beginning of a line is treated. So ' ' really is magical.)

The moral of the story is, Perl is great if you like lots of magical behaviour. If you don't have a bar of it, use Python. :-P

Chris Jester-Young
chomp instead of chop, just in case the last line of the file is missing its newline.
ysth
ysth: The question concerns Unix as far as I can tell, and Unix text files are always meant to end with a newline. This goes double for files supposedly in /etc.
Chris Jester-Young
I used to religiously use chomp, for many years, but have come to the conclusion that it's unnecessary (for many purposes) and that's why chop is there in the first place.
Chris Jester-Young