tags:

views:

491

answers:

8

Hi guys.

How to check a line ($_ value) is a blank line in perl? or other good method to check it instead of using $_

I want to code like this

if ($_ eq '')  # Check current line is a blank line (no any characters)
        {
            $x = 0; 
        }

Thank you.


Updated some code with question solution below

my test.txt for parse

constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG">
    vid = 0
    name = ""
    units = ""

constant fixup private GemConfigAlarms = <U1 0>         /*  my Comment  */
    vid = 1
    name = "CONFIGALARMS"
units = ""
min = <U1 0>
max = <U1 2>
default = <U1 0>

My code below.

That's why i need to initial the $x = 0. I am not sure it is a normal solution or not.

sub ConstantParseAndPrint
{

        if (/^$/)   // SOLUTION!
        {
            $x = 0;
        }
        if ($x == 0)
    {
            if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
            {
                    $name1 = $1; # constant
                    $name2 = $2; # fixup
                    $name3 = $3; # private
                    $name4 = $4; 
                    $name5 = $5; 
                    $name6 = $7; 
                    $name7 = $8;
                    # start print 
                    if (!$name7 eq '')
                    {
                    print DEST_XML_FILE "<!-- $name7-->\n";
                    }
                    print DEST_XML_FILE  "  <ECID";
                    print DEST_XML_FILE " logicalName=\"$name4\""; 
                    print DEST_XML_FILE " valueType=\"$name5\""; 
                    print DEST_XML_FILE " value=\"$name6\""; 
                    $x = 1;
            }
    }
    elsif ($x == 1)
    {
        if(/\s*vid\s*=\s*(.*?)(\s|\n|\r)/)
        {
            $nID = $1;
                        print DEST_XML_FILE " vid=\"$nID\"";
            $x = 2;
        }
    }
        elsif ($x == 2)
    {
        if(/\s*name\s*=\s*(.*?)(\s|\n|\r)/)
        {
            $nName = $1;
                        print DEST_XML_FILE " name=$nName";
            $x = 3;
        }
    }
        elsif ($x == 3)
    {
        if(/\s*units\s*=\s*(.*?)(\s|\n|\r)/)
        {
            $nUnits = $1;
                        print DEST_XML_FILE " units=$nUnits";
            $x = 4;

        }
    }
        elsif ($x == 4)
    {       # \s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>\
        if(/\s*min\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/)
        {
                        #$nMinName1 = $1;
                        $nMinName2 = $2; # find the nMin Value
                        #$nMinName3 = $3;
                        #$nMinName4 = $4;
                        print DEST_XML_FILE " min=\"$nMinName2\"";
            $x = 5;
        }
                else
                {
                    print DEST_XML_FILE  "></ECID>\n";
                    $x = 0; # there is no line 4 and line 5
                }
    }
        elsif ($x == 5)
    {
        if(/\s*max\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/)
        {
            #$nMaxName1 = $1;
                        $nMaxName2 = $2; # find the nMax Value
                        #$nMaxName3 = $3;
                        #$nMaxName4 = $4;
                        print DEST_XML_FILE " max=\"$nMaxName2\"";
            $x = 6;
        }

    }
        elsif ($x == 6)
    {
        if(/\s*default\s*=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9]+)>(\s|\n|\r)/)
        {
            #$nDefault1 = $1;
                        $nDefault2 = $2; # find the default Value
                        #$nDefault3 = $3;
                        #$nDefault4 = $4;
            print DEST_XML_FILE " default=\"$nDefault2\">";
                        print DEST_XML_FILE  "</ECID>\n";
                        $x = 0;  
        }


    }
}

THANK YOU ALL.

+8  A: 
if ($_ =~ /^\s*$/) {
   # blank
}

checks for 0 or more whitespaces (\s*) bound by beginning(^)/end($) of line. That's checking for a blank line (i.e. may have whitespace). If you want an empty line check, just remove the \s*.

The check against $_ can be implicit, so you can reduce the above to if (/^\s*$/) for conciseness.

Brian Agnew
Hi Brian, i replaced my code with "if ($_ ~= /^\s*$/)" - show syntax error?
Nano HE
My mistake. Reverse the ~= to =~.
Brian Agnew
It works well now. THANK YOU A LOT.
Nano HE
+4  A: 

You can use:

if ($_ =~ /^$/)

or even just

if (/^$/)

since Perl assumes checking against $_

klausbyskov
both styles work well. thank you.
Nano HE
You don't want to use $ there since it allows a trailing newline. Use the \z to mean the absolute end of string. Use the more precise tool when you can.
brian d foy
Brian, thank you for pointing that out.
klausbyskov
A: 

The way you showed - if ( $_ eq '' ) is perfectly sane. Perhaps you should describe what is your problem with it?

depesz
That only checks for an *empty* line with no chars on it. A *blank* line might contain whitespaces. I would use your version but then again I always use trim().
Nifle
Hi Depesz, I updated my code above. That's why i want to initial the $x=0 when read a new blank line. thank you for your care.
Nano HE
A: 
if(/^\s*$/)
{
    $x = 0;
}
Grimmy
A: 

If you just want to check if the current value of $_ or $var is a blank (or at least all-whitespace) line, then something like

if (/^\s*$/) { ... }
if ($var =~ /^\s*$/){ ... }

as several others have already mentioned.

However, I find that I most commonly want to ignore blank lines while processing input in a loop. I do that like this:

while (<>) {
    next if /^\s*$/;
    ...
}

If I want to allow the traditional shell-style comments, I usually add

s/\s*#.*$//;

just before the check for a blank line.

Dale Hagglund
Thanks for details.
Nano HE
+1  A: 
while (<>){
    chomp;
    if ($_ eq ""){
        print "blank at $.\n";
    }
}
ghostdog74
This gives a false match for a line with just 0.
jmcnamara
fixed. thank you
ghostdog74
+5  A: 

The answer depends on what you mean by a blank line (whether it contains no characters apart from a newline or whether it contains only whitespace). An idiomatic way to deal with this is to use a negative match against \S which matches in both of these cases:

if ( ! /\S/ ) {
    ...
}

If you are only looking for the former than your own answer is fine.

You often see this technique used as a filter:

while (<>) {
    next unless /\S/; # Ignore blank lines.
    ...
}
jmcnamara
Thank you for details. It's very simple style to cover my need.
Nano HE
@jmcnamara: voted up. your "next unless /\S/" is much better then the "next if /^\s*$/" I am get used to!
+3  A: 

Against my better judgment I will try to help you again.

The issue is not how to find a blank line. The issue is not which regex to use. The fundamental issue is understanding how to analyze a problem and turn that analysis into code.

In this case the problem is "How do I parse this format?"

I've written a parser for you. I have also taken the time to write a detailed description of the process I used to write it.

WARNING: The parser is not carefully tested for all cases. It does not have enough error handling built in. For those features, you can request a rate card or write them yourself.

Here's the data sample you provided (I'm not sure which of your several questions I pulled this from):

constant fixup GemEstabCommDelay = <U2 20>
    vid = 6
    name = "ESTABLISHCOMMUNICATIONSTIMEOUT"
    units = "s"
    min = <U2 0>
    max = <U2 1800>
    default = <U2 20>


constant fixup private GemConstantFileName = <A "C:\\TMP\\CONST.LOG">
    vid = 4
    name = ""  units = ""


constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG">
    vid = 0
    name = ""
    units = ""  

Before you can write a parser for a data file, you need to have a description the structure of the file. If you are using a standard format (say XML) you can read the existing specification. If you are using some home-grown format, you get to write it yourself.

So, based on the sample data, we can see that:

  1. data is broken into blocks.
  2. each block starts with the word constant in column 0.
  3. each block ends with a blank line.
  4. a block consists of a start line, and zero or more additional lines.
  5. The start line consists of the keyword constant followed by one or more whitespace delimited words, an '=' sign and an <> quoted data value.
    • The last keyword appears to be the name of the constant. Call it constant_name
    • The <>-quoted data appears to be a combined type/value specifier.
    • earlier keywords appear to specify additional metadata about the constant. Let's call those options.
  6. The additional lines specify additional key value pairs. Let's call them attributes. Attributes may have a single value or they may have a type/value specifier.
  7. One or more attributes may appear in a single line.

Okay, so now we have a rough spec. What do we do with it?

How is the format structured? Consider the logical units of organization from largest to smallest. These will determine the structure and flow of our code.

  • A FILE is made of BLOCKS.
  • BLOCKS are made of LINES.

So our parser should decompose a file into blocks, and then handle the blocks.

Now we rough out a parser in comments:

# Parse a constant spec file.

# Until file is done:
    # Read in a whole block
    # Parse the block and return key/value pairs for a hash.

    # Store a ref to the hash in a big hash of all blocks, keyed by constant_name.

# Return ref to big hash with all block data

Now we start to fill in some code:

# Parse a constant spec file.
sub parse_constant_spec {
    my $fh = shift;

    my %spec;

    # Until file is done:
        # Read in a whole block
    while( my $block = read_block($fh) ) {

        # Parse the and return key/value pairs for a hash.
        my %constant = parse_block( $block );

        # Store a ref to the hash in a big hash of all blocks, keyed by constant_name.
        $spec{ $constant{name} } = \%constant;

    }

    # Return ref to big hash with all block data
    return \%spec;
}

But it won't work. The parse_block and read_block subs haven't been written yet. At this stage that's OK. The point is to rough in features in small, understandable chunks. Every once in a while, to keep things readable you need to gloss over the details drop in a subroutine--otherwise you wind up with monstrous 1000 line subs that are impossible to debug.

Now we know we need to write a couple of subs to finish up, et viola:

#!/usr/bin/perl
use strict;
use warnings;

use Data::Dumper;

my $fh = \*DATA;

print Dumper parse_constant_spec( $fh );


# Parse a constant spec file.
# Pass in a handle to process.
# As long as it acts like a file handle, it will work.
sub parse_constant_spec {
    my $fh = shift;

    my %spec;

    # Until file is done:
        # Read in a whole block
    while( my $block = read_block($fh) ) {

        # Parse the and return key/value pairs for a hash.
        my %constant = parse_block( $block );

        # Store a ref to the hash in a big hash of all blocks, keyed by constant_name.
        $spec{ $constant{const_name} } = \%constant;

    }

    # Return ref to big hash with all block data
    return \%spec;
}

# Read a constant definition block from a file handle.
# void return when there is no data left in the file.
# Otherwise return an array ref containing lines to in the block. 
sub read_block {
    my $fh = shift;

    my @lines;
    my $block_started = 0;

    while( my $line = <$fh> ) {

        $block_started++ if $line =~ /^constant/;

        if( $block_started ) {

            last if $line =~ /^\s*$/;

            push @lines, $line;
        }
    }

    return \@lines if @lines;

    return;
}


sub parse_block {
    my $block = shift;
    my ($start_line, @attribs) = @$block;

    my %constant;

    # Break down first line:
    # First separate assignment from option list.
    my ($start_head, $start_tail) = split /=/, $start_line;

    # work on option list
    my @options = split /\s+/, $start_head;

    # Recover constant_name from options:
    $constant{const_name} = pop @options;
    $constant{options} = \@options;

    # Now we parse the value/type specifier
    @constant{'type', 'value' } = parse_type_value_specifier( $start_tail );

    # Parse attribute lines.
    # since we've already got multiple per line, get them all at once.
    chomp @attribs;
    my $attribs = join ' ', @attribs;

    #  we have one long line of mixed key = "value" or key = <TYPE VALUE> 

    @attribs = $attribs =~ /\s*(\w+\s+=\s+".*?"|\w+\s+=\s+<.*?>)\s*/g;

    for my $attrib ( @attribs ) {
        warn "$attrib\n";
        my ($name, $value) = split /\s*=\s*/, $attrib;

        if( $value =~ /^"/ ) { 
            $value =~ s/^"|"\s*$//g;
        }
        elsif( $value =~ /^</ ) {
           $value = [ parse_type_value_specifier( $start_tail ) ];
        }
        else {
            warn "Bad line";
        }

        $constant{ $name } = $value;
    }

    return %constant;
}

sub parse_type_value_specifier {
    my $tvs = shift;

    my ($type, $value) = $tvs =~ /<(\w+)\s+(.*?)>/;

    return $type, $value;
}

__DATA__
constant fixup GemEstabCommDelay = <U2 20>
    vid = 6
    name = "ESTABLISHCOMMUNICATIONSTIMEOUT"
    units = "s"
    min = <U2 0>
    max = <U2 1800>
    default = <U2 20>


constant fixup private GemConstantFileName = <A "C:\\TMP\\CONST.LOG">
    vid = 4
    name = ""  units = ""


constant fixup private GemAlarmFileName = <A "C:\\TMP\\ALARM.LOG">
    vid = 0
    name = ""
    units = ""  

The above code is far from perfect. IMO, parse_block is too long and ought to be broken into smaller subs. Also, there isn't nearly enough validation and enforcement of well-formed input. Variable names and descriptions could be clearer, but I don't really understand the semantics of your data format. Better names would more closely match the semantics of the data format.

Despite these issues, it does parse your format and produce a big handy data structure that can be stuffed into whatever output format you want.

If you use this format in many places, I recommend putting the parsing code into a module. See perldoc perlmod for more info.

Now, please stop using global variables and ignoring good advice. Please start reading the perldoc, read Learning Perl and Perl Best Practices, use strict, use warnings. While I am throwing reading lists around go read Global Variables are Bad and then wander around the wiki to read and learn. I learned more about writing software by reading c2 than I did in school.

If you have questions about how this code works, why it is laid out as it is, what other choices could have been made, speak up and ask. I am willing to help a willing student.

Your English is good, but it is clear you are not a native speaker. I may have used too many complex sentences. If you need parts of this written in simple sentences, I can try to help. I understand that working in a foreign language is very difficult.

daotoad
@daotoad, You posted a great tutorial for me. I am reading and practising. btw, i enjoy learning programming and english at the same time. THANK YOU A LOT.
Nano HE
Hi daotoad, I got some question to move on. `@attribs = $attribs =~ /\s*(\w+\s+=\s+".*?"|\w+\s+=\s+<.*?>)\s*/g;` for the attribute parse, and you commented like this *# we have one long line of mixed key = "value" or key = <TYPE VALUE>*, acturally i must handle another condition **key = value** ,no doublequote signs ("Value"), like vid = 6 above, i tried to modify you code at my local laptop, but failed to improve it.
Nano HE
btw, I tried to insert print the **Type, Value, Attributes Itmes** at the **sub parse_block** successfully. Is there any good way to print the complex hash **\%spec**? Formated the print result as XML format : *<ECID logicalName="GemAlarmFileName" valueType="A" value="C:\\TMP\\ALARM.LOG" vid="0" name="" units=""></ECID>*
Nano HE
@Nano, I haven't tested it, but you should be able to handle the *key = value* case with a clause like `\w+\s+=\s+\w+\s+`, just add it as another alternate inside the capture.
daotoad
@Nano, in regard to printing the hash, look at its structure by printing it with Data::Dumper. `use Data::Dumper; print Dumper \%spec;`. Then refer to the data structures cookbook in perldoc. There are many examples of accessing data in big structures. http://perldoc.perl.org/perldsc.html perlreftut also has some good things to say about working with complex data structures. http://perldoc.perl.org/perlreftut.html Once you can access the data, you have many options. You can use print statements with interpolation to assemble the XML. You could also use an XML library or a templating system.
daotoad