tags:

views:

97

answers:

3

I am just trying to learn some Perl, so I am going through one function just to get grip on the language. Could somebody explain to me what this function is doing exactly?

#! /usr/bin/perl 
use strict; 

my %hash; 

&Parse('first.txt'); 
&Parse('second.txt'); 

my $outputpath = 'output.txt'; 
unlink ($outputpath); 
open (OUTPUT, ">>$outputpath") || die "Failed to open OUTPUT ($outputpath) - $!"; 
print OUTPUT "$_ \t" . join("\t", @{$hash{$_}}) . "\n" foreach (sort keys %hash); 
close (OUTPUT) || die "Failed to close OUTPUT ($outputpath) - $!"; 

sub Parse { 
    my $inputpath = shift; 
    open (INPUT, "<$inputpath") || die "Failed to open INPUT ($inputpath) - $!"; 
    while (<INPUT>) { 
        chomp; 
        my @row = split(/\t/, $_); 
        my $col1 = $row[0]; 
        shift @row; 
        push(@{$hash{$col1}}, @row); 
    } 
    close (INPUT) || die "Failed to close INPUT ($inputpath) - $!"; 
    return 1; 
}

I am more interested in shift and push and chomp.

+2  A: 

Look up the documentation.

Svante
+4  A: 

Edit: You posted some extra code, I'll comment that as well.

#!/usr/bin/perl 
#The first line (has to be first, hence this comment comes after) allows the linux shell to know 
#this is a perl program, and to call perl to execute it.

#use strict: allow stricter checking of perl syntax. You should always do this.
use strict; 

#declare a global variable called hash - not a very good name...
my %hash; 

#call the method with 'first.txt' as argument
&Parse('first.txt'); 
&Parse('second.txt');  #same thing, different parameter


my $outputpath = 'output.txt'; 

#destroy the file declared above if it exists
unlink ($outputpath); 

# open the file for append (could simple have opened for output and not used the unlink above...)
open (OUTPUT, ">>$outputpath") || die "Failed to open OUTPUT ($outputpath) - $!"; 

#print a line to output 
#the line comes from a foreach loop
#the foreach loop runs over the hash, sorted by key
#each hash entry contains an array, this array is converted by a string using the JOIN function
# the join function will paste the elements of the array into a string, seperated by a tab
print OUTPUT "$_ \t" . join("\t", @{$hash{$_}}) . "\n" foreach (sort keys %hash); 

#Close the outputfile
close (OUTPUT) || die "Failed to close OUTPUT ($outputpath) - $!"; 

This program was probably written some time ago - Modern Perl looks a bit different and has a few best practices that are not here yet.

Don't use this as an example on how to write Perl. Maybe Ether will rewrite this for you, if you smile nicely :)

#declare a sub
sub Parse { 
    # the parameters given to a sub are stored in @_. 
    #shift without arguments takes the first element from @_
    my $inputpath = shift; 
    #opens the file "inputpath" into fileglob INPUT. 
    #If this fails, quit with an error message
    open (INPUT, "<$inputpath") || die "Failed to open INPUT ($inputpath) - $!"; 

    #loop over the file one line at the time, putting each line in $_
    while (<INPUT>) { 

        #chop = remove last character. chomp = remove last character if it is a CRLF. 
        #Without arguments, works on $_ 
        chomp; 

        #split the $_ variable (containing the row) 
        #into an array based on the tab character
        my @row = split(/\t/, $_); 

        # take the first element into col1
        my $col1 = $row[0]; 

        # shift it (remove the first element)
        shift @row; 

        # actually both rows above can be just written as one statement:
        my $col1 = shift @row;

        #the $hash variable is probably a global hashref defined somewhere above...
        #the $hash hashref now contains a bucket named with the 'col1' value
        # the value of that bucket is the array of the row we just read
        push(@{$hash{$col1}}, @row); 

        # end the while loop
    } 

    #close the file or die
    close (INPUT) || die "Failed to close INPUT ($inputpath) - $!"; 

    #end the method
    return 1; 
}
Konerak
this is a best explanation i can hope for .thanks
cpp_Beginner
Now i have pasted the complete program.could you also explain about the print statement ?
cpp_Beginner
+1  A: 

If you have a sane perl installation the following command line commands will help:

perldoc -f shift
perldoc -f push
perldoc -f chomp

You'll also love:

perldoc perlfunc
perldoc perlvar

Don't miss the perlvar part about $_ or you won't ever get what perl is about.

You'll gradually notice perl is not object oriented, it supports objects, but it's a pretty odd implementation. Perl is more getting the work done oriented and the work is usually related to the extraction or translation of some kind of data set.

Perl one liners are the most powerful command lines you will ever write:

perl -pe 's/(\d*)/$1*10/ge'

Check out the -p, -e, -n and -i switches in perldoc perlrun

(That's one of the main reasons Perl 6 was scheduled as major rewrite, only now it's been in the works since always and scheduled to release the day after Duke Nukem Forever)

shift anyway is like python's some_array.pop(1) or javascript's some_array.shift(), etc.

push is like python's some_array.append(junk) or javascript's some_array.push(more_junk), etc.

chomp is really peculiar and is actually the cross-platform version of chop: it removes end-of-line character from lines that have been read from stdin. It's a kind of an hack to overcome this little diamond operator <> (check perldoc perlop - "I/O Operators" section) flaw: diamond reads stdin or a command line file argument line by line, but it doesn't remove the \n. (nor the \r\n)

chomp removes them after that. (chop only removes \n and leaves \r alone.)

ZJR
A last word of warning: reading other people's perl *(or even yours, after a couple of weeks)* can be a *royal pain*, focus on deliberate examples and manuals. Perl documentation is the more extensive I've ever seen.
ZJR
@ZJR - no, it's not reading other peoplr's Perl. It's reading Perl by people who don't know how to do software development, more specifically how to write readable code. Said people would write unreadable code in ANY language.
DVK