views:

191

answers:

4

Hi,

I am reading a file using perl script. This file consists of strings with different characters and I am supposed to identify strings containing the character 'X'. I want to know how should I (1) print this string (containing 'X') and also (2) write this string to another file (3) count the number of 'X' characters in the whole file. The script below prints the whole file again. Any suggestions?

#!/use/bin/perl
use strict;
use warnings;

open (FILE, "/home/user/Desktop/infile.phy") || die "cant open file\n";
my @body = <FILE>;
close (FILE);
my $count= 0;
my $string = '';
foreach $_(@body){
 if ($_ =~ m/[X]/){
  print "$_";
  $count++;
  print $count;
 }
 else {
  print ;
 }
}
exit;
+1  A: 

You are printing $_ in both branches of your if-clause. Get rid of the else branch.

innaM
It might not be obvious to a newcomer that 'print ;' prints '$_', but that is what happens.
Jonathan Leffler
exactly. I could have been a bit more verbose. sorry.
innaM
A: 

Assuming "string" in your question equals "line":

use strict;
use warnings;

@ARGV=qw(/home/user/Desktop/infile.phy);

my $count = 0;
open my $outfile, '>', 'outfile' or die $!;
while (<>) {
  my $cnt = tr/X/X/;
  if ($cnt) {
    print;
    print $outfile $_;
  }
  $count += $cnt;
}

close $outfile or die $!;

print $count;
+4  A: 

Since this is code review, let's go one by one:

#!/use/bin/perl

That shebang line is most likely a typo. It should probably be

#!/usr/bin/perl

or whatever which perl returns on your system.

use strict;
use warnings;

Good.

open (FILE, "/home/user/Desktop/infile.phy") || die "cant open file\n";

No need for package global filehandles when you can use lexical filehandles. The 3-argument form of open is preferable these days. Also, the error message should indicate the file which you could not open:

my $filename = '/home/user/Desktop/infile.phy';
open my $input, '<', $filename
    or die "Cannot open '$filename' for reading: $!";

my @body = <FILE>;

You are slurping the file into an array. That is completely unnecessary in this case.

my $count  = 0;
my $string = '';

Declare and initialize (if necessary) any variables in the smallest possible scope.

my $count;

The variable $string is not used anywhere else in your code.

foreach $_(@body){

This is silly. for uses $_ if no loop variable is specified. It is easier to keep things straight if you instead specify a lexical loop variable.

for my $line ( @body ) {

However, I do not think you should slurp the file.

        if ($_ =~ m/[X]/){

That results in a successful match if the line contains an X. So, it is equivalent to /X/. However, that will not tell you the word that contained the 'X'. For that, you need to decide what a word is and do your matching at the word level.

With all that in mind, consider the following script. I have made a simplifying assumption regarding what I consider to be a word. You should be able to build on this to satisfy all the requirements:

#!/usr/bin/perl

use strict;
use warnings;

my $filename = "$ENV{TEMP}/test.txt";
open my $input, '<', $filename
    or die "Cannot open '$filename' for reading: $!";

my $count;

while ( my $line = <$input> ) {
    my @words = grep { /X/ } split /\b/, $line;
    $count += @words;
    print join(', ', @words), "\n";
}

print "$count\n";

__END__

UPDATE: If you do not care about finding the words within each line that have one or more X characters, the while loop would be simplified:

while ( <$input> ) { 
    $count += (my @matches = /(X)/g );
    print if @matches;
}

by using $_. That, however, is probably inefficient (given that we are saving each matched X character). In this case, tr works best:

my ($count, $n);
$n = tr/X// and $count += $n and print while <$input>;
Sinan Ünür
I agree with all your review remarks. However, point 3 in the question asks "count the number of 'X' characters in the whole file". Your solution, instead, counts the number of words (strings? that depends on the definition of "strings") that contain an 'X' char.
@blixtor: Thank you for catching that. Of course, each word could contain multiple X characters. In fact, I now realize the OP did not care about splitting the lines into words etc.
Sinan Ünür
The 3-arg form of open is sometimes useful and sometimes not. It is easy to always use 2-arg open safely, and sometimes you want its ability to take default IO layers from the open pragma or -C switch. If you are going to criticize 2-arg opens, at least say "because you may unsafely use a variable filename someday", not just "it's the new way to do it".
ysth
Try your simplified while loop version on a file with no X and see why $count should be initialized to 0 :)
ysth
@ysth Arrrrrrgh! It is just proof that the `if` statement is completely unnecessary.
Sinan Ünür
A: 

hey respond as possible Ihave to sumbmit the journal by monday.

Divya
Who should respond to what? This question already has some good answers.
sth