tags:

views:

595

answers:

4

Hi,

I have some issue with a Perl script. It modifies the content of a file, then reopen it to write it, and in the process some characters are lost. All words starting with '%' are deleted from the file. That's pretty annoying because the the % expressions are variable placeholders for dialog boxes.

Do you have any idea why? Source file is an XML with default encoding

Here is the code:

undef $/;
open F, $file or die "cannot open file $file\n";
my $content = <F>;                                           
close F;                                                     

$content =~s{status=["'][\w ]*["']\s*}{}gi;

printf $content;

open F, ">$file" or die "cannot reopen $file\n";             
printf F $content;                                           
close F or die "cannot close file $file\n";
+25  A: 

You're using printf there and it thinks its first argument is a format string. See the printf documentation for details. When I run into this sort of problem, I always ensure that I'm using the functions correctly. :)

You probably want just print:

 print FILE $content;

In your example, you don't need to read in the entire file since your substitution does not cross lines. Instead of trying to read and write to the same filename all at once, use a temporary file:

open my($in),  "<", $file       or die "cannot open file $file\n";
open my($out), ">", "$file.bak" or die "cannot open file $file.bak\n";

while( <$in> )
    {
    s{status=["'][\w ]*["']\s*}{}gi;
    print $out;
    }

rename "$file.bak", $file or die "Could not rename file\n";

This also reduces to this command-line program:

% perl -pi.bak -e 's{status=["\']\\w ]*["\']\\s*}{}g' file
brian d foy
+4  A: 

Er. You're using printf.

printf interprets "%" as something special.

use "print" instead.

If you have to use printf, use

printf "%s", $content;

Important Note:

PrintF stands for Print Format , just as it does in C.

fprintf is the equivelant in C for File IO.

Perl is not C.

And even IN C, putting your content as parameter 1 gets you shot for security reasons.

Kent Fredric
A: 

Or even

perl -i bak -pe 's{status=["\'][\w ]*["\']\s*}{}gi;' yourfiles

-e says "there's code following for you to run"

-i bak says "rename the old file to whatever.bak"

-p adds a read-print loop around the -e code

Perl one-liners are a powerful tool and can save you a lot of drudgery.

Joe McMahon
no, -i bak says "rename the old file to whateverbak". whatever.bak would be -i .bak
ysth
A: 

If you want a solution that is aware of the XML nature of the docs (i.e., only delete status attributes, and not matching text contents) you could also use XML::PYX:

$ pyx doc.xml | perl -ne'print unless /^Astatus/' | pyxw
Yanick