views:

2414

answers:

7

How can I rename all files on a drive with .wma and .wmv extensions to .txt extension using Perl regardless how deep they are in the directory structure?

+9  A: 

See perldoc File::Find. The examples in the documentation are pretty self-explanatory and will get you most of the way there. When you have an attempt, update the question with more information.

If this is a learning exercise, you will learn better by first trying to do yourself.

UPDATE:

Assuming you have had a chance to look into how to do this yourself and taking into account the fact that various solutions have been posted, I am posting how I would have done this. Note that I would choose to ignore files such as ".wmv": My regex requires something to come before the dot.

#!/usr/bin/perl

use strict;
use warnings;

use File::Find;

my ($dir) = @ARGV;

find( \&wanted, $dir );

sub wanted {
    return unless -f;
    return unless /^(.+)\.wm[av]$/i;
    my $new = "$1.txt";
    rename $_ => $new
        or warn "'$_' => '$new' failed: $!\n";
    return;
}

__END__
Sinan Ünür
+2  A: 

And if you are a newbie, one more useful piece of advice: to rename the files, use "move()" method from "File::Copy" module (and always check for whether move() failed)

Also, avoid an un-obvious bug of accidentally renaming a directory whose name ends with .wma/.wmv (since the "wanted" callback is called on both files and directories)

P.S. I definitely concur with File::Find advice above (also, consider looking into File::Find::Rule, as explained in this link). However, as an exercise in learning Perl, writing your own recursive file finder (or better yet, turning it from recursive into breadth-first-search loop) is something you might consider doing if your goal is to learn instead of just write a quick one-off.

DVK
A: 

I had to do something similar recently. This script would require modification, but has all the essentials:

  1. It recurses through files and directories (sub recurse).
  2. It has a function to act on directories (processDir) and a separate one to act on files (processFile).
  3. It handles spaces in file names using an alternate version of the glob function from File::Glob.
  4. It performs no actions, but instead writes an output file (CSV, TAB or perl script) so that the user can review proposed changes before making a big mistake.
  5. It outputs partial results periodically, which is useful if your system goes down part way.
  6. It proceeds in depth first order. This is important, because if you have a script that modifies (renames or moves) a parent directory before processing the subdirectories and files, bad things can happen.
  7. It reads from a skip list file, which allows you to avoid huge directories and mounted volumes that you do not want to visit.
  8. It does not follow symbolic links, which often cause circularities.

A small modification to processFile is most of what you would need to do, plus gutting the features you don't need. (This script was designed to look for files with characters in their names not supported on Windows.)

NOTE: At the end it calls "open", which on the MAC will open the resulting file in its default application. On Windows, use "start". On other Unix systems, there are similar commands.

#!/usr/bin/perl -w

# 06/04/2009. PAC. Fixed bug in processDir. Was using $path instead of $dir when forming newpath.

use strict;
use File::Glob ':glob'; # This glob allows spaces in filenames. The default one does not.

sub recurse(&$);
sub processFile($);
sub stem($);
sub processXMLFile($);
sub readFile($);
sub writeFile($$);
sub writeResults($);
sub openFileInApplication($);

if (scalar @ARGV < 4) {
    print <<HELP_TEXT;

    Purpose: Report on files and directories whose names violate policy by:
                   o containing illegal characters
                   o being too long
                   o beginning or ending with certain characters

    Usage:   perl EnforceFileNamePolicy.pl root-path skip-list format output-file 

        root-path .... Recursively process all files and subdirectories starting with this directory.
        skip-list .... Name of file with directories to skip, one to a line.
        format ....... Output format:
                            tab = tab delimited list of current and proposed file names
                            csv = comma separated list of current and proposed file names
                            perl = perl script to do the renaming
        output-file .. Name of file to hold results.

    Output:  A script or delimited file that will rename the offending files and directories is printed to output-file.
             As directories are processed or problems found, diagnostic messages will be printed to STDOUT.

    Note: Symbolic links are not followed, otherwise infinite recursion would result.
    Note: Directories are processed in depth-first, case-insensitive alphabetical order. 
    Note: If \$CHECKPOINT_FREQUENCY > 0, partial results will be written to intermediate files periodically.
          This is useful if you need to kill the process before it completes and do not want to lose all your work.

HELP_TEXT
  exit;
}


########################################################
#                                                      #
#                 CONFIGURABLE OPTIONS                 #
#                                                      #
########################################################

my $BAD_CHARACTERS_CLASS = "[/\\?<>:*|\"]";
my $BAD_SUFFIX_CLASS = "[. ]\$";
my $BAD_PREFIX_CLASS = "^[ ]";
my $REPLACEMENT_CHAR = "_";
my $MAX_PATH_LENGTH = 256;
my $WARN_PATH_LENGTH = 250;
my $LOG_PATH_DEPTH = 4; # How many directories down we go when logging the current directory being processed.
my $CHECKPOINT_FREQUENCY = 20000; # After an integral multiple of this number of directories are processed, write a partial results file in case we later kill the process.

########################################################
#                                                      #
#                COMMAND LINE ARGUMENTS                #
#                                                      #
########################################################

my $rootDir = $ARGV[0];
my $skiplistFile = $ARGV[1];
my $outputFormat = $ARGV[2];
my $outputFile = $ARGV[3];


########################################################
#                                                      #
#                BEGIN PROCESSING                      #
#                                                      #
########################################################

my %pathChanges = (); # Old name to new name, woth path attached.
my %reasons = ();
my %skip = (); # Directories to skip, as read from the skip file.
my $dirsProcessed = 0;

# Load the skiplist
my $skiplist = readFile($skiplistFile);
foreach my $skipentry (split(/\n/, $skiplist)) {
    $skip{$skipentry} = 1; 
}

# Find all improper path names under directory and store in %pathChanges.
recurse(\&processFile, $rootDir);

# Write the output file.
writeResults(0);
print "DONE!\n";

# Open results in an editor for review.
#WARNING: If your default application for opening perl files is the perl exe itself, this will run the otput perl script!
#         Thus, you may want to comment this out.
#         Better yet: associate a text editor with the perl script.
openFileInApplication($outputFile);

exit;


sub recurse(&$) {
    my($func, $path) = @_;
    if ($path eq '') {
     $path = ".";
    }

    ## append a trailing / if it's not there
    $path .= '/' if($path !~ /\/$/);

    ## loop through the files contained in the directory
    for my $eachFile (sort { lc($a) cmp lc($b)  } glob($path.'*')) {
     # If eachFile has a shorter name and is a prefix of $path, then stop recursing. We must have traversed "..".
     if (length($eachFile) > length($path) || substr($path, 0, length($eachFile)) ne $eachFile) {
      ## if the file is a directory
      my $skipFile = defined $skip{$eachFile};
      if( -d $eachFile && ! -l $eachFile && ! $skipFile) { # Do not process symbolic links like directories! Otherwise, this will never complete - many circularities.
       my $depth = depthFromRoot($eachFile);
       if ($depth <= $LOG_PATH_DEPTH) {
        # Printing every directory as we process it slows the program and does not give the user an intelligible measure of progress.
        # So we only go so deep in printing directory names.
        print "Processing: $eachFile\n";
       }

        ## pass the directory to the routine ( recursion )
       recurse(\&$func, $eachFile);

       # Process the directory AFTER its children to force strict depth-first order.
       processDir($eachFile);
      } else {
       if ($skipFile) {
        print "Skipping: $eachFile\n";
       }

       # Process file.
            &$func($eachFile);
      }   
     }

    }
}


sub processDir($) {
    my ($path) = @_;
    my $newpath = $path; 
    my $dir;
    my $file;
    if ($path eq "/") {
     return; 
    }
    elsif ($path =~ m|^(.*/)([^/]+)$|) {
     ($dir, $file) = ($1, $2);
    }
    else {
     # This path has no slashes, hence must be the root directory.
     $file = $path;
     $dir = '';
    }
    if ($file =~ /$BAD_CHARACTERS_CLASS/) {
     $file =~ s/($BAD_CHARACTERS_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectDir($path, $newpath, "Illegal character in directory.");
    }
    elsif ($file =~ /$BAD_SUFFIX_CLASS/) {
     $file =~ s/($BAD_SUFFIX_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectDir($path, $newpath, "Illegal character at end of directory.");
    }
    elsif ($file =~ /$BAD_PREFIX_CLASS/) {
     $file =~ s/($BAD_PREFIX_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectDir($path, $newpath, "Illegal character at start of directory.");
    }
    elsif (length($path) >= $MAX_PATH_LENGTH) {
     rejectDir($path, $newpath, "Directory name length > $MAX_PATH_LENGTH.");
    }
    elsif (length($path) >= $WARN_PATH_LENGTH) {
     rejectDir($path, $newpath, "Warning: Directory name length > $WARN_PATH_LENGTH.");
    }
    $dirsProcessed++;
    if ($CHECKPOINT_FREQUENCY > 0 && $dirsProcessed % $CHECKPOINT_FREQUENCY == 0) {
     writeResults(1);
    }
}

sub processFile($) {
    my ($path) = @_;
    my $newpath = $path;
    $path =~ m|^(.*/)([^/]+)$|;
    my ($dir, $file) = ($1, $2);
    if (! defined ($file) || $file eq '') {
     $file = $path;
    }
    if ($file =~ /$BAD_CHARACTERS_CLASS/) {
     $file =~ s/($BAD_CHARACTERS_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectFile($path, $newpath, "Illegal character in filename.");
    }
    elsif ($file =~ /$BAD_SUFFIX_CLASS/) {
     $file =~ s/($BAD_SUFFIX_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectFile($path, $newpath, "Illegal character at end of filename.");
    }
    elsif ($file =~ /$BAD_PREFIX_CLASS/) {
     $file =~ s/($BAD_PREFIX_CLASS)/$REPLACEMENT_CHAR/g;
     $newpath = $dir . $file;
     rejectFile($path, $newpath, "Illegal character at start of filename.");
    }
    elsif (length($path) >= $MAX_PATH_LENGTH) {
     rejectFile($path, $newpath, "File name length > $MAX_PATH_LENGTH.");
    }
    elsif (length($path) >= $WARN_PATH_LENGTH) {
     rejectFile($path, $newpath, "Warning: File name length > $WARN_PATH_LENGTH.");
    }

}

sub rejectDir($$$) {
    my ($oldName, $newName, $reason) = @_;
    $pathChanges{$oldName} = $newName;
    $reasons{$oldName} = $reason;
    print "Reason: $reason  Dir: $oldName\n";
}

sub rejectFile($$$) {
    my ($oldName, $newName, $reason) = @_;
    $pathChanges{$oldName} = $newName;
    $reasons{$oldName} = $reason;
    print "Reason: $reason  File: $oldName\n";
}


sub readFile($) {
    my ($filename) = @_;
    my $contents;
    if (-e $filename) {
     # This is magic: it opens and reads a file into a scalar in one line of code. 
     # See http://www.perl.com/pub/a/2003/11/21/slurp.html
     $contents = do { local( @ARGV, $/ ) = $filename ; <> } ; 
    }
    else {
     $contents = '';
    }
    return $contents;
}

sub writeFile($$) {
    my( $file_name, $text ) = @_;
    open( my $fh, ">$file_name" ) || die "Can't create $file_name $!" ;
    print $fh $text ;
}   

# writeResults() - Compose results in the appropriate format: perl script, tab delimited, or comma delimited, then write to output file.
sub writeResults($) {
    my ($checkpoint) = @_;
    my $outputText = ''; 
    my $outputFileToUse;
    my $checkpointMessage;
    if ($checkpoint) {
     $checkpointMessage = "$dirsProcessed directories processed so far.";
    }
    else {
     $checkpointMessage = "$dirsProcessed TOTAL directories processed.";
    }
    if ($outputFormat eq 'tab') {
      $outputText .= "Reason\tOld name\tNew name\n";
      $outputText .= "$checkpointMessage\t\t\n";
    }
    elsif ($outputFormat eq 'csv') {
      $outputText .= "Reason,Old name,New name\n";
      $outputText .= "$checkpointMessage,,\n";
    }
    elsif ($outputFormat eq 'perl') {
     $outputText = <<END_PERL;
#/usr/bin/perl

# $checkpointMessage
#
# Rename files and directories with bad names.
# If the reason is that the filename is too long, you must hand edit this script and choose a suitable, shorter new name.

END_PERL
    }

    foreach my $file (sort  { 
     my $shortLength = length($a) > length($b) ? length($b) : length($a); 
     my $prefixA = substr($a, 0, $shortLength);
     my $prefixB = substr($b, 0, $shortLength); 
     if ($prefixA eq $prefixB) {
      return $prefixA eq $a ? 1 : -1; # If one path is a prefix of the other, the longer path must sort first. We must process subdirectories before their parent directories.
     }
     else {
      return $a cmp $b;
     }
    } keys %pathChanges) {
     my $changedName = $pathChanges{$file};
     my $reason = $reasons{$file};
     if ($outputFormat eq 'tab') {
      $outputText .= "$reason\t$file\t$changedName\n";
     }
     elsif ($outputFormat eq 'csv') {
      $outputText .= "$reason,$file,$changedName\n";
     }
     else {
      # Escape the spaces so the mv command works.
      $file =~ s/ /\\ /g;
      $changedName =~ s/ /\\ /g;
      $outputText .= "#$reason\nrename \"$file\", \"$changedName\"\n";  
     }
    }
    $outputFileToUse = $outputFile;
    if ($checkpoint) {
     $outputFileToUse =~ s/(^.*)([.][^.]+$)/$1-$dirsProcessed$2/;
    }

    writeFile($outputFileToUse, $outputText);
}

# Compute how many directories deep the given path is below the root for this script.
sub depthFromRoot($) {
    my ($dir) = @_;
    $dir =~ s/\Q$rootDir\E//;
    my $count = 1;
    for (my $i = 0; $i < length($dir); $i++) {
     if (substr($dir, $i, 1) eq "/") { $count ++; }
    }
    return $count;
}

#openFileInApplication($filename) - Open the file in its default application.
#
# TODO: Must be changed for WINDOWS. Use 'start' instead of 'open'??? 
sub openFileInApplication($) {
    my ($filename) = @_;
    `open $filename`;
}
Paul Chernoch
Don't use prototypes. They don't do what you seem to think they do: http://www.perl.com/language/misc/fmproto.html ... I really do not think the OP was best served by a long script that he cannot understand at this level of his learning. I don't know what your script does but the fact that you do not use File::Find (or any of its derivatives) raises a red flag.
Sinan Ünür
Note that `if ( @ARGV < 4 )` is perfectly fine. You do not need `scalar` because the expression is already evaluated in scalar context.
Sinan Ünür
Also, please delete the subroutine `readFile` and use `read_file` from File::Slurp. I mean, did you actually read Uri's article? You do cite it in your comment but it seems you missed the point.
Sinan Ünür
I lacked the time to trim my script down to serve only the purpose required by the questioner, which would have been more helpful to him. I agree. However, the script performed the task it was written for efficiently and without mishap which is the primary purpose of writing any program, and did not require that the users install extra perl modules on their machines, which is a task easy for me but troublesome for them. As for prototypes, I have never quite got the hang of them, despite returning to the docs again and again.
Paul Chernoch
@Paul Chernoch: "As for prototypes, I have never quite got the hang of them" ... Well, **that** would be a good reason **not** to use them.
Sinan Ünür
For some good discussion of prototypes, see http://stackoverflow.com/questions/297034/why-are-perl-function-prototypes-bad Short version: prototypes are really just compiler hints, they don't really do any type constraint. They exist so that you can write subs that act like built-ins. They are almost entirely unlike prototypes in any other language that has them, ignore the name if you are used to C or some other language.
daotoad
Also, if you want friendly code review for this, I recommend Perlmonks.org. Post the code here with the note that you took heat for it another forum, and would like comments, and you should get a good response there. PM is a great resource.
daotoad
A: 

Look at rename.

find -type f -name '*.wm?' -print0 | xargs -0 rename 's/\.wm[av]$/.txt/'

or

find -type f -name '*.wm?' -exec rename 's/\.wm[av]$/.txt/' {} +

Or make your own script

#!/usr/bin/perl

use strict;
use warnings;

use File::Find;

find( sub {
    return unless -f;
    my $new = $_;
    return unless $new =~ s/\.wm[av]$/.txt/;
    rename $_ => $new
        or warn "rename '$_' => '$new' failed: $!\n";
  }, @ARGV );
Hynek -Pichi- Vychodil
+1  A: 
find . -name '*.wm[va]' -a -type f -exec mv '{}' '{}.txt' \;

Ok, there are two basic problems with the above. First off, it's find, not perl. Second, it's actually just putting the .txt at the end, not quite what you wanted.

The first problem is only a problem if you really must do this in perl. Which probably means you're just learning perl, but that's ok, because it's merely a first step. The second is only a problem if you merely want to get the job done and don't care about the language. I'll solve the second problem first:

find . -name '*.wm[va]' -a -type f | while read f; do mv $f ${f%.*}; done

That just gets the job done, but actually moves us away from a perl solution. That's because, if you get it all done in find, you can convert to perl with find2perl:

find . -name '*.wm[va]' -a -type f -exec mv '{}' '{}.txt' \;

This will print out a perl script, which you can save:

find2perl . -name '*.wm[va]' -a -type f -exec mv '{}' '{}.txt' \; > my.pl

It includes a doexec() function which can be modified to do what you want. First would be to change the second argument to the right name (using File::Basename's basename function: basename($command[2], qw/.wmv .wma/) ), the second would be just to eliminate the calls to system, STDOUT munging, etc., and just call rename. But this at least gives you a start.

Tanktalus
A: 

# include the File::Find module, that can be used to traverse directories use File::Find;

# starting in the current directory, tranverse the directory, calling
# the subroutine "wanted" on each entry (see man File::Find)
find(\&wanted, ".");

sub wanted
{
    if (-f and
        /.wm[av]$/)
    {
        # when this subroutine is called, $_ will contain the name of
        # the directory entry, and the script will have chdir()ed to
        # the containing directory. If we are looking at a file with
        # the wanted extension - then rename it (warning if it fails).
        my $new_name = $_;
        $new_name =~ s/\.wm[av]$/.txt/;
        rename($_, $new_name) or
            warn("rename($_, $new_name) failed - $!");
    }
}
Beano
+3  A: 
#!/usr/bin/perl

use strict;
use warnings;
use File::Find;

my $dir = '/path/to/dir';

File::Find::find(
    sub {
        my $file = $_;
        return if -d $file;
        return if $file !~ /(.*)\.wm[av]$/;
        rename $file, "$1.txt" or die $!;
    }, $dir
);
mehmet el kasid