views:

223

answers:

3

My problem is that I have a bunch of file names without the version appended (version keeps changing everytime). The file names are in a file in a particular sequence and I need to get the latest version from a folder and then sequentially install the same. The logic would be:

  1. scan a file with contents
  2. read a line from the file
  3. using this as a key, access the folder and match the same
  4. if found, write the full file-name to a file with some characters appended
  5. if not found, skip and loop to line 1, till all the lines in the file are finished

What is the best language to use: shell script or Perl for such a task? And if someone can provide some hints in the form of code :-)

+2  A: 

I would read in all your partial filenames then loop through the folder matching the full filenames against the partial ones. The exact implementation would depend on some details. Do the full filenames need to appear in the same order as the partial ones did? Can you derive the partial filename from the full filename?

Update: so, something like (assuming $infile, $outfile, and $indir are already opened file and dirhandles, and a translation routine partial_filename_from_full that returns undef for things like directories or non-relevant files):

chomp( my @partial_filenames = readline( $infile ) );

while ( my $filename = readdir( $indir ) ) {
    my $partial_filename = partial_filename_from_full( $filename );
    if ( defined $partial_filename ) {
        $full_filename{ $partial_filename } = $filename;
    }
}

for my $partial_filename ( @partial_filenames ) {
    if ( exists $full_filename{ $partial_filename } ) {
        print $outfile $full_filename{ $partial_filename }, "\n";
    } else {
        # error? just skip it? you decide
    }
}

If there are multiple full filenames per partial filename, instead of assigning:

        $full_filename{ $partial_filename } = $filename;

you would determine if $filename were a better "match" than the previously encountered one.

ysth
yes, the full file-names need to appear in the same order as the partial file names.yes, the partial file-names (as given by Jonathan) are the initial names without the version of the file appended, so the full file name can be derived from the partial file names.
gagneet
+1  A: 

Your question is not very clear, but I'm guessing you have a directory containing file names such as:

  • fileA01
  • fileA02
  • fileB03
  • fileB05
  • fileB12
  • fileC02
  • fileD09
  • fileE22

The file you scan 'with contents' contains a list of names such as:

  • fileA
  • fileB
  • fileE

And you want code to find the entry in the directory with the highest version number for the corresponding file name:

  • fileA02
  • fileB12
  • fileE22

You will have to decide exactly how versions are compared - I've used 2-digit version numbers, but you haven't stated your constraints.

I would probably use Perl for this. First, I'd read the whole 'file with contents' into memory, and then create a monster regex to recognize the file names - possibly with the version number detection included. I'd use opendir, readdir (and closedir) to process the directory. For each line, I'd match it with the regex, and capture whether the name was the most recent version of any of the sought files. If so, I'd capture the filename in a hash, indexed by the version-less filename (hence, if fileA01 was read first, then I'd have $filelist{fileA} = "fileA01"; except of course both the hash key and the full filename would be in variables.

Doing it in shell would be harder. Using the most powerful features of Bash, it is probably doable; I'd still use Perl (or Python, or any scripting language).

Jonathan Leffler
yes, you are right in assuming the above. i will try what you have given. thanks :-)
gagneet
A: 

I would use awk.

awk -f myawk.awk

myawk.awk

BEGIN{
}
{
    myfilename = $0;
    retval = getline otherfile < myfilename;
    if (retval == -1) # check the correct syntax
    {
        # file does not exist. do the necessary error handling

    }
    else
    {
        # File exists. so do what you want.
        # perhaps you might want to write to a new file with the modified filename
    }
}
END{
}
Sathya
An empty 'BEGIN' is a little odd - why include it? Your code does not appear to be dealing with the partial names vs full names issue.
Jonathan Leffler
true. Its more become a habit for me to have the skeleton BEGIN-END.In the code above, I was hinting at a possible solution via awk and did not attempt to provide a running code.
Sathya