views:

124

answers:

2

This question comes from a need to ensure that changes I've made to code doesn't affect the values it outputs to text file. Ideally, I'd roll a sub to take in two filenames and return 1or return 0 depending on whether the contents are identical or not, whitespaces and all.

Given that text-processing is Perl's forté, it should be quite easy to compare two files and determine whether they are identical or not (code below untested).

use strict;
use warnings;

sub files_match {

    my ( $fileA, $fileB ) = @_;
    open my $file1, '<', $fileA;
    open my $file2, '<', $fileB;

    while (my $lineA = <$file1>) {

        next if $lineA eq <$file2>;
        return 0 and last;
    }

    return 1;
}

The only way I can think of (sans CPAN modules) is to open the two files in question, and read them in line-by-line until a difference is found. If no difference is found, the files must be identical.

But this approach is limited and clumsy. What if the total lines differ in the two files? Should I open and close to determine line count, then re-open to scan the texts? Yuck.

I don't see anything in perlfaq5 relating to this. I want to stay away from modules unless they come with the core Perl 5.6.1 distribution.

+12  A: 

It's in the core.

use File::Compare;

if (compare("file1","file2") == 0) {
  print "They're equal\n";
}
Jonas Elfström
Whew! I knew it'd be something simple.
Zaid
`File::Compare` is roughly equivalent to just opening the files and reading them line by line (or, if you use the third argument, block by block). If you are concerned about performance, you may want to check the file size and inode numbers before calling it.
Chas. Owens
Actually File::Compare already checks for file size. Check out the source http://cpan.uwinnipeg.ca/htdocs/perl/File/Compare.pm.html
Jonas Elfström
+5  A: 

There are a couple of O(1) checks you can do first to see if the files are different.

If the files have different sizes, then they are obviously different. The stat function will return the sizes of the files. It will also return another piece of data that will be useful: the inode number. If the two files are really the same file (because the same filename was passed in for both files or because both names are hardlinks for the same file), the inode number will be the same. A file is obviously the same as itself. Baring those two checks there is no better way to compare two local files for equivalence other than to directly compare them against each other. Of course, there is no need to do it line by line, you can read in larger blocks if you so desire.

#!/usr/bin/perl

use strict;
use warnings;

use File::Compare ();

sub compare {
    my ($first, $second)             = @_;
    my ($first_inode, $first_size)   = (stat $first)[1, 7];
    my ($second_inode, $second_size) = (stat $second)[1, 7];

    #same file, so must be the same;
    return 0 if $first_inode == $second_inode;

    #different sizes, so must be different
    return 1 unless $first_size == $second_size;

    return File::Compare::compare @_;
}

print compare(@ARGV) ? "not the " : "", "same\n";
Chas. Owens
From the source of File::Compare `if (!$text_mode ` `}`
Jonas Elfström