views:

117

answers:

1

I'm writing some code to manage a custom on disk file structure and syncronize it to unconnected systems. One of my requirements is to be able to estimate the size of a sync prior to actually generating the sync content. As a simple solution, I've put together a map with full path filenames as the key for effecient lookup of already scanned content.

I run into problems with this when I have multiple files in my file structure referenced from different places in different ways. For example:

C:\DataSource\files\samplefile.txt
C:\DataSource\data\samples\..\..\files\samplefile.txt
C:\DataSource\etc\..\files\samplefile.txt

These 3 path strings all reference the same file on-disk, however their string representation is different. If I drop these into a map verbatim, I'll count the size of samplefile.txt 3 times, and my estimate will be wrong.

In an attempt to find a way around this, I was hoping boost::filesystem::path provided a function to reduce or simplify a path, but I didn't see anything of the sort. Using the path decomposition table and path iterators, I wrote up the following function (for use in a Windows environment):

std::string ReducePath( std::string Path )
{
    bfs::path input( Path );
    bfs::path result( "" );
    bfs::path::iterator it, endIt;
    for( it = input.begin( ), endIt = input.end( ); it != endIt; it ++ )
    {
     if( (*it) == ".." )
     {
      // Remove the leaf directory.
      result = result.parent_path( );
     }
     else if( (*it) == "." )
     {
      // Just ignore.
     }
     else
     {
      // Append the element to the end of the current result.
      result /= (*it);
     }
    }

    return result.string( ).c_str( );
}

I have two questions.

One, is there a standard function that provides this sort of functionality, or does this already exist in boost or another library somewhere?

Two, I'm not entirely confident that the function I wrote will work in all cases, and I'd like some more eyes on it. It works in my tests. Does anyone see a case where it'll break down?

+1  A: 

While not an exact dupe, this question will help: http://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-c-c

daveb