I'm writing some code to manage a custom on disk file structure and syncronize it to unconnected systems. One of my requirements is to be able to estimate the size of a sync prior to actually generating the sync content. As a simple solution, I've put together a map with full path filenames as the key for effecient lookup of already scanned content.
I run into problems with this when I have multiple files in my file structure referenced from different places in different ways. For example:
C:\DataSource\files\samplefile.txt
C:\DataSource\data\samples\..\..\files\samplefile.txt
C:\DataSource\etc\..\files\samplefile.txt
These 3 path strings all reference the same file on-disk, however their string representation is different. If I drop these into a map verbatim, I'll count the size of samplefile.txt 3 times, and my estimate will be wrong.
In an attempt to find a way around this, I was hoping boost::filesystem::path provided a function to reduce or simplify a path, but I didn't see anything of the sort. Using the path decomposition table and path iterators, I wrote up the following function (for use in a Windows environment):
std::string ReducePath( std::string Path )
{
bfs::path input( Path );
bfs::path result( "" );
bfs::path::iterator it, endIt;
for( it = input.begin( ), endIt = input.end( ); it != endIt; it ++ )
{
if( (*it) == ".." )
{
// Remove the leaf directory.
result = result.parent_path( );
}
else if( (*it) == "." )
{
// Just ignore.
}
else
{
// Append the element to the end of the current result.
result /= (*it);
}
}
return result.string( ).c_str( );
}
I have two questions.
One, is there a standard function that provides this sort of functionality, or does this already exist in boost or another library somewhere?
Two, I'm not entirely confident that the function I wrote will work in all cases, and I'd like some more eyes on it. It works in my tests. Does anyone see a case where it'll break down?