views:

216

answers:

1

Hello everyone :)

I've got a moderately complex iterator written which wraps the FindXFile apis on Win32. (See previous question) In order to avoid the overhead of constructing an object that essentially duplicates the work of the WIN32_FIND_DATAW structure, I have a proxy object which simply acts as a sort of const reference to the single WIN32_FIND_DATAW which is declared inside the noncopyable innards of the iterator. This is great because

  1. Clients do not pay for construction of irrelevant information they will probably not use (most of the time people are only interested in file names), and
  2. Clients can get at all the information provided by the FindXFile APIs if they need or want this information.

This becomes an issue though because there is only ever a single copy of the object's actual data. Therefore, when the iterator is incrememented, all of the proxies are invalidated (set to whatever the next file pointed to by the iterator is).

I'm concerned if this is a major problem, because I can think of a case where the proxy object would not behave as somebody would expect:

std::vector<MyIterator::value_type> files;
std::copy(MyIterator("Hello"), MyIterator(), std::back_inserter(files));

because the vector contains nothing but a bunch of invalid proxies at that point. Instead, clients need to do something like:

std::vector<std::wstring> filesToSearch;
std::transform(
 DirectoryIterator<FilesOnly>(L"C:\\Windows\\*"),
 DirectoryIterator<FilesOnly>(),
 std::back_inserter(filesToSearch),
 std::mem_fun_ref(&DirectoryIterator<FilesOnly>::value_type::GetFullFileName)
);

Seeing this, I can see why somebody might dislike what the standard library designers did with std::vector<bool>. I'm still wondering though: is this a reasonable trade off in order to achieve (1) and (2) above? If not, is there any way to still achieve (1) and (2) without the proxy?

EDIT: Updated code:

#pragma once
#include <list>
#include <string>
#include <boost/noncopyable.hpp>
#include <boost/make_shared.hpp>
#include <boost/iterator/iterator_facade.hpp>
#include <Windows.h>
#include <Shlwapi.h>
#pragma comment(lib, "shlwapi.lib")
#include "../Exception.hpp"

namespace WindowsAPI { namespace FileSystem {

class Win32FindData {
    WIN32_FIND_DATA internalData;
    std::wstring rootPath;
public:
    Win32FindData(const std::wstring& root, const WIN32_FIND_DATA& data) :
        rootPath(root), internalData(data) {};
    DWORD GetAttributes() const {
        return internalData.dwFileAttributes;
    };
    bool IsDirectory() const {
        return (internalData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
    };
    bool IsFile() const {
        return !IsDirectory();
    };
    unsigned __int64 GetSize() const {
        ULARGE_INTEGER intValue;
        intValue.LowPart = internalData.nFileSizeLow;
        intValue.HighPart = internalData.nFileSizeHigh;
        return intValue.QuadPart;
    };
    std::wstring GetFolderPath() const {
        return rootPath;
    };
    std::wstring GetFileName() const {
        return internalData.cFileName;
    };
    std::wstring GetFullFileName() const {
        return rootPath + L"\\" + internalData.cFileName;
    };
    std::wstring GetShortFileName() const {
        return internalData.cAlternateFileName;
    };
    FILETIME GetCreationTime() const {
        return internalData.ftCreationTime;
    };
    FILETIME GetLastAccessTime() const {
        return internalData.ftLastAccessTime;
    };
    FILETIME GetLastWriteTime() const {
        return internalData.ftLastWriteTime;
    };
};

class EnumerationMethod : public boost::noncopyable {
protected:
    WIN32_FIND_DATAW currentData;
    HANDLE hFind;
    std::wstring currentDirectory;
    EnumerationMethod() : hFind(INVALID_HANDLE_VALUE) {};
    void IncrementCurrentDirectory() {
        if (hFind == INVALID_HANDLE_VALUE) return;
        BOOL success =
            FindNextFile(hFind, &currentData);
        if (success)
            return;
        DWORD error = GetLastError();
        if (error == ERROR_NO_MORE_FILES) {
            FindClose(hFind);
            hFind = INVALID_HANDLE_VALUE;
        } else {
            WindowsApiException::Throw(error);
        }
    };
    virtual ~EnumerationMethod() {
        if (hFind != INVALID_HANDLE_VALUE)
            FindClose(hFind);
    };
public:
    bool equal(const EnumerationMethod& other) const {
        if (this == &other)
            return true;
        return hFind == other.hFind;
    };
    Win32FindData dereference() {
        return Win32FindData(currentDirectory, currentData);
    };
};

class NonRecursiveEnumeration : public EnumerationMethod
{
public:
    NonRecursiveEnumeration() {};
    NonRecursiveEnumeration(const std::wstring& pathSpec) {
        std::wstring::const_iterator lastSlash =
            std::find(pathSpec.rbegin(), pathSpec.rend(), L'\\').base();
        if (lastSlash != pathSpec.begin())
            currentDirectory.assign(pathSpec.begin(), lastSlash-1);
        hFind = FindFirstFileW(pathSpec.c_str(), &currentData);
        if (hFind == INVALID_HANDLE_VALUE)
            WindowsApiException::ThrowFromLastError();
        while (hFind != INVALID_HANDLE_VALUE && (!wcscmp(currentData.cFileName, L".") || !wcscmp(currentData.cFileName, L".."))) {
            IncrementCurrentDirectory();
        }
    };
    void increment() {
        IncrementCurrentDirectory();
    };
};

class RecursiveEnumeration : public EnumerationMethod
{
    std::wstring fileSpec;
    std::list<std::wstring> futureDirectories;
    std::list<std::wstring>::iterator directoryInsertLocation;
    void ShiftToNextDirectory() {
        if (futureDirectories.empty()) {
            hFind = INVALID_HANDLE_VALUE;
            return;
        }
        //Get the next directory
        currentDirectory = futureDirectories.front();
        futureDirectories.pop_front();
        directoryInsertLocation = futureDirectories.begin();
        std::wstring pathSpec(currentDirectory);
        if (!pathSpec.empty())
            pathSpec.push_back(L'\\');
        pathSpec.append(fileSpec);

        hFind = FindFirstFileW(pathSpec.c_str(), &currentData);
        if (hFind == INVALID_HANDLE_VALUE)
            WindowsApiException::ThrowFromLastError();
        while (!wcscmp(currentData.cFileName, L".") || !wcscmp(currentData.cFileName, L"..")) {
            IncrementCurrentDirectory();
        }
    };
    void IncrementAndShift() {
        if (hFind != INVALID_HANDLE_VALUE && (currentData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) {
            directoryInsertLocation = futureDirectories.insert(directoryInsertLocation,
                currentDirectory + L"\\" + currentData.cFileName);
            directoryInsertLocation++;
        }
        IncrementCurrentDirectory();
        if (hFind == INVALID_HANDLE_VALUE)
            ShiftToNextDirectory();
    };
public:
    RecursiveEnumeration() {};
    RecursiveEnumeration(const std::wstring& pathSpec) {
        std::wstring::const_iterator lastSlash =
            std::find(pathSpec.rbegin(), pathSpec.rend(), L'\\').base();
        if (lastSlash != pathSpec.begin()) {
            futureDirectories.push_back(std::wstring(pathSpec.begin(), lastSlash-1));
            fileSpec.assign(lastSlash, pathSpec.end());
        } else {
            futureDirectories.push_back(std::wstring());
            fileSpec = pathSpec;
        }
        ShiftToNextDirectory();
    };
    void increment() {
        do {
            IncrementAndShift();
        } while (!PathMatchSpecW(currentData.cFileName, fileSpec.c_str()));
    };
};

struct AllResults
{
    bool operator()(const Win32FindData&) {
        return true;
    };
}; 

struct FilesOnly
{
    bool operator()(const Win32FindData& arg) {
        return arg.IsFile();
    };
};

template <typename Filter_T = AllResults, typename Recurse_T = NonRecursiveEnumeration>
class DirectoryIterator : 
    public boost::iterator_facade<DirectoryIterator<Filter_T, Recurse_T>, Win32FindData, std::input_iterator_tag, Win32FindData>
{
    friend class boost::iterator_core_access;
    boost::shared_ptr<Recurse_T> impl;
    Filter_T filter;
    void increment() {
        do {
            impl->increment();
        } while (! filter(impl->dereference()));
    };
    bool equal(const DirectoryIterator& other) const {
        return impl->equal(*other.impl);
    };
    Win32FindData dereference() const {
        return impl->dereference();
    };
public:
    DirectoryIterator(Filter_T functor = Filter_T()) :
        impl(boost::make_shared<Recurse_T>()),
        filter(functor) {
    };
    explicit DirectoryIterator(const std::wstring& pathSpec, Filter_T functor = Filter_T()) :
        impl(boost::make_shared<Recurse_T>(pathSpec)),
        filter(functor) {
    };
};

}}
+1  A: 

I don't understand why you don't have your iterator produce an object that contains the WIN32_FIND_DATAW data when it's dereferenced instead of referring to this proxy object that gets modified when the iterator is incremented.

The iterator is marked as in input iterator, let the user decide whether they want a copy of that file information it refers to or not. Dereferencing the iterator should return a copyable object with a similar interface to your FileData class that contains appropriate member data that in the class so can be copied as necessary to maintain the object's identity. This class wouldn't need the template parameter indicating how the find operation should be performed, since objects if that class wouldn't really have anything to do with a find, except that a find operation can produce it - this object just contains information about a file.

Then when the iterator is dereferenced, it can give back an object of that type.

What the iterator does when it's incremented doesn't need to have anything to do with the object it returns when it's dereferenced (even if some of that data looks an awful lot like the object that gets returned and/or is used in the increment operation).

I'd probably rename the FileData<> class you have to something like FileFindDataInternal<>, since it's really of use only as part of the internal operation of the iterator.

That would free up the name FileData to be used for the class that wraps the information the user is interested in, and which should be copyable by the user.

Michael Burr
"why you don't have your iterator produce an object that contains the WIN32_FIND_DATAW data when it's dereferenced" <-- Because then every dereference requires copying the entire 580 byte WIN32_FIND_DATA structure, plus the the root path string. But if you think that's an unacceptable tradeoff, then I'll refactor out the proxy.
Billy ONeal
@Michael Burr: Edited the modifications into my question. Is that what you were referring to?
Billy ONeal