tags:

views:

183

answers:

1

This is a part of an open source project called JNotify. I am trying to fix the Win32 Implementation and it's really driving me nuts. I already read everything there is to read in MSDN about this, and read every web post about this sucky API. I am trying to receive file system notifications on windows using ReadDirectoryChangesW, using a completion port.

the behavior I am seeing is that normally it works, but some times the buffer I receive when GetQueuedCompletionStatus returns is corrupted in strange ways. eitehr FILE_NOTIFY_INFORMATION.NextEntryOffset points to the itself (resulting in an endless loop), or something else goes wrong and I receive a bogus file name length. This only happens if I re-watch the directory, never in the first event (but re-watching is required otherwise you only get one event for that directory).

The test code that crashes every thing is trivial, it just watch many dirs and creates 2 files in each directory.

here is some relevant code, I can add all of it if you want (the whole thing is not too big), but feels too big for a question here.

This bit of code creates the completion port, it only runs once - and then I use this completion port for all directories.

_completionPort = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);

This is the WatchData constructor, that actually opens the directory handle and associates it with the completion port.

WatchData::WatchData(const WCHAR* path, int mask, bool watchSubtree, HANDLE completionPort)
    :
    _watchId(++_counter), 
    _mask(mask), 
    _watchSubtree(watchSubtree),
    _byteReturned(0),
    _completionPort(completionPort)
{
    _path = _wcsdup(path); 
    _hDir = CreateFileW(_path,
                         FILE_LIST_DIRECTORY | GENERIC_READ | GENERIC_WRITE,
                         FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                         NULL, //security attributes
                         OPEN_EXISTING,
                         FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED, NULL);
    if(_hDir == INVALID_HANDLE_VALUE )  
    {
        throw GetLastError();
    }

    if (NULL == CreateIoCompletionPort(_hDir, _completionPort, (ULONG_PTR)&_watchId, 0))
    {
        throw GetLastError();
    }
}

This is the code running (inside a WatchData object) when I start watching a directory:

int WatchData::watchDirectory()
{
    printf("(Re)watching %ls\n", _path);
    memset(_buffer, 0, sizeof(_buffer));
    memset(&_overLapped, 0, sizeof(_overLapped));
    if( !ReadDirectoryChangesW( _hDir,
                                _buffer,//<--FILE_NOTIFY_INFORMATION records are put into this buffer
                                sizeof(_buffer),
                                _watchSubtree,
                                _mask,
                                &_byteReturned,
                                &_overLapped,
                                NULL))



    {
        return GetLastError();
    }
    else
    {
        return 0;
    }
}

This is the main loop that run in it's own thread, handling completion events. Note that "This should not happen bit", it actually happens a lot.

DWORD WINAPI Win32FSHook::mainLoop( LPVOID lpParam )
{
    debug("mainLoop starts");
    Win32FSHook* _this = (Win32FSHook*)lpParam;

    HANDLE hPort = _this->_completionPort;
    DWORD dwNoOfBytes = 0;
    ULONG_PTR ulKey = 0;
    OVERLAPPED* pov = NULL;
    WCHAR name[1024];

    while (_this->_isRunning)
    {
        pov = NULL;
        BOOL fSuccess = GetQueuedCompletionStatus(
                        hPort,         // Completion port handle
                        &dwNoOfBytes,  // Bytes transferred
                        &ulKey,
                        &pov,          // OVERLAPPED structure
                        INFINITE       // Notification time-out interval
                        );
        if (fSuccess)
        {
            if (dwNoOfBytes == 0)
            {
                // can happen after a watch is removed
                continue;
            }
            int wd = *(int*)ulKey;
            EnterCriticalSection(&_this->_cSection);
            WatchData *watchData = _this->find(wd);
            if (!watchData)
            {
                log("mainLoop : ignoring event for watch id %d, no longer in wid2WatchData map", wd);
                LeaveCriticalSection(&_this->_cSection);
                continue;
            }

            //const char* buffer = watchData->getBuffer();
            char buffer[watchData->getBufferSize()];
            memcpy(buffer, watchData->getBuffer(), watchData->getBufferSize());
            LeaveCriticalSection(&_this->_cSection);
            FILE_NOTIFY_INFORMATION *event;
            DWORD i=0;
            do
            {
                event = (FILE_NOTIFY_INFORMATION*)(buffer+i);
                int action = event->Action;
                DWORD len = event->FileNameLength / sizeof(WCHAR);
                for (DWORD k=0;k<len && k < (sizeof(name)-sizeof(WCHAR))/sizeof(WCHAR);k++)
                {
                    name[k] = event->FileName[k];
                }
                name[len] = 0;

                _this->_callback(watchData->getId(), action, watchData->getPath(), name);

                if (i != 0 && event->NextEntryOffset == i)
                {
                    log("should not happen!");
                    break;
                }

                i = event->NextEntryOffset;
            }
            while (i != 0);

            int res = watchData->watchDirectory();
            if (res != 0)
            {
                log("Error watching dir %s : %d",watchData->getPath(), res);
            }
        }
        else
        {
            log("GetQueuedCompletionStatus returned an error");
        }
    }
    debug("mainLoop exits");
    return 0;
}
+2  A: 

I'm pretty sure NextEntryOffset is relative to the current record, not the first record.

...
char* current = buffer;
do
{
    event = (FILE_NOTIFY_INFORMATION*)current;
    ...
    i = event->NextEntryOffset;
    current += i;
}
while (i != 0);
...
Luke
great, looks like this was the problem.one small thing: please fix your code: the while line should be changed to while(event->NextEntryOffset != 0);
Omry
Don't blame me, I was just using what was in your original code.
Luke
not blaming you. the original code is correct under the false assumption that NextEntryOffset is absolute offset. by fixing the bug you just introduced a new small one.
Omry