views:

125

answers:

3

I've got a unit test I'm writing which seems to have some kind of a pointer problem. Basically, it's testing a class that, once constructed, returns information about a file. If all the files expected are detected, then the test operates correctly. If there are more files expected than detected, then the routine correctly reports error. But if there are more files detected than expected, the executable crashes. This has been difficult to follow because when I try to step through with the debugger, the current code point jumps all over the method -- it doesn't follow line by line like you'd expect.

Any ideas as to what I'm doing incorrectly?

Here's my code:

#include "stdafx.h"
#include "boostUnitTest.h"

#include "../pevFind/fileData.h" //Header file for tested code.
#include "../pevFind/stringUtil.h" //convertUnicode()

struct fileDataLoadingFixture
{
    std::vector<fileData> testSuiteCandidates;
    fileDataLoadingFixture()
    {
     WIN32_FIND_DATA curFileData;
     HANDLE fData = FindFirstFile(L"testFiles\\*", &curFileData);
     if (fData == INVALID_HANDLE_VALUE)
      throw std::runtime_error("Failed to load file list!");
     do {
      if(boost::algorithm::equals(curFileData.cFileName, L".")) continue;
      if(boost::algorithm::equals(curFileData.cFileName, L"..")) continue;
      fileData theFile(curFileData, L".\\testFiles\\");
      testSuiteCandidates.push_back(theFile);
     } while (FindNextFile(fData, &curFileData));
     FindClose(fData);
    };
};

BOOST_FIXTURE_TEST_SUITE( fileDataTests, fileDataLoadingFixture )

BOOST_AUTO_TEST_CASE( testPEData )
{
    std::vector<std::wstring> expectedResults;
    expectedResults.push_back(L"a.cfexe");
    expectedResults.push_back(L"b.cfexe");
    //More files.... 
    expectedResults.push_back(L"c.cfexe");
    std::sort(expectedResults.begin(), expectedResults.end());
    for (std::vector<fileData>::const_iterator it = testSuiteCandidates.begin(); it != testSuiteCandidates.end(); it++)
    {
     if (it->isPE())
     {
      std::wstring theFileString(it->getFileName().substr(it->getFileName().find_last_of(L'\\') + 1 ));
      std::vector<std::wstring>::const_iterator target = std::lower_bound(expectedResults.begin(), expectedResults.end(), theFileString);
      BOOST_REQUIRE_MESSAGE(*target == theFileString, std::string("The file ") + convertUnicode(theFileString) + " was unexpected." );
      if (*target == theFileString)
      {
       expectedResults.erase(target);
      }
     }
    }
    BOOST_CHECK_MESSAGE(expectedResults.size() == 0, "Some expected results were not found." );
}

BOOST_AUTO_TEST_SUITE_END()

Thanks!

Billy3

I solved the problem using the following code instead:

BOOST_AUTO_TEST_CASE( testPEData )
{
    std::vector<std::wstring> expectedResults;
    expectedResults.push_back(L"blah.cfexe");
    //files
    expectedResults.push_back(L"tail.cfexe");
    expectedResults.push_back(L"zip.cfexe");
    std::vector<std::wstring> actualResults;
    for(std::vector<fileData>::const_iterator it = testSuiteCandidates.begin(); it != testSuiteCandidates.end(); it++)
    {
     if (it->isPE()) actualResults.push_back(it->getFileName().substr(it->getFileName().find_last_of(L'\\') + 1 ));
    }

    std::sort(expectedResults.begin(), expectedResults.end());
    std::sort(actualResults.begin(), actualResults.end());

    std::vector<std::wstring> missed;
    std::set_difference(expectedResults.begin(), expectedResults.end(), actualResults.begin(), actualResults.end(), std::back_inserter(missed));

    std::vector<std::wstring> incorrect;
    std::set_difference(actualResults.begin(), actualResults.end(), expectedResults.begin(), expectedResults.end(), std::back_inserter(incorrect));

    for(std::vector<std::wstring>::const_iterator it = missed.begin(); it != missed.end(); it++)
    {
     BOOST_ERROR(std::string("The file ") + convertUnicode(*it) + " was expected but not returned.");
    }

    for(std::vector<std::wstring>::const_iterator it = incorrect.begin(); it != incorrect.end(); it++)
    {
     BOOST_ERROR(std::string("The file ") + convertUnicode(*it) + " was returned but not expected.");
    }

    BOOST_CHECK(true); //Suppress commandline "No assertions" warning

}
+2  A: 

What do you think this will return, in the event that the file was unexpected? It looks like you expect it to be a valid file name.

std::vector<std::wstring>::const_iterator target = std::lower_bound(expectedResults.begin(), expectedResults.end(), theFileString);

It will in fact be an iterator at one past the end of the array - you cannot treat it as a valid pointer:

BOOST_REQUIRE_MESSAGE(*target == theFileString, std::string("The file ") + convertUnicode(theFileString) + " was unexpected." );

Normall, you compare the result with the value of end() (fixed):

BOOST_REQUIRE_MESSAGE(target != expectedResults.end() || *target == theFileString, std::string("The file ") + convertUnicode(theFileString) + " was unexpected." );
if (target != expectedResults.end() && *target == theFileString)
{
  expectedResults.erase(target);
}

See in this example here, lower_bound will return off-then-end values:

int main()
{
  int A[] = { 1, 2, 3, 3, 3, 5, 8 };
  const int N = sizeof(A) / sizeof(int);

  for (int i = 1; i <= 10; ++i) {
    int* p = lower_bound(A, A + N, i);
    cout << "Searching for " << i << ".  ";
    cout << "Result: index = " << p - A << ", ";
    if (p != A + N)
      cout << "A[" << p - A << "] == " << *p << endl;
    else
      cout << "which is off-the-end." << endl;
  }
}

The output is:

Searching for 1.  Result: index = 0, A[0] == 1
Searching for 2.  Result: index = 1, A[1] == 2
Searching for 3.  Result: index = 2, A[2] == 3
Searching for 4.  Result: index = 5, A[5] == 5
Searching for 5.  Result: index = 5, A[5] == 5
Searching for 6.  Result: index = 6, A[6] == 8
Searching for 7.  Result: index = 6, A[6] == 8
Searching for 8.  Result: index = 6, A[6] == 8
Searching for 9.  Result: index = 7, which is off-the-end.
Searching for 10.  Result: index = 7, which is off-the-end.

You cannot safely dereference the off-the-end value, but you can use it for the purpose of comparison, or as an insertion point.

Why are you using lower_bound anyway? For the purposes of search, surely you should use find? If you are using lower_bound, it will return a position where the file name could be inserted, but not necessarily equal to the file name you are looking for. In other words, you not only need to compare with the off-the-end value, but also with the file name, in case it returned something valid.

Here is a version that uses find. As you can see it is simpler than the fixed version above.

std::vector<std::wstring>::const_iterator target = std::find(expectedResults.begin(), expectedResults.end(), theFileString);
BOOST_REQUIRE_MESSAGE(target != expectedResults.end(), std::string("The file ") + convertUnicode(theFileString) + " was unexpected." );
if (target != expectedResults.end())
{
  expectedResults.erase(target);
}
1800 INFORMATION
From SGI's docs on std::lower_bound: Specifically, it returns the first position where value could be inserted without violating the ordering. from http://www.sgi.com/tech/stl/lower_bound.html Indicates that the iterator returned should be valid even if the item is not found. std::find returns end on failure... I don't believe lower_bound does.
Billy ONeal
Oh, and editing the code as you suggest causes the test in question to produce incorrect results.
Billy ONeal
On the page you linked to, there is a specific example which shows it will return an off-the-end value
1800 INFORMATION
@Billy: insert functions in the STL insert *before* the position iterator. Therefore, when the searched element is greater than all elements in the range, lower_bound returns an end iterator, which you must not dereference (although you can for example pass it to vector::insert)
Steve Jessop
I fixed the code suggestion, although, do you think you should be using find instead of lower_bound? It would make the code simpler
1800 INFORMATION
Lower_Bound is an O(lg n) operation, whereas find is O(n). This is why I did it that way.. I guess I'll just do this differently. Thanks for pointing out the bug though. Checkmarking due to pointing out that bug, even if it didn't actually solve the problem.
Billy ONeal
If you are concerned about the runtime, maybe you should use a map? A better data structure can optimise performance too
1800 INFORMATION
See the code I posted above that fixed the problem :) Thanks! Should spend more time in STL manuals :P
Billy ONeal
@1800: I wouldn't expect a map to be any faster for searching than an ordered vector with lower_bound. About log(N) comparisons either way, although the map might be somewhat unbalanced whereas a binary search never is. Am I missing something? Map is faster for insertion and deletion at the middle, of course.
Steve Jessop
A: 

"the current code point jumps all over the method -- it doesn't follow line by line like you'd expect"

This usually means you have more than 1 thread hitting the method at a time. I usually see this when more than 1 object on my aspx page causes a postback at the same time.

JBrooks
Do you see any threads in the above code?
Billy ONeal
I guess you are debugging an optimised build - debugging those tends to be more difficult
1800 INFORMATION
Nope. Debug build. No reason to build a release build for unit tests ;)
Billy ONeal
No I don't see any threads, but then again you wouldn't see any in the example I gave either. There is more than 1 thread when 2 items cause a postback.
JBrooks
A: 

"the current code point jumps all over the method -- it doesn't follow line by line like you'd expect"

Check your compiler options! Have you got debug on and optimise off?

Optimising compilers often completly "rewrite" your code changing the order of instructions, unrolling loops etc. etc. so that the debugger has difficulty in mapping an executable instruction to the original line of code.

James Anderson
It's running in debug mode :(
Billy ONeal