tags:

views:

153

answers:

3

I am in the process of writing an application in which I use the Set class in the C++ STL. I've discovered that the call to set->find() always seems to fail when I query for the last element I inserted. However, if I iterate over the set, I am able to see the element I was originally querying for.

To try to get a grasp on what is going wrong, I've created a sample application that exhibits the same behavior that I am seeing. My test code is posted below.

For the actual application itself, I need to store pointers to objects in the set. Is this what is causing the weird behavior. Or is there an operator I need to overload in the class I am storing the pointer of?

Any help would be appreciated.

#include <stdio.h>
#include <set>

using namespace std;

#define MySet set<FileInfo *,bool(*)(const FileInfo *, const FileInfo*)>

class FileInfo
{
    public:
        FileInfo()
        {
            m_fileName = 0;
        }
        FileInfo( const FileInfo & file )
        {
            setFile( file.getFile() );
        }
        ~FileInfo()
        {
            if( m_fileName )
            {
                delete m_fileName;
                m_fileName = 0;
            }
        }
        void setFile( const char * file )
        {
            if( m_fileName )
            {
                delete m_fileName;
            }
            m_fileName = new char[ strlen( file ) + 1 ];
            strcpy( m_fileName, file );
        }
        const char * getFile() const
        {
            return m_fileName;
        }
    private:
        char * m_fileName;
};

bool fileinfo_comparator( const FileInfo * f1, const FileInfo* f2 )
{
    if( f1 && ! f2 ) return -1;
    if( !f1 && f2 ) return 1;
    if( !f1 && !f2 ) return 0;

    return strcmp( f1->getFile(), f2->getFile() );
}

void find( MySet *s, FileInfo * value )
{
    MySet::iterator iter = s->find( value );
    if( iter != s->end() )
    {
        printf( "Found File[%s] at Item[%p]\n", (*iter)->getFile(), *iter );
    }
    else
    {
        printf( "No Item found for File[%s]\n", value->getFile() );
    }
}

int main()
{
    MySet *theSet = new MySet(fileinfo_comparator);

    FileInfo * profile = new FileInfo();
    FileInfo * shell = new FileInfo();
    FileInfo * mail = new FileInfo();

    profile->setFile( "/export/home/lm/profile" );
    shell->setFile( "/export/home/lm/shell" );
    mail->setFile( "/export/home/lm/mail" );

    theSet->insert( profile );
    theSet->insert( shell );
    theSet->insert( mail );

    find( theSet, profile );

    FileInfo * newProfile = new FileInfo( *profile );

    find( theSet, newProfile );

    FileInfo * newMail = new FileInfo( *mail );

    find( theSet, newMail );

    printf( "\nDisplaying Contents of Set:\n" );
    for( MySet::iterator iter = theSet->begin();
            iter != theSet->end(); ++iter )
    {
        printf( "Item [%p] - File [%s]\n", *iter, (*iter)->getFile() );
    }
}

The Output I get from this is:

Found File[/export/home/lm/profile] at Item[2d458]
Found File[/export/home/lm/profile] at Item[2d458]
No Item found for File[/export/home/lm/mail]

Displaying Contents of Set:
Item [2d478] - File [/export/home/lm/mail]
Item [2d468] - File [/export/home/lm/shell]
Item [2d458] - File [/export/home/lm/profile]

**Edit It's kind of sad that I have to add this. But as I mentioned before, this is a sample application that was pulled from different parts of a larger application to exhibit the failure I was receiving.

It is meant as a unit test for calling set::find on a set populated with heap allocated pointers. If you have a problem with all the new()s, I'm open to suggestions on how to magically populate a set with heap allocated pointers without using them. Otherwise commenting on "too many new() calls" will just make you look silly.

Please focus on the actual problem that was occurring (which is now solved). Thanks.

***Edit

Perhaps I should have put these in my original question. But I was hoping there would be more focus on the problem with the find() (or as it turns out fileinfo_comparator function that acts more like strcmp than less), then a code review of a copy-paste PoC unit test.

Here are some points about the code in the full application itself.

  • FileInfo holds a lot of data along with the filename. It holds SHA1 sums, file size, mod time, system state at last edit, among other things. I have cut out must of it's code for this post. It violates the Rule of 3 in this form (Thanks @Martin York. See comments for wiki link).
  • The use of char* over std::string was originally chosen because of the use of 3rd_party APIs that accept char*. The app has since evolved from then. Changing this is not an option.
  • The data inside FileInfo is polled from a named pipe on the system and is stored in a Singleton for access across many threads. (I would have scope issues if I didn't allocate on heap)
  • I chose to store pointers in the Set because the FileInfo objects are large and constantly being added/removed from the Set. I decided pointers would be better than always copying large structures into the Set.
  • The if statement in my destructor is needless and a left over artifact from debugging of an issue I was tracking down. It should be pulled out because it is unneeded.
+10  A: 

Your comparison function is wrong - it returns bool, not integer as strcmp(3). The return statement should be something like:

return strcmp( f1->getFile(), f2->getFile() ) < 0;

Take a look here.

Also, out of curiosity, why not just use std::set<std::string> instead? STL actually has decent defaults and frees you from a lot of manual memory management.

Nikolai N Fetissov
+1: the fester of `new` is a sign of bad coding in a C++ application.
Matthieu M.
Doh, that's a forehead slapper! Thanks!@Metthieu Please see what a proof of concept application is. When reading code that was pulled out of a larger application to reproduce a problem, it may not make total sense to you. A good coder knows PoC code may look funny but can ignore that to find the true problem.
LukeFu
That's not enough to make fileinfo_comparator() correct. The comparison needs to enforce a strict weak ordering. http://www.sgi.com/tech/stl/StrictWeakOrdering.html This does not happen when either value is NULL. It just so happens it does not fail above because you do not have NULL values in the test set above.
Martin York
+1  A: 

In your constructor:

FileInfo( const FileInfo & file ) 
        { 
            setFile( file.getFile() ); 
        }

m_fileName seems to be not initialized.

Kirill V. Lyadvinsky
Good catch! Thanks!
LukeFu
+2  A: 

It looks to me like your FileInfo doesn't work correctly (at least for use in a std::set). To be stored in a std::set, the comparison function should return a bool indicating that the two parameters are in order (true) or out of order (false).

Given what your FileInfo does (badly designed imitation of std::string), you'd probably be better off without it completely. As far as I can see, you can use std::string in its place without any loss of functionality. You're also using a lot of dynamic allocation for no good reason (and leaking a lot of what you allocate).

#include <set>
#include <iostream>
#include <iterator>
#include <string>

int main() { 
    char *inputs[] = { "/export/home/lm/profile", "/export/home/lm/shell", "/export/home/lm/mail" };
    char *outputs[] = {"Found: ", "Could **not** find: "};

    std::set<std::string> MySet(inputs, inputs+3);

    for (int i=0; i<3; i++)
        std::cout 
            << outputs[MySet.find(inputs[i]) == MySet.end()] 
            << inputs[i] << "\n";

    std::copy(MySet.begin(), MySet.end(), 
        std::ostream_iterator<std::string>(std::cout, "\n"));

    return 0;
}

Edit: even when (or really, especially when) FileInfo is more complex, it shouldn't attempt to re-implement string functionality on its own. It should still use an std::string for the file name, and implement an operator< that works with that:

class FileInfo { 
    std::string filename;
public:
    // ...
    bool operator<(FileInfo const &other) const { 
       return filename < other.filename;
    }
    FileInfo(char const *name) : filename(name) {}
};

std::ostream &operator(std::ostream &os, FileInfo const &fi) { 
    return os << fi.filename;
}

int main() { 
    // std::set<std::string> MySet(inputs, inputs+3);
    std:set<FileInfo> MySet(inputs, inputs+3);

    // ...

    std::copy(MySet.begin(), MySet.end(), 
        std::ostream_iterator<FileInfo>(std::cout, "\n"));
 }
Jerry Coffin
Thanks for the comment. FileInfo in it's true form is actually a lot larger and holds data regarding a files SHA1 sum, size, mod time, state of the system when modified and so on. I trimmed it down because all that extra stuff was unrelated to my issue.
LukeFu