tags:

views:

196

answers:

3

I have a boost multi_index_container declared as below which is indexed by hash_unique id(unsigned long) and hash_non_unique transaction id(long). Insertion and retrieval of elements is fast but when I delete elements, it is much slower. I was expecting it to be constant time as key is hashed.

e.g To erase elements from container

for 10,000 elements, it takes around 2.53927016 seconds

for 15,000 elements, it takes around 7.137100068 seconds

for 20,000 elements, it takes around 21.391720757 seconds

Is it something I am missing or is it expected behavior?

class Session
{
  public:
    Session() {
      //increment unique id
      static unsigned long counter = 0;
      boost::mutex::scoped_lock guard(mx);
      counter++;
      m_nId = counter;
    }

    unsigned long GetId() {
     return m_nId;
    }
    long GetTransactionHandle(){
    return m_nTransactionHandle;
    }
    ....
private:
  unsigned long m_nId;
  long m_nTransactionHandle;
  boost::mutext mx;
  ....
};
typedef multi_index_container<
  Session*,
  indexed_by< 
    hashed_unique< mem_fun<Session,unsigned long,&Session::GetId> >,
    hashed_non_unique< mem_fun<Session,unsigned long,&Session::GetTransactionHandle> >
    >  //end indexed_by
  > SessionContainer;
typedef SessionContainer::nth_index<0>::type SessionById;


int main() {
  ...
   SessionContainer container;
   SessionById *pSessionIdView = &get<0>(container);
   unsigned counter = atoi(argv[1]);
   vector<Session*> vSes(counter);
   //insert
   for(unsigned i = 0; i < counter; i++) {
     Session *pSes = new Session();
     container.insert(pSes); 
     vSes.push_back(pSes);
   }
   timespec ts;
   lock_settime(CLOCK_PROCESS_CPUTIME_ID, &ts);
   //erase
   for(unsigned i = 0; i < counter; i++) {
      pSessionIdView->erase(vSes[i]->getId());
      delete vSes[i];
   }
   lock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts);
   std::cout << "Total time taken for erase:" << ts.tv_sec << "." << ts.tv_nsec << "\n";
   return (EXIST_SUCCESS);
}

A: 

In your test code, what value for m_nTransactionHandledo the Session objects receive? Could it be it's the same value for all the objects? If so, erasing will take long, as performance of hashed containers is poor when there are many equal elements. Try assigning different m_nTransactionHandle values on creation to see if this speeds your test up.

Joaquín M López Muñoz
Thx for your hint. yes, possibility of m_nTransactionHandle could be same value and that's why I have used it as hash_non_unique. But primary index m_nId is hashed_unique and I am using that to erase from container. I think non-unique/secondary index values shouldn't impact performance while erasing entry via a primary hashed index. Anyway, I will try that out and let you know.
rjoshi
You are absolutely correct. If I make all both the indexes hashed_unique, performance is almost constant. I even tried making 2nd index as ordered_unique and ordered_non_unique and performance is almost same. So I am wondering why 2nd hashed_non_unique looses performance event though I use hash_unique key to delete the object. I think either it is a bug or design flaw.
rjoshi
A: 

When erasing an element, performance is a function of all the indices making up the container (basically, the element must be erased from every index, not only the index you're currently working with). Hashed indices are badly hurt when there are many equivalent elements, this is not the pattern they're designed to work against.

Joaquín M López Muñoz
A: 

I just found that the performance for hashed_non_unique versus hashed_unique for 2nd index is the almost same except slight overhead of checking duplicate.

The bottleneck was with boost::object_pool. I don't know internal implementation but it seem it is a list where it iterate through the list to find objects. See the link for performance results and source code.

http://joshitech.blogspot.com/2010/05/boost-object-pool-destroy-performance.html

rjoshi