Hi all, I'm using Dipperstein's bitarray.cpp class to work on bi-level (black and white) images where the image data is natively stored as simply as one pixel one bit.
I need to iterate through each and every bit, on the order of 4--9 megapixels per image, over hundreds of images, using a for loop, something like:
for( int i = 0; i < imgLength; i++) {
if( myBitArray[i] == 1 ) {
// ... do stuff ...
}
}
Performance is usable, but not amazing. I run the program through gprof and find out there is significant time and millions of calls to std::vector
methods like iterator and begin. Here's the top-sampled functions:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
37.91 0.80 0.80 2 0.40 1.01 findPattern(bit_array_c*, bool*, int, int, int)
12.32 1.06 0.26 98375762 0.00 0.00 __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >::__normal_iterator(unsigned char const* const&)
11.85 1.31 0.25 48183659 0.00 0.00 __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >::operator+(int const&) const
11.37 1.55 0.24 49187881 0.00 0.00 std::vector<unsigned char, std::allocator<unsigned char> >::begin() const
9.24 1.75 0.20 48183659 0.00 0.00 bit_array_c::operator[](unsigned int) const
8.06 1.92 0.17 48183659 0.00 0.00 std::vector<unsigned char, std::allocator<unsigned char> >::operator[](unsigned int) const
5.21 2.02 0.11 48183659 0.00 0.00 __gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, std::allocator<unsigned char> > >::operator*() const
0.95 2.04 0.02 bit_array_c::operator()(unsigned int)
0.47 2.06 0.01 6025316 0.00 0.00 __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >::__normal_iterator(unsigned char* const&)
0.47 2.06 0.01 3012657 0.00 0.00 __gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >::operator*() const
0.47 2.08 0.01 1004222 0.00 0.00 std::vector<unsigned char, std::allocator<unsigned char> >::end() const
... remainder omitted ...
I'm not really familiar with C++'s STL, but can anyone shed light on why, for instance, std::vector::begin() is being called a few million times? And, of course, whether there's something I can be doing to speed it up?
Edit: I just gave up and optimized the search function (the loop) instead.