ansaurus

Question

Searching a C++ Vector<custom_class> for the first/last occurence of a value

Answer 1

+1 A:

A binary search seems like your best option here, as long as your vector remains sorted. It's essentially identical, performance-wise, to performing a lookup in a binary tree-structure.

Charles Salvia 2009-10-30 10:58:17

Answer 2

A:

The vector is ordered in terms of time

The start time or the end time?

What is wrong with a naive O(n) search? Remember you are only searching and not sorting. You could use a sorted container as well (if that doesn't go against the basic design).

dirkgently 2009-10-30 10:59:42

End time, sorry, will edit the question.The list is going to be very large (upwards of 2000 elements) and efficiency is very important (I need to process this approx. once a second, with as little program wait as possible), so I'm trying to minimise the search time as much as possible.

ianhales 2009-10-30 11:04:08

What sort of efficiency: insert/delete/find? If it's Which means a std::set is better suited for this purpose (or even a hash_map) if you want to have quick lookup. I have had signifanct performance gains when moving from `vector` to `set` (where the size of the data set ~10K). Also, see http://www.bigzaphod.org/set.vs.vector/

dirkgently 2009-10-30 11:10:18

See http://www.sgi.com/tech/stl/complexity.html and the SO question: http://stackoverflow.com/questions/181693/what-are-the-complexity-guarantees-of-the-standard-containers

dirkgently 2009-10-30 11:12:57

Answer 3

+4 A:

I'd just use find and find_end, and then do something more complicated only if testing showed it to be too slow.

If you're really concerned about lookup performance, you might consider a different data structure, like a map with timestamp as the key and a vector or list of elements as the value.

Kristopher Johnson 2009-10-30 11:04:01

I'm concerned that the time complexity of find and find_end are quite high. The vector will be large, and I really need to minimise lag.

ianhales 2009-10-30 11:15:44

Answer 4

+5 A:

If the vector is sorted and contains the value, std::lower_bound will give you an iterator to the first element with a given value and std::upper_bound will give you an iterator to one element past the last one containing the value. Compare the value with the returned element to see if it existed in the vector. Both these functions use binary search, so time is O(logN).

To compare on tracklet.end->t, use:

bool compareTracklets(const Tracklet &tr1, const Tracklet &tr2) {
    return (tr1.end->t < tr2.end->t);
}

and pass compareTracklets as the fourth argument to lower_bound or upper_bound

interjay 2009-10-30 11:10:29

That might be possible, but how would I make it check on `tracklet.end->t`? I'm relatively new to C++ :)

ianhales 2009-10-30 11:14:26

You use a comparison function or object. I've edited my answer with an example.

interjay 2009-10-30 11:24:57

Answer 5

+1 A:

dirkgently referred to a sweet optimization comparative. But I would in fact not use a std::vector for this.

Usually, when deciding to use a STL container, I don't really consider the performance aspect, but I do consider its interface regarding the type of operation I wish to use.

std::set<T>::find
std::set<T>::lower_bound
std::set<T>::upper_bound
std::set<T>::equal_range

Really, if you want an ordered sequence, outside of a key/value setup, std::set is just easier to use than any other.

You don't have to worry about inserting at a 'bad' position
You don't have problems of iterators invalidation when adding / removing an element
You have built-in methods for searching

Of course, you also want your Comparison Predicate to really shine (hopes the compiler inlines the operator() implementation), in every case.

But really, if you are not convinced, try a build with a std::vector and manual insertion / searching (using the <algorithm> header) and try another build using std::set.

Compare the size of the implementations (number of lines of code), compare the number of bugs, compare the speed, and then decide.

Most often, the 'optimization' you aim for is actually a pessimization, and in those rares times it's not, it's just so complicated that it's not worth it.

Optimization:

Don't
Expert only: Don't, we mean it

Matthieu M. 2009-10-30 13:02:35

ansaurus

tags:

views:

answers:

Searching a C++ Vector<custom_class> for the first/last occurence of a value

related questions