views:

114

answers:

2

I'm looking for some STL, boost, or similar container to use the same way indexes are used in databases to search for record using a query like this:

select * from table1 where field1 starting with 'X';

or

select * from table1 where field1 like 'X%';

I thought about using std::map, but I cannot because I need to search for fields that "start with" some text, and not those that are "equal to". Beside that, I need it to work on multiple fields (each "record" has 6 fields, for example), so I would need a separate std::map for each one.

I could create a sorted vector or list and use binary search (breaking the set in 2 in each step by reading the element in the middle and seeing if it's more or less than 'X'), but I wonder if there is some ready-made container I could use without reinventing the wheel?

+6  A: 

std::map is fine, or std::set if there's no data other than the string. Pass your prefix string into lower_bound to get the first string which sorts at or after that point. Then iterate forward through the map until you hit the end or find an element which doesn't begin with your prefix.

Steve Jessop
but that will do the exact comparison of key. How to implement the "starts with 'X'" condition?
Naveen
@Naveen It finds the point at which the prefix would be inserted - it's not the same as find.
anon
ok..I got it now, I was thinking more interms of writing a functor. But this makes more sense.
Naveen
In order to find the end of the range, I think a `lower_bound` approach with the next prefix would be better than a linear search (obviously between the begin of the range we have found and the end of the container).
Matthieu M.
Yes, if you want an iterator pair (to use in an algorithm), rather than to manually visit each element.
Steve Jessop
@Matthieu M Yes this is a clear improvement. It would be interesting to get the next prefix automatically.
Vicente Botet Escriba
Well, in fact the simplest thing would probably be to use a predicate only comparing the prefix combined with the `equal_range` algorithm. The next prefix comes by incrementing the last character of the prefix by 1.
Matthieu M.
+9  A: 

Boost.Multi-Index allows you to manage with several index and it implements the lower_bound as for std::set/map. You will need to select the index corresponding to the field and then do as if it was a map or a set.

Next follows a generic function that could be used to get a couple of iterators, the fist to the first item starting with a given prefix, the second the first item starting with the next prefix, i.e. the end of the search

template <typename SortedAssociateveContainer>
std::pair<typename SortedAssociateveContainer::iterator, 
          typename SortedAssociateveContainer::iterator> 
starts_with(
  SortedAssociateveContainer const& coll, 
  typename SortedAssociateveContainer::key_type const& k)
{
  return make_pair(coll.lower_bound(k), 
                   coll.lower_bound(next_prefix(k));
}

where

  • next_prefix gets the next prefix using lexicographic order based on the SortedAssociateveContainer comparator (of course this function needs more arguments to be completely generic, see the question).

The result of starts_with can be used on any range algorithm (see Boost.Range)

Vicente Botet Escriba
+1, although Steve helps with describing how to search for one field, MultiIndex is the easiest way to index on multiple fields.
Matthieu M.
+1. In defence of my answer, I wrote it before the sentence about multiple fields was added.
Steve Jessop
@Steve: you are right. I also upvoted your answer. However, I have to accept this one, and it is much easier to use.
Milan Babuškov
Sure. Vicente, if you want to just copy my details of how to use `lower_bound`, plus Matthieu's comment about using it twice, you should feel free. On SO, improving your answer to include other people's details is not plagiarism :-)
Steve Jessop
As suggested by Steve I will complete my answer with details more details and the double use of lower_bound to get the begin and end iterator.
Vicente Botet Escriba