tags:

views:

2093

answers:

7

what is a good way to select a random element from a map? C++. It is my understanding that maps don't have random access iterators. The key is a long long and the map is sparsely populated.

+9  A: 
map<...> MyMap;
iterator item = MyMap.begin();
std::advance( item, random_0_to_n(MyMap.size()) );
James Curran
I haven't tried this yet but if it works it looks perfect. Will try it when i get home. what #includes does this require?
Deathbob
James Curran
...and make sure your random_0_to_n() is always < n :)
Constantin
+7  A: 

I like James' answer if the map is small or if you don't need a random value very often. If it is large and you do this often enough to make speed important you might be able to keep a separate vector of key values to select a random value from.

map<...> MyMap;
vector<...> MyVecOfKeys; // <-- add keys to this when added to the map.

map<...>::key_type key = MyVecOfKeys[ random_0_to_n(MyVecOfKeys.size()) ];
map<...>::data_type value = MyMap[ key ];

Of course if the map is really huge you might not be able to store a copy of all the keys like this. If you can afford it though you get the advantage of lookups in logarithmic time.

ryan_s
The map isn't very big right now so i'm interested in doing it the easy way, but it could grow to be large. It might then be worth customizing the map class a little bit to allow me to pick a random key in this manner without storing a redundant vector. Is there a way to do that already?
Deathbob
I don't think there is a good way to get at the keys of a map without resorting to traversing the map in linear time, owing to the fact that it is stored as a tree. Are you sure a map is the best data structure for your purposes?
ryan_s
I think we can speed this up a little, see my answer.
Constantin
+1  A: 

Maybe you should consider Boost.MultiIndex, although note that it's a little too heavy-weighted.

pongba
+3  A: 

Maybe draw up a random key, then use lower_bound to find the closest key actually contained.

Assaf Lavie
If the keys are evenly distributed that might be a good idea. If they aren't you'd wind up getting one key more often than the others.
ryan_s
A: 

Here is the case when all map items must be access in random order.

  1. Copy the map to a vector.
  2. Shuffle vector.

In pseudo-code (It closely reflects the following C++ implementation):

import random
import time

# populate map by some stuff for testing
m = dict((i*i, i) for i in range(3))
# copy map to vector
v = m.items()
# seed PRNG   
#   NOTE: this part is present only to reflect C++
r = random.Random(time.clock()) 
# shuffle vector      
random.shuffle(v, r.random)
# print randomized map elements
for e in v:
    print "%s:%s" % e, 
print

In C++:

#include <algorithm>
#include <iostream>
#include <map>
#include <vector>

#include <boost/date_time/posix_time/posix_time_types.hpp>
#include <boost/foreach.hpp>
#include <boost/random.hpp>

int main()
{
  using namespace std;
  using namespace boost;
  using namespace boost::posix_time;

  // populate map by some stuff for testing
  typedef map<long long, int> Map;
  Map m;
  for (int i = 0; i < 3; ++i)
    m[i * i] = i;

  // copy map to vector
#ifndef OPERATE_ON_KEY
  typedef vector<pair<Map::key_type, Map::mapped_type> > Vector;
  Vector v(m.begin(), m.end());
#else
  typedef vector<Map::key_type> Vector;
  Vector v;
  v.reserve(m.size());
  BOOST_FOREACH( Map::value_type p, m )
    v.push_back(p.first);
#endif // OPERATE_ON_KEY

  // make PRNG
  ptime now(microsec_clock::local_time());
  ptime midnight(now.date());
  time_duration td = now - midnight;
  mt19937 gen(td.ticks()); // seed the generator with raw number of ticks
  random_number_generator<mt19937, 
    Vector::iterator::difference_type> rng(gen);

  // shuffle vector
  //   rng(n) must return a uniformly distributed integer in the range [0, n)
  random_shuffle(v.begin(), v.end(), rng);

  // print randomized map elements
  BOOST_FOREACH( Vector::value_type e, v )
#ifndef OPERATE_ON_KEY
    cout << e.first << ":" << e.second << " ";
#else
    cout << e << " ";
#endif // OPERATE_ON_KEY
  cout << endl;
}
J.F. Sebastian
+1  A: 

Continuing ryan_s theme of preconstructed maps and fast random lookup: instead of vector we can use a parallel map of iterators, which should speed up random lookup a bit.

map<K, V> const original;
...

// construct index-keyed lookup map   
map<unsigned, map<K, V>::const_iterator> fast_random_lookup;
map<K, V>::const_iterator it = original.begin(), itEnd = original.end();
for (unsigned i = 0; it != itEnd; ++it, ++i) {
  fast_random_lookup[i] = it;
}

// lookup random value
V v = *fast_random_lookup[random_0_to_n(original.size())];
Constantin
+2  A: 

If your map is static, then instead of a map, use a vector to store your key/value pairs in key order, binary search to look up values in log(n) time, and the vector index to get random pairs in constant time. You can wrap the vector/binary search to look like a map with a random access feature.

ejgottl