views:

244

answers:

8

I have a method, which uses random samples to approximate a calculation. This method is called millions of times, so its very important that the process of choosing the random numbers is efficient.

I'm not sure how fast javas Random().nextInt really are, but my program does not seem to benefit as much as I would like it too.

When choosing the random numbers, I do the following (in semi pseudo-code):

// Repeat this 300000 times
Set set = new Set();
while(set.length != 5)
    set.add(randomNumber(MIN,MAX));

Now, this obviously has a bad worst-case running time, as the random-function in theory can add duplicated numbers for an eternity, thus staying in the while-loop forever. However, the numbers are chosen from {0..45}, so a duplicated value is for the most part unlikely.

When I use the above method, its only 40% faster than my other method, which does not approximate, but yields the correct result. This is ran ~ 1 million times, so I was expecting this new method to be at least 50% faster.

Do you have any suggestions for a faster method? Or maybe you know of a more efficient way of generation a set of random numbers.

To clarify, here is the two methods:

// Run through all combinations (1 million). This takes 5 seconds
 for(int c1 = 0; c1 < deck.length; c1++){
    for(int c2 = c1+1; c2 < deck.length; c2++){
     for(int c3 = c2+1; c3 < deck.length; c3++){
        for(int c4 = c3+1; c4 < deck.length; c4++){
         for(int c5 = c4+1; c5 < deck.length; c5++){
             enumeration(hands, cards, deck, c1, c2, c3, c4, c5);
         }
            } 
      }     
   }
   }

// Approximate (300000 combinations). This takes 3 seconds
Random rand = new Random();
HashSet<Integer> set = new HashSet<Integer>();
int[] numbers = new int[5];
while(enumerations < 300000){
set.clear();
while(set.size() != 5){
    set.add(rand.nextInt(deck.length));
}
Iterator<Integer> i = set.iterator();
int n = 0;
while(i.hasNext()){
    numbers[n] = i.next();
    n++;
}

After some testing and profiling, I found this method to be the most effective:

Random rand = new Random();
int[] numbers = new int[5];
ArrayList<Integer> list = new ArrayList<Integer>();
while(enumerations < 300000){
 while(list.size() != 5) {
     int i = rand.nextInt(deck.length);
        if(!list.contains(i)) list.add(i);
 }
 int index = 0;
 for(int i : list){ numbers[index] = i; index++; }
 enumeration(hands, cards, deck,numbers);
}
+1  A: 

A common technique is to start with a list of all the possible inputs, and randomly select from that, deleting ones as you go. That way there's no risk of selecting a duplicate and having to loop for an unknown amount of time. Of course this method only works with discrete data, but fortunately integers are. Also remember that your list (or other data structure) selection and deletion should be O(1) if possible, since you're focusing on speed.

Tesserex
If the application is what it looks like (a poker odds calculator) then there are 52C5 == 2598960 possible inputs, so fewer than 1/6 of inputs will be used. This is very inefficient use of memory, since the input samples (in a typical poker odds calculator) do not need to be retained in memory after evaluation. This will be even worse in the likely event that the evaluation function is extended to 7-card hands (52C7 == 133784560 combinations)
finnw
yes, you are right - unfortunately that additional information with the actual methods was not in the question when i wrote my answer.
Tesserex
+5  A: 

You can try using an existing Java implementation (or this one) for a Mersenne Twister.

Keep in mind most MT's are not cryptographically secure.

Yuval A
Can you clarify, what you mean by not cryptographically secure?
Frederik Wordenskjold
It means you shouldn't use them for cryptography, because it's still possible to predict the next number given a certain amount of prior information.
Tesserex
+1  A: 

Never guess, always measure.

 long time = System.getCurrentMilliseconds();
 Random().nextInt()
 System.out.println(System.getCurrentMilliseconds() - time);

Also, you should never rely on how infrequent a known bug will happen, just code defensivley so it doesn't. Detect a duplicate, and if it is a duplicate then don't add it, and skip the iteration with a continue statement.

As for fastest methods and random numbers... You can't get random numbers in Java's Math.random(). You can only get pseudo random numbers. How fast you want this to be comes at the sacrifice of how seemingly random you need to them appear. The fastest way to generate a pseudo-random number would involve bit shifting and addition based on the a seed value such as System.getCurrentMilliSeconds() Also, pseudo-random number generation is already pretty fast since it is just raw CPU arithmetic anyway, so you will probably be happy enough once you see how many milliseconds it takes to generate one with Math.random().

Zombies
Never *measure*. Always profile.
Yuval A
@Yuval isn't profiling measuring?
matt b
@Yuval: If you don't measure, you don't know when it's fast enough. Profiling is often invasive. You should measure *and* profile... although you certainly shouldn't measure a *single* call like this.
Jon Skeet
How fast is Math.random() compared with Random().nextInt() ?I'm using the Random-class at the moment.
Frederik Wordenskjold
@Frederik: Math.random() uses Random.nextDouble() internally, so it's slower, if anything.
Michael Borgwardt
+2  A: 

You could use linear congruence as a random generator: http://en.wikipedia.org/wiki/Linear_congruential_generator [yet consider their statistical disadvantages]

You only need a calculation of (x + c) % m for each number. Yet, in my experience the creation of objects (like you might do with every call of new Set and add, depending on which implementation you use) might cost you more speed than a call to nextInt(). Maybe you should try a profiler like e.g. this one: http://www.eclipse.org/tptp/

Searles
I'm running os x, so I cannot use the eclipse tptp profiler! I really miss a profiler!
Frederik Wordenskjold
I've once used JProfiler on Mac OS X. Afaik they have a free trial period of 14 days.
Searles
Thx, I'll give it a try.
Frederik Wordenskjold
NetBeans' profiler is an alternative on MAc OS X.
trashgod
+1  A: 

I don't have any input on your actual problem, and I don't know too much Java (just poking around). However it seems to me that you are trying to build a hand evaluator for poker and this thread http://pokerai.org/pf3/viewtopic.php?f=3&amp;t=16 contains some extremely fast java hand evaluators. Hopefully some of this code could be of help.

RD
I've actually been inspired by some algorithms in this thread. I'm implementing an omaha-evaluator though, so many of the stuff in this thread, like heads-up lookup tables, I cannot use.
Frederik Wordenskjold
+1  A: 

It looks like you want to select a k-combination from a set S without replacement, with S having n distinct values, k = 5 and n = 52. You can shuffle() the entire set and select k elements (as @Tesserex suggests), or pick() k elements while avoiding duplicates (as you've shown). You'll want to profile both in your particular environment and for your chosen generator. I often, but not always, see a modest edge for pick().

private static final Random rnd = new Random();
private static final int N = 52;
private static final int K = 5;
private static final List<Integer> S = new ArrayList<Integer>(N);
static {
    for (int i = 0; i < N; i++) {
        S.add(i + 1);
    }
}
private final List<Integer> combination = new ArrayList<Integer>(K);

...

private void shuffle() {
    Collections.shuffle(S, rnd);
    combination.addAll(S.subList(0, K));
}

private void pick() {
    for (int i = 0; i < K; i++) {
        int v = 0;
        do {
            v = rnd.nextInt(N) + 1;
        } while (combination.contains(v));
        combination.add(v);
    }
}
trashgod
A: 

If you're being slowed down by the fact that you have to skip over duplicates, you could solve that problem by creating a list of all the card values, and then removing from the list as cards are selected and choosing a random number in a smaller range the next time. Something like this:

// Assuming we're just numbering all the cards 0 to 51. This could be more sophisticated, of course.
ArrayList cards=new ArrayList(52);
for (int x=0;x<52;++x)
  cards=new Integer(x);

Integer[] hand=new Integer[5];
for (int h=0;h<5;++h)
{
  // Pick a card from those remaining
  int n=random.nextInt(cards.size());
  hand[h]=cards.get(n);
  // Remove the picked card from the list
  cards.remove(n);
}

For the first draw, cards.get(n) will return n, no matter what n is. But from then on, values will be removed so cards.get(3) might return 7, etc.

Creating the list and removing from it adds a bunch of overhead. My guess would be that if you're only picking 5 cards at a time, the probabilty of collisions is small enough that eliminating duplices after you find them would be faster than preventing them. Even on the last draw, the probability of a duplicate is only 4/52=1/13, so you'd rarely hit a duplicate at all and the probability that 2 draws in a row would both be duplicates would be tiny. It all depends on how long it takes to generate a random number compared to how long it takes to set up the array and do the removes. The easiest way to tell would be to do some experiments and measure. (Or profile!)

Jay
Exactly what I thought - that the probability of a duplicate is so small, that the times it takes to prevent them would take a longer time than just checking for them. I've updated the OP with my result.
Frederik Wordenskjold
A: 

Don't try to develop your known random num generator. Use a known one like SecureRandom instead:

http://www.owasp.org/index.php/Using_the_Java_Cryptographic_Extensions

Raj Kaimal