views:

69

answers:

2

So, I'm just playing around implementing some sorting algorithms in C++, but I'm finding it irritating to benchmark them at the moment, due to the length of time it takes to not run the algorithm, but to create the input data. I currently test each length of input (1000, 2000, ...) 10 times, to gain a somewhat averaged time. For each of these 10 times, I create a new random vector of the correct length, by doing:

    // Each of the 10 times.
    for(int j = 0; j < 10; j++) {

        A.clear();

        // 'i' is the current input size.
        for(int k = 0; k < i; k++) {
            A.push_back(rand() % 10000);
        }

        // Other stuff
    }

Is there a better way to do this? Should I be bothering to cap the rand() at 10000, or is that just my OCD brain liking round numbers? (I.e., could that modulo operation actually be taking a serious amount of time when you consider it's performed up to - at the moment - 10,000 for each loop of the 10.) Alternatively, should I really create a new vector each time I run the sort? I've been doing so because I felt that it's possible that a created vector might be biased, and so if that one was generated and then used 10 times the answer might be quite off...

+1  A: 

A quote from cplusplus.com (http://www.cplusplus.com/reference/stl/vector/), which offers a very useful hint:

"Reallocations may be a costly operation in terms of performance, since they generally involve the entire storage space used by the vector to be copied to a new location. Therefore, whenever large increases in size are planned for a vector, it is recommended to explicitly indicate a capacity for the vector using member function vector::reserve."

Using vector::reserve will almost certainly give a performance increase in your case.

EDIT: You could try using random_shuffle (http://www.cplusplus.com/reference/algorithm/random_shuffle/) to shuffle your vector once it's been created (apparently, random_shuffleis linear in the number of elements).

Greg S
It will almost certainly only do so for the first run. I don't think implementations are really dismissing their capacity on `clear()`. If they would, we wouldn't need the swap trick.
sbi
@sbi: thanks for pointing that out, you're absolutely right, `clear()` won't shrink the vector and apparently, not even `shrink_to_fit()` definitely will (http://stackoverflow.com/questions/2664051/why-is-shrink-to-fit-non-binding).
Greg S
+1  A: 

Is there a better way to do this?

Yup, there are a few things you might want to do here to help speed things up. As mentioned before, reserving space in the std::vector and then assigning values to the known elements, is faster. Also, pre incrementing ( ++var instead of var++ ) is faster when using non optimized compilers. Just for the sake of keeping your code fast, no matter who builds it, you might want to consider doing that from now on. As far as memory goes, you may find it trivial, but when I use known sizes that are unsigned, and not unreasonably large, I use unsigned short's for my for loops.

About the modulo, however. You might want to not use it, if you don't need it. Depending on the data type held in the vector, your results should wrap if they go above the maximum storage capacity of the type.

I don't know off hand if it eats up more processing power having variables wrap, and if it does, I'm still not sure if its a less expensive operation then preforming modulo. Might want to run some speed tests with known sizes before going with rand.

    A.reserve(i * i);
    for(unsigned short j = 0; j < 10; ++j) {            
        for(unsigned short k = 0; k < i; ++k) 
            A[k + (i*10)] = rand();                
        // Other stuff
    }

Edit

Very small change to note: The loop is going only 10 times, so you might as well use an unsigned char, rather than a short. On Win32 at the very least, it takes half the memory.

    A.reserve(i * i);
    for(unsigned char j = 0; j < 10; ++j) {            
        for(unsigned char k = 0; k < i; ++k) 
            A[k + (i*10)] = rand();                
        // Other stuff
    }
Xoorath