views:

28

answers:

2

There are a couple of sources for random in LoadRunner scenarios:

  • rand () function
  • Random think time deltas (runtime settings)
  • Random pacing time components (runtime settings)
  • Random parameters (as part of the VUGen test)

I use those functionalities, and I could live with their pseudorandomness. I cannot, however, live with the fact that all scenario runs containing at least one of those functionalities behave pseudorandomly AND indeterministic, i.e. for a given start state (random seed), I want two runs to generate EXACTLY the same load, including timing (pacing and think times). So I want two runs to be based on EXACTLY the same random sequences. That means that I want to seed all random generators myself, as part of the initialization of each run.

I can use srand () to set a seed value for rand (). Setting a specific (hard-coded) seed value upon init always results in the same sequence delivered by rand () -- for all virtual users. If I seed with the VUser ID number, I would even get different rand () sequences for every vuser, while they still are the same from run to run for each user.

What about the other pseudorandom sources in LR, those beyond rand ()? Do I have a chance to seed them all so I get deterministic scenario behavior?

I think that would greatly help.

Without a mechanism like that, one has to plan for very long, and/or very high-traffic test scenarios in order to "average out" the randomness in the results statistics (do you agree with this?) which I do all day.

+1  A: 

The short answer to your question is: NO.

Random implies just what it says => "Random". 

If you use the "built in" random features of parameters you are pretty much screwed as you have no control over how the internal random-seeds are initialized, and this can not predict in any way the next value.

If what you ultimately want to achieve is to extrapolate the results and predict server behavior under load you are in for a very rocky road.

Extrapolating results

Your run with 100 vusers and achieve an avg. of 50-60 hits/sec with
response times under 3 sec.

Logically 1000 vusers (10x load) would give you 500-600 hits/sec ... 

But what about the response times? How do you extrapolate them? How do you know
when the web-server(s) chokes and achieves it's knee-point? 

Remember that the hits/sec is directly proportional to the response times... so predicting hits/sec (or pages/sec) becomes very difficult and inaccurate

Things you can not control

Even if you'd achieve an "exact" copy of another run you still have to deal with the response times and network delays, that in effect are always different, regardless of circumstances (and also totally out of your control).

A more "realistic" way to define load

Load-testing in itself is not an exact science, and no load-test can ever simulate the real world completely, but we can get close. The way we do it here is that we try to simulate the individual users as close as possible. This way we can set the load expectations according to user-types, something that the "business" people usually have a clue about.

We also divide the "users" into types, such as power, normal or novice user - the difference for these are the speed at which they operate (and the way they use the UI's).

By doing this we can "load" the target application with a certain "expected user load" instead of pages/sec or hits/sec values or other technical meters.

We also execute longer runs to see how the service behaves over time, so a 72h or more test is not unusual for the Endurance test phase. This also shows if there are any memory leaks over time on the servers and how background processes impact the server performance during "night time"

K.Sandell
I see, and agree on the "72h or more test is not unusual for the Endurance test phase" statement, which is where I'm coming from. I strongly disagree, however, to "Random implies just what it says => 'Random'. " since we're talking about pseudo-random generators which are 100% deterministic and thus reproduceable if you have control over the seed values -- and still pseudo-random if you don't. So if I *do* have access to all seeds, I don't see the point why I should not make a certain load test run more reproduceable by generating exactly the same load -- depending on the test goal.
TheBlastOne
The situation at hand is that the randomly chosen test cases lead to very, very different data sets (in terms of complexity) to be processed, and I do not have a cheap or reliable way to categorize my test case data set in order to select only the complex, or only the cheap test cases. So selecting anything randomly (and not controlling *all* seeds) forces me to execute very, very long load test runs so the huge differences between the test cases "average out" and the run returns useful results.
TheBlastOne
"This way we can set the load expectations according to user-types, something that the "business" people usually have a clue about." -- yeah, *usually* ;)
TheBlastOne