tags:

views:

202

answers:

5

Has anyone done temporal unit-testing?

I'm not even sure if such lingo has been coined or not, but the point is to test that operations perform within temporal limits. I have a few algorithms and I want to test that their execution time increases as expected, and I guess similar testing could be used for IO, and what not, kind of like test_timeout or something.

However because the hardware affects the speed of execution it doesn't seem trivial. So I was wondering if anyone has tried this sort of thing before, and if they would could share their experience.

Thanks

Edit: Trying to compile a list of stuff that needs to be taken care of in this kind of situation

+1  A: 

If you want to check if the time increases, the hardware of different machines should not matter, if you don't check for absolute values, but for relative change. Or am I missing something here?

Fabian Steeg
no not missing anything, its a good point.
Robert Gould
+1  A: 

I think you could do regression checks over unit-test runtime figures. With a lot of the unit test frameworks you can typically get a report that says testname, executiontime. I know junit/surefire does it. So basically you can compare this to previous runs and establish if any significant changes have taken place. If you keep all of this in a database (with the hostname) you can compare execution times for the same run-time environment to previous test-runs. In this way you don't really write tests for performance but you just assert separately that there have been no significant changes to execution-time.

krosenvold
interesting idea!
Robert Gould
+2  A: 

The closest thing that I know of that's built in to a unit testing framework is timed tests that were added in JUnit 4. This could be used to ensure that an algorithm's performance doesn't degrade as input size increases.

Bill the Lizard
good link, the idea of timeouts is cool, but can't be done easily on my stack :(
Robert Gould
+4  A: 

Just some notes from my experience... We care about the performance of many of our components and have a very unittest-like framework to exercise and time them (with hindsight, we should have just used CppUnit or boost::test like we do for unittests). We call these "component benchmarks" rather than unittests.

  • We don't specify an upper limit on time and then pass/fail... we just log the times (this is partly related to a customer reluctance to actually give hard performance requirements, despite performance being something they care about a lot!). (We have tried pass/fail in the past and had a bad experience, especially on developer machines... too many false alarms because an email arrived or something was indexing in the background)
  • Developers working on optimisation can just work on getting the relevant benchmark times down without having to build a whole system (much the same as unittests let you focus on one bit of the codebase).
  • Most benchmarks test several iterations of iterations of something. Lazy creation of resources can mean the first use of a component can have considerably more "setup time" associated with it. We log out "1st", "average subsequent" and "average all" times. Make sure you understand the cause of any significant differences between these. In some cases we benchmark setup times explicitly as an individual case.
  • Ought to be obvious, but: just time the code you actually care about, not the test environment setup time!
  • For benchmarks you end up testing "real" cases a lot more than you do in unittests, so test setup and test runtime tend to be a lot longer.
  • We have an autotest machine run all the benchmarks nightly and post a log of all the results. In theory we could graph it or have it flag components which have fallen below target performance. In practice we haven't got around to setting anything like that up.
  • You do want such an autotest machine to be completely free of other duties (e.g if it's also your SVN server, someone doing a big checkout will make it look like you've had a huge performance regression).
  • Think about other scalar quantities you might want to benchmark besides time and plan to support them from the start. For example, "compression ratio achieved", "Skynet AI IQ"...
  • Don't let people do any analysis of benchmark data on sub minimum spec hardware. I've seen time wasted 'cos of a design descision made as a result of a benchmark run on someone's junk lappy, when a run on the target platform - a high end server - would have indicated something completely different!
timday
Very nice and complete list!
Robert Gould
A: 

If you're working in C++ have a look at http://unittest-cpp.sourceforge.net/

I've not used the time stuff but it is the best (most concise) unit test framework I've found.

Patrick