views:

1149

answers:

7

Hello,

I have been recently (and repeatedly) asked by customers about MIPS needed to run our software. Usually we was able to get rid of this questions by explaining the customer that this is really depend on the cpu/os/hw (our software is highly portable) and/or use case (i.e. how our software is used).

But I have a last one not only very stubborn but in addition provides good reasons to be stubborn. :) He wants an estimate because he is not sure he has enough power to run our software, so buying the software before this estimate is not logical. (We can't provide the demo/evaluation since it will require significant amount of work to run on this specific platform.)

And now the the question: Does somebody have an experience with such task on any piece of hw with any software? Any real life example will be really helpful. I have an option to run our software on many OS and many hardware. So if you know any tool for such estimate on any hardware there is a chance I can use it or at least get an idea. For know I only know how to measure CPU load on eCosPro OS.

Edit:

Using probe is actually a good idea, assuming that i can create a control environment where only my software is running all instruction i can count is mine, and i guess probe has an interface to do it. Actually i have a few different hardware debuggers and if somebody have experience how to do it will be really good, any way I'm going to read some documentation tomorrow and hopefully will find something in this direction.

+1  A: 

Do you have a simulator or debugger probe which can give you a instruction count? You don't even need to do it for the right target platform, just a rough instruction count will do.

If everything else fails, run it on whatever platform you have access to, scale the runtime with the quotient of your-MHz/customer's-MHz. This should give you a very rough estimate of what kind of runtimes the customer will experience.

JesperE
Great idea, do you know *any* probe that can do it ? Estimation on my platform will satisfy the customer.
Ilya
I don't have much experience of using different debugger probes, unfortunately, I only have rough idea of what you can do with them.
JesperE
Thanks any way, i will post here result after i will get one :)
Ilya
A: 

I/S is a "weak" metric for a system with an operating system.

In the nitty-gritty, what you have to do is

  1. figure out the worst-case instruction path and count how many cycles it takes to execute(this means reading the assembly for that CPU and reviewing the CPU handbook that tell you # of cycles).
  2. Now figure out your real-time constraints.
  3. Then you use #1 for worst-case cycles. Adjust the CPU upward/downward until it fits in the real-time constraints.
  4. Add a fudge factor.
Paul Nathan
This is very long road ... :) The software is quit big and complex.
Ilya
A: 

The first thing that you're going to need to do is to come up with some sort of criteria for correct operation. This is going to depend very much on the nature of the application - your criteria might include "must execute code x in 3ms", or "must have a latency lower than 100ms". Any criterion that doesn't relate back to a quantitative measurement is going to be difficult as it will be subjective.

Finding your criteria for correct operation will allow you to find the critical portions of code. Bear in mind that these may be found in corner cases rather than normal operation.

If those critical portions of code are small then counting instructions for your target platform is going to be relatively straightfoward. If you've got a simulator that may be even easier. (depending on the code you may need to do a mock up to ensure that it gets executed, but that will likely still be easier than counting instructions if you've got a big chunk of code to analyse)

If your critical code is large then you might have to do something similar to JesperE's suggestion. Unless your application is targeted at an incredibly price sensitive industry the chances are that the customer will be willing to accept a little slack in the calculations - so better to over estimate than under estimate your cpu requirements.

Where I would differ from JesperE's suggestion is to suggest not concentrating on MHz but rather the actual MIPS of the targets. For example, compile and execute your code on a test platform - if you've got a profiler that may be all the better. Then compile your code for the customer's target and do a rough comparison of the number of instructions in the resultant executable. You can then incorporate this ratio, along with the relative MIPS of the test and target processors, into the calculation of execution time.

Andrew Edgecombe
A: 

You say that your software is highly portable, so my suggestion would be to run the software on the platform that is closest in processor architecture, processor instruction set and memory / peripheral bus type. Measure the longest routine that has to run in realtime and then make an estimate as to how long it will run on their architecture.

geometrikal
the question was how to measure ... The idea of measuring cross my mind :)
Ilya
1. Toggle a pin when you enter and exit the critical routine. Use the scope on the pin to measure the time.2. Set your compiler to output the assembly files and count the lines of code for each routine for a rough guess of the number of instructions
geometrikal
this is not really possible, my codebase is larger than average embedded os code base. Finding the longest routine is not enough finding the longest execution path is more appropriate, but kind of hard technically.
Ilya
+3  A: 

OK you realize that this is fraught with disclaimers & warnings -- CPU speeds, memory speeds, cache hits, MMU page tables flushes, bus contention, etc... (if it's a heavy-duty embedded system) all factor significantly into the decision....

Having said that.... what I would do is this. Get an RTOS (stay with me), perhaps something like FreeRTOS (free, what a surprise) or u/C-OS-II (not free for commercial use, maybe $3K). These kernels allow you to instrument the code to count idle CPU cycles (idle task spin loop).

So run your whole application (or the customer's application) as the only (non-idle) task on a board that you guys agree on (e.g. MPC860 board, ARM7 board, etc...). Measure the % CPU on the board via the RTOS. (e.g. "on the Flibber board running at 60 MHz, our application used 12% of the CPU.")

Without them giving you more, or vice versa, that sounds like a pretty reasonable length to go to for them.

The good thing is that once you've done this, you can re-use the work for other targets and/or boards, maybe the figures will help you increase sales and/or tune/optimize your software.

Good luck!

Dan
That seems like a reasonable approximation.
Paul Nathan
This was approach i planned to do, before i read the JesperE post. And despite the fact that cpu usage sounds to me more relevant information, customer want MIPS and i will try to see if i can get the MIPS information from ICE.
Ilya
Ilya - Sorry, I didn't quite "close the loop" -- let's say at 60 MHz your CPU provides ~50 MIPS.... so if your program uses 10% of the CPU (90% idle), it's using approximately 10% of the 50 MIPS = 5 MIPS.
Dan
Sure, i understand this.It's just seems to me that probe approach is more general, if for example i will find the way to get mips from Lauterbach i will be able to get mips information on any platform that have ice interface without extra coding on each OS (and we support big number of OS's),
Ilya
A: 

Most modern day debuggers give you the capability to view the instructions consumed eg; RVDS. Plus, you can use processor emulators to get a decent idea of instructions consumed without actually running on the platform (if your code is standalone such as a codec or a cryptographic module and doesnt depend on board) - note that this will give you instructions, not cycles. The cycles will be affected by your board details (eg; wait states, memory access etc)

A: 

On two of the processor architectures that I use (MSP430F5X and AVR32) there is a hardware cycle count register built into the processor. I normally have a scheme where, when the processor is not busy it is placed into a low power idle state with the processor core halted. There are then two options for working out the actual processor load. I can either set a breakpoint in a periodic timer function and read the number of processor cycles executed or I can instrument particular processes by reading this register at the start and end of their operation. The processor idle time does not appear in the cycle count as the CPU is halted for this time.

You do not specify your processor architecture but this capability may be present.

Ian