tags:

views:

1199

answers:

9

I have always been an embedded software engineer, but usually at Layer 3 or 2 of the OSI stack. I am not really a hardware guy. I have generally always done telecoms products, usually hand/cell-phones, which generally means something like an ARM 7 processor.

Now I find myself in a more generic embedded world, in a small start-up, where I might move to "not so powerful" processors (there's the subjective bit) - I cannot predict which.

I have read quite a bit about debate about using STL in C++ in embedded systems and there is no clear cut answer. There are some small worries about portability, and a few about code size or run-time, but I have two major concerns:
1 - exception handling; I am still not sure whether to use it (see http://stackoverflow.com/questions/2226227/embeeded-c-to-use-exceptions-or-not)
2 - I strongly dislike dynamic memory allocation in embedded systems, because of the problems it can introduce. I generally have a buffer pool which is statically allocated at compile time and which serves up only fixed size buffers (if no buffers, system reset). The STL, of course, does a lot of dynamic allocation.

Now I have to make the decision whether to use or forego the STL - for the whole company, for ever (it's going into some very core s/w).

Which way do I jump? Super-safe & lose much of what constitutes C++ (imo, it's more than just the language definition) and maybe run into problems later or have to add lots of exception handling & maybe some other code now?

I am tempted to just go with Boost, but 1) I am not sure if it will port to every embedded processor I might want to use and 2) on their website, they say that they doesn't guarantee/recommend certain parts of it for embedded systems (especially FSMs, which seems weird). If I go for boost & we find a problem later ....

A: 

The biggest problem with STL in embedded systems is the memory allocation issue (which, as you said, causes a lot of problems).

I'd seriously research creating your own memory management, built by overriding the new/delete operators. I'm pretty sure that with a bit of time, it can be done, and it's almost certainly worth it.

As for the exceptions issue, I wouldn't go there. Exceptions are a serious slowdown of your code, because they cause every single block ({ }) to have code before and after, allowing the catching of the exception and the destruction of any objects contained within. I don't have hard data on this on hand, but every time I've seen this issue come up, I've seen overwhelming evidence of a massive slowdown caused by using exceptions.

Edit:
Since a lot of people wrote comments stating that exception handling is not slower, I thought I'd add this little note (thanks for the people who wrote this in comments, I thought it'd be good to add it here).

The reason exception handling slows down your code is because the compiler must make sure that every block ({}), from the place an exception is thrown to the place it is dealt with, must deallocate any objects within it. This is code that is added to every block, regardless of whether anyone ever throws an exception or not (since the compiler can't tell at compile time whether this block will be part of an exception "chain").

Of course, this might be an old way of doing things that has gotten much faster in newer compilers (I'm not exactly up-to-date on C++ compiler optimizations). The best way to know is just to run some sample code, with exceptions turned on and off (and which includes a few nested functions), and time the difference.

Edan Maor
-1 for complete lack of understanding of how exceptions are implemented.
Billy ONeal
Exceptions, when implemented by modern compilers, typically do not cause run time overhead unless an exception is actually thrown. And if you use exceptions carefully (and not for normal flow control) performance will not be an issue when things are going wrong.
Brian Neal
Have you timed that, Brian? The last time I tried measuring it (last summer), I found that simply enabling exceptions and stack unwinding in compiler settings caused a slowdown, regardless of whether I actually threw any exceptions or not.
Crashworks
@Brian: At least on Win32, every `try` block must set up an `EXCEPTION_REGISTRATION` block on the stack and point the FS register at it. This happens regardless of whether any exceptions actually occur. Source: http://www.microsoft.com/msj/0197/exception/exception.aspx Also the compiler *must* add code to every block that declares any objects with non-trivial destructors, unless it can prove that an exception cannot occur inside the block. Otherwise, how will those objects be destroyed during stack unwinding?
j_random_hacker
@j_random_hacker: You're talking about Win32 structured exceptions, not C++ exceptions. The two concepts are totally seperate. As for the stack unwinding- that happens when a function exits, regardless of whether it exits via an exception or a return statement.
Joe Gauterin
@Edan Maor/Crashworks: Please be specific about what compiler you're using and what version. Up to date versions of the better compilers use zero-overhead exception handling.
Joe Gauterin
@Joe: MSVC++ uses Win32 SEH for C++ exceptions, you can catch a SEH exception in C++ using `catch (...)`. See here for more: http://msdn.microsoft.com/en-us/library/de5awhsw%28VS.71%29.aspx Other compilers may do differently of course, but SEH has special support from the OS and so is likely to be more efficient than alternatives.
j_random_hacker
@Joe: You seem to forget that stack unwinding needs to know which objects with non-trivial dtors need to have their dtors called. Consider this func: `void f() { X a; { X b; i_throw(); X c; } other_code(); X d; }`. `X` has a non-trivial dtor. Now `a` and `b` must be destroyed, without `other_code()` being run or `d` being constructed. To do this with no additional setup overhead would mean `JMP`-ing to a pre-existing address that performs this sequence of steps -- but because of `other_code()`, no such sequence exists. Let me know if that's not clear.
j_random_hacker
@Joe: Like I said, I'm no expert in this field (I haven't done embedded C++ in a while now), I just think the OP should **their** compiler, to see if it involves overhead or not. Since a lot of compilers **do** involve overhead, it's worth checking out.
Edan Maor
@j_random_hacker: Your information on MSVC using SEH for C++ exceptions is years out of date - MSVC 7.1 acts like that by default, VC8, VC9 and VC10 can be made to act like that with a compiler switch, but by default they don't. SEH is much slower than C++ exceptions because the compiler must generate code as if every statement could throw an exception.
Joe Gauterin
@ j_random_hacker: re. stackunwinding - You're comparing code which handles errors using exceptions to code that doesn't handle errors. If you replaced the call to `i_throw();` with an `if( i_return_error_code() == ERROR){return;}` the compiler would have to generate exactly the same stack unwinding code.
Joe Gauterin
@j_random_hacker - So you are telling me that stack doesn't unwind and objects are not destroyed when you turn off exceptions? You aren't making any sense. That kind of behavior must occur regardless of exceptions being off or on, as Joe Gauterin points out.
Brian Neal
@Brian: Yes, stack unwinding must happen in any case. But, afaik, if exceptions are enabled, then an exception thrown somewhere 5 blocks in (a function calling a function, etc.), must now walk backwards through all 5 functions, and call the dtors in those functions. This needs code surrounding each function, and this code does NOT need to exist if exception handling isn't enabled (since the dtor will only be called at the end of the block, period). Am I making any sense?
Edan Maor
Also, obviously code that doesn't use exceptions will use some different method (error codes) with overhead. However, I'm talking about the overhead that *some* compilers will put in automatically, around every block, which I think still constitutes a slowdown.
Edan Maor
One last thing, this subject has a lot of questions already devoted to it on SO, which I think are a better place to discuss it. I'd like to direct your attention to this question: http://stackoverflow.com/questions/691168/how-much-footprint-does-c-exception-handling-add
Edan Maor
@Edan Maor: Certainly the compiler must generate more code to handle these cases when exceptions are turned on. That is unavoidable. However this code need not add any time overhead if no exception occurs. This extra code will not be executed in the "normal" case when exceptions are not thrown. This is why it is important to only use exceptions for violations of preconditions or other invariants and not "normal" flow control.
Brian Neal
@Edan Maor: And the next step is to compare this extra code to the amount of hand written code that must be developed to check and set return codes when you can't do exceptions, and weigh the costs in that light. Most people, if they even bother to check return codes, don't do such a great job at it, and this code is error prone and leaky.
Brian Neal
@Joe: Re stackunwinding: I see what you mean, since if you don't use exceptions then you should check every function's return code for errors. But many real-world functions *cannot fail*, and turning on any form of exception handling will force this overhead to occur whenever you have a block scope containing both an object with non-trivial dtor and a call to a C++ function from another translation unit (because then the compiler can't prove no C++ exceptions can take place). `/EHsc` will exempt `extern C` calls, but not calls to C++ functions.
j_random_hacker
@Joe: You claim that my statement about SEH is out of date, so you haven't checked the assembly generated under the various options. Regardless of whether you specify `/EHa` (the "old" way that specifically handles SEH exceptions) or `/EHsc` ("only C++ exceptions"), if you have an object with non-triv dtor occurring in the same block as a call to an external function then instructions to populate a `EXCEPTION_REGISTRATION` struct and set `FS` are generated. SEH is used either way. Compile (don't link) http://pastebin.com/m1fb29a45 with /Fa + /EH, /EHsc and /EHa to convince yourself.
j_random_hacker
@Brian: Of course objects are destroyed and dtors called on normal scope exit. But if exceptions are being used, then only a subset of dtors are called, and there is no other program code being run in-between. That means that in general, stack unwinding due to a `throw` can't simply `JMP` to some place in the pre-existing, normal-scope-exit code to perform cleanup -- instead it needs to somehow keep track of which objects require dtors to be run at any point. I.e. additional overhead is required.
j_random_hacker
@j_random_hacker, sure but my point was that this overhead is just extra code; a good compiler will not cause any runtime impact unless an exception actually occurs.
Brian Neal
Brian, have you actually tried it -- comparing a build with exceptions totally disabled to one with exceptions? If so, and it's the same, which compiler?
Crashworks
@Crashworks, compare what? Code size or execution time?. Yes, you do get more code when you have exceptions turned on. This is unavoidable and to be expected. Maybe that is the deal breaker for you. However, in my experience, on a good compiler (and I'm not talking about MSVC++) this extra code doesn't actually execute until an exception is thrown. The compilers we use are GCC and Green Hills MULTI. The compilers are not perfect, it isn't true 0% overhead until an exception is thrown, but the results of doing this are acceptable for our application.
Brian Neal
@Brian: For the record, I wasn't talking about code size, I was talking about execution time. From the results I've seen (on older compilers), execution time *is* longer, even if no exceptions are thrown. Just turning them on will cause a significant slowdown.
Edan Maor
@Brian: Interestingly, I just tried a variation of my pastebin snippet on Linux x86 g++ 4.2.1 and to its credit, the only difference was an extra 32 bytes allocated on the stack -- but not written to. So it seems that in a function, if there are any local variables that don't fit in registers (meaning space has to be allocated on the stack anyway), *no additional instructions will be executed if no exceptions are caught or thrown*. Very impressive!
j_random_hacker
+6  A: 

Let me start out by saying I haven't done embedded work for a few years, and never in C++, so my advice is worth every penny you're paying for it...

The templates utilized by STL are never going to generate code you wouldn't need to generate yourself, so I wouldn't worry about code bloat.

The STL doesn't throw exceptions on its own, so that shouldn't be a concern. If your classes don't throw, you should be safe. Divide your object initialization into two parts, let the constructor create a bare bones object and then do any initialization that could fail in a member function that returns an error code.

I think all of the container classes will let you define your own allocation function, so if you want to allocate from a pool you can make it happen.

Mark Ransom
+1, I think this is one of the few times it's a good idea to move construction work out of constructors.
j_random_hacker
What do you mean "the STL doesn't throw exceptions on its own"? What if you call vector::at with an out of range index? And you can also configure IO streams to throw exceptions.Also, templates can generate more code than you may if you wrote it by hand. See the example in Stroustrup about combining a template with void* to reduce such bloat.
Brian Neal
@Brian: `vector::at()` is a good example. It would be more accurate to say that the STL can be used in such a way that it will never generate exceptions (here, by using `operator[]()` instead of `at()`) and without making any additional compromises.
j_random_hacker
@Brian: Regarding code bloat, functions comprising identical object code will be removed at link time with MSVC++ if you specify /Gy to the compiler and /OPT:ICF to the linker. I believe the GNU linker can do much the same.
j_random_hacker
@Brian Neal, I forgot about `vector::at`, and probably a few others too - thanks for the clarification. It should be possible to search your standard library files for "throw" and find all of the 'exceptions' to my overly generalized statement.
Mark Ransom
@j_random_hacker We don't often get to use MSVC++ in embedded systems. GNU is much more common, but there are many more crappy compilers in the embedded systems world than in the desktop world.
Brian Neal
Brian Neal
@j_random_hacker - I use templates all the time and we don't see excessive problems with template code bloat in our embedded work. However my comment, if you bothered to read it, said that you can *sometimes* generate less code than templates if you wrote it by hand. Again, see Stroustrup's example of how he combines a type-unsafe void* algorithm with a type-safe thin template versus having 2 separate and complete template instantiations on different types. That's all I was trying to say.
Brian Neal
@Brian: I read your comment, no need to repeat it or say "if you bothered to read it". You provided some useful info, and so did I. Right?
j_random_hacker
@j_random_hacker: Well you implied in your response to my answer that you didn't agree with some comment I made about code bloat. Since this is the only place were I mentioned that, I felt I had to add more information. In any event I think you missed the original point I was trying and failing to make. But who cares.
Brian Neal
+8  A: 

Super-safe & lose much of what constitutes C++ (imo, it's more than just the language definition) and maybe run into problems later or have to add lots of exception handling & maybe some other code now?

We have a similar debate in the game world and people come down on both sides. Regarding the quoted part, why would you be concerned about losing "much of what constitutes C++"? If it's not pragmatic, don't use it. It shouldn't matter if it's "C++" or not.

Run some tests. Can you get around STL's memory management in ways that satisfy you? If so, was it worth the effort? A lot of problems STL and boost are designed to solve just plain don't come up if you design to avoid haphazard dynamic memory allocation... does STL solve a specific problem you face?

Lots of people have tackled STL in tight environments and been happy with it. Lots of people just avoid it. Some people propose entirely new standards. I don't think there's one right answer.

Dan Olson
Mawg
+8  A: 

The other posts have addressed the important issues of dynamic memory allocation, exceptions and possible code bloat. I just want to add: Don't forget about <algorithm>! Regardless of whether you use STL vectors or plain C arrays and pointers, you can still use sort(), binary_search(), random_shuffle(), the functions for building and managing heaps, etc. These routines will almost certainly be faster and less buggy than versions you build yourself.

Example: unless you think about it carefully, a shuffle algorithm you build yourself is likely to produce skewed distributions; random_shuffle() won't.

j_random_hacker
+4  A: 

I work on real-time embedded systems every day. Of course, my definition of embedded system may be different than yours. But we make full use of the STL and exceptions and do not experience any unmanageable problems. We also make use of dynamic memory (at a very high rate; allocating lots of packets per second, etc.) and have not yet needed to resort to any custom allocators or memory pools. We have even used C++ in interrupt handlers. We don't use boost, but only because a certain government agency won't let us.

It is our experience you can indeed use many modern C++ features in an embedded environment as long as you use your head and conduct your own benchmarks. I highly recommend you make use of Scott Meyer's Effective C++ 3rd edition as well as Sutter and Alexandrescu's C++ Coding Standards to assist you in using C++ with a sane programming style.

Brian Neal
+1, useful answer. But I don't think you know as much about exceptions or code bloat as you think you do -- please see my comments in response to yours on others' posts.
j_random_hacker
Where exactly in my response does the phrase "code bloat" appear? I appreciate the +1 but please direct your comments to this particular answer.
Brian Neal
sounds great (and, yes, both of those books (and the complete Meyers "effective...") are sitting beside my monitor right now. What sort of processors do you target?
Mawg
+6  A: 
Crashworks
You've linked an article from 2006 which is now out of date. C++ exceptions aren't slow on decent modern compilers. If you're dealing with an embedded system for which a decent modern copmiler doesn't exist you have a problem - but to give a blanket "As to exceptions: they are slow" is flat out wrong.
Joe Gauterin
Recognized C++ experts like Herb Sutter and Andrei Alexandrescu disagree with your "exceptions are slow" statement. If you don't use exceptions, you yourself are now responsible for writing and checking error return codes, and this code is almost always less efficient and compared to the code modern compilers emit for exceptions. Furthermore, the code people write (if they bother to write it at all) to check error codes is often rife with errors and mistakes.
Brian Neal
It hasn't been our experience that STL allocators are slow, bloated, or inefficient. In fact, its often better to use them than something that you wrote yourself, since millions of people have used them and the bugs are shaken out of them. As for your #3, you can supply your own allocator that meets your alignment requirements to just about anything in the STL. And I simply don't believe #4. I've seen the GCC STL make platform specific optimizations. Also, did you know std::copy() turns into a std::memcpy() if it detects it is using native types on GCC?
Brian Neal
Exceptions are not very slow, but they do impose a nonzero runtime overhead on at least one popular modern compiler (MSVC++9) even when no exception is ever thrown. To see this, try compiling (not linking) http://pastebin.com/m1fb29a45 with `/EHa` and then with `/EHsc`, using /Fa to produce an assembly listing. In both cases, Win32 structured exception handling (SEH) management is introduced -- that's the additional pushing of data onto the stack and setting of the `FS` segment register.
j_random_hacker
The article is from 2006, but *my own* timings were from August 2009. I have read all the theory about how exceptions aren't slow any more **but they do not corroborate with actual measurements I have taken**.
Crashworks
Brian: those are EA's points, not mine, but #4 was determined empirically. Basically, they wrote their own implementations of the containers, and found that they ran much faster than the STL's. Therefore, the STL is not maximally efficient.
Crashworks
Recognized experts like data when profiling disagree with 'red book blind reading is always right' statement.
rama-jka toti
+3  A: 
  1. for memory management, you can implement your own allocator, which request memory from the pool. And all STL container have a template for the allocator.

  2. for exception, STL doesn't throw many exceptions, in generally, the most common are: out of memory, in your case, the system should reset, so you can do reset in the allocator. others are such as out of range, you can avoid it by the user.

  3. so, i think you can use STL in embedded system :)

doudehou
+3  A: 

It basically depends on your compiler and in the amount of memory you have. If you have more than a few Kb of ram, having dynamic memory allocation helps a lot. If the implementation of malloc from the standard library that you have is not tuned to your memory size you can write your own, or there are nice examples around such as mm_malloc from Ralph Hempel that you can use to write your new and delete operators on top.

I don't agree with those that repeat the meme that exceptions and stl containers are too slow, or too bloated etc. Of course it adds a little more code than a simple C's malloc, but judicious use of exceptions can make code much clear and avoid too much error checking blurb in C.

One has to keep in mind that STL allocators will increase their allocations in powers of two, which means sometimes it will do some reallocations until it reaches the correct size, which you can prevent with reserve so it becomes as cheap as one malloc of the desired size if you know the size to allocate anyway.

If you have a big buffer in a vector for example, at some point it might do a reallocation and ends using up 1.5x the memory size that you are intending it to use at some point while reallocating and moving data. (For example, at some point it has N bytes allocated, you add data via append or an insertion iterator and it allocates 2N bytes, copies the first N and releases N. You have 3N bytes allocated at some point).

So in the end it has a lot of advantages, and pays of if you know what you are doing. You should know a little of how C++ works to use it on embedded projects without surprises.

And to the guy of the fixed buffers and reset, you can always reset inside the new operator or whatever if you are out of memory, but that would mean you did a bad design that can exhaust your memory.

An exception being thrown with ARM realview 3.1:

--- OSD\#1504 throw fapi_error("OSDHANDLER_BitBlitFill",res);
   S:218E72F0 E1A00000  MOV      r0,r0
   S:218E72F4 E58D0004  STR      r0,[sp,#4]
   S:218E72F8 E1A02000  MOV      r2,r0
   S:218E72FC E24F109C  ADR      r1,{pc}-0x94 ; 0x218e7268
   S:218E7300 E28D0010  ADD      r0,sp,#0x10
   S:218E7304 FA0621E3  BLX      _ZNSsC1EPKcRKSaIcE       <0x21a6fa98>
   S:218E7308 E1A0B000  MOV      r11,r0
   S:218E730C E1A0200A  MOV      r2,r10
   S:218E7310 E1A01000  MOV      r1,r0
   S:218E7314 E28D0014  ADD      r0,sp,#0x14
   S:218E7318 EB05C35F  BL       fapi_error::fapi_error   <0x21a5809c>
   S:218E731C E3A00008  MOV      r0,#8
   S:218E7320 FA056C58  BLX      __cxa_allocate_exception <0x21a42488>
   S:218E7324 E58D0008  STR      r0,[sp,#8]
   S:218E7328 E28D1014  ADD      r1,sp,#0x14
   S:218E732C EB05C340  BL       _ZN10fapi_errorC1ERKS_   <0x21a58034>
   S:218E7330 E58D0008  STR      r0,[sp,#8]
   S:218E7334 E28D0014  ADD      r0,sp,#0x14
   S:218E7338 EB05C36E  BL       _ZN10fapi_errorD1Ev      <0x21a580f8>
   S:218E733C E51F2F98  LDR      r2,0x218e63ac            <OSD\#1126>
   S:218E7340 E51F1F98  LDR      r1,0x218e63b0            <OSD\#1126>
   S:218E7344 E59D0008  LDR      r0,[sp,#8]
   S:218E7348 FB056D05  BLX      __cxa_throw              <0x21a42766>

Doesn't seem so scary, and no overhead is added inside {} blocks or functions if the exception isn't thrown.

piotr
+2  A: 

In addition to all comments, I would propose you reading of Technical Report on C++ Performance which specifically addresses topics that you are interested in: using C++ in embedded (including hard real-time systems); how exception-handling usually implemented and which overhead it has; free store allocation's overhead.

The report is really good as is debunks many popular tails about C++ performance.

Alexander Poluektov