views:

614

answers:

9

If you are someone who programs in C or C++, without the managed-language benefits of memory management, type checking or buffer overrun protection, using pointer arithmetic, how do you make sure that your programs are safe? Do you use a lot of unit tests, or are you just a cautious coder? Do you have other methods?

+21  A: 

All of the above. I use:

  1. A lot of caution
  2. Smart Pointers as much as possible
  3. Data structures which have been tested, a lot of STL
  4. Unit tests all the time
  5. Memory validation tools like MemValidator and AppVerifier
  6. Pray every night it doesn't crash on customer site.

Actually, I am just exaggerating. Its not too bad and its actually not too hard to keep control of resources if you structure your code properly.

Interesting note. I have a large application which uses DCOM and has managed and unmanaged modules. The unmanaged modules generally are harder to debug during development, but perform very well at the customer site because of the many tests run on it. The managed modules sometimes suffer from bad code because the garbage collector is so flexible, programmers get lazy in checking resource usage.

Andrew Keith
Smart pointers is one I haven't heard of before. I'll have to check that out.
Robert Harvey
i have to use a lot of them since i am plagued by COM legacy code. Without smart pointers i will lose track of all those references and hemorrhage memory.
Andrew Keith
I've developed an allergy for seeing naked pointers in C++ code. If I see one my instinct is to wrap it up in a smart pointer, even if that is unnecessary. The instinct has served me well - I don't recall having a dangling pointer for probably ten years or more.
Phil Nash
IMO Smart pointers are just a stop-gap measure. Garbage collection is the real deal. I think it is well known by now that smart pointers (the ref counted variety at least and the auto_ptr<T> variety being of limited use anyways) cannot match the performance characteristics of a well tuned garbage collector. But then again this is from a person who thinks its time for C++ to retire now.
SDX2000
@SDX2000: I think most experienced C++ developers would argue the garbage collection is inefficient at best and a crutch at worst, in comparison to the correct usage of smart pointers. There are garbage collectors available for C++ but they are not favored because of the efficient implementation and the variety of smart pointer implementations available. Obviously your understanding of smart pointers seems to be affecting your opinion I suggest further reading about how and when to use them (as auto_ptr is not of limited use, it has a very precise a well defined use (transfer of ownership)).
Martin York
@SDX2000: The concept of retiring a language is laughable. Each language is good for solving problems in different application spaces. C#/Java/C++/C all have different (yet overlapping) areas where they shine and others areas where they not as useful. You should not use a language because it is the one you know, you should use a language that best fits the problem domain you are trying to write a program for.
Martin York
@Martin - Thanks for your kind words of wisdom...popular opinion, it seems remains divided as usual for example see http://stackoverflow.com/questions/867114/why-no-reference-counting-garbage-collection-in-c/867141#867141 . There is no reason to believe either of the garbage collection mechanisms is faster/more efficient than the other without some quantitative evaluation. And just FYI my understanding of smart pointers is not as shallow as you may think...I have on some occasions written TR1/shared_ptr<T> like smart pointers and I am well aware of their strengths and weaknesses.
SDX2000
@Martin - In answer to your second comment, you are right its laughable indeed. I should have been more specific when I said C++ should retire now. What I meant was...it's high time now that we re-evaluate the position of C++ as a generic problem solving tool and discontinue usage in the domains which are better served by other modern languages. If you have ever worked in C# you will know that C++ is a PITA. I have been programming in C++ for the past 15 years my C++ chops are not in question here.
SDX2000
There is nothing *efficient* about smart pointers. Reference counting (assuming that's the kind of smart pointer we're talking about) is ridiculously inefficient compared t oa decent GC. A good C++ programmer should accept that fact. Garbage collectors are very efficient, much more so than the primitive refcounting we use in C++. Smart pointers have other redeeming qualities of course, which a GC can't offer. But performance isn't among them.
jalf
+13  A: 

Just as relevant - how do you ensure your files and sockets are closed, your locks released, yada yada. Memory is not the only resource, and with GC, you inherently lose reliable/timely destruction.

Neither GC nor non-GC is automatically superior. Each has benefits, each has its price, and a good programmer should be able to cope with both.

I said as much in an answer to this question.

Steve314
There are techniques for doing RAII in managed languages: http://www.levelofindirection.com/journal/2009/9/24/raii-and-closures-in-java.htmlhttp://www.levelofindirection.com/journal/2009/9/24/raii-and-readability-in-c.html
Phil Nash
... and http://www.levelofindirection.com/journal/2009/9/24/raii-and-closures-in-java.html
Phil Nash
@Phil - interesting reading, but of course anyone who thinks "this proves C# and Java beat C++" should actually read those links. If an idiom was a magic cure, the idioms for ensuring proper deletion of heap-allocated objects in C++ would be magic cures too, and we wouldn't see garbage collection fans mocking C++.
Steve314
Sockets and file locks are a red herring. There are simple, well established patterns for these in managed languages. In c# it's the "using" statement, which disposes of resources automatically when they are no longer needed.
Robert Harvey
@Harvey - not every socket or file lives only for the life of a single function call - and where they do, a C++ local variable using encapsulated RAII is cleaner and less error prone than try/finally. Consider e.g. the files underlying GUI app documents, which you may want to keep open (e.g. for locking). You may have multiple view objects referencing that document. Already, you're dealing with issues relevant to both GC and RAII. In both cases there are idioms to ensure part of the work gets done, but the programmer must apply those idioms correctly and generally take responsibility.
Steve314
Sorry - not just try/finally, but using too and other idioms. The GC-languages approaches need an idiom to be repeated for every use of a class that need it, whereas C++ encapsulates the RAII in the class once and for all.
Steve314
I should also add that try/finally or whatever can be cleaner when what you might want is a simple local variable, but what you got is a reference to a heap-allocated object from some kind of factory. Cleaner still if memory is the only resource to release, of course.
Steve314
@Robert Harvey: There are simple, well established patterns for handling memory in standard C++. What are you complaining about? The languages do things differently. Each approach is better in some ways and worse in others. Both work.
David Thornley
Who's complaining?
Robert Harvey
A: 

C++ has all the features you mention.

There is memory management. You can use Smart Pointers for very precise control. Or there are a couple of Garbage collectors available though they are not part of the standard (but it most situations Smart Pointers are more than adequate).

C++ is a strongly typed language. Just like C#.

We are using buffers. You can opt to use bounds checked version of the interface. But if you know that there is not a problem then you are free to use the unchecked version of the interface.

Compare method at() (checked) to operator[] (Unchecked).

Yes we use Unit Testing. Just like you should be using in C#.

Yes we are cautious coders. Just like you should be in C#. The only difference is the pitfalls are different in the two languages.

Martin York
I didn't see the question "does C++ have the modern benefits of memory management" being asked, but "If you program in C++, *without* the modern benefits of memory management,..., how do you make sure that your programs are safe? "
Pete Kirkham
If I program without smart pointers, it's a whole lot harder to make sure my programs are safe. I don't see the relevance, though. If you program in C# without using the "using" statement (which IIRC is a fairly recent addition), how do you make sure your other resources are disposed properly?
David Thornley
Aren't smart pointers adequate in the same situations that VB6 and COM reference counting was adequate? That's what Microsoft wanted to improve when they chose for the .NET style of garbage collection.
MarkJ
@MarkJ: Hardly. COM reference counting put the responcability on the user. Smart pointer like GC puts the responcability on the developer of the Smart pointer/GC. Basically Smart Pointers is a much finer grain of Garbage collection that is deterministic (unlike GC which is not deterministic).
Martin York
@MarkJ: In Java GC adds so many other problems that destructors (or finalisers are practically usless) while in .NET they had to add the concept of "using" to make garbage collection usable. So the real question is why do you think the "using" cocept is better than "Smart Pointers" when "using" puts the responcability back on the user of the object just like the COM reference counting did.
Martin York
Read this: http://stackoverflow.com/questions/1064325/why-not-use-pointers-for-everything-in-c/1064485#1064485 for a more detailed description of Smart Pointers and its advantages of GC.
Martin York
+12  A: 

I use lots and lots of asserts, and build both a "debug" version and a "release" version. My debug version runs much much slower than my release version, with all the checks it does.

I run frequently under Valgrind, and my code has zero memory leaks. Zero. It is a lot easier to keep a program leak-free than it is to take a buggy program and fix all the leaks.

Also, my code compiles with no warnings, despite the fact that I have the compiler set for extra warnings. Sometimes the warnings are silly, but sometimes they point right at a bug, and I fix it without any need to find it in the debugger.

I'm writing pure C (I can't use C++ on this project), but I'm doing C in a very consistent way. I have object-oriented classes, with constructors and destructors; I have to call them by hand, but the consistency helps. And if I forget to call a destructor, Valgrind hits me over the head until I fix it.

In addition to the constructor and destructor, I write a self-check function that looks over the object and decides whether it is sane or not; for example, if a file handle is null but associated file data is not zeroed out, that indicates some kind of error (either the handle got clobbered, or the file wasn't opened but those fields in the object have trash in them). Also, most of my objects have a "signature" field that must be set to a specific value (specific to each different object). Functions that use objects typically assert that the objects are sane.

Any time I malloc() some memory, my function fills the memory with 0xDC values. A structure that isn't fully initialized becomes obvious: counts are way too big, pointers are invalid (0xDCDCDCDC), and when I look at the structure in the debugger it's obvious that it's uninitialized. This is much better than zero-filling memory when calling malloc().

Any time I free memory, I erase the pointer. That way, if I have a stupid bug where the code tries to use a pointer after its memory has been freed, I instantly get a null-pointer exception, which points me right at the bug. My destructor functions don't take a pointer to an object, they take a pointer to a pointer, and clobber the pointer after destructing the object. Also, destructors wipe their objects before freeing them, so if some chunk of code has a copy of a pointer and tries to use an object, the sanity check assert fires instantly.

Valgrind will tell me if any code writes off the end of a buffer. If I didn't have that, I would have put "canary" values after the ends of the buffers, and had the sanity check test them. These canary values, like the signature values, would be debug-build-only, so the release version would not have memory bloat.

I have a collection of unit tests, and when I make any major changes to the code, it is very comforting to run the unit tests and have some confidence I didn't horribly break things. Of course I run the unit tests on the debug version as well as the release version, so all my asserts have their chance to find problems.

Putting all this structure into place was a bit of extra effort, but it pays off every day. And I feel quite happy when an assert fires and points me right at a bug, instead of having to run the bug down in the debugger. In the long run, it's just less work to keep things clean all the time.

Finally, I have to say that I actually like Hungarian notation. I worked at Microsoft a few years back, and like Joel I learned Apps Hungarian and not the broken variant. It really does make wrong code look wrong.

steveha
It all sounds great... but I'm glad I have people like Eric Lippert putting the structure in place without me lifting a finger.
MarkJ
+2  A: 

Andrew's answer is a good one, but I'd also add discipline to the list. I find that after enough practice with C++ that you get a pretty good feel for what's safe and what's begging for the velociraptors to come eat you. You tend to develop a coding style that feels comfortable when following the safe practices and leaves you feeling the heebie-jeebies should you try to, say, cast a smart pointer back to a raw pointer and pass it to something else.

I like to think of it like a power tool in a shop. It's safe enough once you've learned to use it correctly and as long as you make sure to always follow all the safety rules. It's when you think you can forgo the safety goggles that you get hurt.

Boojum
+2  A: 

I have done both C++ and C# and I don't see all the hype about managed code.

Oh right, there is a garbage collector for memory, that's helpful... unless you refrain from using plain old pointers in C++ of course, if you only use smart_pointers, then you don't have so much problems.

But then I would like to know... does your garbage collector protects you from:

  • keeping database connections open?
  • keeping locks on files?
  • ...

There is much more to resources management than memory management. The good thing is C++ is that you learn rapidly what resources management and RAII means, so that it becomes a reflex:

  • if I want a pointer, I want an auto_ptr, a shared_ptr or a weak_ptr
  • if I want a DB connection, I want an object 'Connection'
  • if I open a file, I want an object 'File'
  • ...

As for buffer overruns, well, it's not like we are using char* and size_t everywhere. We do have some things call 'string', 'iostream' and of course the already mentioned vector::at method which free us from those constraints.

Tested libraries (stl, boost) are good, use them and get on to more functional problems.

Matthieu M.
Database connections and file locks are a red herring. There are simple, well established patterns for these in managed languages. In c# it's the "using" statement, which disposes of resources automatically when they are no longer needed.
Robert Harvey
IMO the main problem with smart pointers in C++ is that there's no real standard. If you use 3rd party libraries/frameworks, it's very unlikely that they all use the same smart pointer type. So you can rely on them within a module, but as soon as you interface components from different vendors, you're back to manual memory management.
nikie
@nikie: when I use 3rd party components, I expect them to be very clear on their memory management strategy. But then, the only 3rd libraries we have at work are OpenSource like Boost or Cyptopp, so I don't have much experience there.
Matthieu M.
+1  A: 

Beside a lot of the good tips given here, my most important tool is DRY -- Don't Repeat Yourself. I don't spread error prone code (e.g. for handling memory allocations with malloc() and free()) all over my codebase. I have exactly one single location in my code where malloc and free are called. It is in the wrapper functions MemoryAlloc and MemoryFree.

There is all the argument checking and the initial error handling that usually is given as repeated boilerplate code around the call to malloc. Additionally, it enables anything with the need to modify only one location, beginning with simple debugging checks like counting the successful calls to malloc and free and verify at program termination that both numbers are equal, up to all kinds of extended security checkings.

Sometimes, when I read a question here like "I always have to ensure that strncpy terminates the string, is there an alternative?"

strncpy(dst, src, n);
dst[n-1] = '\0';

followed by days of discussion, I always wonder if the art of extracting repeated functionality into functions is a lost art of higher programming that is no longer taught in programming lectures.

char *my_strncpy (dst, src, n)
{
    assert((dst != NULL) && (src != NULL) && (n > 0));
    strncpy(dst, src, n);
    dst[n-1] = '\0';
    return dst;
}

Primary problem of code duplication solved -- now let's think if strncpy really is the right tool for the job. Performance? Premature optimization! And one single location to begin with it after it proves to be the bottleneck.

Secure
+3  A: 

I have been using C++ for 10 years. I have used C, Perl, Lisp, Delphi, Visual Basic 6, C#, Java and various other languages which I can't remember off the top of my head.

The answer to your question is simple: you have to know what you're doing, more than C#/Java. The more than is what spawns such rants as Jeff Atwood's regarding "Java Schools".

Most of your questions, in a sense, are nonsensical. The 'problems' you bring up are simply facts of how hardware really works. I'd like to challenge you to write a CPU & RAM in VHDL/Verilog and see how stuff really works, even when really simplified. You'll start to appreciate that the C#/Java way is a abstraction papering over hardware.

An easier challenge would be to program an elementary operating system for an embedded system from initial power-on; it'll show you what you need to know as well.

(I've also written C# and Java)

Paul Nathan
+1 for "you have to know what you're doing."
Chris Lutz
Asking questions is part of the process of getting to the place where you "know what you are doing."
Robert Harvey
I'm not knocking you, Robert. I gave you my best understanding of how you program safely outside of VM code, plus a route to understanding the real machines.
Paul Nathan
I appreciate that, and the fact that c/c++ is used a lot in embedded systems; clearly it is closer to the metal than some other languages like Java.
Robert Harvey
+3  A: 

We write in C for embedded systems. Besides using some of the techniques common to any programming language or environment, we also employ:

  • A static analysis tool (e.g. PC-Lint).
  • Conformance to MISRA-C (enforced by the static analysis tool).
  • No dynamic memory allocation at all.
Steve Melnikoff