views:

619

answers:

11

I had been wondering for quite some time on how to manager memory in my next project. Which is writing a DSL in C/C++.

It can be done in any of the three ways.

  1. Reference counted C or C++.
  2. Garbage collected C.
  3. In C++, copying class and structures from stack to stack and managing strings separately with some kind of GC.

The community probably already has a lot of experience on each of these methods. Which one will be faster? What are the pros and cons for each?

A related side question. Will malloc/free be slower than allocating a big chunk at the beginning of the program and running my own memory manager over it? .NET seems to do it. But I am confused why we can't count on OS to do this job better and faster than what we can do ourselves.

+4  A: 

uh ... It depends how you write the garbage collection system for your DSL. Neither C or C++ comes with a garbage collection facility built-in but either could be used to write a very efficient or a very inefficient garbage collector. Writing such a thing, by the way, is a non-trivial task.

DSLs are often written in higher level languages such as Ruby or Python specifically because the language writer can leverage the garbage collection and other facilities of the language. C and C++ are great for writing full, industrial strength languages but you certainly need to know what you are doing to use them - knowledge of yacc and lex is especially useful here but a good understanding of dynamic memory management is important also, as you say. You could also check out keykit, an open source music DSL written in C, if you still like the idea of a DSL in C/C++.

Joe Soul-bringer
+7  A: 

It all depends! That's a pretty open question. It needs an essay to answer it!

Hey.. here's one somebody prepared earlier:

http://lambda-the-ultimate.org/node/2552

http://www.hpl.hp.com/personal/Hans_Boehm/gc/issues.html

It depends how big your objects are, how many of them there are, how fast they're being allocated and discarded, how much time you want to invest optimizing and tweaking to make optimizations. If you know the limits of how much memory you need, for fast performance, I would think you can't really beat grabbing all the memory you need from the OS up front, and then managing it yourself.

The reason it can be slow allocating memory from the OS is that it deals with lots of processes and memory on disk and in ram, so to get memory it's got to decide if there is enough. Possibly, it might have to page another processes memory out from ram to disk so it can give you enough. There's lots going on. So managing it yourself (or with a GC collected heap) can be far quicker than going to the OS for each request. Also, the OS usually deals with bigger chunks of memory, so it might round up the size of requests you make meaning you could waste memory.

Have you got a real hard requirement for going super quick? A lot of DSL applications don't need raw performance. I'd suggest going with whatever's simplest to code. You could spend a lifetime writing memory management systems and worrying which is best.

Scott Langham
Are you saying that all this will depending on what *language*, C or C++, one uses? Since in either of these language, one has to implement GC one's self, I think that there no per-language variation, just per algorithm variation.
Joe Soul-bringer
Not entirely sure what you're asking. But, the same memory management algorithms can be applied to both languages. However in C++ you'd do something like override operator new and delete to implement it. In C I'm not sure; possibly name your own functions and be sure to use them for all allocations.
Scott Langham
+1  A: 

A related side question. Will malloc/free be slower than allocating a big chuck at the begining of the program and running my own memory manager over it? .NET seems to do it. But I am confused why we can't count on OS to do this job better and faster than what we can do ourselves.

The problem with letting the OS handle memory allocation is that it introduces indeterministic behaviour. There's no way for the programmer to know how long the OS will take to return a new chunk of memory - an allocation may be quite costly if memory has to be paged out to disk.

Preallocating therefore might be a good idea, especially when using a copying garbage collector. It'll increase memory consumption, but allocation will be fast because in most cases it'll just be a pointer increment.

Christoph
The OS (or actually, the libraries, as malloc allocates big chunks and then doles them out piece by piece to your program) is a general purpose allocator. If you can constrain your allocator in some way, you can make it faster.
Michael Kohne
Reading/writing memory that has already been allocated can also be quite costly if it has been paged out to disk. :)
bk1e
+3  A: 

With most garbage collection implementations, allocation can see a speed improvement, but then you have the additional cost of the collection phase which can be triggered at any point in your program's execution, leading to a sudden (seemingly random) delay.

As for your second question, it depends on your memory management algorithms. You'd be safe sticking with your library's default malloc implementation, but there are alternatives which boast better performance.

Mike Weller
+4  A: 

Why would garbage collected C be faster than C++? The only garbage collectors available for C are pretty inefficient things, more designed to plug memory leaks than to actually improve the quality of your code.

In any case, C++ has the potential for reaching better performance with less code (note that it's only a potential. It's also very possible to write C++ code that is far slower than the equivalent C).

Considering the current state of both languages, GC's are not currently going to improve performance in your code. GC's can be made very efficient in languages designed for it. C/C++ are not among those. ;)

Apart from that, it's impossible to say. Languages don't have a speed. It doesn't make sense to ask which language is faster. It depends on 1) the specific code, 2) the compiler that compiles it, and 3) the system it's running on (hardware as well as OS).

malloc is a fairly slow operation, far slower than the .NET equivalents, so yes, if you are performing a lot of small allocations, you may be better off allocating a large pool of memory once, and then using chunks of that.

The reason is that the OS has to find a free chunk of memory, basically by following a linked list of all free memory areas. In .NET, a new() call is basically nothing more than moving the heap pointer as many bytes as required by the allocation.

jalf
A: 

Neither C nor C++ will give you garbage for free. What they will give you is memory allocation libraries (which provide malloc/free, etc). There are many online resources to algorithms for writing garbage collection libraries. A good start is link text

Denis Hennessy
A: 

Most non GC languages will allocate and de-allocate the memory as needed and no longer needed. GC'd languages usually allocate large chunks of memory before hand and only free the memory when idle and not in the middle of a intensive task so I am going to yes if GC kicks in at correct time.

The D programming language is a garbage collected language and ABI compatible with C and partly ABI compatible with C++. This Page shows some benchmarks between string performance in C++ and D.

Tim Matthews
+1  A: 

As people have pointed out - GC is faster to allocate (because it just gives you the next block on its list), but slower overall (because it has to compact the heap regularly, in order for allocs to be fast).

so - go for the compromise solution (which is actually pretty damn good):

You create your own heaps, one for each size of object you generally allocate (or 4-byte, 8 byte, 16-byte, 32-byte, etc) then, when you want a new piece of memory you grab the last 'block' on the appropriate heap. Because you pre-allocate from these heaps, all you need to do when allocating is grab the next free block. This works better than the standard allocator because you are happily wasting memory - if you want to allocate 12 bytes, you'll give up a whole 16 byte block from the 16-byte heap. You keep a bitmap of used v free blocks so you can allocate quickly without wasting loads of memory or needing to compact.

Also, because you're running several heaps, highly-parallel systems work much better as you don't need to lock so often (ie you have multiple locks for each heap so you don't get contention nearly as much)

Try it - we used it to replace the standard heap on a very intensive application, performance went up by quite a lot.

BTW. the reason the standard allocators are slow is that they try not to waste memory - so if you allocate a 5 byte, 7 byte and 32 bytes from the standard heap, it'll keep those 'boundaries'. Next time you need to allocate, it'll walk through those looking for enough space to give you what you asked for. That worked well for low-memory systems, but you only have to look at how much memory most apps use today to see that GC systems go the other way, and try to make allocations as fast as possible whilst caring nothing for how much memory is wasted.

gbjbaanb
+1  A: 

The problem has a lot of variables, but if your application is written with garbage collection in mind, and if you exploit the special features of the Boehm collector, such as different allocation calls for blocks that don't contain pointers, then as a general rule your application - Will have simpler interfaces - Will run somewhat faster - Will require from 1.2x to 2x the space than a similar application using explicit memory management.

For documentation and evidence supporting these claims, you can see the information on Boehm's web site, and also Ben Zorn's several papers on the measured cost of conservative garbage collection.

Most importantly you'll save a ton of effort and won't have to worry about a significant class of memory-management bugs.

The issue of C vs C++ is orthogonal, but GC will definitely be faster than reference counting, especially when there's no compiler support for reference counting.

Norman Ramsey
A: 

I suggest that if you have written a program where memory allocation and deallocation (explicitly or GC'ed) is the bottleneck, then you should re-think your architecture, design and implementation.

Foredecker
A: 

If you don't want to explicitly manage memory, don't use C/C++. There are plenty of languages with either reference counting or compiler-supported garbage collectors that will probably work much better for you.

C/C++ are designed in an environment where the programmer manages their own memory. Trying to retrofit GC or ref counting onto them may help some, but you'll find that you either have to compromise the performance of the GC (because it doesn't have any compiler hinting as to where pointers might be), or you'll find new and fascinating ways that you can screw up the reference counts or the GC or whatever.

I know it sounds like a good idea, but really, you should just grab a language more suited to the task.

Michael Kohne