views:

72

answers:

6

Let's say I have a program(C++, for example) that allocates multiple objects, never bigger than a given size(let's call it MAX_OBJECT_SIZE).

I also have a region(I'll call it a "page") on the heap(allocated with, say, malloc(REGION_SIZE), where REGION_SIZE >= MAX_OBJECT_SIZE).
I keep reserving space in that page until the filled space equals PAGE_SIZE(or at least gets > PAGE_SIZE - MAX_OBJECT_SIZE).

Now, I want to allocate more memory. Obviously my previous "page" won't be enough. So I have at least two options:

  1. Use realloc(page, NEW_SIZE), where NEW_SIZE > PAGE_SIZE;
  2. Allocate a new "page"(page2) and put the new object there.

If I wanted to have a custom allocate function, then:

  1. Using the first method, I'd see how much I had filled, and then put my new object there(and add the size of the object to my filled memory variable).
  2. Using the second method, I'd have a list(vector? array?) of pages, then look for the current page, and then use a method similar to 1 on the selected page.

Eventually, I'd need a method to free memory too, but I can figure out that part.

So my question is: What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations? I understand that different operations may perform differently, but I'm looking for an overall metric.

A: 

It is not clear from your question why you need to allocate a big block of memory in advance rather than allocating memory for each object as needed. I'm assuming you are using it as a contiguous array. Otherwise, it would make more sense to malloc the memory of each object as it is needed.

If it is indeed acting as an array,malloc-ing another block gives you another chunk of memory that you have to access via another pointer (in your case page2). Thus it is no longer on contiguous block and you cannot use the two blocks as part of one array.

realloc, on the other hand, allocates one contiguous block of memory. You can use it as a single array and do all sorts of pointer arithmetic not possible if there are separate blocks. realloc is also useful when you actually want to shrink the block you are working with, but that is probably not what you are seeking to do here.

So, if you are using this as an array, realloc is basically the better option. Otherwise, there is nothing wrong with malloc. Actually, you might want to use malloc for each object you create rather than having to keep track of and micro-manage blocks of memory.

MAK
I suppose s/he expects a lot of "small" (at most max_obj_size big) alloc requests, so that having a "page" and avoiding a "real" malloc is not a bad idea.
ShinTakezou
A: 

You have not given any details on what platform you are experimenting. There are some performance differences for realloc between Linux and Windows, for example.

Depending on the situation, realloc might have to allocate a new memory block if it can't grow the current one and copy the old memory to the new one, which is expensive. If you don't really need a contiguous block of memory you should avoid using realloc.

My sugestion would be to use the second approach, or use a custom allocator (you could implement a simple buddy allocator [2]).

You could also use more advanced memory allocators, like

the_void
I'm using Linux, but I expect to port the code to Windows as well.
luiscubal
As I've said, it would be easier to just use a `multiplatform memory allocator`. If you're going to also use `threads` things will become *very complicated* to be manage by yourself. Go with `APR` for a proven-solution (it's the library used by the Apache webserver and *many others*) or use something newer, specially designed for parallel environments, like `tcmalloc` or `nedmalloc` (http://www.nedprod.com/programs/portable/nedmalloc/).
the_void
A: 

In the worst case, option 1 could cause a "move" of the original memory, that is an extrawork to be done. If the memory is not moved, anyway the "extra" size is initialized, which is other work too. So realloc would be "defeated" by the malloc method, but to say how much, you should do tests (and I think there's a bias on how the system is when the memory requests are done).

Depending on how many times you expect the realloc/malloc have to be performed, it could be an useful idea or an unuseful one. I would use malloc anyway.

The free strategy depends on the implementation. To free all the pages as whole, it is enough to "traverse" them; instead of an array, I would use linked "pages": add sizeof(void *) to the "page" size, and you can use the extra bytes to store the pointer to the next page.

If you have to free a single object, located anywhere in one of the pages, it becomes a little bit more complex. My idea is to keep a list of non-sequential free "block"/"slot" (suitable to hold any object). When a new "block" is requested, first you pop a value from this list; if it is empty, then you get the next "slot" in the last in use page, and eventually a new page is triggered. Freeing an object, means just to put the empty slot address in a stack/list (whatever you prefer to use).

ShinTakezou
+1  A: 

In my experience option 2 is much easier to work with has minimal overhead. Realloc does not guarantee it will increase the size of existing memory. And in practice it almost never does. If you use it you will need to go back and remap all of the old objects. That would require that you remember where every object allocated was... That can be a ton over overhead.

But it's hard to qualify "most efficient" without knowing exactly what metrics you use.

This is the memory manager I always use. It works for the entire application not just one object.

allocs:

for every allocation determine the size of the object allocated.

1 look at a link list of frees for objects of that size to see if anything has been freed if so take the first free

2 look for in a look up table and if not found

2.1 allocate an array of N objects of the size being allocated.

3 return the next free object of the desired size.

3.1 if the array is full add a new page.

N objects can be programmer tunned. If you know you have a million 16 byte objects you might want that N to be slightly higher.

for objects over some size X, do not keep an array simply allocate a new object.

frees:

determine the size of the object, add it to the link list of frees.

if the size of the object allocated is less than the size of a pointer the link list does not need to incur any memory overhead. simply use the already allocated memory to store the nodes.

The problem with this method is memory is never returned to the operating system until the application has exited or the programmer decides to defragment the memory. defragmenting is another post. it can be done.

madmik3
Take a look at the `slab allocator`: http://en.wikipedia.org/wiki/Slab_allocation
the_void
A: 

In linux (and probably other POSIX systems) there is a third possibility, that is to use a memory mapped region with shm_open. Such a region is initialized by zeroes once you access it, but AFAIK pages that you never access come with no cost, if it isn't just the address-range in virtual memory that you reserve. So you could just reserve a large chunk of memory at the beginning (more than you ever would need) of your execution and then fill it incrementally from the start.

Jens Gustedt
A: 

What is the most efficient way to solve a problem like this? Is it option 1, option 2 or some other option I haven't considered here? Is a small benchmark needed/enough to draw conclusions for real-world situations?

Option 1. For it to be efficient, NEW_SIZE has to depend on old size non-linearly. Otherwise you risk running into O(n^2) performance of realloc() due to the redundant copying. I generally do new_size = old_size + old_size/4 (increase by 25% percent) as theoretically best new_size = old_size*2 might in worst case reserve too much unused memory.

Option 2. It should be more optimal as most modern OSs (thanks to C++'s STL) are already well optimized for flood of small memory allocations. And small allocations have lesser chance to cause memory fragmentation.

In the end it all depends how often you allocate the new objects and how do you handle freeing. If you allocate a lot with #1 you would have some redundant copying when expanding but freeing is dead simple since all objects are in the same page. If you would need to free/reuse the objects, with #2 you would be spending some time walking through the list of pages.

From my experience #2 is better, as moving around large memory blocks might increase rate of heap fragmentation. The #2 is also allows to use pointers as objects do not change their location in memory (though for some applications I prefer to use pool_id/index pairs instead of raw pointers). If walking through pages becomes a problem later, it can be too optimized.

In the end you should also consider option #3: libc. I think that libc's malloc() is efficient enough for many many tasks. Please test it before investing more of your time. Unless you are stuck on some backward *NIX, there should be no problem using malloc() for every smallish object. I used custom memory management only when I needed to put objects in exotic places (e.g. shm or mmap). Keep in mind the multi-threading too: malloc()/realloc()/free() generally are already optimized and MT-ready; you would have to reimplement the optimizations anew to avoid threads being constantly colliding on memory management. And if you want to have memory pools or zones, there are already bunch of libraries for that too.

Dummy00001