tags:

views:

366

answers:

8
+2  Q: 

How malloc works?

Possible Duplicate:
How do free and malloc work in C?

Consider a scenario where i have to allocate some 20 bytes of memory through malloc. For the function call to malloc() to be successful, should the 20 bytes be available contiguously in memory or can it be scattered? For eg, in the above case, if there are 4 or 5 chunks of 10 bytes each, will malloc work? Or is this OS specific or compiler-specific?

+4  A: 

Standard malloc is defined in the C standard to allocate a contiguous block of memory (at least it appears so to you) - it will return a null pointer if the allocation fails.

At a lower level, the OS will be doing something like what kotlinski or Blank Xavier have described in their respective answers.

From §7.20.3 of the ISO/IEC 9899-1999 C Standard:

The pointer returned if the allocation (by calloc, realloc, or malloc) succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).

It is not that explicit, but the paragraph mentions 'accessing an array of such objects', and in the C standard, arrays are:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. (from §6.2.5)

Also note that subsequent calls to calloc, realloc, and malloc do not guarantee contiguity or ordering of memory (with other memory blocks already allocated).

This point is also specified in §7.20.3.

The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions is unspecified.

birryree
Subsequent call to `realloc` WILL either return a pointer to contiguous memory of that size of fail. It will also protect your existing data in that area if correctly allocated before. The pointer can change, and the absolute address of your data can change, but if `realloc` is successful you have your original data and the resized block as one block. Your very last statement is false with respect to `realloc`...
drewk
My last statement may have been unclear - I meant that repeated calls to any of the allocation functions do not guarantee contiguity/ordering respective to other blocks of memory that have already been allocated, not that the memory allocated by them would not be contiguous.
birryree
+1  A: 

The memory has to be continuous - what you are mentioning is generally referred to as memory fragmentation, and is a real problem for applications acquiring and freeing memory very often.

BrokenGlass
+2  A: 

This is platform specific. To your eyes, on the software level, it will always be presented as you malloc:ed 20 contiguous bytes. But on some platforms, e.g. with paged memory, these bytes do not really need to be contiguous when it comes down to the actual hardware - it will just appear so.

kotlinski
+7  A: 

The question is kind of wrong.

In a typical OS, there exists the concepts of virtual memory and physical memory.

Physical memory exists typically in 4kb blocks, virtual memory likewise.

Each process has virtual memory - to each process the OS presents what appears to be the fully addressable memory range. So on a 32 bit machine, each process 'thinks' it has 4 GB of contigious memory.

In reality, the OS, behind the scenes, is busy mapping virtual memory allocations onto real blocks of physical memory. So a, say, 400kb virtual memory allocation, is mapped onto 100 physical blocks. Those physical blocks do not need to be contigious (and almost never are - nothing stops it from happening, but on a machine doing any kind of work, it's highly improbable) but the virtual memory allocation does need to be contigious.

So you can still run into virtual memory fragmentation. Here, a process requests a block of memory and there isn't, in that particular processes virtual memory map, a block of contigious virtual memory such that the request can be satisfied.

That problem is the problem you're thinking of.

Blank Xavier
+1 very informative.
birryree
humble bow :-) [extra padding for min post length]
Blank Xavier
This will never effect a pointer returned from `malloc` however.
drewk
drwek, what do you mean? malloc returns virtual memory pointers - if there is no block of contigious virtual memory addresses such that a request can be fufilled, it will affect malloc, since malloc will have to return NULL!
Blank Xavier
I couldn't understand the last part of your explanation. What exactly will happen when virtual memory fragmentation occurs?
Raj Kumar
Imagine you have a process. It has been doing lots of allocations and deallocations. It's virtual memory map is highly fragmented - lots of allocations, lots of gaps. You might find then you reach a point where the biggest gap is say, 5 mbytes long. You then try to malloc 10 mbytes. This will fail. There is nowhere in the virtual memory map for a contigious 10 mbyte allocation - even though there is plenty of physical memory available for that allocation.
Blank Xavier
A: 

Most likely it will not work. malloc can not rearrange already allocated memory so it can't make a continuous free block of 20 byte. It can only ask the OS to get another page of memory from, of which it can chop of 20 bytes + header.

Ben Strasser
Sorry, but there's a lot wrong with that. "malloc" *can* `realloc` memory, and in various circumstances it can give you the very same location back. That's because often malloc implementations give you more memory than you asked for (e.g. a popular scheme is to divide the memory into buckets, and each bucket contains memory regions of a certain size; then malloc will give you an element from a bucket that can hold at least the required size). Also, malloc will not ask the OS for a new page for tiny sizes, only if the buckets are empty. It does typically ask when alloc'ing big sizes, though.
DarkDust
@DarkDust: I agree that the answer is not yet well explained, but your deconstruction of it also leaves somewhat to be desired. I cannot construe 'malloc can realloc memory, and in various circumstances it can give you the very same location back' as a meaningful statement that accurately describes the behaviour of malloc.
Jonathan Leffler
+1  A: 
Loadmaster
+1  A: 

The call to malloc will either succeed in returning a logically contiguous block of memory from your program's HEAP memory space equal to the size requested or it will fail with a NULL pointer. "Logically contiguous" means that with a malloc of this type:

int *ip;      /* Nothing yet allocated (other than the size of a pointer... */
int ar[100];  /* 100 ints on the STACK */
ip = (int *)malloc(sizeof ar);   /* if that succeeds, 100 ints on the HEAP */

will allocate space for 100 ints on your OS on the HEAP and return either NULL or the pointer. Separately, the array ar is allocated on the STACK. Each array will be laid out with all the ints logically next to each other, at least as far your program knows. If they were not next to each other, you would not be able to address these blocks as arrays with the array[offset] notation or with pointer arithmetic.

You can then access either STACK or HEAP blocks of memory with an array access or pointer access interchangeably like this:

ip[2]=22;        /* the second element of ip[] is '22' */
*(ar+33)=3333;   /* the 33 element of ar is '3333' */

i=*(ip+2);       /* assign using pointers */
j=ar[33];        /* assign using array offsets */

If the memory block returned by malloc were not logically contiguous to your program, you would not be able to access the block with pointer arithmetic or array subscripting.

Behind the scenes, your OS may move other blocks of memory that are moveable, use virtual memory, swap other items to virtual memory, etc, in order to increase the HEAP allocated to your program. Malloc can either be a very fast or very expensive call -- depending what else is going on on that system and the HEAP space allocated to your program.

HEAP memory (that part accessed with dynamic system calls in C) is potentially subject to fragmentation. Say you allocated the number of 20 byte blocks where memory becomes scarce. Now image that you free every other block of those blocks. You will have highly fragmented memory since blocks allocated with malloc cannot be moved if it effects the pointer that the program uses to access the block. (It can be moved transparently, but don't count on that being efficient.)

If you are making many calls for HEAP memory, consider changing your logic to use realloc to grow and shrink the memory as needed. A big 'gotcha' with realloc is the pointer to your existing data may change, so only use 1 pointer to it. Realloc allows the OS to move the data as needed to have a better fit with what is available on the HEAP. You will (mostly) avoid the potential for memory fragmentation that way.

For quick blocks of 20 bytes, consider using the STACK. That is what it is for. Look at this SO post to see characteristics of STACK vs HEAP.

Read the C Guide of calloc, malloc, realloc, free for more info.

drewk
A: 

OS specific.

If it's a system that has virtual memory, malloc is just going to allocate another page of VM and add it to its heap, and then hand you the memory you need. The chunk of memory within the page has to be contiguous though. Interestingly, if your demand is bigger than the page size, the memory doesn't have to be contiguous in physical memory. Two pages of VM can be made contiguous, regardless of where they are located in physical memory.

If it's a system that does not have VM, then yes, all the memory you request needs to be contiguous in physical memory.

Ziffusion