ansaurus

Question

Does malloc() allocate a contiguous block of memory?

Answer 1

A:

The answer to question 1 and 2 is Yes

About ugliness (ie question 3) what is the programmer trying to do with that allocated memory?

hhafez 2009-03-09 07:22:48

Answer 2

+5 A:

1) Yes it does, or malloc will fail if there isn't a large enough contiguous block available. (A failure with malloc will return a NULL pointer)

2) Yes it will. The internal memory allocation will keep track of the amount of memory allocated with that pointer value and free all of it.

3)It's a bit of a language hack, and a bit dubious about it's use. It's still subject to buffer overflows as well, just may take attackers slightly longer to find a payload that will cause it. The cost of the 'protection' is also pretty hefty (do you really need >2mb per request buffer?). It's also very ugly, although your boss may not appreciate that argument :)

workmad3 2009-03-09 07:23:22

Answer 3

+3 A:

As for #3, without more code it's hard to answer. I don't see anything wrong with it, unless its happening a lot. I mean, you don't want to allocate 2mb chunks of memory all the time. You also don't want to do it needlessly, e.g. if you only ever use 2k.

The fact that you don't like it for some reason isn't sufficient to object to it, or justify completely re-writing it. I would look at the usage closely, try to understand what the original programmer was thinking, look closely for buffer overflows (as workmad3 pointed out) in the code that uses this memory.

There are lots of common mistakes that you may find. For example, does the code check to make sure malloc() succeeded?

jeffamaphone 2009-03-09 07:28:08

Answer 4

+3 A:

The exploit (question 3) is really up to the interface towards this structure of yours. In context this allocation might make sense, and without further information it is impossible to say if it's secure or not.
But if you mean problems with allocating memory bigger than the structure, this is by no means a bad C design (I wouldn't even say it's THAT old school... ;) )
Just a final note here - the point with having a char[1] is that the terminating NULL will always be in the declared struct, meaning there can be 2 * 1024 * 1024 characters in the buffer, and you don't have to account for the NULL by a "+1". Might look like a small feat, but I just wanted to point out.

E Dominique 2009-03-09 07:37:39

Also, the standard doesn't allow arrays of size 0, though some compilers do.

TrayMan 2009-03-09 07:42:56

That's right, but you could have a char*.

E Dominique 2009-03-09 07:52:03

No he can't; a char * would address memory somewhere else completely, instead of contiguous with the structure.For C99, the proper declaration for this is a flexible-size array "char package[]". But pretty much any compiler supporting that also supports the GNU extension for size 0.

puetzk 2009-03-09 14:47:24

@puetzk yes, you are indeed correct. Ofcourse.

E Dominique 2009-03-10 08:19:19

Answer 5

+9 A:

3 - That's a pretty common C trick to allocate a dynamic array at the end of a struct. The alternative would be to put a pointer into the struct and then allocate the array separately, and not forgetting to free it too. That the size is fixed to 2mb seems a bit unusual though.

TrayMan 2009-03-09 07:41:12

thanks a lot for your comments . basically we receive data from socket.we do not know the exact size we are going to receive and have capped it at 2 MB . the data we receive is copied into this structure . This change was done because this was the one with the min impact.

2009-03-09 07:49:22

@unknown (google), if the size is fixed, you can also change the array size from 1 to your fixed size. This trick makes only sense for arrays with variable lengths.

quinmars 2009-03-09 11:05:17

Answer 6

+6 A:

This is a standard C trick, and isn't more dangerous that any other buffer.

If you are trying to show to your boss that you are smarter than "very old school programmer", this code isn't a case for you. Old school not necessarily bad. Seems the "old school" guy knows enough about memory management ;)

qrdl 2009-03-09 08:23:16

I agree. Given the context of it's use (to store an unknown amount of data from a socket), I'd say it's pretty good (although I would read a much smaller chunk at a time from the socket, and then realloc() and keep reading until the socket was empty in order to minimize memory footprint).

Chris Lutz 2009-03-09 08:34:20

...(but with modern CPU speeds that's unlikely to be a HUGE issue regardless.)

Chris Lutz 2009-03-09 08:35:12

It is more dangerous in the sense that it's undefined behaviour.

Chris Young 2009-03-09 09:43:23

What do you call by "undefined behaviour" in this case? It doesn't violate C standard. The only problem I see is sizeof() will return invalid size but it is always the case for dynamic buffers.

qrdl 2009-03-09 10:14:06

why dont you just answer the questions and then shut up instead of trying to do my appraisal..

2009-03-14 08:57:11

Probably because you don't know the language good enough but trying to attack someone who knows much more then you, just to show how "smart" you are to your boss. Unfortunately there are enough people of this kind in our profession, but usually they quit programming and move to management.

qrdl 2009-03-14 11:42:12

@qrdl: See the edit of my answer at http://stackoverflow.com/questions/625270/does-malloc-allocate-a-contiguous-block-of-memory/625430#625430

Chris Young 2009-03-15 01:46:37

@Chris: I see what you mean. Indeed it is undefined behaviour, agreed. Thank you for pointing this out.

qrdl 2009-03-15 07:03:33

to qrdl : i strongly suspect you are the one who wrote this code :-) i will repeat .. either answer or shut up ... dont go for assumptions about my 'intentions' .. unfortunately there are enough people like you too "oh god . i know everything and people are always out to prove me wrong"

2009-03-16 04:22:06

@qrdl : so we agree its undefined behaviour and since its UNDEFINED behaviour and if this piece of code does cause some problems , i will be the one who will have to spend nights debugging this , and not you , you twit

2009-03-16 04:38:21

Too often I made the vary same mistake and then repeat to myself: "I need to learn to spot idiots faster and do not react".<hangup>

qrdl 2009-03-16 07:53:25

@qrdl : yes, had you learned that , you would have looked in the mirror and would have never touched a keyboard again . :-) ..anyway i dont understand why you come here again and again to get insulted ..and btw its "very" and not "vary" ..

2009-03-17 03:59:10

Answer 7

+2 A:

I've seen and used this pattern frequently.

Its benefit is to simplify memory management and thus avoid risk of memory leaks. All it takes is to free the malloc'ed block. With a secondary buffer, you'll need two free. However one should define and use a destructor function to encapsulate this operation so you can always change its behavior, like switching to secondary buffer or add additional operations to be performed when deleting the structure.

Access to array elements is also slightly more efficient but that is less and less significant with modern computers.

The code will also correctly work if memory alignment changes in the structure with different compilers as it is quite frequent.

The only potential problem I see is if the compiler permutes the order of storage of the member variables because this trick requires that the package field remains last in the storage. I don't know if the C standard prohibits permutation.

Note also that the size of the allocated buffer will most probably be bigger than required, at least by one byte with the additional padding bytes if any.

chmike 2009-03-09 08:29:48

The C standard requires members to be in the order you put them in the struct. However, it's undefined behaviour for reasons I explained in my answer.

Chris Young 2009-03-15 03:59:58

Answer 8

+1 A:

Would like to add that not is it common but I might also called it a standard practice because Windows API is full of such use.

Check the very common BITMAP header structure for example.

http://msdn.microsoft.com/en-us/library/aa921550.aspx

The last RBG quad is an array of 1 size, which depends on exactly this technique.

CDR 2009-03-09 08:38:42

Answer 9

+22 A:

To answer your numbered points.

Yes.
All the bytes. Malloc/free doesn't know or care about the type of the object, just the size.
It is strictly speaking undefined behaviour, but a common trick supported by many implementations. See below for other alternatives.

The latest C standard, ISO/IEC 9899:1999 (informally C99), allows flexible array members.

An example of this would be:

int main(void)
{       
    struct { size_t x; char a[]; } *p;
    p = malloc(sizeof *p + 100);
    if (p)
    {
        /* You can now access up to p->a[99] safely */
    }
}

This now standardized feature allowed you to avoid using the common, but non-standard, implementation extension that you describe in your question. Strictly speaking, using a non-flexible array member and accessing beyond its bounds is undefined behaviour, but many implementations document and encourage it.

Furthermore, gcc allows zero-length arrays as an extension. Zero-length arrays are illegal in standard C, but gcc introduced this feature before C99 gave us flexible array members.

In a response to a comment, I will explain why the snippet below is technically undefined behaviour. Section numbers I quote refer to C99 (ISO/IEC 9899:1999)

struct {
    char arr[1];
} *x;
x = malloc(sizeof *x + 1024);
x->arr[23] = 42;

Firstly, 6.5.2.1#2 shows a[i] is identical to (*((a)+(i))), so x->arr[23] is equivalent to (*((x->arr)+(23))). Now, 6.5.6#8 (on the addition of a pointer and an integer) says:

"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."

For this reason, because x->arr[23] is not within the array, the behaviour is undefined. You might still think that it's okay because the malloc() implies the array has now been extended, but this is not strictly the case. Informative Annex J.2 (which lists examples of undefined behaviour) provides further clarification with an example:

An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).

Chris Young 2009-03-09 08:40:27

+1, for the flexible and zero-length arrays. You could maybe also add that the benefit of this practice is that you save the memory for one pointer and reduce it to only one (expensive) allocation.

quinmars 2009-03-09 08:54:12

I disagree about undefined behaviour. malloc() is guaranteed to return continuous block of memory so you can safely access memory beyond the struct using either pointer arithmetics of array index - according to the standard they are the same. So it is defined behaviour.

qrdl 2009-03-13 09:49:30

@qrdl: The standard specifically disallows accessing beyond the array. I have edited my post to explain why it's undefined.

Chris Young 2009-03-14 05:09:44

@ chris : thanks a lot for your answer .

2009-03-16 04:28:04

@unknown (google): You're welcome. Perhaps you could make the answer as accepted then? :)

Chris Young 2009-03-16 06:51:50

@Chris: This may be nitpicking, but as I understand it `malloc` will allocate contiguous virtual memory space, but the actual physical memory which backs it may not be contiguous. At least that's how it seems to work in Linux, AFAIK.

Robert S. Barnes 2009-10-14 09:26:53

@Robert S. Barnes: You are not incorrect, but the physical layout is entirely irrelevant to the C standard. It only matters that it appears contiguous to the program when accessed in a well-defined manner. It's equally true and irrelevant to point out that the memory might not be contiguous because it may span several pieces of silicon.

Chris Young 2009-10-15 06:13:16

Answer 10

+3 A:

I don't think the existing answers quite get to the essence of this issue. You say the old-school programmer is doing something like this;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[1]; 
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));

I think it's unlikely he's doing exactly that, because if that's what he wanted to do he could do it with simplified equivalent code that doesn't need any tricks;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[2*1024*1024 + 1]; 
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc(sizeof(ts_request_def));

I'll bet that what he's really doing is something like this;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[1]; // effectively package[x]
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc( sizeof(ts_request_def) + x );

What he wants to achieve is allocation of a request with a variable package size x. It is of course illegal to declare the array's size with a variable, so he is getting around this with a trick. It looks as if he knows what he's doing to me, the trick is well towards the respectable and practical end of the C trickery scale.

Bill Forster 2009-03-10 23:19:47

Answer 11

+1 A:

Yes. malloc returns only a single pointer - how could it possibly tell a requester that it had allocated multiple discontiguous blocks to satisfy a request?

Andrew Medico 2009-03-11 00:47:23

Right, that is the job for the OS and virtual memory through the MMU. The actual physical blocks of RAM are likely all over the place.

Zan Lynx 2009-03-11 00:59:58

Answer 12

+1 A:

In response to your third question.

free always releases all the memory allocated at a single shot.

int* i =(int*) malloc(1024*2);

free(i+1024); //gives error free(i); //releases all the 2KB memory

2009-03-13 07:31:54

Answer 13

+1 A:

This common C trick is also explained in this StackOverflow question (Can someone explain this definition of the dirent struct in solaris?).

X-Istence 2009-03-13 07:37:43

ansaurus

tags:

views:

answers:

Does malloc() allocate a contiguous block of memory?

related questions