views:

1325

answers:

17

Take the following code :

int *p = malloc(2 * sizeof *p);

p[0] = 10;  //Using the two spaces I
p[1] = 20;  //allocated with malloc before.

p[2] = 30;  //Using another space that I didn't allocate for. 

printf("%d", *(p+1)); //Correctly prints 20
printf("%d", *(p+2)); //Also, correctly prints 30
                      //although I didn't allocate space for it

With the line malloc(2 * sizeof *p) I am allocating space for two integers, right ? But if I add an int to the third position, I still gets allocated correctly and retrievable.

So my question is, why do you specify a size when you use malloc ?

A: 

When you use * (p+3), you're addressing out of bounds even with using 2*sizeof(* p), hence you're accessing an invalid memory block, perfect for seg faults.

You specify the size b/c otherwise, the function doesn't know how big of a block out of the heap memory to allocate to your program for that pointer.

stanigator
*p isn't a pointer it's an int, p is a pointer.
banister
That's wrong, sizeof(*p) is the size of the type pointed to by p. I like sizeof(*p) better than sizeof(int), because the latter makes it more error prone to change the type.
Ville Laurikari
You're right about my first remarks. I already changed it.
stanigator
+27  A: 

C kindly let you shoot yourself in the head. You have just used random memory on the heap. With unforeseeable consequences.

Disclaimer: My last real C programing was done some 15 years ago.

Igal Serban
No disclaimer needed, this answer is absolutely correct. Usually the next variable you allocate is the one that gets overwritten, but if you're really unlucky you could hit another program's variable space and screw with something random.
Ricket
By that I meant, if after declaring p you also declared int* q = malloc(sizeof(int)); (an array with one element), it is likely (but not guaranteed) that p[2] == q[0]. This also introduces cases where your program may continue on, and it may not wreak havoc, and then suddenly there arises a case where p[2] != q[0] and a bug happens once... Those come-and-go, unpredictable bugs are extremely hard to debug.
Ricket
+4  A: 

In fact, malloc is not allocating enough space for your third integer, but you got "lucky" and your program didn't crash. You can only be sure that malloc has allocated exactly what you asked for, no more. In other words, your program wrote to a piece of memory that was not allocated to it.

So malloc needs to know the size of the memory that you need because it doesn't know what you will end up doing with the memory, how many objects you plan on writing to the memory, etc...

Adam Batkin
I would argue that this is actually unlucky :)
bdonlan
You see how unlucky it is when the code is run on some system where the heap behaves differently (but within the standard of course). "It ran on my machine" is not the phrase customers want to hear.
sharptooth
+4  A: 

This all goes back to C letting you shoot yourself in the foot. Just because you can do this, doesn't mean you should. The value at p+3 is definitely not guaranteed to be what you put there unless you specifically allocated it using malloc.

Matt Kellogg
+12  A: 

You got (un)lucky. Accessing p[3] is undefined, since you haven't allocated that memory for yourself. Reading/writing off the end of an array is one of the ways that C programs can crash in mysterious ways.

For example, this might change some value in some other variable that was allocated via malloc. That means it might crash later, and it'll be very hard to find the piece of (unrelated) code that overwrote your data.

Worse yet, you might overwrite some other data and might not notice. Imagine this accidentally overwrites the amount of money you owe someone ;-)

Harold L
It's also likely to overwrite information about what's in the heap, meaning that malloc() and free() might do increasingly screwy things until there's a very mysterious crash with no apparent cause.
David Thornley
A: 

Because malloc() allocates in BYTES. So, if you want to allocate (for example) 2 integers you must specify the size in bytes of 2 integers. The size of an integer can be found by using sizeof(int) and so the size in bytes of 2 integers is 2 * sizeof(int). Put this all together and you get:

int * p = malloc(2 * sizeof(int));

Note: given that the above only allocates space for TWO integers you are being very naughty in assigning a 3rd. You're lucky it doesn't crash. :)

banister
actually, `int *p = malloc(2 * sizeof *p);` also assigns the correct amount because it multiplies by the size of the pointer, which in this case, is an `int`
Andreas Grech
yes, sizeof *p and sizeof(int) return the same value :)
banister
@Dreas: That's taking the `sizeof` the type pointed at by `p`, not of `p` itself. Universally, given `in *p`, `sizeof(int) == sizeof(*p)`.
Novelocrat
+82  A: 

Simple logic: If you do not park in a legal parking space, nothing might happen but occasionally your car might get towed and you might get stuck with a huge fine. And, sometimes, as you try to find your way to the pound where your car was towed, you might get run over by a truck.

malloc gives you as many legal parking spots as you asked. You can try to park elsewhere, it might seem to work, but sometimes it won't.

For questions such as this, the Memory Allocation section of the C FAQ is a useful reference to consult. See 7.3b.

On a related (humorous) note, see also a list of bloopers by ART.

Sinan Ünür
+ 1 I like this analogy
Andreas Grech
Perfect metaphor. You can park illegally day after day and maybe nothing bad will happen. But there is still the possibility of being towed or ticketed, so you shouldn't do it.
MatrixFrog
Thank you everyone for the upvotes. I appreciate it.
Sinan Ünür
You provide a good explanation, you get upvotes. Thanks for the explanation.
David Thornley
The analogy is good, but you don't explain what's going on. It seems obvious to people that understand the problem...but if he understood the problem, he wouldn't have asked the question. The fact that the program could crash or the memory he's using could get overwritten isn't obvious. Explain what's going on behind this analogy, and it's an easy +1.
Beska
+1. Excellent analogy. And one's time wasted^H^H^H^H^H spent hunting down the memory corruptions is the fine for the the offense.
Andrew Y
+2  A: 

Memory is represented as an enumerable contiguous line of slots that numbers can be stored in. The malloc function uses some of these slots for its own tracking info, as well as sometimes returning slots larger than what you need, so that when you return them later it isn't stuck with an unusably small chunk of memory. Your third int is either landing on mallocs own data, on empty space leftover in the returned chunk, or in the area of pending memory that malloc has requested from the OS but not otherwise parcelled out to you yet.

Michael Speer
A: 

Because malloc is allocating space on the heap which is part of the memory used by your program which is dynamically allocated. The underlying OS then gives your program the requested amount (or not if you end up with some error which implies you always should check return of malloc for error condition ) of virtual memory which it maps to physical memory (ie. the chips) using some clever magic involving complex things like paging we don't want to delve into unless we are writing an OS.

insitu
A: 

As everyone has said, you're writing to memory that isn't actually allocated, meaning that something could happen to overwrite your data. To demonstrate the problem, you could try something like this:

int *p = malloc(2 * sizeof(int));
p[0] = 10; p[1] = 20; p[2] = 30;
int *q = malloc(2 * sizeof(int));
q[0] = 0; // This may or may not get written to p[2], overwriting your 30.

printf("%d", p[0]); // Correctly prints 10
printf("%d", p[1]); // Correctly prints 20
printf("%d", p[2]); // May print 30, or 0, or possibly something else entirely.

There's no way to guarantee your program will allocate space for q at p[2]. It may in fact choose a completely different location. But for a simple program like this, it seems likely, and if it does allocate q at the location where p[2] would be, it will clearly demonstrate the out-of-range error.

MatrixFrog
Actually, since p was allocated on the heap and q was allocated on the stack, there is virtually 0 chance of a hit.
Sanjaya R
Good point. I'll edit.
MatrixFrog
+15  A: 

Let me give you an analogy to why this "works".

Let's assume you need to draw a drawing, so you retrieve a piece of paper, lay it flat on your table, and start drawing.

Unfortunately, the paper isn't big enough, but you, not caring, or not noticing, just continue to draw your drawing.

When done, you take a step back, and look at your drawing, and it looks good, exactly as you meant it to be, and exactly the way you drew it.

Until someone comes along and picks up their piece of paper that they left on the table before you got to it.

Now there's a piece of the drawing missing. The piece you drew on that other person's paper.

Additionally, that person now has pieces of your drawing on his paper, probably messing with whatever he wanted to have on the paper instead.

So while your memory usage might appear to work, it only does so because your program finishes. Leave such a bug in a program that runs for a while and I can guarantee you that you get odd results, crashes and whatnot.

C is built like a chainsaw on steroids. There's almost nothing you cannot do. This also means that you need to know what you're doing, otherwise you'll saw right through the tree and into your foot before you know it.

Lasse V. Karlsen
+1 Another great analogy
Andreas Grech
A: 

You are asking for space for two integers. p[3] assumes that you have space for 4 integers!

===================

You need to tell malloc how much you need because it can't guess how much memory you need.

malloc can do whatever it wants as long as it returns at least the amount of memory you ask for.

It's like asking for a seat in a restaurant. You might be given a bigger table than you need. Or you might be given a seat at a table with other people. Or you might be given a table with one seat. Malloc is free to do anything it wants as long as you get your single seat.

As part of the "contract" for the use of malloc, you are required to never reference memory beyond what you have asked for because you are only guaranteed to get the amount you asked for.

+2  A: 

Depending on the platform, p[500] would probably "work" too.

Sanjaya R
+1  A: 

When using malloc(), you are accepting a contract with the runtime library in which you agree to ask for as much memory as you are planning to use, and it agrees to give it to you. It is the kind of all-verbal, handshake agreement between friends, that so often gets people in trouble. When you access an address outside the range of your allocation, you are violating your promise.

At that point, you have requested what the standard calls "Undefined Behavior" and the compiler and library are allowed to do anything at all in response. Even appearing to work "correctly" is allowed.

It is very unfortunate that it does so often work correctly, because this mistake can be difficult to write test cases to catch. The best approaches to testing for it involve either replacing malloc() with an implementation that keeps track of block size limits and aggressively tests the heap for its health at every opportunity, or to use a tool like valgrind to watch the behavior of the program from "outside" and discover the misuse of buffer memory. Ideally, such misuse would fail early and fail loudly.

One reason why using elements close to the original allocation often succeeds is that the allocator often gives out blocks that are related to convenient multiples of the alignment guarantee, and that often results in some "spare" bytes at the end of one allocation before the start of the next. However the allocator often store critical information that it needs to manage the heap itself near those bytes, so overstepping the allocation can result in destruction of the data that malloc() itself needs to successfully make a second allocation.

Edit: The OP fixed the side issue with *(p+2) confounded against p[1] so I've edited my answer to drop that point.

RBerteig
A: 

The reason for the size given to malloc() is for the memory manager to keep track of how much space has been given out to each process on your system. These tables help the system to know who allocated how much space, and what addresses are free()able.

Second, c allows you to write to any part of ram at any time. Kernel's may prevent you from writing to certain sections, causing protection faults, but there is nothing preventing the programmer from attempting.

Third, in all likelyhood, malloc()ing the first time probably doesn't simply allocate 8 bytes to your process. This is implementation dependent, but it is more likely for the memory manager to allocate a full page for your use just because it is easier to allocate page size chunks....then subsequent malloc()'s would further divide the previously malloc()ed page.

KFro
C doesn't allow you to write to any part of memory. C guarantees that you can write to certain parts, and anything else is undefined behavior. This means that anything the implementation does is perfectly legal C, including formatting your hard disk, sending everybody on your contact list promotional material from Amway, or even doing what you expected. Just don't count on that last.
David Thornley
I guess my point is that "undefined behavior" is not undefined. It is defined by the implementation. There are many C programs out there that take advantage of the fact that writing to particular locations in memory does something specific due to the implementation allowing it. Special purpose registers, for instance, can control a hardware feature in a defined manner...and we understand that at the time we write the C code.
KFro
+3  A: 

Try this:

int main ( int argc, char *argv[] ) {
  int *p = malloc(2 * sizeof *p);
  int *q = malloc(sizeof *q);
  *q = 100;

  p[0] = 10;    p[1] = 20;    p[2] = 30;    p[3] = 40;
  p[4] = 50;    p[5] = 60;    p[6] = 70;


  printf("%d\n", *q);

  return 0;
}

On my machine, it prints:

50

This is because you overwrote the memory allocated for p, and stomped on q.

Note that malloc may not put p and q in contiguous memory because of alignment restrictions.

A: 

Do :

int *p = malloc(2 * sizeof(*p)); // wrong (if type is something greater than a machine word)

[type] *p = malloc(2 * sizeof([type])); // right.
Yuriy Y. Yermilov
`sizeof(*p)`: size of the thing pointed to by `p` not the size of `p`.
Sinan Ünür
No, because the datatype is not always an integer.In this case it is a pointer, but you need to allocate room for the datatype which can be a custom one, say a struct which is 40 bytes or so. So in other words, <type> *p = malloc(2 * sizeof(<type>))
Yuriy Y. Yermilov
The * on the *p dereferences the pointer which makes that evaluate to sizeof(int). If p was defined as double*, you'd get sizeof(*p) == sizeof(double).
Matthew Iselin
If you give `sizeof` an expression that evaluates to a value 9as a opposed to a type) it will take the size of the type of that value. Therefore, for all types `X`, `X *p; sizeof(*p) == sizeof(X);` holds.
Logan Capaldo
Uh, that failed at being formatted right... The point is still clear though.
Matthew Iselin