ansaurus

Question

Validity of the code

Answer 1

+14 A:

Your array is two arrays of int[2], while your function populate() treats it as a single array of int[4]. Depending on exactly how the compiler decides to align the elements of array, this may not be a valid assumption.

Specifically, when j is 2 and you try to access arr[2], this is outside the bounds of main's array[0] and is therefore invalid.

Greg Hewgill 2010-01-10 04:48:38

But when you access `arr[2]`, `arr` is not an "array [2] of anything", it is a pointer. I am not sure if this argument applies.

Alok 2010-01-10 05:24:27

@Alok: In C++, multidimensional arrays are a different thing than arrays of pointers. Java only has the latter, but C and C++ have both forms.

Greg Hewgill 2010-01-10 05:50:30

@Greg, OK. In C, there is nothing like "multidimensional arrays", there are just arrays (which in turn can be composed of arrays, and so on). Not sure if you were implying that C++ has true multidimensional arrays.

Alok 2010-01-10 06:43:16

Sorry but some of this is garbage.1. int A[2][2]; is a multi-dimansion array. It is also an array of arrays (they are not mutually exclusive).2. The code is legal and will always work (C/C++ arrays are required to be contiguous).3. Never do it again. "Legal" != "Smart". If you handed that to me I'd break your knuckles. It is deliberately confusing.

Michael J 2010-01-10 08:39:11

Michael J is correct in that elements are contiguous, however the behavior is still undefined, the implementation is allowed to perform bounds checking for example.

avakar 2010-01-10 10:06:46

@avakar - The function "popuate()" gets a pointer to an integer. How can it do any bounds checking? You can check that the process owns the memory, which it does. You cannot reasonably check the object size.

Michael J 2010-01-10 11:26:46

Michael J, because the pointer itself can possibly carry the necessary information (the pointers are then called *fat*). I think there even were real implementations that did this.

avakar 2010-01-10 11:50:22

@Michael J: It doesn't matter *how* it'd do it. Just the fact that it is allowed is enough to render it undefined behavior.

jalf 2010-01-10 14:40:09

I am now convinced that it is undefined behavior. Please see the link in my post or the explanation in my post.

Alok 2010-01-10 15:32:13

@Greg/Alok: I'm not sure I understand what you mean by "depending on how the compiler decides to align the elements of array". What does align have to do with this? If I understood Alok's answer (please correct me if not), a standard-conformant compiler *has* to allocate the memory in a specific way for this definition. The undefined behaviour doesn't come from how the compiler chooses to lay out the memory, but by invalid indexing done later.

Edan Maor 2010-01-10 15:39:50

I should note that while it may actually work with a stack allocated array, if it were dynamically allocated there would be a problem because of the book-keeping of the array size usually just placed before the first element of the array.

Matthieu M. 2010-01-10 16:09:37

Alok 2010-01-10 16:12:45

Emile Cormier 2010-01-10 18:05:53

@Alok: So, if I inderstand correctly, the standard allows int* to be implemented as more than a mere 32/64-bit address and can have other "housekeeping" data? **Edit:** It seems you deleted the comment I'm replying to here.

Emile Cormier 2010-01-10 18:25:16

@Emile: Yes, the standard doesn't care how a pointer is implemented as long as it behaves as expected. Pointers are not memory addresses. Think of compilers targeting Flashbytecode for example. Raw pointers don't exist in that environment, so pointers have to be compiled to... some other representation. A pointer is a C/C++ concept. Just because it is *typically* compiled down to a memory address, that doesn't mean it's required to be equivalent.

jalf 2010-01-10 18:32:09

@Emile, yeah, my comment wasn't that well-formed, and I thought I would write it again, but by then jalf replied. Sorry.

Alok 2010-01-10 19:33:03

Guys, if your interpretation were true then casting would be illegal. How would malloc work? It returns a void * pointer and is cast to whatever. You have misunderstood the standard.

Michael J 2010-01-10 21:21:54

`void *` is not cast to whatever, it is guaranteed by the standard that a pointer to `void *` can be *converted* to any other pointer type. `malloc` guarantees alignment for any type. Also, a pointer can be converted to `void *` and back, and that round trip is guaranteed to yield the original pointer.

Alok 2010-01-10 21:46:28

Answer 2

+3 A:

That is not going to always work. C has arrays of arrays, not 2D arrays. The sub-arrays are not always specified to be contiguous in memory(static arrays might be, check C/C++ standard) In this particular example, I suspect it works correctly. However, if you had dynamically allocated the memory being passed in, you quite possibly would fail, because malloc(or new) might have put the subarrays quite far apart.

If, however, you want to linearly walk down '2d' memory, you can construct a 2D accessor against a 1D array and it will work fine and things like memset will work against the 1D array.

Paul Nathan 2010-01-10 04:51:07

Arrays are required to have contiguous memory (section 6.2.5, bullet 10 in the ISO C99 standard). And if multidimensional arrays are defined to be arrays of arrays, then doesn't that imply that they too are contiguous (the n+1st array must immediately follow the n-th array in memory)?

jamesdlin 2010-01-10 05:15:44

Oops, I meant bullet 20.

jamesdlin 2010-01-10 05:22:41

They must be contiguous, but there might be implementation-defined padding between them. I'm not aware of this being a problem for `int` on any compiler I've used, though.

Mark Ransom 2010-01-10 05:23:40

Ah, good point. Padding could be a problem for multidimensional `char` arrays.

jamesdlin 2010-01-10 05:25:58

Where would the padding be? Between `array[i]`'s last element and `array[i+1]`'s first element? But `array[i]` and `array[i+1]` are successive elements of an array, so there can't be any padding.

Alok 2010-01-10 05:39:04

malloc returns a block of contiguous memory. Treating the result of malloc as an array of int arrays will never result in 'subarrays quite far apart'.

Pete Kirkham 2010-01-10 10:13:49

If there is padding in arrays, how do you portably allocate them dynamically? `int (*multi_array)[3] = malloc(rows * sizeof(int[3]));` wouldn't work by that reasoning ( http://en.wikibooks.org/wiki/C_Programming/Common_practices#Dynamic_multidimensional_arrays )

Pete Kirkham 2010-01-10 10:28:49

The behavior is undefined not because of padding, but because "the standard says so".

Alok 2010-01-10 15:33:07

The standard says that `int[3][rows]` is an array of objects of type `int[3]`, and that elements in arrays are contiguous. So why is the code I posted undefined?

Pete Kirkham 2010-01-10 16:19:12

@Pete: `malloc` indeed allocates contiguous space, but the question isn't about `malloc` (or dynamically allocated memory) at all.

Alok 2010-01-10 16:29:06

Answer 3

+7 A:

You have an array of arrays. array[1] follows array[0] in memory, because arrays are contiguous. If p == array[0], then p[1] follows p[0], because arrays are contiguous. So, you are right: all the memory for array is contiguous.

In pictures, array looks like this.

+-----------------+-----------------+
|      [0]        |      [1]        |
+-----------------+-----------------+

Now, let's break down array[0] and array[1], they individually look like this:

+--------+--------+
|  [0]   |  [1]   |        
+--------+--------+

So, the final picture is:

+--------+--------+--------+--------+
| [0][0] | [0][1] | [1][0] | [1][1] |
+--------+--------+--------+--------+

Now, the question is, can you access this contiguous memory the way you are. The answer is, it is not guaranteed by the standard. The arrays are contiguous, but the standard doesn't allow indexing the way you have done. In other words:

&array[0][0]+2 == &array[1][0], but (&a[0][0] + 2) + 1 is undefined whereas &a[1][0] + 1 is valid. If this seems strange, it is, but as per the quote you posted from the standard, you are only allowed to calculate a pointer that is either inside an array or at most one past the array (without dereferencing that "one past" pointer).

In practice, I doubt that this would fail anywhere, but the according to the standard at least, your code is invalid because of undefined behavior.

See this post on comp.lang.c as well.

Alok 2010-01-10 05:22:00

Shouldn't the last element in the third picture be "[1][1]"?

vobject 2010-01-10 14:08:20

The question is not whether the memory is contiguous (it clearly is), but whether the behavior is well-defined. I don't see how this can be "half-right". He asks if it is UB, the answer is yes. The fact that it'll usually work doesn't make it any less undefined.

jalf 2010-01-10 14:41:20

jalf, you are right. When I said "half-right", I meant that some people thought that the memory need not be contiguous, therefore the behavior is undefined. On the other hand, many other people thought that the memory has to be continuous, so the behavior is well-defined.

Alok 2010-01-10 15:30:59

@vobject: fixed. @jalf: I have changed the wording of my answer.

Alok 2010-01-10 15:37:03

What also needs to be added to the mix is the CPU configuration. For instance, the DEC memory model is reversed from the PC memory model. So the arrays would also be reversed and accessing them as a single array would come up differently.

Dave 2010-01-12 01:47:19

My copy of the standard says this in 8.3.4 Arrays [dcl.arrays] "A consistent rule is followed for multidimensional arrays. If E is an n-dimensional array of rank i × j × . . . k, then E appearing in an expression is converted to a pointer to an (n − 1)-dimensional array with rank j× ... × k. If the * operator, either explicitly or implicitly as a result of subscripting, is applied to this pointer, the result is the pointed-to (n-1)-dimensional array, which itself is immediately converted into a pointer." This suggests that it may not be UB. as the array is converted to an array of size 4.

Michael Anderson 2010-01-12 02:18:03

@Michael: No, what it says is that the two-dimensional array is converted to a single-dimensional array of size 2. It's really just stating the obvious. Dereferencing a N-dimensional array yields the N-1 dimensional array that is stored as one of the array entries. In our case, each array entry in the "outer" array stores a single-dimensional array of size two. And one of those is returned when you dereference the 2D array.

jalf 2010-01-12 05:02:26

Answer 4

A:

In C everything is stored in linear memory segments. You are passing address of a[0][0] which would be same as address of a[0] so a[i][j] is same as a[i*ColSize+j] because everything is stored linearly. But if you allocate memory dynamically it would fail because that time all rows might not be stored in contiguous location. then a[i][j] would be *(&a[i]+j).

GG 2010-01-10 05:36:26

No -- the arrays are NOT required to be stored linearly by the standard. If it's declared as a two dimensional array, it's a two dimensional array -- not a single array big enough to hold all the sub-arrays.

Billy ONeal 2010-01-10 05:51:51

BillyONeal, an array of arrays is still an array and its elements are stored contiguously.

avakar 2010-01-10 10:42:21

ansaurus

tags:

views:

answers:

Validity of the code

related questions