ansaurus

Question

Why C or C++ does not allow passing array by values to function

Answer 1

+10 A:

You can pass an array by value, but you have to first wrap it in a struct or class. Or simply use a type like std::vector.

I think the decision was for the sake of efficiency. One wouldn't want to do this most of the time. It's the same reasoning as why there are no unsigned doubles. There is no associated CPU instruction, so you have to make what's not efficient very hard to do in a language like C++.

As @litb mentioned: "C++1x and boost both have wrapped native arrays into structs providing std::array and boost::array which i would always prefer because it allows passing and returning of arrays within structs"

An array is a pointer to the memory that holds that array and the size. Note it is not the exact same as a pointer to the first element of the array.

Most people think that you have to pass an array as a pointer and specify the size as a separate parameter, but this is not needed. You can pass a reference to the actual array itself while maintaining it's sizeof() status.

//Here you need the size because you have reduced 
// your array to an int* pointing to the first element.
void test1(int *x, int size)
{
  assert(sizeof(x) == 4);
}

//This function can take in an array of size 10
void test2(int (&x)[10])
{
  assert(sizeof(x) == 40);
}

//Same as test2 but by pointer
void test3(int (*x)[10])
{
  assert(sizeof(*x) == 40);
  //Note to access elements you need to do: (*x)[i]
}

Some people may say that the size of an array is not known. This is not true.

int x[10];  
assert(sizeof(x) == 40);

But what about allocations on the heap? Allocations on the heap do not return an array. They return a pointer to the first element of an array. So new is not type safe. If you do indeed have an array variable, then you will know the size of what it holds.

Brian R. Bondy 2009-03-22 14:18:37

You see that [10] in the function parameter? That's where the size is compimg from. The reference has nothing to do with it.

anon 2009-03-22 14:35:30

it's more type safe than the test1

Brian R. Bondy 2009-03-22 14:37:24

@Neil Butterworth: The size of an array is part of the array type itself. If you wanted variable size you'd be better off with an std::vector.

Brian R. Bondy 2009-03-22 14:39:59

You said that "people think you need to specify the size". People are right, as your needlessly complex code demonstrates.

anon 2009-03-22 14:46:04

It has everything to do with it. Had you said void test2(int x[10]); you would pass nothing more than a simple pointer, and the "10" there is completely ignored and useless (even dangerous). Now the reference accepts the parameter by-reference and avoids conversion of the argument to pointer.

Johannes Schaub - litb 2009-03-22 14:49:54

@Neil Butterworth: Thanks for pointing out what needed clarification. I corrected my description to say "specify the size as a separate parameter". My above code demonstrates that there is a difference between a pointer to the first element and a pointer to an array.

Brian R. Bondy 2009-03-22 14:50:10

But you still need to know the size! My point is that using sizeof() in the function is pointless because you already know the size of tehe array is 10.

anon 2009-03-22 14:53:28

I think the above nicely demonstrates what the difference between a pointer and an array pointer is. The second variant does not pass the size, it is just specifying the type. Which happens to contain the size. It is important to know because someone might use code as in test1 with sizeof.

Brian R. Bondy 2009-03-22 14:57:38

Brian yeah i like that sample too. look at bottom of http://stackoverflow.com/questions/275994/whats-the-best-way-to-do-a-backwards-loop-in-c-c-c/276053#276053 too for a sizeof replacement for arrays.

Johannes Schaub - litb 2009-03-22 15:06:30

worthwhile to mention C++1x and boost both have wrapped native arrays into structs providing std::array<T, N> and boost::array<T, N> which i would always prefer because it allows passing and returning of sarray within structs

Johannes Schaub - litb 2009-03-22 15:09:24

@litb: nice array_size implementation, very cool

Brian R. Bondy 2009-03-22 15:30:16

+1, nice clear explanation.

j_random_hacker 2009-03-23 04:02:01

@Neil: It is in fact possible to capture the size of the array without specifying it in the function declaration by turning the function into a function template, with the array size as a non-type (i.e. integral) template parameter.

j_random_hacker 2009-03-23 04:02:49

thanks for the praise, Brian. appreciated :)

Johannes Schaub - litb 2009-03-23 04:15:19

Answer 2

+6 A:

EDIT: I've left the original answer below, but I believe most of the value is now in the comments. I've made it community wiki, so if anyone involved in the subsequent conversation wants to edit the answer to reflect that information, feel free.

Original answer

For one thing, how would it know how much stack to allocate? That's fixed for structures and objects (I believe) but with an array it would depend on how big the array is, which isn't known until execution time. (Even if each caller knew at compile-time, there could be different callers with different array sizes.) You could force a particular array size in the parameter declaration, but that seems a bit strange.

Beyond that, as Brian says there's the matter of efficiency.

What would you want to achieve through all of this? Is it a matter of wanting to make sure that the contents of the original array aren't changed?

Jon Skeet 2009-03-22 14:21:56

char x[10]; assert(sizeof(x) == 10);

Brian R. Bondy 2009-03-22 14:23:04

Brian is right. arrays in C++ and C89 always have sizes known at compile time :) but i think your answer still has to do with it. there are many situations where an array decays into a pointer. and making arrays pass to functions would for one dangerous (precisely because of that decay) ...

Johannes Schaub - litb 2009-03-22 14:29:01

and useless to a certain degree, because we would be limited for passing one size (i.e int[42] - but not int[41]).

Johannes Schaub - litb 2009-03-22 14:30:03

i believe your answer points out exactly that, just from a point of view which expects to make it work to accept different array sizes but which would not work in C++/C because it would need to accept array sizes that have different sizes. so i +1 anyway :D

Johannes Schaub - litb 2009-03-22 14:33:22

The size of the array is part of the type itself. If you wanted variable size you'd be better off with an std::vector.

Brian R. Bondy 2009-03-22 14:38:42

@litb: arrays certainly do *not* always have know sizes at compile time! An array allocated on the heap, for instance, can have an arbitrary size that varies depending on the state and inputs of the program.

thehouse 2009-03-22 14:52:33

i meant when the argument has an array type, then that size has to be known. passing an array on the heap around by value wouldn't be possible because one would need the size at compile time. now i see what jon means, thanks thehouse :)

Johannes Schaub - litb 2009-03-22 15:03:42

@thehouse: new loses some type information, i.e. it is not type safe for arrays. It will cast away the array part of the pointer. But if you indeed have a typesafe variable that holds an array, it will always know its size.

Brian R. Bondy 2009-03-22 16:15:10

...The type of an array includes its size. But it can be reduced to a simple pointer to the first element, hence losing the size. My point is that arrays and pointers are distinct beasts, but often treated together.

Brian R. Bondy 2009-03-22 16:24:33

Is this answer still useful? I'm not convinced it is, but I'm loathe to delete it when there's potentially useful information in the comments. Suggestions anyone?

Jon Skeet 2009-03-22 18:26:37

@Jon: I would leave it for the comments. About your last comment: you are right, if the user does not want the contents to be changed, the language (C++, not C) has that facility through the use of the const keyword.

David Rodríguez - dribeas 2009-03-22 20:13:57

Okay, I'll leave it but put a note in the answer to point to the comments :)

Jon Skeet 2009-03-22 20:45:11

Answer 3

+14 A:

In C/C++, internally, an array is passed as a pointer to some location, and basically, it is passed by value. The thing is, that copied value represents a memory address to the same location.

In C++, a vector<T> is copied and passed to another function, by the way.

Mehrdad Afshari 2009-03-22 14:22:28

Best answer! I think though it should emphasis also that C/C++ does not *prevent* passing the array by value.

hasen j 2009-03-22 14:39:46

This answer confuses arrays with pointers. It may sound simple, but it indeed causes confusion i think. It's important to keep the concepts of arrays and pointers apart, to understand their fundamental difference in the first place.

Johannes Schaub - litb 2009-03-22 14:46:26

@litb: Well, yeah, it's pretty important to know that it's not precisely the same as passing a pointer from the compiler's point of view (e.g. multidimensional arrays), but it's equally important to emphasis that the actual array address is *copied*, not passed by reference.

Mehrdad Afshari 2009-03-22 14:55:46

An array is indeed more than a pointer, otherwise for char a[1]; sizeof(a) would equal sizeof(char*) instead of being 1.

anon 2009-03-22 15:22:01

@Neil: I think that's exactly the thing `litb` pointed out. The real problem is the concept you might want to define as an array. While I admit litb's comment is perfectly valid, I'm not able to explain it in a more precise way. I think it's already obvious what I mean.

Mehrdad Afshari 2009-03-22 15:26:27

Yes, i think that is the fundamental problem. Both of us are right i think. Some talk about the concept of an array, and thus include buffers allocated at the heap and pointed to by a pointer, and some talk about declared arrays as they exist in C and C++ as aggregate objects.

Johannes Schaub - litb 2009-03-22 15:40:28

@litb: I think the problem is that new always returns a char* even for arrays. It would be better if it returned an int (*)[10] when you allocated that, but then you wouldn't be allowed to have dynamic allocations. I think you could hack something together with templates to get typesafe new

Brian R. Bondy 2009-03-22 16:02:00

Brian, yeah i agree, that special handling somewhat knuddles pointers and arrays together (btw, here is teh pet peeve http://stackoverflow.com/questions/423823/whats-your-favorite-programmer-ignorance-pet-peeve/484900#484900 :p)

Johannes Schaub - litb 2009-03-22 16:14:24

nice :), I don't think everyone understands that new doesn't return an array it returns a pointer to the first element of an array. The question on this page asks about arrays though, not pointers to the first element of an array.

Brian R. Bondy 2009-03-22 16:19:35

@Brian: All goes back to the definition of the array. Is it the buffer or the concept as litb pointed out? Isn't the pointer of the first element called an array? or is it? Is a vector an array (as it stores a sequence of elements linearly, which is one definition of an array)?

Mehrdad Afshari 2009-03-22 16:33:00

@Mehrdad: I think the definition of an "array type" is a variable that has a pointer to the first element and a known size. Otherwise what you have is just a pointer.

Brian R. Bondy 2009-03-22 16:37:15

A buffer is what an array pointers to, but I don't think of it as part of the array type.

Brian R. Bondy 2009-03-22 16:37:55

It's not part of an array *type*, but it can probably be considered a part of an array. You are certainly right about array type, and the way compiler treats arrays. Anyhow, I think it does not matter that much but the real controversial thing is the definition.

Mehrdad Afshari 2009-03-22 16:42:59

Ya I guess basically if you consider array == array type.

Brian R. Bondy 2009-03-22 16:51:41

Mehrdad, it's quite clear what an array is. a pointer is not an array. a vector is not, a boost::array is not... just a T[N] is. the other things either emulate an array, or model the conceptual array. But they are not real arrays.

Johannes Schaub - litb 2009-03-22 16:51:50

if you say an array is nothing more than a pointer, then that's plain wrong. a lambda expression then is nothing more than a chunk of bits. everything is the same then, if you abstract the language rules away and look at them at the assembler/machine code level

Johannes Schaub - litb 2009-03-22 16:53:10

@litb: Right. That statement should be clarified I think... This one is better.

Mehrdad Afshari 2009-03-22 17:01:22

@litb: Your last comment is terrific. It made me thinking of an answer like "an array is a sequence of electrons that get around the machine :))"

Mehrdad Afshari 2009-03-22 17:23:36

@Mehrdad: I know what you're trying to say, but I think your current explanation is very close to wrong. The problem is that it blurs the line -- *all* cases where pointer semantics are used instead of value semantics can be explained as "well, it's really value semantics on the underlying address".

j_random_hacker 2009-03-23 03:59:24

Yes, but isn't it the right thing nevertheless?! I mean, considering the question is a design decision on the way C works, and not a how-to question, I think it's OK to give the answer "this is basically the only way *C* works, everything is call by value, but that thing is now an address"

Mehrdad Afshari 2009-03-23 10:28:37

Mehrdad, hehe great i made you think of something fun :p dunno, i thought as you have it now it makes sense - the pointer is passed by value. after all it is, i think.

Johannes Schaub - litb 2009-03-23 16:06:34

C has two things: the "locator value" (= lvalue) and the "value of an expression" (= rvalue), as it terms those in a note. While there are rvalue arrays (arrays wrapped in a struct and returned from a function), the "value of an array in an expression" is intended to be the pointer i believe...

Johannes Schaub - litb 2009-03-23 16:10:34

@Mehrdad: Actually your answer is correct, it's just a question of emphasis. My reading of your answer (and maybe I'm alone :) ) is that you're trying to say "C *does* use value semantics everywhere" (using a broadened definition of "value semantics") which is confusing I think. :/

j_random_hacker 2009-03-24 13:48:30

Answer 4

+3 A:

I'm not actually aware of any languages that support passing naked arrays by value. To do so would not be particularly useful and would quickly chomp up the call stack.

Edit: To downvoters - if you know better, please let us all know.

anon 2009-03-22 14:27:27

C++ vector<...> objects, when passed by value, are copied. And although this wastes (heap-)memory, it won't "quickly chomp up the call stack".

vog 2009-03-22 14:58:34

The native array inside a vector is passed by reference - it's a pointer. C++ native arrays are exactly the same as C arrays.

anon 2009-03-22 15:02:18

MATLAB has value semantics even for arrays. This is one reason why non-trivial matlab programs often use a huge amount of memory, though current versions of the interpreter use copy-on-write to reduce the memory usage.

janneb 2009-03-22 15:04:54

Copy on write is passing by ref (R does the same). The fact the array may grow later (depending on use) is not the issue.

anon 2009-03-22 15:07:13

i agree with Neil. i know no language which can pass arrays by value. C# and Java pass them not at all (not even by reference), C++ can pass them by reference only. And C can't pass them at all either, only passing a pointer to their first element.

Johannes Schaub - litb 2009-03-22 15:10:34

though, is passing arrays contained in structs passing-of-arrays? probably one could argue about that hours long :D

Johannes Schaub - litb 2009-03-22 15:12:18

By native, I meant naked - I'll change my answer.

anon 2009-03-22 15:17:08

No, as I said, MATLAB has value semantics, not reference semantics. When you call a function, semantically you get copies of the arguments rather than references to them. That the interpreter can defer copying is an implementation detail; perhaps I shouldn't have brought that up, confusing the issue

janneb 2009-03-22 16:54:41

i don't know any language that does this, but certainly there are languages that do so. not saying you are wrong. just saying i don't know one that do so. i imagine it could have benefits where language do long calcs with the arguments - aliasing could matter there. thanks for the insignt janneb

Johannes Schaub - litb 2009-03-22 17:05:56

now that you told me matlab has value semantics, indeed i will have to correct myself and say i know at least one language that does :)

Johannes Schaub - litb 2009-03-22 17:07:25

You can pass arrays in Ada as "in" arguments. This is conceptually like passing them by value. However you can't modify "in" parameters. And certainly, under the sheets, the compiler is passing them by reference for efficiency.

Brian Neal 2009-03-22 17:43:59

@vog: Passing by value consumes stack. Passing vectors by value does not in so much as the memory is indeed allocated on the heap. The vector internals are copied, the copy constructor will allocate new heap memory and copy the arrays. No reference to the original is passed.

David Rodríguez - dribeas 2009-03-22 20:20:43

'copy the arrays' should be: copy the array contents.@janneb: Copy on write is quite troublesome in multithreaded programs. Either you enforce locking on each access to the element affecting performance, or you end up playing russian rulette with pointers.

David Rodríguez - dribeas 2009-03-22 20:23:26

@dribeas: That might be; I'm not arguing that the MATLAB way is better or worse, just saying how it is.

janneb 2009-03-22 21:51:34

@Brian Neal: In Fortran you can set the intent(in) attribute for arguments, which is similar to Ada. But as you say yourself, this is not really value semantics. More like passing objects via const reference in C++.

janneb 2009-03-22 21:52:08

Answer 5

+3 A:

This is one of those "just because" answers. C++ inherited it from C, and had to follow it to keep compatibility. It was done that way in C for efficiency. You would rarely want to make a copy of a large array (remember, think PDP-11 here) on the stack to pass it to a function.

Brian Neal 2009-03-22 16:42:45

Answer 6

+4 A:

I think that there 3 main reasons why arrays are passed as pointers in C instead of by value. The first 2 are mentioned in other answers:

efficiency
because there's no size information for arrays in general (if you include dynamically allocated arrays)

However, I think a third reason is due to:

the evolution of C from earlier languages like B and BCPL, where arrays were actually implemented as a pointer to the array data

Dennis Ritchie talks about the early evolution of C from languages like BCPL and B and in particular how arrays are implemented and how they were influenced by BCPL and B arrays and how and why they are different (while remaining very similar in expressions because array names decay into pointers in expressions).

http://plan9.bell-labs.com/cm/cs/who/dmr/chist.html

Michael Burr 2009-03-22 17:41:55

ansaurus

tags:

views:

answers:

Why C or C++ does not allow passing array by values to function

related questions