views:

144

answers:

9

Suppose we have an array say:

int arr[1000];

and I have a function that works on that array say:

void Func(void);

Why would there ever be a need to pass by reference (by changing the void), when I can have arr[1000] as an external variable outside main()?

  1. What is the difference?Is there any difference?
  2. Why do people prefer passing by reference rather than making it external? (I myself think that making it external is easier).
A: 
  1. There is a difference in scope. If you declare "int arr[1000]" in your main() for instance, you cannot access it in your function "another_function()". You would have to explicitly pass it by reference to every other function in which you want to use it. If it were external, it would be accessible in every function.

  2. See (1.)

I meant that my int arr[1000] is an extenal variable.
fahad
+1  A: 

It's largely a matter of scope; If you make all your variables external/global in scope, how confusing is that going to get?

Not only that, but you'll have a large number of variables that simply do not need to exist at any given time. Passing function arguments around instead of having lots of global variables lets you more easily get rid of things you no longer need.

Andrew Barber
Plus it increases the necessity of bearing the pain to send it to each and every function and some waste of memory to accept that in the function.
fahad
The 'pain' of having to type in a variable name in a function call is much, much less than the alternative.
Andrew Barber
+5  A: 

I think you're asking if global variables are bad. Quoting an excellent answer:

The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.

To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).

If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.

Jacob
@jamietre: It's tagged `c`
Jacob
uh. oops.......
jamietre
+3  A: 

If arr is external then anyone can modify it, not just Func. This is Officially Bad.

Passing arguments ensures that you know what data you are changing and who is changing it.

EDIT: Where Officially Bad means "Usually bad, but not always. Generally don't do it unless you have a good reason." Just like all the other "rules" of software development :)

Cameron Skinner
+1 for the Capitalized Slogan For the Improvement of Programmers Worldwide. No, seriously, globals are yucky (in most, but not all cases).
rubenvb
I would extend that label (yucky) to all cases. Sometimes bad APIs make it difficult or impossible to solve a problem without globals, but that doesn't mitigate the yuck factor (global state, components interfering with one another, ...).
R..
True: even if your global is something fundamental to the hardware (an address with special meaning, like I/O or graphics), that doesn't make it *pleasant*. Possibly only *modifiable* globals are always yucky, though. Unmodifiable globals might be alright. The only thing that saves a C function from being a global is that functions aren't actually objects. In languages where functions are objects, you have to be a fairly hardcode dependency-injector before you go to the extent of using only anonymous functions...
Steve Jessop
@Steve: indeed, I was talking about modifiable objects. However, namespace pollution is also a form of 'global state' to consider. This is pretty irrelevant for `static const` objects, but also relevant to the example you brought up, functions.
R..
A: 

It's a maintenance issue too. Why would I want to have to track down some external somewhere when I can just look at the function and see what it is supposed to be?

Derek
+9  A: 

If you use a global variable arr, Func is limited to always being used with that one variable and nothing else. Here are some reasons why that might be bad:

  • arr is part of the "current document" you're working with, and you later decide you want your program to support having more than one document open.
  • You later decide (or someone using your code as a library decides) to use threads, and suddenly your program randomly crashes when two threads clobber each other's work in arr.
  • You later decide to make your code a library, and now it makes sense for the caller (in case there's more than one point at which the library gets used in a program) to provide the buffer; otherwise independent parts of the calling code would have the be aware of one another's implementations.

All of these problems go away as soon as you eliminate global variables and make your functions take pointers to the data they need to operate on.

R..
This is the answer I was going to post :)
Zack
+2  A: 

By making the variable external to the function, the function is now tightly coupled to the module that defines the variable, and is thus harder to reuse in other programs. It also means that your function can only ever work on that one array, which limits the function's flexibility. Suppose one day your requirements change, and now you have to process multiple arrays with Func.

By passing the array as a parameter (along with the array size), the function becomes more easily decoupled from the module using it (meaning it can be more easily used by other programs/modules), and you can now use the function to process more than one array.

From a general code maintenance standpoint, it's best that functions and their callers communicate through parameters and return values rather than rely on shared variables.

John Bode
+1  A: 

Passing by reference (rather than using a global variable) makes it more clear to someone reading the code that the function may change the values of the array.

Additionally if you were to want to preform the action on more than one array you could just use the same function over and over and pass a different array to it each time.

Another reason is that when writing multi-threaded code you usually want each thread to exclusively own as much of the data that it has to work on (sharing writable data is expensive and may result in race conditions if not done properly). By restricting global variable access and making local variables and passing references you can more easily write code that is more thread (and signal handler) friendly.

As an example lets look at the simple puts function.

int puts(const char *s);

This function write a C string to standard output, which can be useful. You might write some complicated code that outputs messages about what it is doing at different stages of execution using puts.

 int my_complicated_code( int x, int y, int z);

Now, imagine that you call the function several times in the program, but one of those times you actually don't want it to write to standard output, but to some other FILE *. If all of your calls to puts were actually fputs, which takes a FILE * that tells what file to print to, this would be easy to accomplish if you changed my_complicated_code to take in a FILE * as well as it's other arguments.

 int my_complicated_code(int x, int y, int z, FILE * out_file);

Now you can decide which file it will print to at the time when you call my_complicated_code by passing it a reference to any FILE * you have (that is open for writing).

The same thing follows for arrays. The memcpy function would be much less useful if it only copied data to one particular location. Or if it only copied from one particular location, since it actually takes two references to arrays.

It is often easier to write unit tests for functions that take references too since they don't make assumptions about where the data they need is or what its name is. You don't have to keep updating an array with a certain name to mimic the input you want to test, just create a different array for each test and pass it to your function.

In many simple programs it may seem like it is easier to write code using global variables like this, but as programs get bigger this is not the case.

nategoose
I agree with what you're saying, but note that C is strictly a pass-by-value language. You're not "passing-by-reference"; you're passing a pointer by value.
Scott Stanchfield
@Scott Stanchfield: The only semantic difference between passing by reference and passing by pointer value that I can think of is that when you pass by pointer value you are able to reassign that pointer (thereby losing the reference). Other possible differences that come to mind are themselves reliant on other features of the language (and may all be overcome in C++ references getting the pointer value that would have been passed in C). These include reinterpreting the pointer as another type (pointer to another type or integer). Reference in C++ is by pointer value with syntactic changes.
nategoose
+1  A: 

As an addition to all the other answers already giving good reasons: Every single decision in programming is a tradeoff between different advantages and disadvantages. Decades of programming experience by generations of programmers have shown that global state is a bad thing in most cases. There is even a programming paradigm built around the avoidance of it, taking it to the extreme of avoiding state at all:

http://en.wikipedia.org/wiki/Functional_programming

You may find it easier at the moment, but when your projects keep going to grow bigger and bigger, at some point you will find that you have implemented so many workarounds for the problems that came up in the meantime, that you will find yourself unable to maintain your own code.

Secure