views:

677

answers:

16

I'm fairly new to Java (been writing other stuff for many years) and unless I'm missing something (and I'm happy to be wrong here) the following is a fatal flaw...

String foo = new String();
thisDoesntWork(foo);
System.out.println(foo);//this prints nothing

public static void thisDoesntWork(String foo){
   foo = "howdy";
}

Now, I'm well aware of the (fairly poorly worded) concept that in java everything is passed by "value" and not "reference", but String is an object and has all sorts of bells and whistles, so, one would expect that unlike an int a user would be able to operate on the thing that's passed into the method (and not be stuck with the value set by the overloaded =).

Can someone explain to me what the reasoning behind this design choice was? As I said, I'm not looking to be right here, and perhaps I'm missing something obvious?

+6  A: 

When you pass "foo", you're passing the reference to "foo" as a value to ThisDoesntWork(). That means that when you do the assignment to "foo" inside of your method, you are merely setting a local variable (foo)'s reference to be a reference to your new string.

Another thing to keep in mind when thinking about how strings behave in Java is that strings are immutable. It works the same way in C#, and for some good reasons:

  • Security: Nobody can jam data into your string and cause a buffer overflow error if nobody can modify it!
  • Speed : If you can be sure that your strings are immutable, you know its size is always the same and you don't ever have to do a move of the data structure in memory when you manipulate it. You (the language designer) also don't have to worry about implementing the String as a slow linked-list, either. This cuts both ways, though. Appending strings just using the + operator can be expensive memory-wise, and you will have to use a StringBuilder object to do this in a high-performance, memory-efficient way.

Now onto your bigger question. Why are objects passed this way? Well, if Java passed your string as what you'd traditionally call "by value", it would have to actually copy the entire string before passing it to your function. That's quite slow. If it passed the string by reference and let you change it (like C does), you'd have the problems I just listed.

Dave Markle
it really doesn't have to do with the fact that strings are immutable. If he passed a List, the same thing would happen, and list are mutable
Laplie
The problem at hand has nothing to do with immutability of String. The behaviour would be exactly the same with any other object.I'm not sure immutability of String has much to do with security, I think it's much more a performance issue.
Axelle Ziegler
Edited accordingly.
Dave Markle
Actually the behaviour would be exactly the same were arguments passed by value in Java. Or by any mean actually. The behaviour he shows only has to do with parameter binding.
Axelle Ziegler
I think the layman's concept of "by value" here is actually to copy the string's contents, that's why the quotation marks.
Dave Markle
Even if it did, the behaviour would still be the same wouldn't it. It merely has to do with the fact that no matter what's its value, a function parameter is a local variable (well actually, a function parameter is not really a variable at all). Or am I missing something ?
Axelle Ziegler
@Axelle: You're missing the meaning of "pass by reference". Java *does* pass arguments by value. If it had pass-by-reference, the effect would be very different. You need to understand what pass-by-reference really means. It's not the same as "pass a reference by value".
Jon Skeet
The comment about a List is confusing. You could modify the *contents* of a List parameter (by calling e.g. add), just not the List itself. If String had a "setValue" method, you could call it. Mutable Strings are actually called StringBuilder, so if you must modify a parameter, use one of those
Bill Michell
Let's all just imagine java references as C pointers. everything is so much more easy then: to modify the parameter, you just do what that guy did (foo = "howdy"). to modify the object pointed to (if any), you use stuff like "add" and so on if the object is not immutable.
Johannes Schaub - litb
@Axelle: Function parameters are local variables that are guaranteed to be definitely assigned to when the method executes. So they really are variables in all respects. You contradict yourself in one and the same sentence there (it is a variable - but actually it's not)
eljenso
Well I stand by my binding comment. For the rest, I guess that's what you get for trying to say things at 6 in the morning :/ Sorry.
Axelle Ziegler
A: 

In Java strings are immutable which means you can't change the value in the String object.

String temp = new String();
temp = "hi!";

That code above actually creates two string objects (deleting the first one after the new "hi!" object is created).

http://www.jchq.net/certkey/0802certkey.htm

Mr. Pig
A: 

Go do the really big tutorial on suns website.

You seem not to understand the difference scopes variables can be. "foo" is local to your method. Nothing outside of that method can change what "foo" points too. The "foo" being referred to your method is a completely different field - its a static field on your enclosing class.

Scoping is especially important as you dont want everything to be visible to everything else in your system.

mP
no, I understand scoping intimately. I don't understand the value of making it impossible to alter the value of a non primitive object passed into a method. I guess, ultimately, I'm just longing for the option to do either, as is the case in C, and asking why this was removed in Java.
Dr.Dredel
And look at the mess that is C, no encapsulation in any form. Any line of code anywhere int he system can magically change values and trash your program if your lucky.
mP
Dr Dredel.Your question was badly worded. It appeared from your example you also did not understand scopes. If you did you would appreciate that references are also scoped. When passing object references around you are getting a copy of the reference pointer. Objects can have many pointers to them
mP
I don't seriously have to defend C, here, do I?
Dr.Dredel
Passing by reference is equivalent to passing pointers - with a few restrictions. By value simplies this problem and leads to better encapsulation. Start to think of the problems C has and then think why did SUN simply certain aspects when designing Java.
mP
+3  A: 

This rant explains it better than I could ever even try to.

Ric Tokyo
Glad you liked my rant ;)
Scott Stanchfield
A: 

Are you sure it prints null? I think it will be just blank as when you initialized the foo variable you provided empty String.

The assigning of foo in thisDoesntWork method is not changing the reference of the foo variable defined in class so the foo in System.out.println(foo) will still point to the old empty string object.

Bhushan
Its not initialized with an empty string, AFAICS.
Adeel Ansari
yea, sorry, typo.. I'll correct
Dr.Dredel
Vinegar I meant empty String object
Bhushan
A: 

Reference typed arguments are passed as references to objects themselves (not references to other variables that refer to objects). You can call methods on the object that has been passed. However, in your code sample:

public static void thisDoesntWork(String foo){
    foo = "howdy";
}

you are only storing a reference to the string "howdy" in a variable that is local to the method. That local variable (foo) was initialized to the value of the caller's foo when the method was called, but has no reference to the caller's variable itself. After initialization:

caller     data     method
------    ------    ------
(foo) -->   ""   <-- (foo)

After the assignment in your method:

caller     data     method
------    ------    ------
(foo) -->   ""
          "hello" <-- (foo)

You have another issues there: String instances are immutable (by design, for security) so you can't modify its value.

If you really want your method to provide an initial value for your string (or at any time in its life, for that matter), then have your method return a String value which you assign to the caller's variable at the point of the call. Something like this, for example:

String foo = thisWorks();
System.out.println(foo);//this prints the value assigned to foo in initialization 

public static String thisWorks(){
    return "howdy";
}
joel.neely
No, reference types *aren't* passed by reference. The value of the argument (which is a reference) is passed by value. This is *not* the same as pass by reference semantics.
Jon Skeet
Java uses copy semantics, and it will pass a copy of the reference to a method. Hence, references are passed by value.
eljenso
+2  A: 

The problem is you are instantiating a Java reference type. Then you pass that reference type to a static method, and reassign it to a locally scoped variable.

It has nothing to do with immutability. Exactly the same thing would have happened for a mutable reference type.

Julien Chastang
+1  A: 

In java all variables passed are actually passed around by value- even objects. All variables passed to a method are actually copies of the original value. In the case of your string example the original pointer ( its actually a reference - but to avoid confusion ill use a different word ) is copied into a new variable which becomes the parameter to the method.

It would be a pain if everything was by reference. One would need to make private copies all over the place which would definitely be a real pain. Everybody knows that using immutability for value types etc makes your programs infinitely simpler and more scalable.

Some benefits include: - No need to make defensive copies. - Threadsafe - no need to worry about locking just in case someone else wants to change the object.

mP
Everything is by reference in Java.
Axelle Ziegler
WRONG WRONG WRONG.Primitive values are not by reference. You cannot pass a reference to an "int" living in one method - aka a local variable to another method and modify the first local variable.
mP
Well, every object is passed by reference. Passing value types by reference doesn't make sense anyway.
Axelle Ziegler
@Axelle, no, objects *aren't* passed by reference. Passing value types by reference *does* make sense. Please use a language which allows real pass-by-reference semantics (e.g. C# using "ref" parameters). Java is *strictly* pass-by-value.
Jon Skeet
+3  A: 

Your question as asked doesn't really have to do with passing by value, passing by reference, or the fact that strings are immutable (as others have stated).

Inside the method, you actually create a local variable (I'll call that one "localFoo") that points to the same reference as your original variable ("originalFoo").

When you assign "howdy" to localFoo, you don't change where originalFoo is pointing.

If you did something like:

String a = "";
String b = a;
String b = "howdy"?

Would you expect:

System.out.print(a)

to print out "howdy" ? It prints out "".

You can't change what originalFoo points to by changing what localFoo points to. You can modify the object that both point to (if it wasn't immutable). For example,

List foo = new ArrayList();
System.out.println(foo.size());//this prints 0

thisDoesntWork(foo);
System.out.println(foo.size());//this prints 1

public static void thisDoesntWork(List foo){   
    foo.add(new Object);
}
Laplie
It has everything to do with pass by value vs pass by reference. With *real* pass-by-reference parameters (e.g. "ref" in C#) you could pass the string variable by reference and it would be changed.
Jon Skeet
A: 

Dave, you have to forgive me (well, I guess you don't "have to", but I'd rather you did) but that explanation is not overly convincing. The Security gains are fairly minimal since anyone who needs to change the value of the string will find a way to do it with some ugly workaround. And speed?! You yourself (quite correctly) assert that the whole business with the + is extremely expensive.

The rest of you guys, please understand that I GET how it works, I'm asking WHY it works that way... please stop explaining the difference between the methodologies.

(and I honestly am not looking for any sort of fight here, btw, I just don't see how this was a rational decision).

Dr.Dredel
Well, you are obviously missing something. The example you show **has nothing to do with name-passing** it's only a question of how parameters are bound inside a method (and it's actually helpful to understand the difference between a parameter and a variable).
Axelle Ziegler
@Dr.Dredel: It's for simplicity, both in the language and when reading code. When I call a method and pass in a string reference, I know that the method won't change it. While evil code could (in some cases) get round it, the more common case is non-evil code - where you just want to be able (cont)
Jon Skeet
to reason about the code easily. When you pass references by value, it's easier to understand what the code will do. In C#, you *can* pass parameters by reference - but it's still rare, because it makes the code harder to understand.
Jon Skeet
A: 

@Axelle

Mate do you really know the difference between passing by value and by reference ?

In java even references are passed by value. When you pass a reference to an object you are getting a copy of the reference pointer in the second variable. Tahts why the second variable can be changed without affecting the first.

mP
Yes. Yes I do. I'm just confounded by the (seemingly) retarded way in which Java implements String. It's an object, I SHOULD be able to modify one of its members (the one that's holding the chars) to whatever I want, wherever it goes. You pass in a list you can add() to it, right?
Dr.Dredel
Making String immutable was a *great* decision on the part of the Java designers (which is why that decision was also used in .NET). Otherwise we'd be defensively copying all over the place... <shudder>
Jon Skeet
+1  A: 

If we would make a rough C and assembler analogy:

void Main()
{ 
  // stack memory address of message is 0x8001.  memory address of Hello is 0x0001.  
  string message = "Hello"; 
  // assembly equivalent of: message = "Hello";
  // [0x8001] = 0x0001

  // message's stack memory address
  printf("%d", &message); // 0x8001

  printf("%d", message); // memory pointed to of message(0x8001): 0x0001
  PassStringByValue(message); // pass the pointer pointed to of message.  0x0001, not 0x8001
  printf("%d", message); // memory pointed to of message(0x8001): 0x0001.  still the same

  // message's stack memory address doesn't change
  printf("%d", &message); // 0x8001
}

void PassStringByValue(string foo)
{
 printf("%d", &foo); // &foo contains foo's *stack* address (0x4001)

 // foo(0x4001) contains the memory pointed to of message, 0x0001
 printf("%d", foo);  // 0x0001
 // World is in memory address 0x0002
 foo = "World";  // on foo's memory address (0x4001), change the memory it pointed to, 0x0002
 // assembly equivalent of: foo = "World":
 // [0x4001] = 0x0002

 // print the new memory pointed by foo
 printf("%d", foo); // 0x0002

 // Conclusion: Not in any way 0x8001 was involved in this function.  Hence you cannot change the Main's message value.
 // foo = "World"  is same as [0x4001] = 0x0002

}


void Main()
{
  // stack memory address of message is 0x8001.  memory address of Hello is 0x0001.  
  string message = "Hello"; 
  // assembly equivalent of: message = "Hello";
  // [0x8001] = 0x0001

  // message's stack memory address
  printf("%d", &message); // 0x8001

  printf("%d", message); // memory pointed to of message(0x8001): 0x0001
  PassStringByRef(ref message); // pass the stack memory address of message.  0x8001, not 0x0001
  printf("%d", message); // memory pointed to of message(0x8001): 0x0002. was changed

  // message's stack memory address doesn't change
  printf("%d", &message); // 0x8001
}


void PassStringByRef(ref string foo)
{
 printf("%d", &foo); // &foo contains foo's *stack* address (0x4001)

 // foo(0x4001) contains the address of message(0x8001)
 printf("%d", foo);  // 0x8001
 // World is in memory address 0x0002
 foo = "World"; // on message's memory address (0x8001), change the memory it pointed to, 0x0002
 // assembly equivalent of: foo = "World":
 // [0x8001] = 0x0002;


 // print the new memory pointed to of message
 printf("%d", foo); // 0x0002

 // Conclusion: 0x8001 was involved in this function.  Hence you can change the Main's message value.
 // foo = "World"  is same as [0x8001] = 0x0002

}

One possible reason why everything is passed by value in Java, its language designer folks want to simplify the language and make everything done in OOP manner.

They would rather have you design an integer swapper using objects than them provide a first class support for by-reference passing, the same for delegate(Gosling feels icky with pointer to function, he would rather cram that functionality to objects) and enum.

They over-simplify(everything is object) the language to the detriment of not having first class support for most language constructs, e.g. passing by reference, delegates, enum, properties comes to mind.

Michael Buen
+1  A: 

It is because, it creates a local variable inside the method. what would be an easy way (which I'm pretty sure would work) would be:

String foo = new String();    

thisDoesntWork(foo);    
System.out.println(foo); //this prints nothing

public static void thisDoesntWork(String foo) {    
   this.foo = foo; //this makes the local variable go to the main variable    
   foo = "howdy";    
}
A: 

If you think of an object as just the fields in the object then objects are passed by reference in Java because a method can modify the fields of a parameter and a caller can observe the modification. However, if you also think of an object as it's identity then objects are passed by value because a method can't change the identity of a parameter in a way that the caller can observe. So I would say Java is pass-by-value.

+2  A: 

Since my original answer was "Why it happened" and not "Why was the language designed so it happened," I'll give this another go.

To simplify things, I'll get rid of the method call and show what is happening in another way.

String a = "hello";
String b = a;
String b = "howdy"

System.out.print(a) //prints hello

To get the last statement to print "hello", b has to point to the same "hole" in memory that a points to (a pointer). This is what you want when you want pass by reference. There are a couple of reasons Java decided not to go this direction:

  • Pointers are Confusing The designers of Java tried to remove some of the more confusing things about other languages. Pointers are one of the most misunderstood and improperly used constructs of C/C++ along with operator overloading.

  • Pointers are Security Risks Pointers cause many security problems when misused. A malicious program assigns something to that part of memory, then what you thought was your object is actually someone else's. (Java already got rid of the biggest security problem, buffer overflows, with checked arrays)

  • Abstraction Leakage When you start dealing with "What's in memory and where" exactly, your abstraction becomes less of an abstraction. While abstraction leakage almost certainly creeps into a language, the designers didn't want to bake it in directly.

  • Objects Are All You Care About In Java, everything is an object, not the space an object occupies. Adding pointers would make the space an object occupies importantant, though.......

You could emulate what you want by creating a "Hole" object. You could even use generics to make it type safe. For example:

public class Hole<T> {
   private T objectInHole;

   public void putInHole(T object) {
      this.objectInHole = object;
   }
   public T getOutOfHole() {
      return objectInHole;
   }

   public String toString() {
      return objectInHole.toString();
   }
   .....equals, hashCode, etc.
}


Hole<String> foo = new Hole<String)();
foo.putInHole(new String());
System.out.println(foo); //this prints nothing
thisWorks(foo);
System.out.println(foo);//this prints howdy

public static void thisWorks(Hole<String> foo){
   foo.putInHole("howdy");
}
Laplie
Good answer, only last part with the code example is maybe a bit too contrived, and distracts from the important stuff you wrote above.
eljenso
"Uncontrolled" pointers can be confusing and risky. Not all languages use pointers like C/C++ and let you randomly point to odd places or objects. Java simply has well-controlled pointers.
Scott Stanchfield
A: 

This is because inside "thisDoesntWork", you are effectively destroying the local value of foo. If you want to pass by reference in this way, can always encapsulate the String inside another object, say in an array.

class Test {

    public static void main(String[] args) {
        String [] fooArray = new String[1];
        fooArray[0] = new String("foo");

        System.out.println("main: " + fooArray[0]);
        thisWorks(fooArray);
        System.out.println("main: " + fooArray[0]);
    }

    public static void thisWorks(String [] foo){
        System.out.println("thisWorks: " + foo[0]);
        foo[0] = "howdy";
        System.out.println("thisWorks: " + foo[0]);
    }
}

Results in the following output:

main: foo
thisWorks: foo
thisWorks: howdy
main: howdy
Nate