tags:

views:

588

answers:

5

Hi please see the following code.

String s = "Monday";
if(s.subString(0,3).equals("Mon"){}

String s2 = new String(s.subString(0,3));
String s3 = s.subString(0,3);

I know that line 2 will still point to "Monday" and have a new String object with the offset and count set to 0,3.

The line 4 will create a new String "Mon" in string pool and point to it.

But not sure what about line 5 whether it will behave like line 2 or line 4.

If i am wrong for line 2 or 4 also please correct.. Thanks

A: 

read this http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html

"Returns a new string..."

Tobiask
I believe the question is really about when strings share an underlying char array and when they don't.
Jon Skeet
+4  A: 

As pointed out by Pete Kirkham, this is implementation specific. My answer is only correct for the current Sun JRE.

You're right about a normal substring call just creating a new string referring to the same character array as the original string. That's what happens on line 5 too. The fact that the new string object reference happens to be assigned to a variable doesn't change the behaviour of the method.

Just to be clear, you say that in line 2 the new string will still point to "Monday" - the char array reference inside the string will be to the same char array as one used for "Monday". But "Monday" is a string in itself, not a char array. In other words, by the time line 2 has finished (and ignoring GC) there are two string objects, both referring to the same char array. One has a count of 6 and the other has a count of 3; both have an offset of 0.

You're wrong about line 4 using a "string pool" though - there's no pooling going on there. However, it is different to the other lines. When you call the String(String) constructor, the new string takes a copy of the character data of the original, so it's completely separate. This can be very useful if you only need a string which contains a small part of a very large original string; it allows the original large char array to be garbage collected (assuming nothing else needs it) while you hold onto the copy of the small portion. A good example of this in my own experience is reading lines from a line. By default, BufferedLineReader will read lines using an 80-character buffer, so every string returned will use a char array of at least 80 characters. If you're reading lots of very short lines (single words) the difference in terms of memory consumption just through the use of the odd-looking

line = new String(line);

can be very significant.

Does that help?

Jon Skeet
Downvoters: please add comments, or the downvote isn't particularly useful...
Jon Skeet
See my post. This is implementation defined and varies.
Pete Kirkham
Good point - will edit to make this clear.
Jon Skeet
+1  A: 

At line 5---->s3=Mon .

Warrior
+4  A: 

I know that line 2 will still point to "Monday" and have a new String object with the offset and count set to 0,3.

That is currently true of the Sun JRE implementation. I seem to recall that was not true of the Sun implementation in the past, and is not true of other implementations of the JVM. Do not rely on behaviour which is not specified. GNU classpath might copy the array (I can't remember off hand what ratio is uses to decide when to copy, but it does copy if the copy is a small enough fraction of the original, which turned one nice O(N) algorithm to O(N^2)).

The line 4 will create a new String "Mon" in string pool and point to it.

No, it creates a new string object in the heap, subject to the same garbage collection rules as any other object. Whether or not it shares the same underlying character array is implementation dependant. Do not rely on behaviour which is not specified.

The String(String) constructor says:

Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string.

The String(char[]) constructor says:

Allocates a new String so that it represents the sequence of characters currently contained in the character array argument. The contents of the character array are copied; subsequent modification of the character array does not affect the newly created string.

Following good OO principles, no method of String actually requires that it is implemented using a character array, so no part of the specification of String requires operations on an character array. Those operations which take an array as input specify that the contents of the array are copied to whatever internal storage is used in the String. A string could use UTF-8 or LZ compression internally and conform to the API.

However, if your JVM doesn't make the small-ratio sub-string optimisation, then there's a chance that it does copy only the relevant portion when you use new String(String), so it's a case of trying it a seeing if it improves the memory use. Not everything which effects Java runtimes is defined by Java.

To obtain a string in the string pool which is equal to a string, use the intern() method. This will either retrieve a string from the pool if one with the value already has been interned, or create a new string and put it in the pool. Note that pooled strings have different (again implementation dependent) garbage collection behaviour.

Pete Kirkham
Thanks Pete for nice explanation. "the newly created string is a copy of the argument string." So where is the argument string stored.
harshit
Like any object in Java, it's stored in the heap until it's no longer referenced, and is then garbage collected sometime. All implementations I've seen use a char array to store the characters in the string, and some have different rules for sharing this array between Strings...
Pete Kirkham
... but I can't recall one where String(String) doesn't end up with the array being the same length as the new String. So the array used internally by the argument may be copied or may be referenced, but that's up to the implementation.
Pete Kirkham
A: 

In Sun's implementation String objects have a private final char value[] field. When you create a new String by calling substring(), no new char array is created, the new instance uses the value of the original object. This is the case in line 2 and 5, the new String objects will use the char array of s.

The constructor String(String) creates a new char array in case of the string length being less than the total length of the char array value. So the String created in line 4 will use a new char array.

You should have a look at the source code of the constructor public String(String original), it's really simple.

Chei