views:

4151

answers:

6

I'm looking for a clear, concise and accurate answer.

Ideally as the actual answer, although links to good explanations welcome.

+36  A: 

Boxing & unboxing is the process of converting a primitive value into an object oriented wrapper class (boxing), or converting a value from an object oriented wrapper class back to the primitive value (unboxing).

For example, in java, you may need to convert an int value into an Integer (boxing) if you want to store it in a Collection because primitives can't be stored in a Collection, only objects. But when you want to get it back out of the Collection you may want to get the value as an int and not an Integer so you would unbox it.

Boxing and unboxing is not inherently bad, but it is a tradeoff. Depending on the language implementation, it can be slower and more memory intensive than just using primitives. However, it may also allow you to use higher level data structures and achieve greater flexibility in your code.

These days, it is most commonly discussed in the context of Java's (and other language's) "autoboxing/autounboxing" feature. Here is a java centric explanation of autoboxing.

Justin Standard
+8  A: 

In .Net:

Often you can't rely on what the type of variable a function will consume, so you need to use an object variable which extends from the lowest common denominator - in .Net this is object.

However object is a class and stores its contents as a reference.

List<int> notBoxed = new List<int> { 1, 2, 3 };
int i = notBoxed[1]; // this is the actual value

List<object> boxed = new List<object> { 1, 2, 3 };
int j = (int) boxed[1]; // this is an object that can be 'unboxed' to an int

While both these hold the same information the second list is larger and slower. Each value in the second list is actually a reference to an object that holds the int.

This is called boxed because the int is wrapped by the object. When its cast back the int is unboxed - converted back to it's value.

For value types (i.e. all structs) this is slow, and potentially uses a lot more space.

For reference types (i.e. all classes) this is far less of a problem, as they are stored as a reference anyway.


Thanks for the clarification @Justin Standard

Keith
+13  A: 

from C# 3.0 In a Nutshell:

Boxing is the act of casting a value type into a reference type:

int x = 9; 
object o = x; // boxing the int

unboxing is... the reverse:

object o = 9; int x = (int)o; // unboxing o
Christian Hagelid
+2  A: 

The .NET FCL generic collections:

List<T>
Dictionary<TKey, UValue>
SortedDictionary<TKey, UValue>
Stack<T>
Queue<T>
LinkedList<T>

were all designed to overcome the performance issues of boxing and unboxing in previous collection implementations.

For more, see chapter 16, CLR via C# (2nd Edition).

Jonathan Webb
+21  A: 

Boxed values are data structures that are minimal wrappers around primitive types*. Boxed values are typically stored as pointers to objects on the heap.

Thus, boxed values use more memory and take at minimum two memory lookups to access: once to get the pointer, and another to follow that pointer to the primitive. Obviously this isn't the kind of thing you want in your inner loops. On the other hand, boxed values typically play better with other types in the system. Since they are first-class data structures in the language, they have the expected metadata and structure that other data structures have.

In Java and Haskell generic collections can't contain unboxed values. Generic collections in .NET can hold unboxed values with no penalties. Where Java's generics are only used for compile-time type checking, .NET will generate specific classes for each generic type instantiated at run time.

Java and Haskell have unboxed arrays, but they're distinctly less convenient than the other collections. However, when peak performance is needed it's worth a little inconvenience to avoid the overhead of boxing and unboxing.

* For this discussion, a primitive value is any that can be stored on the call stack, rather than stored as a pointer to a value on the heap. Frequently that's just the machine types (ints, floats, etc), structs, and sometimes static sized arrays. .NET-land calls them value types (as opposed to reference types). Java folks call them primitive types. Haskellions just call them unboxed.

** I'm also focusing on Java, Haskell, and C# in this answer, because that's what I know. For what it's worth, Python, Ruby, and Javascript all have exclusively boxed values. This is also known as the "Everything is an object" approach.

Peter Burns
Small world, Pete! Good answer.
Brian MacKay
I wish I could upvote twice
Derrick
Why though a boxed value, what benefit does the CLR or whatever get form boxing values?
CoffeeAddict
A: 

Like anything else, autoboxing can be problematic if not used carefully. The classic is to end up with a NullPointerException and not be able to track it down. Even with a debugger. Try this:

public class TestAutoboxNPE
{
    public static void main(String[] args)
    {
     Integer i = null;

     // .. do some other stuff and forget to initialise i

     i = addOne(i);           // Whoa! NPE!
    }

    public static int addOne(int i)
    {
     return i + 1;
    }
}
fiddlesticks
This is just bad code, and has nothing to do with autoboxing. The variable `i` is prematurely initialized. Either make it an empty declaration (`Integer i;`) so that the compiler can point out that you forgot to initialize it, or wait to declare it until you know its value.
erickson
Hmm, and if I do something inbetween inside a try catch block then the compiler will force me to initialise it with something. This is no real code - it's an example of how it could happen.
fiddlesticks
What does this demonstrate? There is absolutely no reason to use the Integer object. Instead you now have to deal with a potential NullPointer.
Richard Clayton