views:

290

answers:

5

I was reading this article, where they have this code:

// Serialization
XmlSerializer s = new XmlSerializer( typeof( ShoppingList ) );
TextWriter w = new StreamWriter( @"c:\list.xml" );
s.Serialize( w, myList );
w.Close();

// Deserialization
ShoppingList newList;
TextReader r = new StreamReader( "list.xml" );
newList = (ShoppingList)s.Deserialize( r );

Is the last line a cast statement? If so, doesn't it degrade the serialization performance?

+5  A: 

Yes the last line is a cast statement. Casting does have a cost associated with it but it is insignificant as compared to the cost of serialization. I doubt it would even show up on a profiler.

Think of what serialization involves.

  • Processing a byte stream
  • Creating types based on metadata information
  • Conversion between byte arrays and data types

Any of these operations are significantly more expensive than a single cast operation.

EDIT As to why it requires casting at all.

There are a couple of reasons here. The first is that the deserialization APIs have no way of knowing what the type of the byte stream is before it inspects it. So the only choice the API has in terms of a return type in metadata is Object.

Secondly, deserialization must support literally any type that is serializable. In order to function it must pick a return type for the method that is applicable to all types which can be serialized. The only type available that meets that is object.

JaredPar
Thanks Jared. The reason I wondered is because I want to use serialization for my save file system in my app, but the files might contain millions of objects. Do you know why it requires a cast? ie. why is it not type safe?
Joan Venge
@Joan it requires a cast because the Deserialization engine doesn't know the type before it reads the underlying data. Also, the API has been around since before generics so it was forced to return object as the type.
JaredPar
Thanks Jared, good to know. So is it possible to have a serializer that returns the type without casting?
Joan Venge
@Joan, sure but it will be typed as object. There is no way to avoid the cast if you want to get to the true type of the object.
JaredPar
Hmm, yeah this is interesting. So one can't even write a new serializer with custom attributes, etc using generics so that the serializer knows what the type is? Because you specify the type in the serialization, right? I added some code above.
Joan Venge
+2  A: 

Casts are hugely cheap compared with the deserialization cost itself. The process of deserialization is pretty complex - a single (working) cast hardly takes any time at all.

Of course, if you're interested in fast, portable, compact serialization with a good versioning story, you should be looking at Protocol Buffers:

(There are other serialization frameworks too, such as Thrift.)

Jon Skeet
Thanks Jon. I am not actually sure how serialization works under the hood. Do you know why it requires a cast? ie. why is it not type safe?
Joan Venge
Look at the return type from the API - which was created before generics. There's really not much you could do to avoid that.
Jon Skeet
Thanks Jon. So is it possible to have a serializer that returns the type without casting? Just out of curiosity :)
Joan Venge
no. no way of going from object to the specified type without casting.
Darren Kopp
Yes, using generics - but of course it could still fail at execution time if the data isn't valid for that type. That's fundamental to the nature of serialization.
Jon Skeet
(I meant with your own serialization scheme, of course.)
Jon Skeet
Thanks Jon. In that case the generic version would be superior? Seeing that you already have to specify the type when you serialize (I added code above). And even if it would fail at runtime for wrong type, same problem exists with casting to the wrong type, right?
Joan Venge
It would entirely depend on what else it did. I mean you could fairly easily wrap the framework's serialization code in a generic type which did the cast for you - it wouldn't really make it safer.
Jon Skeet
Thanks Jon. I see what you mean. What I meant was, would one be able to write a serializer that doesn't require casting? Like it somehow directly creates the object of the right type, not a System.Object. I am just curious theoretically.
Joan Venge
Btw Jon, thanks for mentioning proto buffers, never head of them, but looks really good.
Joan Venge
Yes, it would have to create the object of the right type to start with - which is exactly what Protocol Buffers does :) Now I'm at home I can add links to my post, as well. (Previously I was typing on a phone - links aren't really convenient to add that way.)
Jon Skeet
Thanks Jon. I appreciate your extra answers.
Joan Venge
Btw Jon, so is this project yours?: http://code.google.com/p/protobuf-net/wiki/Performance where it says proto#?
Joan Venge
Nope, that's a different one: http://code.google.com/p/protosharp/ - I don't think it's under as active development though.
Jon Skeet
Thanks Jon for clearing it up. I will go with your implementation then. Thanks.
Joan Venge
A: 

Deserialize returns a type Object so the casting is to get it into the correct class.

Whether is would have any impact on the deserialization or not, you want it to be part of ShoppingList.

James Black
+1  A: 

The Deserialize() method returns an object and must be "cast" to the correct type.

Casting is primarily telling the compiler that you know what the object type is since the compiler is unable to infer its type. The runtime will still generate an InvalidCast exception if the type is not what you specified (or a sub-type of the type specified).

The actual cost of casting is minimal.

Andrew Robinson
I disagree with your statement about inferring the type. It doesn't even try to infer the type because serialization was in .NET long before generics.
Samuel
I hear you. I think I am speaking in very general termsm. We cast because type at runtine is unknown by the compiler (at compile time). Maybe not a correct use of the term 'infer'.
Andrew Robinson
A: 

If you change that last line of code

newList = (ShoppingList)s.Deserialize( r );

to

newList = s.Deserialize( r );

The compiler will add back in a cast. I just confirmed this with Red Gate's .NET Reflector. So regardless of the cost of casting, you are required to do it if you want to use that typed object.

Dave L