views:

3280

answers:

8

Hi everyone,

Here's a fictitious example of the problem I'm trying to solve. If I'm working in C#, and have XML like this:

<?xml version="1.0" encoding="utf-8"?>
<Cars>
  <Car>
    <StockNumber>1020</StockNumber>
    <Make>Nissan</Make>
    <Model>Sentra</Model>
  </Car>
  <Car>
    <StockNumber>1010</StockNumber>
    <Make>Toyota</Make>
    <Model>Corolla</Model>
  </Car>
  <SalesPerson>
    <Company>Acme Sales</Company>
    <Position>
       <Salary>
          <Amount>1000</Amount>
          <Unit>Dollars</Unit>
    ... and on... and on....
  </SalesPerson>
</Cars>

the XML inside SalesPerson can be very long, megabytes in size. I want to deserialize the tag, but not deserialize the SalesPerson XML element instead keeping it in raw form "for later on".

Essentially I would like to be able to use this as a Objects representation of the XML.

[System.Xml.Serialization.XmlRootAttribute("Cars", Namespace = "", IsNullable = false)]
public class Cars
{
    [XmlArrayItem(typeof(Car))]
    public Car[] Car { get; set; }

    public Stream SalesPerson { get; set; }
}

public class Car
{
    [System.Xml.Serialization.XmlElementAttribute("StockNumber")]
    public string StockNumber{ get; set; }

    [System.Xml.Serialization.XmlElementAttribute("Make")]
    public string Make{ get; set; }

    [System.Xml.Serialization.XmlElementAttribute("Model")]
    public string Model{ get; set; }
}

where the SalesPerson property on the Cars object would contain a stream with the raw xml that is within the <SalesPerson> xml element after being run through an XmlSerializer.

Can this be done? Can I choose to only deserialize "part of" an xml document?

Thanks! -Mike

p.s. example xml stolen from http://stackoverflow.com/questions/364253/how-to-deserialize-xml-document

+1  A: 

You can control how your serialization is done by implementing the ISerializable interface in your class. Note this will also imply a constructor with the method signature (SerializationInfo info, StreamingContext context) and sure you can do what you are asking with that.

However have a close look at whether or not you really need to do this with streaming because if you don't have to use the streaming mechanism, achieving the same thing with Linq to XML will be easier, and, simpler to maintain in the long term (IMO)

Tim Jarvis
+1  A: 

Typically XML deserialization is an all-or-nothing proposition out of the box, so you'll probably need to customize. If you don't do a full deserialization, you run the risk that the xml is malformed within the SalesPerson element, and so the document is invalid.

If you are willing to accept that risk, you'll probably want to do some basic text parsing to break out the SalesPerson elements into a different document using plain text processing facilities, then process the XML.

This is a good example of why XML is not always the correct answer.

MikeD
+2  A: 

I think the previous commenter is correct in his comment that XML might not be the best choice of a backing store here.

If you are having issues of scale and aren't taking advantage of some of the other niceties you get with XML, like transforms, you might be better off using a database for your data. The operations you are doing really seem to fit more into that model.

I know this doesn't really answer your question, but I thought I would highlight an alternate solution you might use. A good database and an appropriate OR mapper like .netTiers, NHibernate, or more recently LINQ to SQL / Entity Framework would probably get you back up and running with minimal changes to the rest of your codebase.

Anderson Imes
A: 

You may control what parts of the Cars class are deserialized by implementing the IXmlSerializable interface on the Cars class, and then within the ReadXml(XmlReader) method you would read and deserialize the Car elements but when you reach the SalesPerson element you would read its subtree as a string and then construct a Stream over the the textual content using a StreamWriter.

If you never want the XmlSerializer to write out the SalesPerson element, use the [XmlIgnore] attribute. I am not sure what you want to happen when you seriailize the Cars class to its XML representation. Are you trying to only prevent deserialization of the SalesPerson while still being able to serialize the XML representation of the SalesPerson represented by the Stream?

I could probably provide a code example of this if you want a concrete implementation.

Oppositional
A: 

If all you want to do is parse out the SalesPerson element but keep it as a string, you should use Xsl Transform rather than "Deserialization". If, on the other hand, you want to parse out the SalesPerson element and only populate an object in memory from all the other non-SalesPerson elements, then Xsl Transform might also be the way to go. If the files are way big, you may consider separating them and using Xsl to combine different xml files so that the SalesPerson I/O only occurs when you need it to.

alord1689
The use case is that the Car data I want as objects so that my program can interact with it. The SalesPerson XML simply gets sent over the wire to another system, so I don't even need to inspect it. Basically, I need to get all the data, but only care about what the Car elements contain.
Mike
If that's the case, then all you have to do is not supply XmlElementAttributes to serialize the non-car data.
alord1689
* deserialize, I mean
alord1689
+1  A: 

Please try defining the SalesPerson property as type XmlElement. This works for output from ASMX web services, which use XML Serialization. I would think it would work on input as well. I would expect the entire <SalesPerson> element to wind up in the XmlElement.

John Saunders
They may also need the XmlAnyAttribute on that member.
Steven Sudit
Can you say why?
John Saunders
I may be mistaken, actually, since it looks like XmlAny is for a property that returns an *array* of XmlElements, not just one.
Steven Sudit
I just re-read the description more carefully, and it looks like XmlAnyElement and XmlAnyAttribute are for slicing. They're catch-alls for the stuff that the XSD doesn't find a place for.
Steven Sudit
I'm not talking about `XmlElementAttribute`. I'm talking about `System.Xml.XmlElement`.
John Saunders
A: 

I would suggest you to manually read from Xml, using any lightweight methods, like XmlReader, XPathDocument or LINQ-to-XML.

When you have to read only 3 properties, I suppose you can write code that manually read from that node and have a full control of how it is executed instead of relying on Serialization/Deserialization

Bogdan_Ch
+2  A: 

It might be a bit old thread, but i will post anyway. i had the same problem (needed to deserialize like 10kb of data from a file that had more than 1MB). In main object (which has a InnerObject that needs to be deserializer) i implemented a IXmlSerializable interface, then changed the ReadXml method.

We have xmlTextReader as input , the first line is to read till a XML tag:

reader.ReadToDescendant("InnerObjectTag"); //tag which matches the InnerObject

Then create XMLSerializer for a type of the object we want to deserialize and deserialize it

XmlSerializer   serializer = new XmlSerializer(typeof(InnerObject));

this.innerObject = serializer.Deserialize(reader.ReadSubtree()); //this gives serializer the part of XML that is for  the innerObject data

reader.close(); //now skip the rest 

this saved me a lot of time to deserialize and allows me to read just a part of XML (just some details that describe the file, which might help the user to decide if the file is what he wants to load).