views:

251

answers:

1

I'm having a serialized array of some type. Is there a way to append new objects to this serialized array (in a serialized form) without the need to read the already saved collection into the memory?

Example:

I'm having in a file.xml, XML-serialized array of Entity containing 10^12 elements. I need to add another 10^5 elements to the serialized file, but I don't want to read all the previous ones, append the new ones and write a new array to a stream because it would be very resource intensive (especially memory).

It if it would require a Binary Serializer, I would have no problem with that.

+4  A: 

In general the solution is to change the the XML bytes, this way you won't have to read all of it like in deserializing.

The steps in general are:

  1. List item
  2. Open the file stream
  3. Store the closing node of the array
  4. Serialize the new item
  5. Write the serialized bytes to stream
  6. Write the closing node

Code for example that add an integer to a serialized array:

// Serialize array - in you case it the stream you read from file.xml
var ints = new[] { 1, 2, 3 };
var arraySerializer = new XmlSerializer(typeof(int[]));
var memoryStream = new MemoryStream(); // File.OpenWrite("file.xml")
arraySerializer.Serialize(new StreamWriter(memoryStream), ints);

// Save the closing node
int sizeOfClosingNode = 13; // In this case: "</ArrayOfInt>".Length
                            // Change the size to fit your array
                            // e.g. ("</ArrayOfOtherType>".Length)

// Set the location just before the closing tag
memoryStream.Position = memoryStream.Length - sizeOfClosingNode;

// Store the closing tag bytes
var buffer = new byte[sizeOfClosingNode];
memoryStream.Read(buffer, 0, sizeOfClosingNode);

// Set back to location just before the closing tag.
// In this location the new item will be written.
memoryStream.Position = memoryStream.Length - sizeOfClosingNode;

// Add to serialized array an item
var itemBuilder = new StringBuilder();
// Write the serialized item as string to itemBuilder
new XmlSerializer(typeof(int)).Serialize(new StringWriter(itemBuilder), 4);
// Get the serialized item XML element (strip the XML document declaration)
XElement newXmlItem = XElement.Parse(itemBuilder.ToString());
// Convert the XML to bytes can be written to the file
byte[] bytes = Encoding.Default.GetBytes(newXmlItem.ToString());
// Write new item to file.
memoryStream.Write(bytes, 0, bytes.Length);
// Write the closing tag.
memoryStream.Write(buffer, 0, sizeOfClosingNode);

// Example that it works
memoryStream.Position = 0;
var modifiedArray = (int[]) arraySerializer.Deserialize(memoryStream);
CollectionAssert.AreEqual(new[] { 1, 2, 3, 4 }, modifiedArray);
Elisha
Hi Elisha, this seems like a very interesting answer, and I have voted it up. For folks like me that do not have much experience in this area, it would be helpful if you could add a few more comments in your code : for example, where does the value #13 used above come from (it's based on your knowledge of exactly how many bytes writing the array { 1,2,3 } of int into a memorystream takes ?). Another question : your example shows a modification of a memory stream (you have the file write as XML commented out) : can that be substituted for using memory stream with no major mods ? thanks,
BillW
@BillW, added some comments, it's not intuitive code so I hope it helps :)The number #13 will change according to the type serialized. It represents the node name length that closes the array XML. e.g., </ArrayOfString> will take 16 bytes (byte per char).I used MemoryStream just to make the answer readable and clear (I guess I wasn't too successful), but it's shares the same base with FileStream. Both are stream, replacing the first block in real life to File.OpenWrite("file.xml") won't affect the rest of the code responsible on adding the new item.
Elisha