views:

128

answers:

1

I'm trying to serialize a dictionary in C#. All the examples I've been able to find create XML like the following:

<Dictionary>
    <ArrayOfEntries>
        <Entry>
            <Key>myFirstKey</Key>
            <Value>myFirstValue</Value>
        </Entry>
        <Entry>
            <Key>mySecondKey</Key>
            <Value>mySecondValue</Value>
        </Entry>
    </ArrayOfEntries>
</Dictionary>

It varies, sometimes the ArrayOfEntries node isn't necessary, but I still see the regular pattern of the dictionary's key-value pairs being stored in their own nodes. What I would like is something like the following:

<Dictionary>
    <myFirstKey>myFirstValue</myFirstKey>
    <mySecondKey>mySecondValue</mySecondKey>
</Dictionary>

I have written ReadXml and WriteXml to do this before and it works for a single dictionary, but if I try to serialize and deserialize a List<T> of my serializable dictionary instances, the deserialized list ends up with only the last serializable dictionary in it. I think something must be greedy about my read or write method such that it doesn't know when to stop. Here are my serializable dictionary methods that currently don't work for serialization of List's of serializable dictionaries:

public void WriteXml(XmlWriter writer)
{
    foreach (string key in getKeysToSerialize())
    {
        string cleanKey = key.SanitizeForXml();
        string value = getValueForKey(key).Trim();

        if (isCdata(key, value))
        {
            string cdataFriendlyValue = cleanStringForUseInCdata(value);
            if (string.IsNullOrEmpty(cdataFriendlyValue))
            {
                continue;
            }

            writer.WriteStartElement(cleanKey);
            writer.WriteCData(cdataFriendlyValue);
            writer.WriteEndElement();
        }
        else
        {
            writer.WriteElementString(cleanKey, value);
        }
    }
}

And ReadXml:

public void ReadXml(XmlReader reader)
{
    string key = null, value = null;
    bool wasEmpty = reader.IsEmptyElement;

    while (reader.Read())
    {
        if (wasEmpty)
        {
            return;
        }

        switch (reader.NodeType)
        {
            case XmlNodeType.Element:
                key = reader.Name;
                if (keyIsSubTree(key))
                {
                    using (XmlReader subReader = reader.ReadSubtree())
                    {
                        storeSubtree(key, subReader);
                    }

                    // Reset key to null so we don't try to parse it as
                    // a regular key-value pair later
                    key = null;
                }
                break;
            case XmlNodeType.Text:
                value = reader.Value;
                break;
            case XmlNodeType.CDATA:
                value = cleanCdataForStoring(reader.Value);
                break;
            case XmlNodeType.EndElement:
                if (!string.IsNullOrEmpty(key))
                {
                    string valueToStore =
                        string.IsNullOrEmpty(value) ? string.Empty : value
                    Add(key, valueToStore);
                }
                key = null;
                value = null;
                break;
        }
    }
}

One difference I've noticed between other tutorials and what I'm doing is that many use XmlSerializer to serialize and deserialize objects in their ReadXml and WriteXml, whereas I use WriteElementString and the like.

My question is, how can I serialize a dictionary such that its XML is like my second XML example above, but so that serializing and deserializing a List<MySerializableDictionary> also works? Even just hints of "why are you doing blah, that might make it act funny" would help.

+1  A: 

My WriteXml seemed okay because it outputs the dictionary's content in the XML format I want. ReadXml was where I was going wrong. Based on the ReadXml implementation found in this tutorial, I came up with the following:

public override void ReadXml(XmlReader reader)
{
    reader.MoveToContent();
    bool isEmptyElement = reader.IsEmptyElement;
    reader.ReadStartElement();
    if (isEmptyElement)
    {
        return;
    }
    while (XmlNodeType.EndElement != reader.NodeType)
    {
        // Bypass XmlNodeType.Whitespace, as found in XML files
        while (XmlNodeType.Element != reader.NodeType)
        {
            if (XmlNodeType.EndElement == reader.NodeType)
            {
                reader.ReadEndElement();
                return;
            }
            reader.Read();
        }
        string key = reader.Name;
        if (keyIsSubTree(key))
        {
            storeSubtree(key, reader.ReadSubtree());
        }
        else
        {
            string value = reader.ReadElementString();
            storeKeyValuePair(key, value ?? string.Empty);
        }
    }
    if (XmlNodeType.EndElement == reader.NodeType)
    {
        reader.ReadEndElement();
    }
}

This seemed to work for dictionaries that had only one key-value pair as well as dictionaries with multiple key-value pairs. My test of serializing/deserializing a List<T> of dictionaries also worked. I haven't added any special handling for CDATA yet, and I'm sure that will be necessary. This seems like a start, though.

Edit: actually, maybe I only need to worry about CDATA when I'm writing, not when reading.

Edit: updated ReadXml code above to account for XmlNodeType.Whitespace, which is found in XML files when you're deserializing. I was getting an XML error because ReadEndElement was being called when the reader.NodeType was not an XmlNodeType.EndElement. Put in a check for that so it only calls ReadEndElement when it's an EndElement, and also changed the outer while loop so the reader can progress (via Read()) when it hasn't hit an Element and it hasn't hit an EndElement (e.g., it has hit Whitespace).

Sarah Vessels
The user can also provide keys with invalid characters in them, although you "Sanitize" them.. this affects the uniqueness of the keys in your XML.
insipid