views:

57

answers:

3

I'm facing a problem that Google couldn't solve yet!

I'm trying to store URLs in an XML file. Problem is that these URLs contain Equal Signs (=) in them. And that generates an error.

Here is my code: (**token is a variable that contains the URL)

Dim child As String = vbCrLf & "<Link URL='" & token & "'></Link>"
Dim fragment As XmlDocumentFragment = doc.CreateDocumentFragment
fragment.InnerXml = child

The error message: (Error line and position are meaningless here)

'=' is an unexpected token. The expected token is ';'. Line 2, position 133.

I've replaced all '&' symbols with '&amp' in case they were the ones causing the error, but no luck so far.

+3  A: 

You should never use string manipulation to create XML. If you use the XML APIs of .NET, they will take care of all the special characters for you. Try:

XmlElement linkElement = doc.CreateElement("Link");
XmlAttribute urlAttribute = doc.CreateAttribute("URL");
urlAttribute.Value = token;
linkElement.SetAttributeNode(urlAttribute);
fragment.AppendChild(linkElement);
John Saunders
I already use this code to create the document. The code I've shown above is for injecting new elements in an existing XML Document.I'm sure there is a better way. I'm just agiling through the code doing what I know, and leaving what I don't know for later.Thanks for your contribution nevertheless :)
Ansari
@Ansari: if you already know how to use CreateElement etc., then why would you start creating elements by playing with strings?
John Saunders
A: 

Try adding this line below the first line:

-- insert this line; should make the "=" sign safe for XML...
token = System.Web.HttpUtility.UrlEncode(token)

Dim child As String = vbCrLf & "<Link URL='" & token & "'></Link>"
Dim fragment As XmlDocumentFragment = doc.CreateDocumentFragment
fragment.InnerXml = child
code4life
That will encode the whole element, not just the attribute value. It's no longer the string representation of an XML element.
Guffa
Good point - will fix that.
code4life
Encoding 'token' and decoding when reading worked like a charm!! Thanks mate!
Ansari
@code4life: -1 for still using string manipulation
John Saunders
@John, go ahead, -1 all you want... it doesn't change the fact that his problem is solved!!! Also, he's not looking for a complete code refactor.
code4life
@code4life: it's a bad practice, plain and simple. Not only for the OP, but for others who read this later, they need to know not to do this. Also, you may have solved the immediate problem, but when this code is copied and pasted elsewhere, it may break, and it will certainly encourage bad habits in those who use it.
John Saunders
@John: "it's bad practice, plain and simple". Can you explain why? Also are you sure that we should take such a dogmatic stance here? I think you're completely wrong in this instance by the way. Parsing XML, I would defer to your stance, but building or generating XML, you are incorrect to deprecate string manipulation. Just try and build a 50-100MB XML using your so-called "best practice approach", I dare you.
code4life
@code4life: the rules of strings and the rules of XML are different. By using string manipulation you will always have to add the "foreign" rules of XML, making the code an unnatural mix of string and XML semantics. Your answer is a perfect example of what not to do. You are using `UrlEncode` on the token. Will the recipient of the document know to use `UrlDecode`? Is that actually the correct encoding for an XML attribute? Any chance that `UrlEncode`, intended for use in a URL, will leave a character not valid in an XML Attribute?
John Saunders
@code4life: Also, to create large documents, I prefer LINQ to XML recently. It's much easier in constructing large XML documents, especially if they are being filled with data from some other source. If forced to use the `XmlDocument` API, refactoring would result in code nearly as usable as LINQ to XML.
John Saunders
@John, that's completely facetious. As you mentioned, by using XElement features, UrlEncoded values are completely abstracted safely. What does that have to do with originally packaging the values into the XML node? Also your comment about LINQ to XML is not necessarily an answer to my question about building 50-100MB file **quickly**. It's a facile way to do things, but I would take you to task if you are telling me that this is actually more **performant** than raw string manipulation (via StringBuilder).
code4life
@John: also AFAIK, UrlEncode is safe for attributes...
code4life
@code4life: Using the features of any of the XML APIs, it's not necessary to be concerned about encoding at all. You just set the string value, and it is properly encoded for the situation. Change an element to an attribute or vice-versa, and the encoding will change accordingly. I have not done performance comparisons of LINQ to SQL over string manipulation for creating large files, but CPU and memory performance aren't usually the dominant factors when writing large files.
John Saunders
@code4life: also, if UrlEncode happens to work for attributes, then it's a fluke. It was not designed for that. The `Value` property of an attribute _is_ designed for it.
John Saunders
+1  A: 

You should not replace & with &amp, you should replace them with &amp;.

Or better yet, create a node in your fragment and add an attribute to it. That way the object will encode the data correctly for you.

Dim fragment As XmlDocumentFragment = doc.CreateDocumentFragment()
Dim node As XmlElement = doc.CreateElement("Link")
Dim attr as XmlAttribute = doc.CreateAttribute("URL")
attr.Value = token
node.Attributes.Append(attr)
fragment.AppendNode(node)
Guffa
Bart van Heukelom
@Bart van Heukelom: The operators in the code will of course not be escaped, as they are never part of a string. I am talking about the characters inside the `token` string.
Guffa
Bart van Heukelom