views:

1089

answers:

5

The following code snippet on SQL server 2005 fails on the ampersand '&':

select cast('<name>Spolsky & Atwood</name>' as xml)

Does anyone know a workaround?

Longer explanation, I need to update some data in an XML column, and I'm using a search & replace type hack by casting the XML value to a varchar, doing the replace and updating the XML column with this cast.

+5  A: 

It's not valid XML. Use &amp;:

select cast('<name>Spolsky &amp; Atwood</name>' as xml)
John Saunders
Martin Liversage
Looks ok to me.
John Saunders
Martin Liversage
KM
@KM: You were right. I thought you were talking about the code. Please go ahead and fix the text.
John Saunders
+3  A: 
select cast('<name>Spolsky &amp; Atwood</name>' as xml)

A literal ampersand inside an XML tag is not allowed by the XML standard, and such a document will fail to parse by any XML parser.

An XMLSerializer() will output the ampersand HTML-encoded.

The following code:

using System.Xml.Serialization;

namespace xml
{
    public class MyData
    {
        public string name = "Spolsky & Atwood";
    }

    class Program
    {
        static void Main(string[] args)
        {
            new XmlSerializer(typeof(MyData)).Serialize(System.Console.Out, new MyData());
        }
    }
}

will output the following:

<?xml version="1.0" encoding="utf-8"?>
<MyData
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"&gt;
  <name>Spolsky &amp; Atwood</name>
</MyData>

, with an &amp; instead of &.

Quassnoi
edosoft
Martin Liversage
+2  A: 

You'd need to XML escape the text, too.

So let's backtrack and assume you're building that string as:

SELECT '<name>' + MyColumn + '</name>' FROM MyTable

you'd want to do something more like:

SELECT '<name>' + REPLACE( MyColumn, '&', '&amp;' ) + '</name>' FROM MyTable

Of course, you probable should cater for the other entities thus:

SELECT '<name>' + REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( MyColumn, '&', '&amp;' ), '''', '&apos;' ), '"', '&quot;' ), '<', '&lt;' ), '>', '&gt;' ) + '</name>' FROM MyTable
Rowland Shaw
Thinking about it, you could probably do this better playing about using the 'FOR XML' to generate the XML, and the relevant XPath to read the source, if it already is properly formatted XML - would need a better description of what you're trying to achieve with schema to really make a judgement
Rowland Shaw
A: 

As John and Quassnoi state, & on it's own is not valid. This is because the ampersand character is the start of a character entity - used to specify characters that cannot be represented literally. There are two forms of entity - one specifies the character by name (e.g., &amp;, or &quot;), and one the specifies the character by it's code (I believe it's the code position within the Unicode character set, but not sure. e.g., &#34; should represent a double quote).

Thus, to include a literal & in a HTML document, you must specify it's entity: &amp;. Other common ones you may encounter are &lt; for <, &gt; for >, and &quot; for ".

Jason Musgrove
+2  A: 

When working with XML in SQL you're a lot safer using built-in functions instead of converting it manually.

The following code will build a proper SQL XML variable that looks like your desired output based on a raw string:

DECLARE @ExampleString nvarchar(40)
    , @ExampleXml xml

SELECT  @ExampleString = N'Spolsky & Atwood'

SELECT  @ExampleXml =
    (
        SELECT  'Spolsky & Atwood' AS 'name'
        FOR XML PATH (''), TYPE
    )

SELECT  @ExampleString , @ExampleXml
Aaron