views:

103

answers:

3

I'm trying to write the dagger '†' symbol to a HTML page which gets converted to a PDF document, but this appears on the PDF as 'â€'

I understand that I need to use the HTML code for this symbol, which is †.

I've done this successfully for the '€' but in these cases I've written the code directly into the HTML. In this case, I'm reading the symbol from an XML file. When I inspect the value of the variable that contains the symbol, it appears as '†'.

I should note that I've tried reading the symbol & the code from the XML file, as follows:

<fund id="777" countryid="N0" append="&#8224;" />

and

<fund id="777" countryid="N0" append="†" />

but both are stored in the variable as the symbol, and when I write them to the page, both are rendered as 'â€'. Also, I've tried the following:

string code = "&#8224;";
string symbol = "†";
string htmlEncodedCode = HttpUtility.HtmlEncode(code);
string htmlEncodedSymbol = HttpUtility.HtmlEncode(symbol);

tc.Text = fund.Name + code + " " + symbol + " " + 
    htmlEncodedCode + " " + htmlEncodedSymbol;

but only the first works. It appears in the document as:

FundName† †&#8224; â€

Can somebody suggest how I can get this to work?

Update:

@James Curran's answer below was correct. Just for the sake of clarity, I had to change the XML to:

<fund id="777" countryid="N0" append="&amp;dagger;" /> 

and in my C#:

tc.Text = fund.Name + append;
+1  A: 

This is an encoding issue. †is probably the Latin-1 representation of the dagger in UTF-8. Try converting the dagger from UTF-8 to ISO-8859-1.

Sjoerd
Or try to get the PDF to render using UTF-8.
R. Bemrose
Indeed. And read Joel Spolsky's "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)": http://www.joelonsoftware.com/articles/Unicode.html Really!
Matt Gibson
@R. Bemrose it's more likely an HTML issue. The default encoding for HTTP content is ISO-8859-1. So if the encoding for the HTML page is not set to UTF-8, that is what the PDF renderer will use.
JeremyP
+3  A: 
James Curran
storing this in the XML file as `<fund id="777" countryid="N0" append="†" />`returns the error: `Reference to undeclared entity 'dagger'.` .. do I need to refer to it differently in the XML?
DaveDev
R. Bemrose
A: 

With the XML file what you probably want to do is something along the lines of:

<fund id="777" countryid="N0" append="&amp;#8224;" />

The reason is that the XML file will interpret the &amp; as the & symbol and the rest as literal text. Thus in your html you will get &#8224; and that should do you.

Chris