views:

450

answers:

4

Hi,

I've created a utf8 encoded RSS feed which presents news data drawn from a database. I've set all aspects of my database to utf8 and also saved the text which i have put into the database as utf8 by pasting it into notepad and saving as utf8. So everything should be encoded in utf8 when the RSS feed is presented to the browser, however I am still getting the weird question mark characters for pound signs :(

Here is my RSS feed code (coldfusion):

<cfsilent>
<!--- Get News --->
<cfinvoke component="com.news" method="getAll" dsn="#Request.App.dsn#"     returnvariable="news" />
</cfsilent>
<!--- If we have news items --->
cfif news.RecordCount GT 0>
<!--- Serve RSS content-type --->
<cfcontent type="application/rss+xml">
<!--- Output feed --->
<cfcontent reset="true"><?xml version="1.0" encoding="utf-8"?>
<cfoutput>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"&gt;
    <channel>
        <title>News RSS Feed</title>
        <link>#Application.siteRoot#</link>
        <description>Welcome to the News RSS Feed</description>
        <lastBuildDate>Wed, 19 Nov 2008 09:05:00 GMT</lastBuildDate>
        <language>en-uk</language>
        <atom:link href="#Application.siteRoot#news/rss/index.cfm" rel="self" type="application/rss+xml" />

    <cfloop query="news">
    <!--- Make data xml compliant --->
        <cfscript>
        news.headline = replace(news.headline, "<", "&lt;", "ALL");
        news.body = replace(news.body, "<", "&lt;", "ALL");
        news.date = dateformat(news.date, "ddd, dd mmm yyyy");
        news.time = timeformat(news.time, "HH:mm:ss") & " GMT"; 
        </cfscript>        
    <item>
        <title>#news.headline#</title>
        <link>#Application.siteRoot#news/index.cfm?id=#news.id#</link>
        <guid>#Application.siteRoot#news/index.cfm?id=#news.id#</guid>
        <pubDate>#news.date# #news.time#</pubDate>
        <description>#news.body#</description>
    </item>
    </cfloop>
    </channel>
</rss>
</cfoutput>
<cfelse>
<!--- If we have no news items, relocate to news page --->
<cflocation url="../news/index.cfm" addtoken="no">
</cfif>

Has anyone any suggestions? I've done loads of research but can't find any answers :(

Thanks in advance,

Chromis

A: 

Your escaping function is too simple. You need to change & to &amp; first.

If you use named entities (i.e. &pound;) that is cause of the error.

porneL
+5  A: 

Get rid of your escaping code and use XMLFormat instead:

<item>
    <title>#XMLFormat(news.headline)#</title>
    <link>#Application.siteRoot#news/index.cfm?id=#XMLFormat(news.id)#</link>
    <guid>#Application.siteRoot#news/index.cfm?id=#XMLFormat(news.id)#</guid>
    <pubDate>#XMLFormat(news.date)# #XMLFormat(news.time)#</pubDate>
    <description>#XMLFormat(news.body)#</description>
</item>

View XMLFormat livedoc page.

rip747
A: 

Ok that's fixed it thanks, the problem was that i was using the & pound; entity. Part of my problem is that the text will be coming from user input and text will often be copy and pasted from word, how do i guard against out of range characters from word?

chromis
A: 

Sanitize every input when it is entered in the database, that way should simplify the display of such data afterwards.

Keltia