tags:

views:

1335

answers:

6

I have a problem with an rss feed.

When i do <title>This is a title </title>

The title appears nicely in the feed

But when i ddo $title = "this is a tilte";

<title><![CDATA['$title']]></title>

The title doesn't appear at all.


It still doesn't work. I generate my rss feed dynamicly and it looks like this:

$item_template="
      <item>
         <title>[[title]]</title>
         <link>[[link]]</link>
         <description><![CDATA[[[description]]]]></description>
         <pubDate>[[date]]</pubDate>
      </item>
      ";

and in a loop:

$s.=str_replace(
array("[[title]]","[[link]]","[[description]]","[[date]]"),
array(htmlentities($row["title"]),$url,$description,$date),
$item_template);

The problem is specifically when the title has a euro sign. Then it shows up in my rss validator like:

Â\x80


More detailed information:

Ok I have been struggeling with this for the last few days and I can't find a solution. So I will start a bounty. Here is more information:

  • The information that goes in the feed is stored in a latin 1 database (which i administer)
  • The problem appears when there is a euro sign in the database. No matter wether its like € or &euro;
  • The euro sign sometimes appears like weird charachters or like Â\x80
  • I try to solve the problem on the feed side not on the reader side.
  • The complete code can be found over here: codedump
  • Next: sometimes when the euro sign cannot be parsed the item (either the title or description) is shown empty. So if you look in the source when showing the feed in an browser you'll find <title></title>

If there is more information needed please ask.

+1  A: 

Which programming language or environment do you use? For instance, in PHP the single quotes prevent evaluating the variables inside.

Otherwise, in this case you don't really need those quotes. May be you were confused by the array syntax of PHP.

So you'd better write:

<title><![CDATA[$title]]></title>
Török Gábor
A: 

I believe RSS Profile does not allow it: this document states that title holds, character data which is further defined as follows.

Anton Gogolev
A: 

So I understand it is not recoomended to use cdata in the title. So My only problem I have is with the euro sign wich appears in the validator as "Â\x80 "

Should i replace it with a regex? Or what would you advise?

sanders
+10  A: 

The problem is your outputting code; change

echo '<title><![CDATA[$title]]></title>';

to

echo '<title><![CDATA[' . $title . ']]></title>';

As a side note, please mind the following: Do not answer your own question with a follow-up, but edit the original one. Do not use regexps for no good reason. Do not guess.

Instead, do what you should have done all along: Wrap the title in htmlentitites and be done, as in:

echo '<title>' . htmlentities($title, ENT_NOQUOTES, [encoding]) . '</title>';

Replace [encoding] with the character encoding you are using. Most likely, this is 'UTF-8'. This is necessary because php(<6) uses ISO-8859-1 by default and there is no way to express e.g. the Euro sign in that encoding. For further information, please refer to this well-written introduction.

I also suggest you read about XML. Start with the second chapter.

phihag
I have removed my followups and editet my start post.
sanders
I can still see your answer, but thanks for the effort. I'm sorry, but my answer was incomplete and I should have seen that right away. You also need to specify a character encoding. Edited the answer.
phihag
Well, still problems with the euro sign.$s.=str_replace( array("[[title]]","[[link]]","[[description]]","[[datum]]"), array(htmlentities($row["title"],ENT_NOQUOTES,"UTF-8",false),$url,$description,$datum), $item_template); This gives back an empty title.
sanders
@sanders: Why are you setting the forth parameter(double_encode) of htmlentities? Also, please check that your encoding is really UTF-8
phihag
I removed the fourth parameter. I return it as utf8 like return utf8_encode($s); But stil no result
sanders
@sanders Sorry for the long delay. The problem is the encoding of $s. Try $row["title"] = "\xe2\x82\xac"; /* € in UTF-8*/ . This should yield <title>€</title>. Then look how your $s differs from € in UTF-8 and trace it to the original problem.
phihag
A: 

This article may be helpful for information about the euro sign and support in various contexts. Some of the suggestions from that article include using &#8364; or &euro; or just replacing the sign with the word "euro." Good luck!

Mike Ivanov
+1  A: 

Use htmlspecialchars() instead of htmlentities().

RSS/ATOM feeds are not HTML, so you cant use HTML entities in them. XML has only five entities defined by default, so you can’t use &euro;. Since you’re using UTF — use literal euro sign, without conversion (no htmlentities), but with escaping other sensitive characters (htmlspecialchars).

And this would be completely valid RSS/XML. If this doesn’t solve the problem it means, that it lies somewhere else (please provide me with generated raw-source of the RSS for more help).

Maciej Łebkowski