I am developing a MVC application with PHP that uses XML and XSLT to print the views. It need to be fully UTF-8 supported. I also use MySQL right configured with UTF8. My problem is the next.
I have a <input type="text"/>
with a value like àáèéìíòóùú"><'@#~!¡¿?. This is processed to add it to the database. I use mysql_real_escape_string($_POST["name"])
and then do MySQL a INSERT
. This will add a slash \ before " and '.
The MySQL database have a DEFAULT CHARACTER SET utf8
and COLLOCATE utf8_spanish_ci
. The table field is a normal VARCHAR
.
Then I have to print this on a XML that will be transformed with XSLT. I can use PHP on the XML so I echo it with <?php echo TexUtils::obtainSqlText($value_obtained_from_sql); ?>
. The obtainSqlText() function actually returns the same as the $value processed, is waiting for a final structure.
One of the first things that I will need for the selected input is to convert > and < to >
and <
because this will generate problems with start/end tags. This will be done with <?php htmlspecialchars($string, ENT_QUOTES, "UTF-8"); ?>
. This will also converts & to &
, " to "
and ' to '
. This is a big problem: XSLT starts to fail because it doesn't recognize all HTML special characters.
There is another problem. I've talked about àáèéìíòóùú"><'@#~!¡¿? input but I will have some text from a CKEditor <textarea />
that the value will look like:
<p>
<a href="http://stackoverflow.com/">àáèéìíòóùú"><'@#~!¡¿?</a>
</p>
How I've to manage this? At first, if I want to print this second value right I will need to use <xsl:value-of select="value" disable-output-escaping="yes" />
. Will "><' print right?
So what I am really looking for is how I need to manage this values and how I've to print. I need to use something if is coming from a VARCHAR
that doesn't allows HTML and another if is a TEXT
(for example) and allows HTML? I will need to use disable-output-escaping="yes" everytime?
I also want to know if doing this I am really securing the query from XSS attacks.
Thank you in advance!