views:

492

answers:

5

Hi,

I'm currently working on an admin section for a website. The admin can use the infragistics WebHtmlEditor tool to create markup for pages which will then be loaded into the pages on load.

What is the best way to store this markup in the database? Should we just save the HTML generated by the WebHtmlEditor into a varchar field? Is there any issues with this, e.g. will any markup be lost or cause issues with the DB?

Thanks

A: 

I don't think there are any issues with HTML and SQL. Just remember to escape and unescape it before inserting / after selecting.

eWolf
A: 

The database will store the raw data it is given. There is no need to do anything to it from that point on, you can just output the HTML and everything stored in that varchar field onto the HTML page and it will work fine.

Remember to call mysql_real_escape_string(); (or your languages equivalent) on the post value of the output of WebHtmlEditor before you put it into the database to ensure that it will not throw up any errors in the SQL query.

Sam152
well, it's actually tagged for sqlserver.
Charlie Martin
+1  A: 

Just save it directly into your database and be conscious of the type and length. You might find it should be an nvarchar(max) column. No markup should be lost assuming you're not doing nay transformations between collecting the value from the control and passing to the DB.

Troy Hunt
i'm so glad you said nvarchar(max) instead of ntext.
DForck42
Any reason why I should use varchar over nvarchar?
Fermin
+1  A: 

SQL is not the problem, but if the admin is allowed to paste from Word, then you need to clean up the markup before storing it. I don't know WebHtmlEditor, but you can easily test: if pasting from Word yields things like

style="mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; 
  mso-fareast-language: EN-US; mso-bidi-language: AR-SA; 
  mso-bidi-font-family: 'Times New Roman'; mso-highlight: yellow"

or

<p class="MsoNormal"> .. </p><o.p></o.p>

or a lot of additional <span> and <div> tags, then you'd want to clean up the markup before storing it. Maybe you can test using some online demo and then click some View HTML button within the editor, though then you would not know if the editor might clean up upon saving.

Note that browsers respond differently to the pasting from Word, so if you're relying on WebHtmlEditor to clean up things then you may need to test using some different browsers.

Some rich text editors offer a special button "Paste from Word", but that might effectively act as "Paste as Plain Text", after which your admin might stop using it... (And, of course, your admin might simply forget to use it, so cleaning up is required even if such button exists.)

Arjan
A: 

It depends on the size of the HTML that you are storing and character encoding.

Since this post is tagged SQL Server, the current VARCHAR(MAX) is 8000 characters.

If it's anymore than that you can use a TEXT type.

There are cavaets with TEXT fields because they restrict the ability to use queries with LIKE, problems with UNION, Replication and others.

If you need advanced charactersets you can also consider the unicode types NVARCHAR and NTEXT but these take up twice the storage as VARCHAR and TEXT since they use 2 bytes per character instead of 1.

If any of this content is input by users, you should be extremely careful about XSS injection attacks which is pretty close to impossible to stop once you start allowing HTML from your users.

VARCHAR : http://msdn.microsoft.com/en-us/library/aa258242(SQL.80).aspx TEXT : http://msdn.microsoft.com/en-us/library/aa260619(SQL.80).aspx

XSS Attack : http://en.wikipedia.org/wiki/Cross-site_scripting

Chad Grant