tags:

views:

36

answers:

1

I have a text area that users typically paste content from Microsoft Word into. I am using Tiny MCE for formatting. The problem is they string that gets pasted always has style definitions that are commented out. I need a way to strip this commented stuff out of the string.

Here is an example of the comments that get added:

<!-- /* Font Definitions */ @font-face {font-family:"Courier New"; panose-1:2 7 3 9 2 2 5 2 4 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 0 0 0 1 0;} @font-face {font-family:Wingdings; panose-1:5 2 1 2 1 8 4 8 7 8; mso-font-charset:2; -->

This is just a very small chunk of it, it ussually goes on for hundreds of lines.

anyway, im using strip_tags to get rid of unwanted HTML tags and i've tried using the follow preg_replace but the style comments are always there:

$e_description = preg_replace('/<!--(.|\s)*?-->/', '',$_POST['description']);

Any suggestions on how to get rid of this junk?

Thanks.

A: 

Why not just add the ms modifiers (m is multi-line, s is "dot-all" where . matches all characters:

preg_replace('/<!--.*?-->/ms', '', $_POST['description']);

That MAY work for you (try it out)...

ircmaxell
I rather suggest `'/<!-- /\\* Font Definitions.*?-->/ims'` since user may want to input simple comment. Even this is quite hazardous.
Mikulas Dite
this doesn't do anything/<!--.*?-->/msand this replaces everything in the string not just the commented area'/<!-- /\* Font Definitions.*?-->/ims'Thanks for the suggestions though.
Daelan