tags:

views:

89

answers:

5

I am reading in what i thought was just basic text from an .html file and i want to display it on a asp.net webpage.

I put some css formatting but it doesn't seem to fully work. I got to the bottom of it as the issue is now that i look at what i thought was raw text turns out to be:

<SPAN style="FONT-SIZE: 16pt">
<P style="TEXT-ALIGN: center; MARGIN: 0in 0in 0pt" class=MsoNormal                                 
align=center><SPAN    style="FONT-SIZE: 16pt"><?xml:namespace prefix = o ns = 
"urn:schemas-  microsoft-com:office:office" /><o:p></o:p></SPAN></P><SPAN 
style="FONT-SIZE: 16pt"><o:p> 
<P style="TEXT-ALIGN: center; MARGIN: 0in 0in 0pt" class=MsoNormal align=center><SPAN   
style="FONT-SIZE: 16pt">General Manager’s Corner<o:p></o:p></SPAN></P>  
<P style="TEXT-ALIGN: center; MARGIN: 0in 0in 0pt" class=MsoNormal align=center><SPAN   
style="FONT-SIZE: 16pt">July 2009<o:p></o:p></SPAN></P>  
<P style="TEXT-ALIGN: center; MARGIN: 0in 0in 0pt" class=MsoNormal align=center><SPAN   
style="FONT-SIZE: 16pt"><o:p>&nbsp;</o:p></SPAN></P>

this looks like its coming from microsoft word or something with some inline formatting.

is there anyway i can either:

  1. Remove all the inline formatting
  2. have my css override the inline formatting.
A: 

Inline formatting is more specific so I am pretty sure it will win out everytime when it comes to CSS.

As for removing the actual inline formatting itself a quick google showed up a few options you could use some free some not

Trotts
A: 

You might be able to use the !important hack to override the inline styles in your CSS.

As for removing the inline formatting, you could try Googling paste from word or come up with your own regular expression to discard everything in a tag after the tagname itself.

Ian Oxley
+1  A: 

There is a small API for stripping HTML generated by word called WordOff. Maybe you can use that one?

Emil Stenström
If you want to try it out, just go to http://wordoff.org/
Emil Stenström
A: 

You can apply several simple regex patterns to remove the formating:

For style:

style="[^"]*"

For class and align:

(align|class)=[A-Za-z]*

To play with the expressions you can use this on-line tool: http://www.regextester.com/

czuk
A: 

i just hand coded something that did a bunch of find and replaces. spent too much time trying third party tools that all almost did the job.

ooo