views:

360

answers:

2

I am building a HTML scrubber basically an internal tool for scrubbing a problematic html page. I am building the tool as a web app using ASP.NET 3.5.

It consists of a button and two multiline textboxes.

I am programming it so you paste in the HTML you want scrubbed into the top box. Hit the button and the scrubbed HTML shows up in the bottom box. (some regex methods are performing the scrubbing).

Everything seems to work fine except the scrubbed HTML is being sent to the Bottom Textbox all on one line. All the carriage returns and tabs are stripped out. Basically I paired it down to this simple codebehind code that will strip out the formatting.

TransformHTML = RawHTML.Text;
BEHtml.Text = TransformHTML;

How can I ensure that the formatting of the first Textbox will carry through to the second textbox when I assign it? Thanks.

UPDATE: Seems to be a little confusion in what I am doing. I am giving the textbox a set of HTML like so:

    <tr>
        <td align="left" valign="middle" colspan="6" style="padding-left: 5px;"><strong><a name="Cell_line" id="Cell_line"></a>CELL LINE REFERENCE TABLE</strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
    </tr>

When I hit submit and the click button of the button has the source of

TransformHTML = RawHTML.Text;
BEHtml.Text = TransformHTML;

basically copies the Textbox.Text string to a string variable and then assignes that string variable to the other Textbox.Text I get the following output:

<tr>             <td align="left" valign="middle" colspan="6" style="padding-left: 5px;"><strong><a name="Cell_line" id="Cell_line"></a>CELL LINE REFERENCE TABLE</strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>         </tr>

I'd like it so when I hit submit it is multilined not single lined.

In terms of me scrubbing the HTML it has no relevance on the example I am providing.

+1  A: 

By stripping all the HTML aren't you striping all the formatting? If you want to keep it, you may need to strip a bit less agressively..

Also, make sure the wordwrap option is set for the second textbox (could it be that simple?)

ANSWER FROM ORIGINAL QUESTIONER: The answer above made me recheck my ASP.NET TextBox tag again and it was missing TextMode="MultiLine"

EJB
You were close. It was as simple as I forgot to set TextMode="MultiLine" to the second Textbox. Thanks for making me second guess myself.
RedWolves
A: 

When you strip the HTML, instead of getting rid of BR, replace them with \r\n so it knows to do line breaks.

Serapth
See my updated comments in the question.
RedWolves
Again, its the same progress. Set a breakpoint on your first textbox.Text before you scrub it and you will notice that linebreaks are represented as \r\n. Somewhere during your scrub, you are removing these for some reason. Without seeing your code, I can't tell why.
Serapth