views:

1394

answers:

5

Hey folks.

I know this issue has been touched on here but I have not found a viable solution for my situation yet, so I'd like to but the brain trust back to work and see what can be done.

I have a textarea in a form that needs to detect when something is pasted into it, and clean out any hidden HTML & quotation marks. The content of this form is getting emailed to a 3rd party system which is particularly bitchy, so sometimes even encoding it to the html entity characters isn't going to be a safe bet.

I unfortunately cannot use something like FCKEditor, TinyMCE, etc, it's gotta stay a regular textarea in this instance. I have attempted to dissect FCKEditor's paste from word function but have not had luck tracking it down.

I am however able to use the jQuery library if need be, but haven't found a jQuery plugin for this just yet.

I am specifically looking for information geared towards cleaning the information pasted in, not how to monitor the element for change of content.

Any constructive help would be greatly appreciated.

A: 

Edited from the jquery docs..

$("textarea").change( function() {
    // check input ($(this).val()) for validity here
});

Thats for detecting the changes. The clean would probably be a regex of sorts

edited above to look for a textarea not a textbox

David Archer
I'm familiar with how to monitor a paste action into the textarea, but I really need something more to do with actually cleaning the content. Thanks though.
FluidFoundation
+1  A: 

What about something like this:

function cleanHTML(pastedString) {
    var cleanString = "";
    var insideTag = false;
    for (var i = 0, var len = pastedString.length; i < len; i++) {
        if (pastedString.charAt(i) == "<") insideTag = true;
        if (pastedString.charAt(i) == ">") {
            if (pastedString.charAt(i+1) != "<") {
                insideTag = false;
                i++;
            }
        }
        if (!insideTag) cleanString += pastedString.charAt(i);
    }
    return cleanString;
}

Then just use the event listener to call this function and pass in the pasted string.

peirix
Does that still work if there are unmatched < or > signs in html comments?
Nosredna
+3  A: 

Hi FluidFoundation, I am looking at David Archer's answer and he pretty much answers it. I have used in the past a solution similar to his:

$("textarea").change( function() {
    // convert any opening and closing braces to their HTML encoded equivalent.
    var strClean = $(this).val().replace(/</gi, '&lt;').replace(/>/gi, '&gt;');

    // Remove any double and single quotation marks.
    strClean = strClean.replace(/"/gi, '').replace(/'/gi, '');

    // put the data back in.
    $(this).val(strClean);
});

If you are looking for a way to completely REMOVE HTML tags

$("textarea").change( function() {
    // Completely strips tags.  Taken from Prototype library.
    var strClean = $(this).val().replace(/<\/?[^>]+>/gi, '');

    // Remove any double and single quotation marks.
    strClean = strClean.replace(/"/gi, '').replace(/'/gi, '');

    // put the data back in.
    $(this).val(strClean);
});
Shane Tomlinson
I wish I knew RegEx, so I could write a sollution as easy as this (:
peirix
+1  A: 

You could check out Word HTML Cleaner by Connor McKay. It is a pretty strong cleaner, in that it removes a lot of stuff that you might want to keep, but if that's not a problem it looks pretty decent.

Tim Molendijk
A: 

It might be useful to use the blur event which would be triggered less often:

$("textarea").blur(function() {
    // check input ($(this).val()) for validity here
});
zilverdistel