ansaurus

Question

Transform URL into a link unless there already was a link

Answer 1

+3 A:

This problem is beyond the power of regular expressions. You might be able to write a regex that could avoid some links, but you wouldn't be able to avoid every existing link.

The good news is that a different approach will make the job much easier. Right now you using document.body.innerHTML to manipulate the HTML as plain text. To do it correctly that way, you will basically need to parse the HTML yourself. But you don't have to, because the browser has already parsed it for you!

The web browser allows you to access an HTML document as a series of object. It's called the Document Object Model (DOM) and if you do some reading on that, you should be able to learn how to traverse through the HTML, skipping over anything inside an A element, and using the regex you have on plain text only.

benzado 2010-10-17 02:36:15

Thanks! I'll try.

Matias 2010-10-17 23:36:53

Answer 2

+1 A:

Using the jQuery JavaScript library, this would look like (demo at http://jsfiddle.net/BRPRH/4):

function autolink() {
    var exp = /(\b(https?|ftp):\/\/[-A-Z0-9+\u0026@#\/%?=~_|!:,.;]*[-A-Z0-9+\u0026@#\/%=~_|])/gi,
        lt = '\u003c',
        gt = '\u003e';

    $('*:not(a, script, style, textarea)').contents().each(function() {
        if (this.nodeType == Node.TEXT_NODE) {
            var textNode = $(this);
            var span = $(lt + 'span/' + gt).text(this.nodeValue);
            span.html(span.html().replace(exp, lt + 'a href=\'$1\'' + gt + '$1' + lt + '/a' + gt));
            textNode.replaceWith(span);
        }
    });
}

$(autolink);

Edit: Excluded textareas, scripts, and embedded CSS. I note that this can also be done using pure DOM's splitText, which has the advantage of not adding extra span elements.

Edit 2: Eliminated all ampersands and double quotes.

Edit 3: Got rid of < and > characters as well.

idealmachine 2010-10-17 03:30:36

Matias 2010-10-17 23:31:57

Anyway, I learned some interesting things thanks to you. Maybe the script should exclude img tags as well...

Matias 2010-10-17 23:36:11

@Matias: I've edited the script to eliminate all ampersands and double quotes, if you think that's a problem.

idealmachine 2010-10-17 23:43:58

Also seems like Blogger replaces $('<span/>') with $("<span></span>").

Matias 2010-10-17 23:49:34

Thanks for your time and patience, but i have lost. Blogger don't accept the A tag without the quotes. Blogger says: «Open quote is expected for attribute "{1}" associated with an element type "href".»

Matias 2010-10-17 23:57:17

Blogger accepts your third edit as valid. Anyway, the script is not writing anything on the page. It keep changing '\u003c' with '\u003c' I do not want you to go crazy. Thanks for the help... I'll keep trying.

Matias 2010-10-18 00:14:28

ansaurus

tags:

views:

answers:

Transform URL into a link unless there already was a link

related questions