views:

211

answers:

3

I want to copy a javascript URL-char for char. How, for example, would I successfully copy the javascript from the 'View Source' link on this page:

http://javascript.about.com/library/blsource.htm

doing something like(?):

(function(){
    var w=open('','');
    with(w.document) { 
        write(encodeBlahComponent(document.activeElement.href).replace(/blah/g,'asii equivalent').replace(/blah/g,'unicode equivalent').replace(/blah/g,'entity equivalent'));
        close();
    }
})()

What encoding should I use and how to script it properly?

A: 

Thank you David.

eligmatic
Please thank people by accepting their answer (the tick under the score), not by "answering" your own question.
David Dorward
A: 

If you're document.write​ing to an HTML document, any text you output would have to be HTML-escaped:

function encodeHTML(s) { // for text content and attribute values with " delimiter
    return s.split('&').join('&amp;').split('<').join('&lt;').split('"').join('&quot;');
}

somedocument.write(encodeHTML(link.href));

However it would probably be easier to use DOM methods:

somedocument.write('<p id="out">x</p>');
somedocument.getElementById('out').firstChild.data= link.href;

Either way you don't have to worry about Unicode or &#...; character references. JavaScript strings are natively Unicode. And you would only need to think about using encodeURI if you are creating a URI from some script (eg. var uri= 'javascript:'+encodeURI(somejscode)), which you're not here, you've already got a URI in the link. (encodeURIComponent would also work, but for this case where you have a whole URI not just a component, encodeURI will give simpler results.)

PS. You don't want to use the with statement ever, if you can help it. (Or javascript: URLs, for that matter!)

ETA. If you really need the original source with all errors intact, you would have to do like web-sniffer does and fetch the page again from the network. You might do this for the current page as long as it is the result of a GET method, using an XMLHttpRequest. For example:

var d= window.open().document, x= new XMLHttpRequest();
d.write('<body><pre>x</pre>');
d.close();
x.onreadystatechange= function() {
    if (this.readyState===4)
        d.body.firstChild.firstChild.data= this.responseText;
}
x.open('GET', location.href);
x.send(null);

Or, packed into a bookmarklet:

javascript:(function()%7Bvar%20d=window.open().document,x=new%20XMLHttpRequest();d.write('%3Cbody%3E%3Cpre%3Ex%3C/pre%3E');d.close();x.onreadystatechange=function()%7Bif(this.readyState===4)d.body.firstChild.firstChild.data=this.responseText%7D;x.open('GET',location.href);x.send(null)%7D)()
bobince
I understand you. However, I think i've not explained myself properly. The example link is a bookmarklet whos' method involves escaping entities like greater/less than and ampersand-twice! A mind-bender!!! For instance, a 'greater than' entity is escaped twice. First as:.replace(/greater than/g,'ampersand-g-t-semicolon')and then as:.replace(/greater than/g,'ampersand-aammpp-semicolon-g-t-semicolon')(don't ask me why!)
eligmatic
How on earth do I copy that, keeping loyal to the template pattern of the original url? Is there a custom dot-prototype function or regExp out there for such an instance? Maybe my question should have been: How do I copy ajavascript string that has HTML and DHTML entities in it using javascript? This is twisting my melon man!
eligmatic
I don't see the problem... you want to put content in HTML, you just `encodeHTML` it as above. It's not at all complicated. In fact looking at the view-source bookmarklet, it's clear it's utterly bogus. It's trying to do all sorts of escapes and unescapes that are totally irrelevant, and some that are completely ineffective and broken syntax like the attempt to escape `<`.
bobince
A better view-source bookmarklet would be the much shorter: `javascript:(function()%7Bh=document.documentElement.innerHTML;document.write('%5Cx3Cbody%3E%5Cx3Cpre%3E_');document.close();document.body.firstChild.firstChild.data=h;%7D)()`.
bobince
Sorry, but sensible as your answers are they don't help me. I need a reliable method to clone! Javascript strings. Surely in order to do that I need to encode each char in the string uniquely before I attempt to make my copy/clone? Otherwise, my "copy" won't be true-although I may be able to run it first time-to the original. Do you see my problem?
eligmatic
bobince thank you for your attention.
eligmatic
No, I do not see your problem. Why do you think you need to encode anything to make a copy of a string? Encoding is a function that is needed when you go from one context to another, such as from raw text to HTML source. If you take the `href` of a link, you've got a URL in a string. If the context you want to use that URL in is just as a URL then there is nothing more you need to do. In that case the way to “clone” a JavaScript string is to say just `var newstring= oldstring`. If the target context is HTML source written by `document.write`, you only need to HTML-encode it; nothing else.
bobince
Thank you for hanging in there. This is my test script:(function(){var d=document,p=prompt("a, b, c, or d?"),lh=d.links[17].href,W=open('',''),wd=W.document;function w(w){open(w+encodeURIComponent(d.URL),'')}function encodeHTML(s){return s.split('if (p=='a') w('http://web-sniffer.net/?url=');if (p=='b') w('http://redbot.org/webui.py?uri=');if (p=='c'){wd.write('<p id="out">x</p>');W.document.all.out.firstChild.data=lh}if (p=='d'){wd.write(encodeHTML(lh))};})();
eligmatic
Results 'a' and 'b' both display the link contents correctly (as I want them copied). How do they do it?!!!Results 'c' and 'd' both display the link contents as:javascript:(function(){var d=document,c=unescape(d.documentElement.innerHTML);c=c.replace(/c=c.replace(/</g,'<');c=c.replace(/>/g,'>');c=c.replace(/</g,'<');c=c.replace(/>/g,'>');d.write('<html><head><title>Source of Page<\/title><\/head><body><pre>'+c+'<\/pre><\/body><\/html>');x.document.close()})();
eligmatic
bobince
bobince
A: 

Re: "Encoding is a function that is needed when you go from one context to another, such as from raw text to HTML source. If you take the href of a link, you've got a URL in a string. If the context you want to use that URL in is just as a URL then there is nothing more you need to do. In that case the way to “clone” a JavaScript string is to say just var newstring= oldstring. If the target context is HTML source written by document.write, you only need to HTML-encode it; nothing else.".

Looks like I've been making a mountain out of a molehill. Your advice didn't sink in until now. It was that crud "View Source" bookmarklet that threw me (I thought the author was supposed to be a JS guru!). Another one of those "learning experience" moments I suppose. Never mind. Many thanks for your support. Thread closed and thank you again.

eligmatic