I am developing my first Firefox extension and for that I need to get the complete source code of the current page. How can I do that with XUL?
Maybe you can get it via DOM, using
var source =document.getElementsByTagName("html");
and fetch the source using DOMParser
Hello everybody,
really looks like there is no way to get "all the sourcecode". You may use
document.documentElement.innerHTML
to get the innerHTML of the top element (usually html). If you have a php error message like
<h3>fatal error</h3>
segfault
<html>
<head>
<title>bla</title>
<script type="text/javascript">
alert(document.documentElement.innerHTML);
</script>
</head>
<body>
</body>
</html>
the innerHTML would be
<head>
<title>bla</title></head><body><h3>fatal error</h3>
segfault
<script type="text/javascript">
alert(document.documentElement.innerHTML);
</script></body>
but the error message would still retain
edit: documentElement is described here: https://developer.mozilla.org/en/DOM/document.documentElement
You can get URL with var URL = document.location.href
and navigate to "view-source:"+URL
.
Now you can fetch the whole source code (viewsource is the id of the body):
var code = document.getElementById('viewsource').innerHTML;
Problem is that the source code is formatted. So you have to run strip_tags() and htmlspecialchars_decode() to fix it.
For example, line 1 should be the doctype and line 2 should look like:
<<span class="start-tag">HTML</span>>
So after strip_tags() it becomes:
<HTML>
And after htmlspecialchars_decode() we finally get expected result:
<HTML>
The code doesn't pass to DOM parser so you can view invalid HTML too.
You will need a xul browser object to load the content into.
Load the "view-source:" version of your page into a the browser object, in the same way as the "View Page Source" menu does. See function viewSource() in chrome://global/content/viewSource.js
. That function can load from cache, or not.
Once the content is loaded, the original source is given by:
var source = browser.contentDocument.getElementById('viewsource').textContent;
Serialize a DOM Document
This method will not get the original source, but may be useful to some readers.
You can serialize the document object to a string. See Serializing DOM trees to strings in the MDC. You may need to use the alternate method of instantiation in your extension.
That article talks about XML documents, but it also works on any HTML DOMDocument.
var serializer = new XMLSerializer();
var source = serializer.serializeToString(document);
This even works in a web page or the firebug console.
The first part of Sagi's answer, but use document.getElementById('viewsource').textContent
instead.
More in line with Lachlan's answer, but there is a discussion of the internals here that gets quite in depth, going into the Cpp code.
http://www.mail-archive.com/[email protected]/msg05391.html
and then follow the replies at the bottom.