views:

86

answers:

5
<html>
 <head>
   <script>
     remove_line(11);
     // This should remove the line 11 from source code, 
     // eliminating the unordered list element.
   </script>
 </head>
 <body>
   <p>Some text</p>
   <ul><li>Some list item</li></ul>
   <a>Some link</a>
 </body>
</html>

I am totally aware that this code hurts your eyes. However, for very atypical technical reasons (performance mostly), this would be the most effective way to solve my problem. It is most likely not doable but it would really save me a lot of coding and performance issues.

If it is not doable (as expected), what is the most consistent/effective notation/technique for uniquely identifying/accessing every elements of the DOM (including text that is mixed/inline with elements).

I was thinking of the following notation: tag_name[index] where index would represent the Xth element by order of appearance in the source code. However, I'm not sure if it's exactly efficient and I'm not sure how to implement it in Javascript. I've also thought of XPath but I'm not sure if it's well supported by Javascript (apart from some frameworks "simulating" it).

UPDATE: My original post wasn't very clear, so I'll clarify some points:

  • The ultimate goal of the library I'm working on is to "minimize page rendering", so doing it server side and sending it back to the user isn't an option. I'll post an update once I have a functional implementation of what I'm trying to do. Otherwise it'd be too long to explain here.

  • To clarify what I really need to do, let me give the following (fictional) example:

    • Script X is a server side script that randomly selects a DOM node from document.html.
    • Script X needs to tell script Y (a Javascript script located in document.html) to delete the DOM node it has selected.
    • How does script X uniquely identify the DOM node it has selected so it can communicate it to script Y?

I'm really interested in how to uniquely identify the DOM node so that script Y can identify it and manipulate it. Preferably, it should work with text nodes as well.

+2  A: 

No javascript has no such access to the source code. Why don't you do this on the server instead?

AnthonyWJones
The goal of the library I'm working on is to "minimize page rendering", so doing it server side and sending it back to the user isn't an option. I'll keep this question updated once I have a functional implementation of what I'm trying to do. Otherwise it'd be too long to explain here.
Olivier Lalonde
+1  A: 

By element_name do you mean the name of the tag, or the value of the name attribute? In the case of the former, you can do:

var nodeToRemove = document.getElementsByTagName('span')[7];

If you mean the value of the name attribute, you could use a library such as jQuery:

$('*[name=myName]:eq(7)');

or if using plain javascript, you'd have to iterate the DOM manually (recursively)

David Hedlund
I meant "name of the tag". Thanks for pointing out the actual JS function for implementing my proposed notation.
Olivier Lalonde
A: 

It's possible in some limited circumstances. A clever workaround is that if the page is static, you could make an XmlHttpRequest for the same page and then responseText(), which would give you the exact source representation. (Note: this might produce browser-specific results.)

However, the DOM itself is an object graph, and retains no knowledge of the structure of the source code. Given a DOM, there are infinitely many raw sources that could have produced it.

John Feminella
getResponseText() will not give you the exact source representation as required, when e.g. trying to count lines. At least in IE, getResponseText() returns the document as it is _after_ the browser has tried to repair incorrect HTML markup and hence made potentially greater changes to the source code.
jarnbjo
jarnbjo: reference? Are you not thinking of `innerHTML`, which behaves in that way? `XMLHttpRequest.responseText` (not `getResponseText()`! that's Java) doesn't generally process the returned content at all (as it may not even be HTML).
bobince
@bobince » Whoops! Fixed.
John Feminella
Thanks for the suggestion but that's not really what I'm trying to do - see update.
Olivier Lalonde
*@John Feminella* not quite fixed :) `responseText` is not a function, its a property and used like `var myVar = xhr.responseText;`.
Andy E
A: 

The browser parses the code into a DOM tree, you can iterate through a collection of say, document.body.childNodes and query the .nodeType ( 1 for elements, 3 for text nodes ), you can query the nodeNames and collection all elements of X type.

The ul would be the second element / child of body. document.body.getElementsByTagName('*')[1] would reference it, I'm not too sure you can do it by line number unless you parse document.documentElement.innerHTML and break it up by newlines, and grab the 11th line but there's a chance the browser reformats this.

Then again it would help telling us what you're trying to do exactly, and more specifically.

Edit: If you're doing this server-side, use a DOM library and removeElement.

meder
A: 

If you access document.body.innerHTML you may get the source code but then again it may be "normalized" in some browsers (e.g. extra newlines and spaces removed).

I believe though that what you would want is to keep semantic and assign IDs to the elements you know you may remove later. For example

<html>
 <head>
   <script>
     var myItem = document.getElementById('list-1-item-1');
     myItem.parentNode.removeChild(myItem);
     // This should remove the line 12 from source code, 
     // eliminating the unordered list element.
   </script>
 </head>
 <body>
   <p>Some text</p>
   <ul><li id="list-1-item-1">Some list item</li></ul>
   <a>Some link</a>
 </body>
</html>

Alternatively you can avoid using parentNode altogether if you set an ID on the parent and you look it up as well.

I insist on the semantic meaning of this because your code may always get reformatted in the browser, proxy and so on.

Sorin Mocanu