views:

591

answers:

7

For the purposes of tracking non-HTML documents via google analytics, I need the mentioned algorithm. It should:

  • not hard-code the domain
  • ignore the protocol (i.e. http/https)
  • not worry about the presence/absence of "www" (any absolute links WILL prefix with "www" and all pages WILL be served via "www")

This is complicated by the fact that I need to access it via a function called from the IE-only 'attachEvent'.

UPDATE Sorry, I've worded this question really badly. The real problem is getting this to work via an event, since IE has its own made-up world of event handling. Take the following:

function add_event(obj) {
    if (obj.addEventListener)
        obj.addEventListener('click', track_file, true);
    else if (obj.attachEvent)
        obj.attachEvent("on" + 'click', track_file);
}

function track_file(obj) { }

It seems as if the "obj" in track_file is not the same across browsers - how can I refer to what was clicked in IE?

+3  A: 

I would like to point out that, if you're on so.com, the following links are URLs within the same domain:

(it may seem odd, but the last two ones are valid: if you're on http://so.com, the last one would take you to http://so.com/mail.google.com/index.php?var=value, which is perfectly valid)

This doesn't really answer the question but I hope it will guide the rest of the answers. If there's anything else weird enough, feel free to add it.

Tom
also href="javascript:..." is same domain, except if it has document.location change inside it, but its up to you then what you gonna do about it.
vsync
A: 

This sounds like a comedy answer but in all seriousness it would be be advisable that you could also do something like:

$('a.external')

Certainly the regex comparison to your window.location is the programmatic answer.

annakata
A: 

Maybe this will help: http://www.quirksmode.org/js/events_properties.html#target

Tom
+2  A: 

The method of attachment is not the only way IE and W3 event listeners differ. For IE you must read window.event.srcElement; in W3 it's event.target where event is the parameter passed to the callback function.

If you don't need multiple event handlers on links, old-school DOM 0 event handlers are probably an easier way for you to approach this, allowing you to just us ‘this’ to get the object on any browser.

function bindtolinks() {
    for (var i= document.links.length; i-->0;)
        document.links.onclick= clicklink;
}

function clicklink() {
    if (this.host==window.location.host) {
        dosomething();
        return true; // I'm an internal link. Follow me.
    } else {
        dosomethingelse();
        return false; // I'm an external link. Don't follow, only do something else.
    }
}
bobince
Would be great if I could get away with the old-school approach, but this has to be a very generic script that just applies to all links on the page, some of which may well have event handlers already present.
Bobby Jack
If you know you're running after other event handlers, you could of course save (link.orig_onclick= link.onlick) and re-call the existing event handler.
bobince
A: 

I will answer the question in the update, about events in IE:

function track_file(evt)
{
  if (evt == undefined)
  {
    evt = window.event; // For IE
  }
  // Use evt
}

is the classical way to get consistent event object across browsers.

After that, I would use regexes to normalize the URL, but I am not sure what you look after.

[EDIT] Some real code to put in practice what I wrote above... :-)

function CheckTarget(evt)
{
  if (evt == undefined)
  {
    // For IE
    evt = window.event;
//~     event.returnValue = false;
    var target = evt.srcElement;
    var console = { log: alert };
  }
  else
  {
    target = evt.target;
//~     preventDefault();
  }
  alert(target.hostname + " vs. " + window.location.hostname);
  var re = /^https?:\/\/[\w.-]*?([\w-]+\.[a-z]+)\/.*$/;
  var strippedURL = window.location.href.match(re);
  if (strippedURL == null)
  {
    // Oops! (?)
    alert("Where are we?");
    return false;
  }
  alert(window.location.href + " => " + strippedURL);
  var strippedTarget = target.href.match(re);
  if (strippedTarget == null)
  {
    // Oops! (?)
    alert("What is it?");
    return false;
  }
  alert(target + " => " + strippedTarget);
  if (strippedURL[1] == strippedTarget[1])
  {
//~     window.location.href = target.href;  // Go there
    return true; // Accept the jump
  }
  return false;
}

That's test code, not production code, obviously!

The lines with //~ comments show the alternative way of preventing the click on link to do the jump. It is, somehow, more efficient because if I use Firebug's console.log, curiously the return false is ineffective.
I used here the behavior "follow link or not", not knowing the real final purpose.

As pointed out in comments, the RE can be simpler by using hostname instead of href... I leave as it because it was already coded and might be useful in other cases.
Some special precautions should be taken in both cases to handle special cases, like localhost, IP addresses, ports...
I got rid of the domain name, before re-reading the question and seeing it wasn't a problem... Well, perhaps it can be useful to somebody else.

Note: I shown a similar solution in a question to decorate links: Editing all external links with javascript

PhiLho
Actually, evt.srcElement.hostname (IE) and this.hostname (others) works for me
Bobby Jack
A: 
if( someDomElementWhichIsALink.href.indexOf(window.location) != -1 ) {
  // this is targeting your domain
}
Thomas Hansen
A: 

Given a click event and the original target element, this should work for the original question:

if(target.protocol == window.location.protocol && target.host == window.location.host){
}

Browsers nicely convert the link from the various patterns mentioned by @Tom into full links, so the protocol and host values simply need to match your domain.

jvenema