views:

32

answers:

2

I'm trying to build a bookmarklet that will get the current page/article's author and date information, for referencing purposes. I know that I can get the Page title and url with document.title and document.URL but I'm drawing a blank when it comes to the other information. Any ideas?

A: 

Does the HTML has a predefined format ? If yes , you could maybe parse the HTML or query the DOM to get the other info that you need .

NM
No, I'm wanting this to work on as many sites as possible
Chris Armstrong
+2  A: 

If the site puts such information in a META tag you can do this:

var author = "";
var info = document.getElementsByTagName('META');
for (var i=0;i<info.length;i++) {
  if (info[i].getAttribute('NAME').toLowerCase()=='author') {
    author = info[i].getAttribute('CONTENT');
  }
}

For the site you mention in your comment, you need to do this non-standard processing

  var author = "";
  var other = document.getElementsByTagName('li');
  for (var i=0;i<other.length;i++) {
    if (other[i].className.toLowerCase()=='author') author=other[i].getElementsByTagName('a')[0].innerHTML;
  }
  alert(author)
}
mplungjan
and for static files you can finddocument.lastModified useful if there is no date meta
mplungjan
PPS: Here is more information - note the part about meta sometimes changed to link rel : http://www.w3.org/TR/html401/struct/global.html#h-7.4.4.2
mplungjan
thanks, that looks like it makes sense but I havn't been able to get it working yet, it returns blank when I test it on a Smashing Magazine post. Is this the kind of thing where every site is going to have a different way of putting this info in?
Chris Armstrong
Please see my edit for the answer to this part of your question
mplungjan
Thanks, do you think I'm going to have to have several loops to catch various methods of labelling 'author' info, or would there be a single meted that would catch the majority of cases? I notice even the New York Times doesn't seem to use the author meta, but has an element with a class of author. Would I be better looking for any element with a class of author?
Chris Armstrong
Hmm, I would say the website SHOULD use the standards which would be the meta or perhaps some other method like http://wiki.foaf-project.org/w/Autodiscovery - you might need a small database telling you what to look for at given sites..
mplungjan
Thanks for flagging as answer. Good luck
mplungjan