views:

406

answers:

7

In order to put e-mail addresses on my sites, I use this Javascript:

function showEmailLink(user, domain, linkText) {
 if (linkText == "") {
  linkText = user + "@" + domain;
 }
 return document.write("<a href=" + "mail" + "to:" + user + "@" + domain
   + ">" + linkText + "<\/a>");
}

so that in my HTML I can write this:

please send me an 
<script type="text/javascript">
  <!--
  showEmailLink("edward","tanguay.info","e-mail");
  //-->
</script>

This protects my site from spammers who collect e-mail addresses by screenscraping the source code since my e-mail is no where in the text.

However, I can't imagine that a motivated spammer could not write a screenscaper somehow which could mechanically determine the e-mail address based on this javascript and HTML code.

How safe is this method of javascript e-mail obsfuscation really?

+9  A: 

It's not really a question of "safety" - anything which a regular user can see isn't "safe" because any really determined malicious entity can just act like a regular user and actually render/evaluate the page.

It's more a question of deterrence - how much do automated harvesters care? I don't have exact numbers, but my guess would be that most harvesters don't bother to fully render or evaluate pages, since there are plenty of "softer" targets for them and it takes a lot longer to fully evaluate a page's scripts which isn't well suited for rapid mass spidering.

If you really want to deter harvesters, probably the best deterrence currently available is something that involves a CAPTCHA to retrieve the address like Mailhide. However, even this can be foiled if the harvester is determined enough (by methods such as knowingly or even unknowingly crowdsourcing CAPTCHA-breaking, et cetera).

Amber
Nice, I didn't know Mailhide. Thanks for the pointer.
André Hoffmann
+1 thanks for the Mailhide link
Edward Tanguay
+1  A: 

If someone wants to target your site specifically, this is 0% safe. If you're just trying to raise the bar against automated scripts, you may be fine. I haven't kept up with the state of the art.

I would like to point out, however, that you shouldn't inject arbitrary strings (such as the username and domain name) into your HTML via document.write(), since that is a security hole. You should be creating an A node and using the getter/setter methods.

jeffamaphone
What's the security hole? I guess I'm missing something but the only effect I can see is that the user can modify the HTML on their local computer.
nfm
+1  A: 

If you are to do this (which I disagree with on principle, as I believe that all content should be accessible to users without JavaScript), the trick will be to do something unique. If your method is unique, there's not much of a point for the authors of scrapers to code a workaround, eh?

However, some modern scrapers have been known to use the rendered source to scrape for addresses, rendering any JavaScript obfuscation methods worthless.

cpharmston
One advantage of reCAPTCHA (and Mailhide, which is done by them) is that it offers a non-Javascript version as well.
Amber
+1  A: 

It all depends whether the cost of rendering the page is offset by the value of the email address. As Dav said, professional spammers can employ an army of cheap labor to render such pages, or to decipher CAPTCHAS. In some cases this is quite worthwhile, such as creating new email accounts at trusted domains.

You could increase the cost of rendering the page by performing some computation in showEmailLink().

nfm
A: 

Although I don't have any hard evidence, I believe that email harvesters have had the capability to execute javascript code for some years now. This is based only on using a function very similar to yours to "protect" email addresses on a public page that weren't used anywhere else. Sure enough, eventually they started to receive spam.

Fundamentally, anything you can do that doesn't require a human to interpret and type in an email address, will eventually be scraped by email harvesters. If your browser can execute javascript to decode it, so can they. (They probably use browsers to do it.)

Greg Hewgill
A: 

If you're like me and don't mind using javascript, I found this page: http://reliableanswers.com/js/mailme.asp It basically uses this snippet:

<script type="text/javascript">
function mailMe(sDom, sUser)
{
  return("mail"+"to:"+sUser+"@"+sDom.replace(/%23/g,"."));
}
</script>
<a href="/contact/" title="Contact Me!"
 onmouseover="javascript:this.href=mailMe('example%23com','me');"
 onfocus="javascript:this.href=mailMe('example%23com','me');">Contact
Me!</a>

Quite a good obfuscation.

Rocky Luck
Its adoption rate makes it bad. If you found it, the harvester will too. :)
bzlm
+1  A: 

matt cutts just mentions in webmaster videos that this technique is no longer "safe" see the link here http://www.youtube.com/watch?v=Ce6cLrrfS5E he says that if you put the JavaScript into a place disallowed by robots.txt then you wont have to worry about the robots rendering the html but Google is getting better at parsing JavaScript and your address may be searchable in clear text if you use this method

Carter Cole
or you could use a service like http://scr.im/ if you really wanted to
Carter Cole