ansaurus

Question

programmatically determining if someone owns a website?

Answer 1

+1 A:

Have them put a file with a hard to guess name on the server?

such as http://www.example.com/5gdbadcab234g3.txt

GWW 2010-07-30 02:18:12

Answer 2

A:

It sounds like you are close. If your goal is to determine whether they have control over the contents of the website, consider sending them a time-limited Guid.

<!-- StampForHadees:76f44668-a7a7-4370-a8bb-4bd1559dbf26-->

If you can scrape their site, and they've pasted that Guid in a comment, as you show above, then you can reasonably say that they have control of the site.

Google has this scheme when you attempt to give them control for hosted Google Mail.

p.campbell 2010-07-30 02:19:56

@random_downvoter: care to leave a comment on why this solution isn't helpful to the OP?

p.campbell 2010-07-30 03:33:55

Answer 3

+2 A:

Make part of the requirement be that comment be inside of the <head> tag. Typically, even user generated content wouldn't make it's way into the head.

Also, your concern about the comment hack are probably unnecessary. Any comment system worth it's weight knows to escape comments so that the comment is not displayed as actual HTML markup.

Mike Sherov 2010-07-30 02:21:13

Answer 4

+4 A:

Do what Google does for their Webmaster Tools. Generate a unique key, and have them put it in a meta tag in the head of their front page. It's pretty unlikely that a user who does not own the site will be able to change the contents within the <head></head> tags. If they can, the site is vulnerable to almost any kind of vandalism, and is hopeless.

Tyler McHenry 2010-07-30 02:21:44

Answer 5

+2 A:

You could have them add your original idea but only accept the comment in, say, the <header> tag of the website. This way you could avoid having them past the comment into a 'comments' section like you originally suggested.

In fact, I subscribed to a service that did just that: include the special comment in the header section of your page.

gnucom 2010-07-30 02:21:58

Answer 6

+8 A:

Here are the options that Google uses for Domain verification:

Create a CNAME or TXT record in your domain's DNS settings. These methods require accessing DNS settings for your domain at your domain host's website. Which method you can choose (CNAME or TXT record) depends on what's offered in your Google Apps control panel. We're currently rolling out the TXT record method but still ask many customers to create a CNAME record, instead.
Upload an HTML file to your domain's web server This method requires being able to upload files to your domain's web server. Try doing this if you don't have access to your domain's DNS settings.
Add a tag to your home page This method is available only for some customers (it's another new method we're rolling out). It requires accessing your domain's web server but not uploading to it. Try doing this if you have write access to files on the server but can't upload new files.

CNAME/TXT or uploading an HTML file to the root of the domain is the most secure, since it requires full control of the domain. If you want to be a bit more lax you could use a Meta tag in the head node, which would prevent someone from adding a comment to a page. All depends on how secure you want to be.

Greg Bray 2010-07-30 02:22:18

thanks, if anyone has figured out the best way to do this I'd bet google has. Copying them seems like the best solution.

hadees 2010-07-30 04:06:11

Answer 7

A:

The only true way is to be able to access their fileserver. Anything transferred through HTTP can be reproduced.

If you don't have access to their server, then the best way would be to have an encrypted string embedded on the page (or in an image or some binary file on that page).

The string should be comprised of the URI, author, and timestamp. That way, even if someone does copy this string to their website, you would still be able to determine the author and the page. An added bonus is you'll be able to determine if there was a theft.

Granted, this is only as good as the algorithm that encrypts the page/author combination; hackers that are good at decrypting could get around this. Additionally, a dishonest author could create his own key for his page, thus you'd need to host the encryption so that no one could tinker with the timestamp. Also, this requires that all authors places the code on their page.

vol7ron 2010-07-30 02:31:22

@random_downvoter: care to leave a comment on why this solution isn't helpful to the OP?

vol7ron 2010-07-30 03:59:02

Answer 8

A:

I know you mentioned that it isn't necessarily domain dependent but that would help. You could hash the domain (as they are unique) and send the person that string to put somewhere on their site either .txt or in the header as others have mentioned.

Then you store all their domains and their hashes in a database and your scraper would check that the domain it is scraping matches the hashed comment string, if it checks out then its fine.

James Hulse 2010-07-30 03:09:28

@random_downvoter: care to leave a comment on why this solution isn't helpful to the OP?

James Hulse 2010-07-30 03:49:14

ansaurus

tags:

views:

answers:

programmatically determining if someone owns a website?

related questions