tags:

views:

37

answers:

3

I'm making an automated script with PHP to check if my link exists at my partner website ( link exchange) .. besides making sure my link exists in the source code , I want to make sure he is not placing it in a HTML comment like <!-- http://www.mywebsite.com --> and cheating me ..

I tried to match it with REGEXP , but have failed

+1  A: 

Writing a regular expression to look ahead/behind that a link is not within a comment may be more difficult than removing the comments on a page, then searching for your link.

I will point out that there are other ways to hide your link. You should determine what exactly you want to verify, i.e. it must be a <a href="[your_url]">. Just know that there are many crafty ways for the link to be in the code but not be displayed (JS, CSS, etc).

The question is, how crafty do you want to be?

Jason McCreary
yes , he can make the link invisible with CSS with Display:none , or visibility:hidden , for example. But I want to solve all issues that can practically be solved and noticed..and placing a link in comments is one of them ...
Naughty.Coder
Then I would strip the comments with one regular expression, then search for your URL.
Jason McCreary
but how can I know that it was in a comment after stripping comments .. I want to be notified if it IS already in a comment
Naughty.Coder
I thought you only care if it is displayed. That is to say not in a comment. You said you didn't want him to cheat.
Jason McCreary
why does it matter if it's in a comment? If your url is found 5 times or 0 times in comments, it's irrelevant, as long as it is in at least one anchor, right?
Chadwick
@Naughty Coder - then how about you search for the url before and after stripping the comments?
Vilx-
Yeah , I dont want him to cheat , and if he does , I want to be notified by setting a cron job that performs this check daily and tells me if it's in a comment ..
Naughty.Coder
Ah , you mean I should strip the comments with it's contents and then check for my link existence .... AH ,, nice idea
Naughty.Coder
Yes, that's what I mean. It's easiest. But if you want something more, I like what **MooGoo** has above.
Jason McCreary
+3  A: 

Use the DOM and XPath, it ignores comments:

$doc = new DOMDocument();
$doc->loadHTML($htmlstring);

$xpath = new DOMXPath($doc);

$result = $xpath->query('//a[contains(@href, "mywebsite.com")]');

if (!$result->length) echo "You've been cheated\n";

And then if you still want to know if your website is being commented out

if (strpos($htmlstring, 'mywebsite.com') !== false && !$result->length)
   echo "Your partner is hiding your link in a comment, sneaky bastard\n";
MooGoo
seems like it will only check the existence of the url . will this notify my link is inside a comment !?
Naughty.Coder
Updated to account for commented links
MooGoo
I will try this and tell you :) thank you
Naughty.Coder
+1  A: 

Sounds like a perfect use for an HTML parser like DOMDocument->loadHTML() and look for an anchor tag with your link. He could still remove it via javascript on the browser side, but that's a different issue.

If it's a cat and mouse game of "are you showing a link to my site" using a standard parser is your best bet. There are just too many ways for a regex to fail on html.

Chadwick