views:

71

answers:

2

Hello,

I would like to replace the link location (of anchor tag) of a page as follows.

Sample Input:

text text text <a href='http://test1.com/'&gt; click </a> text text
other text <a class='links' href="gallery.html" title='Look at the gallery'> Gallery</a>
more text

Sample Output

text text text <a href='http://example.com/p.php?q=http://test1.com/'&gt; click </a> text text
other text <a class='links' href="http://example.com/p.php?q=gallery.html" title='Look at the gallery'> Gallery</a>
more text

I hope I have make it clear. Anyway I am trying to do it with PHP and reg-ex. Would you please light me up with right.

Thank you Sadi

+7  A: 

Don't use regular expressions for parsing HTML.

Do use PHP's built-in XML parsing engine. It works quite well on your question (and answers the question to boot):

<?php
  libxml_use_internal_errors(true);  // ignore malformed HTML
  $xml = new DOMDocument();
  $xml->loadHTMLFile("http://stackoverflow.com/questions/3099187/replace-links-location-href"); 
  foreach($xml->getElementsByTagName('a') as $link) {
   $link->setAttribute('href', "http://www.google.com/?q=" . $link->getAttribute('href'));
  }
  echo $xml->saveHTML();  // output to browser, save to file, etc.
pygorex1
Thanks. Let me test it. But one morething the HTML could be malformed :( So, I might be need to use it as libxml_use_internal_errors(false); isn't it?
Sadi
The libxml engine will actually fix invalidly nested tags. `libxml_use_internal_errors(false)` simply prevents the PHP script from polluting page output with warnings about bad HTML.
pygorex1
Thanks +1 for the explanation :)
Sadi
A: 

Try to use str_replace ();

   $string = 'your text';
   $newstring = str_replace ('href="', 'href="http://example.com/p.php?q=', $string);
Alexander.Plutov
:O!!!! Why it will work :-/
Sadi
You can check this.
Alexander.Plutov