How would I go about taking input html and changing any src or href links that go to a local adress (e.g. href="index.html" to their full location (specified) e.g. href="http://www.somesite.com/index.html") this is for a site that gets a file from another site and displays it (kinda like a proxy)
Take a look at the <base>
tag. It lets you define where all links are relative to.
If you are doing this for random HTML pages that are not necessarily strict, regexps will be a huge headache for you, because you'll have to handle non-standard attributes like:
href="some_url"
href='some_url'
href=some_url
My advice is to use DOM functions for this task. You could do something amongst these lines (untested):
$doc = new DOMDocument();
@$doc->loadHTMLFile($url); // suppress warnings about html errors
$xpath = new DOMXpath($doc);
$hrefs = $xpath->query("//*[@href]/@href"); // select the href attribute of all elements that have a href attribute
for ($i=0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$href->nodeValue = make_new_url($href->nodeValue); // this is where the magic happens
}
// now do the same for src attributes
Again, this code might need some tweaking, especially the XPath query, not very sure about it.
Using the DOM extension might seem overly complex for the task at hand, but it will spare you a lot of headaches and time, on this task and future ones, too.
**You dont need any regular expression for this problem,
** $_SERVER['HTTP_HOST']
$cur_dir = basename(dirname($_SERVER['PHP_SELF']));
$host = $_SERVER['HTTP_HOST'];
echo $host."/".$cur_dir."/"$filename;
this will print http://www.yourdomain.blabla/your/images/index.html