views:

91

answers:

2

I am admittedly a PHP newbie, so I need some help.

I am creating a self-designed affiliate program for my site and have the option for an affiliate to add a SubID to their link for tracking. Without having control over what is entered, I have been testing different scenarios and found a bug when a full URL is entered (i.e. "http://example.com").

In my PHP I can grab the variable from the string no problem. My problem comes from when I get the referring URL and parse it (since I need to parse the referring URL to get the host mane for other uses). Code below:

$refURL = getenv("HTTP_REFERER");

$parseRefURL = parse_url($refURL);

WORKS when incoming link is (for example):

http://example.com/?ref=REFERRER'S-ID&sid=www.test.com

ERROR when incoming link is (notice the addition of "http://" after "sid="):

http://example.com/?ref=REFERRER'S-ID&sid=http://www.test.com

Here is the warning message:

Warning: parse_url(/?ref=REFERRER'S-ID&sid=http://www.test.com) [function.parse-url]: Unable to parse url in /home4/'directory'/public_html/hosterdoodle/header.php on line 28

Any ideas on how to keep the parse-url function from being thrown off when someone may decide to place a URL in a variable? (I actually tested this problem down to the point that it will throw the error with as little as ":/" in the variable)

A: 

Why not sanitize the $refURL by using str_replace to strip out the http://?

Mike Keller
Thank you. I will look into that.
Eric O
Cool let us know how it goes.
Mike Keller
I have created a variety of versions of str_replace functions, but I have determined that there is some yet to be determined bug elsewhere causing my problem. The `parse_url()` is getting the referring URL just fine, yet for some reason it is also grabbing the incoming link's query string and creates the error if one of the variable is a URL. It is not responding to str_replace. THanks for the idea though!
Eric O
+1  A: 

The following portion of code :

$url = "http://example.com/?ref=REFERRER'S-ID&sid=http://www.test.com";
$data = parse_url($url);
var_dump($data);

is working fine for me (PHP 5.3.2), and gives the following output :

array
  'scheme' => string 'http' (length=4)
  'host' => string 'example.com' (length=11)
  'path' => string '/' (length=1)
  'query' => string 'ref=REFERRER'S-ID&sid=http://www.test.com' (length=41)


Are you sure that you're passing a full URL to parse_url ?

If I use this portion of code :

$url = "/?ref=REFERRER'S-ID&sid=http://www.test.com";
$data = parse_url($url);

I get the same warning as you :

Warning: parse_url(/?ref=REFERRER'S-ID&sid=http://www.test.com) 
[function.parse-url]: Unable to parse URL

But this is when not passing a full URL...

Pascal MARTIN
Works fine for me, too. On PHP4 and 5.
webbiedave
Eric O
Ahhh...your example has opened my eyes to a possible looping issue I may have that would take the string into play. Will test...
Eric O
It must be a looping issue since I use a header('Location:') further down to clean the URL address from the clutter of the variable strings. Weird though, try your test again with no "http://"):`$url = "/?ref=REFERRER'S-ID$data = parse_url($url);`It will not create an error.It seems though that I will have to clean out the string before running header('Location:') further down.
Eric O
The manual page ( http://www.php.net/parse_url ) states that *Partial URLs are also accepted, `parse_url()` tries its best to parse them correctly.* ;; with no `http://` at the beginning, the URL you're passing to `parse_url` is a partial one, and it's doing its best to find the informations in the provided string ;; a `http://` right in the middle of it must feel strange to it, I suppose ^^
Pascal MARTIN
Good to know. I eliminated all `header('Location:')` elements from my code to prevent a looping issue from throwing the query string into the `parse_url()` function and the problem STILL persists! Why is `parse_url()` even looking at the query string of the incoming link when all it should be parsing is the referrer's url? There is no way it should even know what the query string is on the current page...right?
Eric O
That's a good question ;; are you sure your are passing to `parse_url` what you think you are ? what about echoing the URL before calling `parse_url` on it ? ;;; as a sidenote : the Referer is not always sent by the browser *(and can be faked/forged)*, so your application must not rely on it to work.
Pascal MARTIN
Yes, I echoed the URL from a test site and got `http://randomabs.com/`. And yet it is still confused by the query string in the incoming link containing a URL. ??? Frustrating. ;;Thank you. The Referrer is a VERY helpful element for me to capture, but the application does not rely on it alone to produce the desired results. I would very much hate to not include it.
Eric O