views:

531

answers:

5

Hello,

My site allows users to enter URLs into a database. I am using the code "$site = strtolower($site);" to make all of these URLs lower-case.

However, I just realized that Wikipedia URLs are case sensitive, so I would like to avoid using "$site = strtolower($site);" on Wikipedia URLs, all of which contain "wikipedia.org".

How could I write a function that will skip over the step "$site = strtolower($site);" if $site contains "wikipedia.org"?

Thanks in advance,

John

A: 

stripos() will tell you if a string exists at any point within another. Check that, and if it's not there, strtolower() will be safe.

Jonathan Sampson
+3  A: 
if(stristr($site, 'wikipedia.org') === FALSE){
 echo "doesn't contain wikipedia.org";
}else{
 echo "wikipedia.org!";
}

Update

Just a short note on storing urls in your database. It is NOT uncommon for a directory, file, username, password or parameter on a server to contain uppercase characters. Although the interpretation depends on the underlying OS, webserver and code I would strongly recommend to not use strtolower() for anything except maybe the domain and protocol.

merkuro
Nice, didn't know there was a "stristr" function
John Rasch
+8  A: 

All URLs on *nix servers are case-sensitive. Some URLs on Windows servers are also case-sensitive.

Edit: The domain name is case-insensitive (actually, the client converts it to lowercase).

http://user:[email protected]/somedir/somefile.ext?someQueryString=someValue#fragment
=======----------==========--------------------------------------------------------

Legend:
    - : Case sensitive
    = : Case insensitive

Note: By specification fragments are supposed to be case sensitive but it is not implemented that way on all clients.

This is a very bad idea to do what you are trying to do. The best way would be to just lowercase the domain name.

Edit 2: Since you asked, here is a function that will properly lowercase a given URL (scheme and domain only):

function urltolower($url) {
    $parts = @parse_url($url);
    if($parts === FALSE) return FALSE;
    $url = '';

    if(!empty($parts['scheme'])) $url .= strtolower($parts['scheme']) . (($parts['scheme'] == 'file') ? ':///' : '://');
    if(!empty($parts['user'])) $url .= $parts['user'] . ((!empty($parts['pass'])) ? ':' . $parts['pass'] : '') . '@';
    if(!empty($parts['host'])) $url .= strtolower($parts['host']);
    if(!empty($parts['port'])) $url .= ':' . $parts['port'];
    if(!empty($parts['path'])) $url .= $parts['path'];
    if(!empty($parts['query'])) $url .= '?' . $parts['query'];
    if(!empty($parts['fragment'])) $url .= '#' . $parts['fragment'];

    return $url;
}

[mixed] urltolower($url)

Lowercases an URL. Returns FALSE on failure. Returns lowercased URL on success.

Example:

echo urltolower('HTTP://en.WikiPedia.org/wiki/PHP');
//echo's http://en.wikipedia.org/wiki/PHP
Andrew Moore
I think my site is hosted on a Unix server, and when I type all caps in the address bar, it goes to my site name in lower-case letters (at least on Google Chrome).
@John: This happened only when your server is configured to use mod_spel or equivalent -- this is non-common.
J-16 SDiZ
@John: domain names are case-insensitive. So, let's say your domain name is example.com and you have a file called test.txt. EXAMPLE.COM/test.txt will work, EXAMPLE.COM/Test.txt won't, example.com/Text.txt won't, example.com/test.txt will.
Andrew Moore
Hmmm. You have me thinking. I might just drop the strtolower altogether. I just want to avoid redundant entries in the database. I. e., I don't want NYtimes.com and nytimes.com to both be in it.
@John: Then just lowercase the domain name.
Andrew Moore
Also, if you changed your mind, might want to change your accepted answer.
Andrew Moore
Yeah... is there an easy way to lower-case the domain only, or does it require a complicated regex?
@John: I just modified my post to include such a function.
Andrew Moore
@John: Any luck?
Andrew Moore
+1  A: 

This is a bad idea. URLs in general are allowed to be case-sensitive, so why would you throw away info? If you do this, you'll have to add exception after exception.

Matthew Flaschen
It seems to me that all URLs except Wikipedia are case-insensitive. Try typing in NYTIMES.com, and you're redirected to nytimes.com.
All domains are case-insensitive - this is not true for all URLS however
John Rasch
John, domains are case-insensitive (http://WIKIPEDIA.ORG goes to http://wikipedia.org too). Paths are not.
Matthew Flaschen
A: 

Making a simple If condition can solve your problem. Set your conditions for your requirements