tags:

views:

75

answers:

4

Hello everybody, What i need get done is checking if a url is already submitted in database using php.

I have a database table where i store URLS submitted by users. But i want to check if the urls are already submitted or not in the database before inserting it to the database. For example, URLS like http://www.example.com, http://www.example.com/, http://example.com and http://example.com/ should return false if any one of them is already inserted in database as they all are same urls. I think we can get this done by using regular expression.But i am a little weak in regex so need your help. thank you

Edited

Hello, let's assume if the urls are in array not database just to make clear. I know about the unique key and matching url with the result from database. But here i have a

different question if you look at it carefully.

$urls = array('http://www.example.com/newpage.html', 'http://www.example.com/newpage.html');

case A user submits a url. suppose http://example.com/newpage.html

Now as a url http://www.example.com/newpage.html is available in $urls array and http://www.example.com/newpage.html and http://example.com/newpage.html(userinput without www) are same page so i need a function to check it and return false if the url is available in the array. I hope i clarified my self now.

So it's not checking a domain or making unique key in mysql table for url field i think so far. I think we need to use regExp for it. Any help?

A: 

Not clear what URLs in your question is all about. URLs are resources. If you mean submitting HTTP variables sent by a GET method saved to a database, you can use the value of one of those variables as a primary key for searching in the database.

stillstanding
please see the edits above
askkirati
A: 

Like Ben James says, www.example.com and example.com are not the same. Also your meaning of URL is a bit vague.

But if you want to check if the example.com already exists. Just do a count on your database with the like option:

select count(*) from table where url like '%.example.com%'

Where example.com is extracted from the complete url. If count > 0 the domain is already in the database. You will have to finetune this solution, but I should use something like this.

Stegeman
breaks on `http://badexample.com`
nickf
@nickf: Basically it breaks on a lot more, that's why I mentioned the finetune. Added a small fix for your issue.
Stegeman
Also, should example.com/test and example.com/blue be the same? I think not. Adding the request URI to the check should help. So `WHERE \`url\` LIKE '%example.com/test%'`.
henasraf
I think that will match if the url is http://example.com/newpage.html also. but it's not the same url. I am not looking to match the same domain site but the urls
askkirati
In that case it's not much different. Just use "url = 'www.example.com/whatever/comes/after' OR url = 'example.com/whatever/comes/after'"
Stegeman
please see the edits above
askkirati
A: 

The database is a different layer in your application. Regex won't help here, because you will have to check what is inside the database first to be able to use Regex on the resultset.

However, you can just make the column storing the URLs UNIQUE and use INSERT IGNORE.

From MySql manual

If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error is issued.

This would still insert example.com and www.example.com though as they are really different strings. You could use parse_url to check and prepare them before insertion.

Related:

Gordon
A: 

Maybe making the field "unique" will help this way mysql will check the value. If you get a return code of 1062 then you'll know that its already in the database.

Of course it may not be a good idea if you have too large number of accounts. And you should check the URL with php that you always insert them in the same way. (like adding or removing 'http://' or www)

Sinan