i am using Php.
given 2 urls like this, http://soccernet.com and http://soccernet.espn.go.com/index?cc=4716
how to tell that they are actually the same?
also consider situation where the difference is the httpS, like https://gmail.com and http://gmail.com
please advise. I am finding it a struggle at using regex because sometimes it is not very good for differentiating for eg, the soccernet example.
i am open to all sorts of possible good ideas and not limiting myself to just regex.
Edit: thanks for all the comments and answers below. how about a good idea for acquiring a level of certainty? what factors should i look for? how do i go about it in the most efficient way?