views:

88

answers:

3

I have a table that contains a list of websites.

I have a sql variable @url, I want to select all the rows where the website is equal to @url. However, if there is no subdomain I want to match www and the root domain.

So, if @url='http://website.com' it should match both 'http://website.com' and 'http://www.website.com'

I'm currently doing

select * from websites a
where b.archived = 0
and @url like replace(a.WebsiteURL, '*', '%')
order by newid()

Obviously that doesn't take into account the www issue.

Any ideas how to solve this problem?

I was thinking of force adding the www. if it doesn't exist (and the url doesn't have a subdomain) before adding it to the DB. But, what about websites that actually don't have the www and redirect the www to the root?

+1  A: 

You should be using the RegexMatch (or similar) feature of SQL. With a regex test of "http://(?:www.)?whatever.com", you should match both "http://whatever.com" and "http://www.whatever.com".

John Fisher
A: 
  http://website.com
   http://www.website.com

Reverse the strings

moc.etisbew//:ptth
moc.etisbew.www//:ptth

-- Do this once to the input string

Search Str = replace(replace(Str,'www',''),'http://')

Reverse the input string as well

moc.etisbew

Now you can search and use an index. You could create an update trigger to automatically populate the reverse field, which is never shown to the user, only an indexable field.

Little strange, but should give you pretty good lookup performance

Sparky
very strange actually. Can you explain the reversal to me?
Tommy
If you need to store both versions of the website, then this technique makes the right most portion of the website name the lookup key. When you do a partial name look up, such as 'JO*' which returns JONES, JOHNSON, etc. By reversing the website and lookup string, the wildcard lookup of *.COM (all website ending in .COM ) becomes moc.*, which can be done using the index. With the wildcard in the first position, a table scan is needed.
Sparky
A: 

You could divide WebsiteURL column into WebsiteScheme, WebsiteSubdomain, WebsiteDomain columns. Thus the check will turn to:

@url = WebsiteScheme + '://' + WebsiteSubdomain + '.' + WebsiteDomain OR 
@url = WebsiteScheme + '://' + WebsiteDomain AND WebsiteSubdomain = 'www'
Alexander Prokofyev