I have a model like this:
class CampaignPermittedURL(models.Model):
hostname = models.CharField(max_length=255)
path = models.CharField(max_length=255,blank=True)
On a frequent basis, I will be handed a URL, which I can urlsplit into a hostname and a path. What I'd like is for the end user to be able to enter a hostname (yahoo.com) and possibly a path (weddings).
I'd like to find when a URL does not 'match' that hostname/path combination like so:
- success: www.yahoo.com/weddings/newyork
- success: yahoo.com/weddings
- failure: cnn.com
- failure: cnn.com/weddings
I think the best way to do this is:
url = urlsplit("http://www.yahoo.com/weddings/newyork")
### split hostname on . and path on /
matches = CampaignPermittedURL.objects.filter(hostname__regex=r'(com|yahoo.com|www.yahoo.com)'), \
path__regex=r'(weddings|weddings/newyork)')
Does anybody have better ideas? I am using PostgreSQL and would otherwise want to try Django Full Text Search but I'm not sure if that's worth it or if it really fits my needs any better than this. Are there other methods that are equally fast?
Keep in mind that my method has the URL passed to it and that the CampaignPermittedURL object may have many hundred records. I am looking for extensible/maintainable solutions foremost, but it does also need to be efficient since this will be scaled to several hundred calls a second.
I'm also fine with using another back-end (Sphinx?) but I am most concerned about staying with standard Django to the highest degree possible.