My first reaction to this problem would be a middleware solution combined with some common SEO practices. Because you have a fairly narrow field of options in your URL schema, this can be a viable option.
The middleware would be responsible for performing two actions on each request.
- Parse
request.path
looking for the pieces of your url.
- Create a URL that is specific for the gender/size/color/material.
Quickly hacking something together, it may look something like this:
class ProductFilterMiddleware:
GENDERS = ("girls", "boys")
MATERIALS = ("cotton", "wool", "silk")
def proc_url(self, path):
""" Process a path looking for gender, color, material and size. """
pieces = [x for x in path.split('/') if x != '']
prod_details = {}
for piece in pieces:
if piece in self.GENDERS:
prod_details['gender'] = piece
elif piece in self.MATERIALS:
prod_details['material'] = piece
elif re.match(r'\d+', piece):
prod_details['size'] = piece
else:
prod_details['color'] = piece
return prod_details
def get_url(self, prod_details):
""" Parse the output of proc_url() to create the correct URL. """
pieces = []
if 'gender' in prod_details:
pieces.append(prod_details['gender'])
if 'material' in prod_details:
pieces.append(prod_details['material'])
if 'size' in prod_details:
pieces.append(prod_details['size'])
if 'color' in prod_details:
pieces.append(prod_details['color'])
return '/%s/' % '/'.join(pieces)
def process_view(self, request, view_func, args, options):
request.product_details = self.proc_url(request.path)
request.product_url = self.get_url(request.product_details)
This would allow arbitrary links to be created to your products without your advanced knowledge, allowing the system to be flexible with its URLs. This also includes partial URLs (just a size and material, show me all the color choices). Any of the following should be parsed without incident:
/56/cotton/red/boys/
/cotton/56/
/green/cotton/red/girls/
/cotton/
From here, your view can then create a list of products to return using request.product_details
as its guide.
Part two of this solution is to then include canonical tags in each of the pages you output. This should prevent duplicate content from adversely affecting your SEO.
<link rel="canonical" href="http://www.example.com/{{ request.product_url }}" />
Warning: Google and other search engines may still hammer your site, requesting information from each URL that it can find. This can create a nasty load on your server very quickly. Because content is available from so many different locations, the spider may dig around quite a bit, even though it knows that only one copy of each page is the real deal.