views:

168

answers:

7

I am a big proponent of using super clean urls for all pages and lists. Generally my pagination urls are just example.com/section/page/2 and tags are example.com/tags/tagname. I generally even try to leave the row id out of the url.

But how would you guys suggest doing a filter list?

Say you have a list of cars and you want to sort by a type, color, price or a combination of those. Say you want to filter the list to get all sedans that are green.

It makes the most sense to me just to:

example.com/cars/?color=green&type=sedan&order=price

It doesn't look very nice at all... I can read it just fine though.

But..

example.com/cars/green/sedan/price

doesn't make any sense. Also it would be a mofo to try to figure out a routing scheme for this.

Also how does this work with SEO? Will google crawl after the ?. Is that good if it does or doesn't crawl the params? Would google indexing endless permutations of the same data have ill effects?

A: 

http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo

Brad
that doesn't help at all. Yes I am using apache but my question is about usability and seo.
+2  A: 

I've seen a couple of schemes that use named parameters in the clean URL. For example:

example.com/cars/color:green/type:sedan/order:price
example.com/cars/page:2

The code to implement it can be a little tricky, but it's easy to understand for the user.

Jason
That does look pretty nice. I've never used a colon in a url before. I'm gonna have to think about how I could route that. Looks promising.
I believe the trick is grabbing the URL before passing it to the rest of your code and picking them out with a regular expression. Any non-named arguments are then passed normally, while the named arguments are stored in a keyed array.
Jason
beware of the duplicate content as your filters could allow for duplicate content pages. make them non-indexable via using META robots tags
dusoft
A: 

Google have long started crawling dynamic pages with query strings. So a '?' in an URL is not a problem for Google.

the reason we use clean urls is not because search engines don't like it. Clean urls are meant to be for the users. So don't sweat if all you are concerned about is SEO, don't sweat it.

Kailash Badu
"the reason we use clean urls is not because search engines don't like it. " - don't say that. I could imagine that the cleaner the URL, the more weight a search term in it gets. That's just my speculation though.
Pekka
Yes I know. I said that it is for the users, "the param string doesn't look very nice". I did say I wondered about the google crawling part, but mainly because there could be limitless combinations of redundant content and how that would affect seo.
that's just your speculation and even the wrong one. yes, keywords in the URL will help, but only if someone links to you using the full URL as an anchor text.
dusoft
A: 

Also how does this work with SEO? Will google crawl after the ?.

Normally, it will.

Is that good if it does or doesn't crawl the params?

That's up to you to decide, isn't it? I could imagine a number of general queries that would be good to have in the index, and a great number of very detailed ones that would not. Maybe you want to prepare a few queries to be indexed and block out all the rest.

Pekka
+1  A: 

when you have a single list it should only be index by google once. remember "duplicate content"?

so generally i would not suggest to have the filtered list in google. (but maybe it fits to your service)

i avoid to use clean urls for filterparams and in my robots.txt i block all urls with queryparams.

d0nut
This is exactly the info I was looking for. Thanks for the suggestions.
A: 

If you object is purely SEO and indexing then do not alter your URI addresses. Instead use the canonical link tag and like to the address of your page that you wish to be indexed. The canonical link tag is supported by all, or soon will be, major search engines and prevents redundant indexing of the same page with different URI addresses.

<link rel="canonical" href="http://example.com/cars/green/sedan/price" type="text/html"/>

The type attribute is not required, but I find it is generally good practice to indicate mime types on all requested or directed media.

A: 

The main thing to think about is what happens when the user wants to find all sedans, but doesn't care about the colour?

Personally I would order the fields by how "high level" they are, as if they are categories. For example, put the type first, then maybe the colour. Any variables that only change the way you look at the data, like ordering, keep in the query string.

So you'd end up with these:

All sedans: example.com/cars/sedan
All green sedans: example.com/cars/sedan/green
Sedans, ordered: example.com/cars/sedan?order=price
Green sedans, ordered: example.com/cars/sedan/green?order=price

DisgruntledGoat