views:

396

answers:

4

So I'm working on a web app, and I want to filter search results.

A nice restful implementation might look like this:

1. mysite.com/clothes/men/hats+scarfs

But lets say we want to ajax up the filtering, like the cool kids, and we want to retain deep linking, we might use the anchor tag and parse that with Javascript to show the correct listings:

2. mysite.com/clothes#/men/hats+scarfs

However, if someone clicks the first link with JS enabled, and then changes filters, we might get:

3. mysite.com/clothes/men/hats+scarfs#/women/shoes

Urk.

Similarly, if someone does not have JS enabled, and clicks link 2 - JS will not parse the options and the correct listings will not be shown.

Are Ajax deep links and non-Ajax links incompatible? It would seem so, as servers cannot parse the # part of a url, since it is not sent to the server.

A: 

If you go to mysite.com/clothes/men/hats+scarfs with JavaScript enabled then your JavaScript should automatically rewrite that to mysite.com/clothes#men/hats+scarfs - when you click on a filter, they should be controlled by JavaScript meaning you'll only change the hashtag rather than the entire URL (as you're going to have return false anyway).

The problem you have is for non-JS users going to your JS enabled deeplinks as the server can't determine that stuff. Unfortunately, the only thing you can do is take them to mysite.com/clothes and make them start their journey again (as far as I'm aware). You'll need to try and ensure that when people link to the site, they use the hardcoded deeplink rather than the hashed deeplink

Javascript can't rewrite the address from /clothes/men/hats+scarfs to /clothes#men/hats+scarfs - it can only affect the # part. Unless you mean a redirect, which would work but be a bit clunky - load one page, then redirect to the same thing, but using the # syntax.
joeformd
A: 

I don't recommend ever using the query string as you are sending data back to the server without direct relevance to the prior specified destination. That is a corruptible security hole as malicious code can be manually added to the query string to cause a XSS or buffer overflow attack at your webserver.

I believe REST was intended to work with absolute URIs without a query string, because then your specifying only a location of a resource and it is that location that is descriptive and semantically relevant in addition to the possibility of the resource being so equally relevant. Even if there is no resource at the specified path you have still instantiated a potentially unique and descriptive location that can be processed accordingly.

A: 

Users entering the site via deep links

Nonsensical links (like /clothes/men/hats#women/shoes) can be avoided if you construct your Ajax initialisation code in such a way that users who enter the site on filtered pages (e.g. /clothes/women/shoes) are taken to the /clothes page before any Ajax filtering happens. For example, you might do something like this (using jQuery):

$("a.filter")
  .each(function() {
    var href = $(this).attr("href").replace("/clothes/", "/clothes#");
    $(this).attr("href", href);
  })
  .click(function() {
    update_filter($(this).attr("href").split("#")[1]);
  });

Users without JavaScript

As you said in the question, there's no way for the server to know about the URL fragment so filtering would not be applied for users without JavaScript enabled if they were given a link to /clothes#filter.

However, even without filtering, these links could be made more meaningful for non-JS users by using the filter strings as IDs in your /clothes page. To prevent this messing with the Ajax experience the IDs would need to be changed (or the elements removed) with JavaScript before the Ajax links were initialised.

How practical this is depends on how many categories you have and what your /clothes page contains.

georgebrock
thanks - yeh that would work, but the first filtering click would be non-ajax, right? To have the first click be non-ajax and subsequent clicks ajax could be weird for the user.As for users without JS, yes I think IDs could work with simpler interfaces. However, here we have more than one filter dimension, which would make using ids pretty tricky
joeformd
Yeah, the first click wouldn't be Ajax, which isn't ideal.
georgebrock
+1  A: 

There's a monkeywrench being thrown into this issue by Google: A proposal for making Ajax crawlable. Google is including recommendations for url structure there that may give you ideas for your own application.

Here's the wrapup:

In summary, starting with a stateful URL such as http://example.com/dictionary.html#AJAX , it could be available to both crawlers and users as http://example.com/dictionary.html#!AJAX which could be crawled as http://example.com/dictionary.html?_escaped_fragment_=AJAX which in turn would be shown to users and accessed as http://example.com/dictionary.html#!AJAX

View Google's Presentation here (note: google docs presentation)

In general I think it's useful to simply turn off JavaScript and CSS entirely and browse your website and web application and see what ends up getting exposed. Once you get a sense of what's visible, you will understand what most search engines see and that in turn will show you what is and is not getting spidered.

artlung
Thanks for the link - very useful, and nice to know that Google consider it a problem worth tackling too.The site is entirely browsable without Javascript - the problem comes when URLs are stored from a system without JS enabled and used on one with, or vice versa.
joeformd
I've accepted this answer as the most useful, as it provided interesting information on how Google are trying to solve the problem. I don't think that right now there is an ideal solution.
joeformd
Facebook actually implemented this. Example: http://www.facebook.com/group.php?gid=15185147661#!/developers/
Kalmi