views:

189

answers:

2

Yesterday morning I noticed Google Search was using hash parameters:

which seems to be the same as the more usual search (with search?q=Client-side+URL+parameters). (It seems they are no longer using it by default when doing a search using their form.)

Why would they do that?

More generally, I see hash parameters cropping up on a lot of web sites. Is it a good thing? Is it a hack? Is it a departure from REST principles? I'm wondering if I should use this technique in web applications, and when.

There's a discussion by the W3C of different use cases, but I don't see which one would apply to the example above. They also seem undecided about recommendations.

+2  A: 

Recently google also stopped serving direct links in search results offering instead redirects.

I believe both have to do with gathering usage statistics, what searches were performed by the same user, in what sequence, what of the search results the user has followed etc.

P.S. Now, that's interesting, direct links are back. I absolutely remember seeing there only redirects in the last couple of weeks. They are definitely experimenting with something.

Developer Art
+5  A: 

Google has many live experimental features that are turned on/off based on your preferences, location and other factors (probably random selection as well.) I'm pretty sure the one you mention is one of those as well.

What happens in the background when a hash is used instead of a query string parameter is that it queries the "real" URL (http://www.google.com/search?q=hello) using JavaScript, then it modifies the existing page with the content. This will appear much more responsive to the user since the page does not have to reload entirely. The reason for the hash is so that browser history and state is maintained. If you go to http://www.google.com/#q=hello you'll find that you actually get the search results for "hello" (even if your browser is really only requesting http://www.google.com/) With JavaScript turned off, it wouldn't work however, and you'd just get the Google front page.

Hashes are appearing more and more as dynamic web sites are becoming the norm. Hashes are maintained entirely on the client and therefore do not incur a server request when changed. This makes them excellent candidates for maintaining unique addresses to different states of the web application, while still being on the exact same page.

I have been using them myself more and more lately, and you can find one example here: http://blixt.org/js -- If you have a look at the "Hash" library on that page, you'll see my implementation of supporting hashes across browsers.


Here's a little guide for using hashes for storing state:

How?

Maintaining state in hashes implies that your application (I'll call it application since you generally only use hashes for state in more advanced web solutions) relies on JavaScript. Without JavaScript, the only function of hashes would be to tell the browser to find content somewhere on the page.

Once you have implemented some JavaScript to detect changes to the hash, the next step would be to parse the hash into meaningful data (just as you would with query string parameters.)

Why?

Once you've got the state in the hash, it can be modified by your code (or your user) to represent the current state in your application. There are many reasons for why you would want to do this.

One common case is when only a small part of a page changes based on a variable, and it would be inefficient to reload the entire page to reflect that change (Example: You've got a box with tabs. The active tab can be identified in the hash.)

Other cases are when you load content dynamically in JavaScript, and you want to tell the client what content to load (Example: http://beta.multifarce.com/#?state=7001, will take you to a specific point in the text adventure.)

When?

If you had a look at my "JavaScript realm" you'll see a border-line overkill case. I did it simply because I wanted to cram as much JavaScript dynamics into that page as possible. In a normal project I would be conservative about when to do this, and only do it when you will see positive changes in one or more of the following areas:

  • User interactivity
    • Usually the user won't see much difference, but the URLs can be confusing
    • Remember loading indicators! Loading content dynamically can be frustrating to the user if it takes time.
  • Responsiveness (time from one state to another)
  • Performance (bandwidth, server CPU)

No JavaScript?

Here comes a big deterrent. While you can safely rely on 99% of your users to have a browser capable of using your page with hashes for state, there are still many cases where you simply can't rely on this. Search engine crawlers, for example. While Google is constantly working to make their crawler work with the latest web technologies (did you know that they index Flash applications?), it still isn't a person and can't make sense of some things.

Basically, you're on a crossroads between compatability and user experience.

But you can always build a road inbetween, which of course requires more work. In less metaphorical terms: Implement both solutions so that there is a server-side URL for every client-side URL that outputs relevant content. For compatible clients it would redirect them to the hash URL. This way, Google can index "hard" URLs and when users click them, they get the dynamic state stuff!

Blixt
ah you're helping me put the finger on what bothers me about this: JavaScript is needed for these links to work. So, for example, if I post such a link on a web site, a simple robot won't be able to follow it, while the more standard links are of course crawlable. I see the win in terms of user experience, but it seems detrimental to general linkability of the Web.
Kai Carver
Yes, you're perfectly right. I added another section to my "guide" on what to think about for these non-JavaScript cases.
Blixt
Thank you for your great answer. And yes, I know you were doing it as an extreme use example, but if I do wget http://blixt.org/js#project/hash?view=code I get the index of http://blixt.org/js instead of the page, which you probably wouldn't want for anything that's like a resource
Kai Carver
True. I'd say all resources do have a proper path, the hash only represents the state of the client. In this particular case, if you wanted a list of projects you would `wget` http://blixt.org/js/projects.json not http://blixt.org/js And for my text adventure game: http://beta.multifarce.com/api/get_frame?frame=7001 And if you want to present data for search engines, you can have public paths to HTML documents generated from the same data that direct the user to the "proper" paths.
Blixt