views:

527

answers:

8

I'm developing a website that helps people understand rap lyrics. Users see the lyrics of a rap song and can click certain lyrics to see an explanation:

alt text (click here for more)

As you can see, each explanation has a permalink (in this case http://RapExegesis.com/2636). Here's what happens when you visit one of these permalinks in your browser:

  1. The app looks up the correct song and artist and redirects you to http://rapexegesis.com/lyrics/ARTIST/SONG#note-2633 (in this case http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind#note-2636)
  2. When a song page loads, the app checks to see whether there's a "note-\d+" in the URL fragment
  3. If there is, it automatically open the correct explanation, and scrolls it into view

Ideally Google and other search engines would associate these permalinks with their corresponding explanations. However, because Google doesn't understand Javascript, these two URLs look exactly the same to it:

And therefore, http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind looks exactly the same as http://RapExegesis.com/2636 to Google as well.

Obviously this is not ideal. Any thoughts? Ideally I'd like to show search engines a different version of http://RapExegesis.com/2636 -- something like

Lyric: Catch me in the kitchen like a Simmons whipping pastry

Meaning: "In the kitchen" refers to cooking up crack (cf. here, here, and here)

Vanessa and Angela Simmons, the twentysomething daughters of Reverend Run of Run-DMC, run Pastry, an apparel and shoe brand

EDIT: The way I originally posed the question was a bit confusing. There are two separate issues:

  1. How do links to explanations on song pages work?
  2. How do URLs corresponding to standalone explanations work?

This diagram (full size here) should make things a bit clearer:

alt text

+1  A: 

You could change the link to actually go to a separate page with the content, and change the behaviour of the JavaScript to nullify the default action of that link (return false) and load things the way it is now.

Like this:

<html>
<head>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js"&gt;&lt;/script&gt;
<script type="text/javascript">
jQuery(document).ready(function() {
    $('.javascript-link').click(function(){
      alert('usual behavior');
      return false;
     });
});
</script>
</head>
<body>

<a class="javascript-link" href="somewhere.html">click me</a>

</body>
</html>
Dean
Are you referring to the in-page link or the permalink?
Crescent Fresh
the link on the page as well as the one the permalink generates.
Dean
This makes sense for the links on the song pages, but what exactly happens when someone / Google visits http://RapExegesis.com/2636?
Horace Loeb
@Dean: I believe @Horace is inquiring about the life of the permalink url once spidered, not about what happens when you click the in-page url.
Crescent Fresh
My latest edit and diagram (http://img686.imageshack.us/img686/6696/googleajax.png) should clear things up a bit
Horace Loeb
+1  A: 

Why not use a REST URL, similar to what is used on this site (look at your address bar). That way each link is different, and will go someplace correct. It will also work if javascript is disabled, as your server can process the url.

If you have javascript working then everything can be done without a page refresh, but this covers the base that the Google spider doesn't execute javascript.

James Black
We shouldn't have to ignore new ways of doing things and all their benefits. Besides, he would still need to implement the same behavior in javascript when actually interacting with the page, so bringing it up statically in that state means implementing it twice, in different languages. That is a bad recipe.
ironfroggy
@ironfroggy - If you design your page to work well with and without javascript things will be implemented twice, once on the server and once in javascript, but, the other option is to not have things work when javascript isn't available, and for many sites that is bad. Ideally the controller logic should only be implemented once, but you may have two ways to get to it, once by an ajax call and the other will have information coming by just going directly to the URL, unmodified by javascript.
James Black
+4  A: 

Don't use #, use the query string.

So, instead of http://rapexegesis.com/lyrics/ARTIST/SONG#note-2633, you'd http://rapexegesis.com/lyrics/ARTIST/SONG?note=2633

the # is specifically meant to be a part of the Same page, using it for something else is just not right. As I understand from your question this would achieve what you want.

eglasius
The hash symbol `#` refers to an anchor `<a name="..." />` on the same page.
Paul Lammertsma
@Paul: modern browsers allow `#` to refer to any element with an `id` attribute - not just `<a>` elements
Jørn Schou-Rode
A: 

Ok, so the situation, as I understand it, is that for each permalink:

  • Google needs to receive static HTML that corresponds only to the specific lyric in question.
  • The user needs to see the entire song, with the lyric highlighted.

One solution for you: browser-sniff on the server. Send Google the snippet, and send the user the entire page. With ASP.NET, you can use Server.Transfer to "redirect" the user without actually redirecting them in the browser and without AJAX.

<%@ Page Language="C#" %>
<script runat="server">
  private void Page_Load(object sender, System.EventArgs e) {
   if(!Request.Browser.Crawler) {
      // look up the realUrl for the entire song
      Server.Transfer(realUrl);
      }
   }
</script>
snippet for the specific lyric goes here and Google sees it, but users won't.
richardtallent
Questions: 1) Is this against Google's rules / will this get me removed from their index? 2) Is this approach better than setting window.location with JS on the explanation page (meaningless to Google, but will redirect everyone else)
Horace Loeb
Doing this is very questionable when you look at Google webmaster guidelines. I'd recommend something along the lines of my solution to be safe.
philfreo
+13  A: 

Here's a good solution... based on looking at your descriptions/diagram as well as having thought through this for previous websites.

Two key concepts to remember... and the rest is all details:

  • Don't show Google things you're not showing regular users. Doing anything tricky here can get you in big trouble with them and isn't really necessary anyways.
  • Use Progressive Enhancement to give your JavaScript users a better experience

This can be done like this...

On your lyrics page, create a permalink to each explanation like this:

<a href="/lyrics/ARTIST/SONG?note=NOTEID" onclick="showPopUpExplanation(NOTEID);">lyric snippet here</a>

Notice how we are using a QueryString (?) instead of a hash (#) so that Google (and JS-disabled users) treat this as a real, unique permalink URL.

Now, use Progressive Enhancement and add onclick handlers to all your .explanation-link links (as shown above or like @Dean suggested), to get the inpage popups to work.

You don't need to use # (hash) links at all for your JavaScript users. This is only necessary when you're trying to allow the browser's Back button to work with AJAX, but since you're showing a in-page popup (presumably with a "close" button), this isn't necessary at all.

Users who want to share a specific explanation with their friends will use your short (/NOTEID) url. These shortened URLs should be a 301 redirect to your real permalink (so search engines don't index both URLs).

Finally, to make the permalinks work you'll use a little server-side scripting so that visiting that page will show the popup right away. Do something like this:

<?php if (isset($_GET['note'])): ?>
    <!-- place above the song lyrics; use CSS to style nicely for non-JS -->
    <div id="explanation"> 
        <?php printExplanation((int)$_GET['note']); ?>
    </div>

    <script type="text/javascript">
        $('#explanation').hide(); // hide HTML explanation for JS users
        showPopUpExplanationFromDiv($('#explanation'));
    </script>
<?php endif; ?>

The important part here is that you're printing the explanation in HTML, rather than in JavaScript/JSON so that Google and non-JS users can see it, but progressively enhance for JS-users.

philfreo
You're right, Phil, this is a great solution. I was getting stuck on the fact the OP wanted Google to see *only* his lyric notes in the page abstract when people search on those phrases. Up-voted you.
richardtallent
I really hate just avoiding this problem, even tho that is usually my number one solution to all problems. Why? Hash tracking is awesome! Google does it, so they should crawl it properly. This is how an interactive app makes the back button and bookmarks work.
ironfroggy
But in this case hash tracking doesn't make sense because the user probably wouldn't expect the back button to get rid of the popup (rather, a close button is expected). If you had Ajax navigation that totally changed the content of the page, then it would be expected, and I do believe in the long run Google is providing support for hash tags/Ajax, but I wouldn't rely on it for now (there's no need to).
philfreo
So I'm confused, why isn't this accepted? Is this not the solution? I was just about to write the same thing till I re-read the solution. This solves every problem, is there something I'm missing?
William
This is a good solution, but there's one problem: Showing the explanation for JS users requires having the lyrics to the song on the page (since, for a JS user, an explanation is a tooltip on a lyric). But we don't want Google to see the lyrics on the explanation page (because the main content of the page is the explanation (since it's specified in the URL)). Therefore, to implement your solution, I have to both download the lyrics with AJAX before I `showPopUpExplanationFromDiv()`. This will hurt the user experience
Horace Loeb
What makes you think it's so bad to show Google lyrics below the explanation? This is what you're doing for real users too, in a way. I really don't think duplicate content is going to be a problem in this situation.
philfreo
+1  A: 

You should just make it available both ways :

This way, your code will be more accessible (even blind people, and they love music, can read you website). Since Google is the most famous blind web user in the word, he will understand it too.

e-satis
But what happens when someone WITH Javascript goes to http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind/note-2636? How do I redirect him to http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind#note-2636? (I could use a javascript-based window.location redirect, I suppose)
Horace Loeb
Someone with javascript activated has no reason to go to /note-2636 since the link is transformed by javascript into #2636.
e-satis
A: 

Use a url shortening service such as Bit.ly Url Shortening service. (You have to type in the url you would like it to be directed to). This will redirect google to the url. I haven't tested it so I am unsure if it will work.

EDIT: Hmm... Stackoverflow.com uses the # at the url and get it indexed by google, maybe you could ask them...? :D

alexy13
Google doesn't have to run javascript to make this work, they just need to not treat /foo#bar and /foo#baz as the same path.
ironfroggy
Thanks, fixed :D
alexy13
A: 

It looks like Google wants to do the right thing. The truth is, I really hate their proposed solution. Still, it seems that they might be understanding any state fragment that starts with an exclamation point as stateful, and thus they should be treating them as unique.

http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind#note-2636

becomes

http://rapexegesis.com/lyrics/Jay-z/Empire-state-of-mind#!note-2636

I maintain a library to manage this kind of state, and if they are, in fact, actually implementing this in their spiders already, then I'll add support for it, myself.

ironfroggy