views:

303

answers:

3

Lets say I have a asp.net webforms website, and I have a paged gridview. Inside the gridview there are links to other pages in the site, these links may be the only links in the entire site to this content. Currently google and other search engines can probably only follow the links that appear on the first page since the gridview pager generates links like:

<a href="javascript:__doPostBack('GridTest','Page$2')">2</a>

So google would never be able to load the other pages to crawl the links.

I need a quick and dirty way for google to index all of the pages that might be hidden behined these javascript links.

I have thought about creating a link with the visibility turned off in the css that will load the gridview with all records visible and no paging. Would this be a good workaround?

If I did have this hidden link how could I prevent the search engine from indexing that page (since I would not want normal visitors to get to it), but still follow and index the pages it links to.

Anyone got any ideas? Thanks for your help.

I liked Colin's idea. I am already using a custom pager control, so I can control the paging links. In order to implement his suggestion I created a control adapter for link buttons which will allow it to render the href attribute if you give it one, and put the postback javascript in the OnClick (but it makes it so you can't use the OnClientClick property at the same time as the href).

using System;
using System.Collections.Generic;
using System.Web;
using System.Web.UI;
using System.Web.UI.Adapters;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.Adapters;

/// <summary>
/// This adapter allows us to specify the href attribute and have it rendered on the page.  
/// When the href is specified the postback javascript will go into the onclick.  The javascript will 
/// cancel the href so it will post back like normal , but for people without javascript 
/// the link should work, and search crawlers should be able to index the links
/// 
/// Styles that you can set like Font-Bold and Font-Underline are not going to work
/// you will need to use the style attribute, or set it in css
/// 
/// </summary>
public class LinkButtonAdapter : WebControlAdapter
{
    protected override void Render(HtmlTextWriter writer)
    {

     LinkButton linkButton = (LinkButton)Control;
     if (linkButton != null)
     {



      writer.WriteBeginTag("a");
      writer.WriteAttribute("id", linkButton.ClientID);

      if (!string.IsNullOrEmpty(linkButton.ToolTip))
      {
       writer.WriteAttribute("title", linkButton.ToolTip);
      }


      if (!string.IsNullOrEmpty(linkButton.CssClass))
      {
       writer.WriteAttribute("class", linkButton.CssClass);
      }

      if (!string.IsNullOrEmpty(linkButton.Attributes["style"]))
      {
       writer.WriteAttribute("style", linkButton.Attributes["style"]);
      }

      if (linkButton.Enabled)
      {




       if (!string.IsNullOrEmpty(linkButton.Attributes["href"]))
       {
        //if the user has set the href render it, and render the javascript in the onclick
        //this will negate the client click script so be careful
        writer.WriteAttribute("href", linkButton.Attributes["href"]);
        string ClientScript = Page.ClientScript.GetPostBackClientHyperlink(linkButton, "");
        if (!string.IsNullOrEmpty(ClientScript))
        {
         writer.WriteAttribute("onclick", ClientScript + ";return false;");
        }

       }
       else
       {
        writer.WriteAttribute("href", Page.ClientScript.GetPostBackClientHyperlink(linkButton, ""));
        if (!string.IsNullOrEmpty(linkButton.OnClientClick))
        {
         writer.AddAttribute(HtmlTextWriterAttribute.Onclick, linkButton.OnClientClick);
        }
       }
      }
      else
      {
       writer.WriteAttribute("disabled", "disabled");
      }





      writer.Write(HtmlTextWriter.TagRightChar);

      //apparently link buttons can contain other controls, who knew
      //when they are databound, and you have the expression in between the
      //tags it gets created as a LiteralControl

      //the behavior for linkbuttons seems to be to overwrite whatever
      //you set in the text propery with what is in between the begin and end tags
      //
      //Unless you have a databound control, then the text and the inner text seems 
      //to be concatenated together, which seems weird
      //
      //Also sometimes it generates literalcontrols, but removes them, and sometime it leave them
      //but only if you have another non literal control inside
      //
      //I don't want to try to emulate this right now it is too confusing.  
      //Just don't use the text and the inner html at the same time, and I am leaving all controls
      //that have been added    

      foreach (Control c in linkButton.Controls)
      {
       if (c is LiteralControl)
       {
        linkButton.Text = ((LiteralControl)c).Text;
       }
       else if (c is DataBoundLiteralControl)
       {
        linkButton.Text = ((DataBoundLiteralControl)c).Text;
       }
       else
       {
        c.RenderControl(writer);
       }
      }

      writer.Write(linkButton.Text);

      writer.WriteEndTag("a");

      Page.ClientScript.RegisterForEventValidation(linkButton.UniqueID);

     }

    }

}

Then put this in your browsers file:

<browsers>

    <browser refID="Default">
     <controlAdapters>
      <adapter controlType="System.Web.UI.WebControls.LinkButton"
         adapterType="LinkButtonAdapter" />
     </controlAdapters>
    </browser>

</browsers>
A: 

Google and other search engines frown heavily on the idea of pages containing hidden links or pages that are concealed from people but shown to search engines, as they are a standard spamming technique. You are more likely to get the site blacklisted than anything else.

Is ASP.NET really incapable of generating links that can be followed? Apart from anything else, users with JavaScript disabled aren't going to be able to use the site either.

It's been a while since I've used ASP.NET (and I wrote all my own controls as the Microsoft ones were so appalling back then), but you might be able to do something to create workable links by subclassing the control and overriding its Render method. There's really no reason why something that can be a simple link causing an HTTP GET should be emulated by using script to submit a form using HTTP POST.

NickFitz
I have heard that this can be true. I have a site which has a hidden link to a page which lists all the content in the site for google because a lot of it is hidden behind postbacks with javascript. The link is hidden with a css class which sets display to none. The site has been up for a year and so far has not been blacklisted by google. I guess too many people use display none for fancy javascript drop down menus and such for google to ban a site just for one hidden link. But still I do not like the index page, I am afraid that google will cache it and people will end up seeing it.
Chris Mullins
NickFitz
A: 

You could dump out a non-javascript version of the page inside a <noscript> tag. That way the links would be visable to any browser or bot. As a special added bonus you can even make it accessible to people who can't use javascript. Blind folks using a text to speach viewer, for example.

Al Crowley
+2  A: 

Subclass the GridView control, override the Render method then make the href of the pager point to the page you want (i.e. Default.aspx?Page=2), and copy the original href javascript link to the onclick event.

so your link would become

<a href="Page2.aspx" onclick="javascript:__doPostBack('GridTest','Page$2')">Page 2</a>

Now the important thing is, is to append return false; to the onclick event, so it would become

<a href="Page2.aspx" onclick="javascript:__doPostBack('GridTest','Page$2');return false;">Page 2</a>

Because the onclick method ALWAYS gets called first, it will trigger the postback but the return false will end the processing of the href attribute. This way, ASP.NET gets it's postback and google gets it's href.

Now, for the most important thing: You need to make sure the GridView responds to the Request.QueryString["Page"] value as well, otherwise there would be no point to this whole exercise would there, as google would not see the second page of content?

Colin
This idea I think works good in my situation. I like the idea of being able to have a fallback hyperlink as well.
Chris Mullins
COol of you to post the eventual code as well, definite +1
Colin
It works pretty well. You might wonder why not just use regular html links? Well the page itself has a bunch of filters that get posted back and persisted in the viewstate, but for the google search bot it can just load each page without any parameters because it is just going to crawl the unfiltered list. I guess users without javascript will also benefit with some limited functionality, they will be able to browse all the data, but only filter the first page.
Chris Mullins
And the even cooler part is is that you could now also use AJAX partial refresh for the gridview. GOogle would still use the other href attribute :-D.
Colin