views:

118

answers:

2

The baidu spider seems to be adding ¤ to end of some crawled urls (it seems that it happens with urls containing single unicode character as the last character)

The baidu-requested url looks like this:

site.com/abc/ä¤

while

site.com/abc/ä is the valid url and as linked from many places on my site.

The internal problem is that a different route is matched for this kind of url and an unhandled exception occurs.

I would not like to lose baidu because of too many 500 errors on the site.

I would like to change the requested URL to a different URL by removing the added character before any ASP.NET MVC processing of the request starts.

Can I write a request filter/http module or something similar in ASP.NET MVC to remove the trailing '¤' from the urls? I would not like to alter my routes to counter-hack this behavior.

+1  A: 

im not sure, but maybe adding a route like "{controller}/{action}/{id}¤", with the caracter there on the end

this route should come before the one you're currently using

bortao
As I have stated in my question, I would like to find a way how to fix this without modifying my routes
Marek
well, filters are applied after the routes, so the best place to do this is in the routes. filtering this before the controller would be much worse
bortao
+1  A: 

I like the "fix it in the Route" answer, but here's an alternative (sorry it's in VB)

Public Class BaseController : Inherits System.Web.Mvc.Controller
    Protected Overrides Function CreateActionInvoker() As System.Web.Mvc.IActionInvoker
            If HttpContext.Request.RawUrl.EndsWith("¤") Then
                Dim redir As String = Replace(HttpContext.Request.RawUrl, "¤", "")
                Response.Redirect(redir)
            End If
    End Sub
End Class

then inherit ALL of your controllers from the BaseController

Note: string.replace is not the "best" way if the trailing character is valid in a normal URL, you'll want to trim it off some other way.

rockinthesixstring
thanks, this was exactly what I was looking for
Marek