views:

2619

answers:

2

Hi all. Does anyone know how to screen scrape web-sites that use digest http authentication? I use code like this:

var request = (HttpWebRequest)WebRequest.Create(SiteUrl);
request.Credentials=new NetworkCredential(Login, Password)

I'm able to access the site's mainpage, but when I try to surf to any other pages (using another request with the same credentials) I get "HTTP/1.1 400 Bad Request" error.

I used Fiddler to compare requests of my C# application with Mozilla Firefox requests.

2 URLs that I try to access are: https://mysiteurl/forum/index.php https://mysiteurl/forum/viewforum.php?f=4&sid=d104363e563968b4e4c07e04f4a15203

Here are 2 requests () of my C# app:

Authorization: Digest username="xxx",realm="abc",nonce="NXa26+NjBAA=747dfd1776c9d585bd388377ef3160f1ff265429",uri="/forum/index.php",algorithm="MD5",cnonce="89179bf17dd27785aa1c88ad976817c9",nc=00000001,qop="auth",response="3088821620d9cbbf71e775fddbacfb6d"

Authorization: Digest username="xxx",realm="abc",nonce="1h7T6+NjBAA=4fed4d804d0edcb54bf4c2f912246330d96afa76",uri="/forum/viewforum.php",algorithm="MD5",cnonce="bb990b0516a371549401c0289fbacc7c",nc=00000001,qop="auth",response="1ddb95a45fd7ea8dbefd37a2db705e3a"

And that's what Firefox sending to the server:

Authorization: Digest username="xxx", realm="abc", nonce="T9ICNeRjBAA=4fbb28d42db044e182116ac27176e81d067a313c", uri="/forum/", algorithm=MD5, response="33f29dcc5d70b61be18eaddfca9bd601", qop=auth, nc=00000001, cnonce="ab96bbe39d8d776d"
Authorization: Digest username="xxx", realm="abc", nonce="T9ICNeRjBAA=4fbb28d42db044e182116ac27176e81d067a313c", uri="/forum/viewforum.php?f=4&sid=d104363e563968b4e4c07e04f4a15203", algorithm=MD5, response="a996dae9368a79d49f2f29ea7a327cd5", qop=auth, nc=00000002, cnonce="e233ae90908860e1"

So in my app I have different values in "nonce" field while in Firefox this field is the same. On the other hand I have same values in "nc" field while Firefox increments this field.

Also when my app tries to access site pages in Fiddler i can see that it always gets response "HTTP/1.1 401 Authorization Required", while Firefox authorizes only once. I've tried to set request.PreAuthenticate = true; but it seems to have no effect...

My question is: how to properly implement digest authentication using C#? Are there any standard methods or do I have to do it from scratch? Thanks in advance.

+2  A: 

This article from 4GuysFromRolla appears to be what you are looking for:

http://www.4guysfromrolla.com/articles/102605-1.aspx

Boo
A: 

I'm currently observing the same issue, though the web server I'm testing this against is my own. The server logs show:

Digest: uri mismatch - </var/path/some.jpg> does not match request-uri
        </var/path/some.jpg?parameter=123456789>

I tried removing the arguments from the URL (as that seemed to be what's different), but the error still occurred just like before.

My conclusion is that the URL arguments have to be included in the digest hash as well and that the HttpWebRequest is for some reason removing it.

Cygon