Your best bet to figure out screen-scraping problems like this one is to use Fiddler. Using Fiddler, you can compare exactly what is going over the wire in your app vs. when accessing the site from a browser. I suspect you'll see some difference between headers sent by your app vs. headers sent by the browser-- this will likley account for the difference you're seeing.
Next, you can do one of two things:
- change your app to send exactly the headers that the browser does (and, if you do this, you should get exactly the response that a real browser gets).
- using Fiddler's "request builder" feature, start removing headers one by one and re-issuing the request. At some point, you'll remove a header which makes the response not match the response you're looking for. That means that header is required. Continue for all other headers until you have a list of headers that are required by the site to yield the response you want.
Personally, I like option #2 since it requires a minimum amount of header-setting code, although it's harder initially to figure out which headers the site requires.
On your actual question of why you're seeing 2 cookies, only the diagnosis above will tell you for sure, but I suspect it may have to do with the mechanism that some sites use to detect clients who don't accept cookies. On the first request in a session, many sites will "probe" a client to see if the client accepts cookies. Typically they'll do this:
- if the request doesn't have a cookie on it, the site will redirect the client to a special "cookie setting" URL.
- The redirect response, in addition to having a Location: header which does the redirect, will also return a Set-Cookie header to set the cookie. The redirect will typically contain the original URL as a query string parameter.
- The server-side handler for the "cookie setter" page will then look at the incoming cookie. If it's blank, this means that the user's browser is set to not accept cookies, and the site will typically redirect the user to a "sorry, you must use cookies to use this site" page.
- If, however, there is a cookie header send to the "cookie setter" URL, then the client does in fact accept cookies, and the handler will simply redirect the client back to the original URL.
- The original URL, once you move on to the next page, may add an additional cookie (e.g. for a login token).
Anyway, that's one way you could end up with two cookies. Only diagnosis with Fiddler (or a similar tool) will tell you for sure, though.