tags:

views:

914

answers:

2

Hi guys,

I am encountering a problem in delphi Tidhttp component wherein the GET procedure can't fetch a specific url but on other urls it is working. Example: this code returns an empty response.datastring. Response.datastring is empty only with this error_url but with other urls the response.datastring has a value. I need to fetch the content of that error_url to fix this problem.

procedure TForm1.Button1Click(Sender: TObject);
var
  Response : TStringStream;
  error_url: string;
begin
  error_url := 'http://www.chefscatalog.com/international/home.aspx'; //error url
  Response := TStringStream.Create;
  try
    IdHTTP1.Get(error_url, Response);
    Memo1.Text := Response.DataString;
  finally
    FreeAndNil(Response);
  end;
end;

By the way idHTTP1 redirect property is set here to true so redirection is not the problem.

This is the exception I encountered: 1. http/1.1 302 Found 2. EDecompressionError with message 'ZLib Error (-3)'

You can download the source code (which is indytest.zip) of this project in this link http://www.yourfilelink.com/get.php?fid=534933

Please help me guys. Thanks in advance :)

+1  A: 

Check the OnRedirect event. For some reason, you are being redirected to an error page.

http://www.chefscatalog.com/error.aspx?impsid=0

Which, in turn, redirects you back to this same error page until you exhaust your RedirectMaximum (15).

Update:

Once you are redirected to the error page, Wizzard explains below why it constantly redirects back to the same error page over and over. Cookies.

The reason you're being redirected in the first place is probably that the site doesn't recognize (or like) your user agent string (in Request property). By default, it's "Mozilla/3.0 (compatible; Indy Library)". Change it to a current string used by FireFox, IE or other recognized browser.

I tried it with "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1", and it seems to work just fine.

You can find more details in the Indy KB PDF.

Bruce McGee
I see no such redirect on that page. TIdHTTP's Redirect handling only applies to HTTP-level redirects. If that page were using such a redirect, the user would never see the error page content.
Remy Lebeau - TeamB
hi bruce, i will investigate on your suggestion. i will give you feedback. thanks a lot :)
davy yabut
@Remy, I didn't look at the page code. I just enabled redirects and wired up the OnRedirect event. I'm not sure how it was redirected, just that the event fires with that destination.
Bruce McGee
hi bruce, yeah i saw that there is a redirection to the error page. I am also encountering an error "http/1.1 302 Security Redirect" I don't know what that means. But how can I stop the redirection to the error page? Is there something I can do or its up to the website administrator.
davy yabut
+1  A: 

The reason is the website you are trying to hit is looking for a cookie and if it's not getting set it tries to set it, and then does a 302 redirect back to it's self.

Because you haven't hooked up a cookie manager you are ending up in a 302 redirect loop as the site keeps checking for cookie, setting and then redirecting.

Handle cookies and it will work just fine with only a single 302.


However it seems that for some reason Indy is ignoring the cookies that are being sent by this site. I whipped up some test code if I hit http://www.google.com I get

New cookie: PREF
New cookie: NID
Redirecting (1) to: http://www.google.co.nz/
New cookie: PREF
New cookie: NID

this is the headers that google send

Set-Cookie: PREF=ID=3c7e441914b902ae:TM=1268686477:LM=1268686477:S=Z-Gwqx52jK0V1rYR; expires=Wed, 14-Mar-2012 20:54:37 GMT; path=/; domain=.google.com
Set-Cookie: NID=32=vsOZvkr4AOZ7320d_OBPf2zR2jau4E6pupbOe_ZaaX4DNjahTzSV-mSA55naTk-5cXQcn7SNEp7uSxbE_cFrL9ZftGApTGZMPGKzcz3_NZE_2MYpWG5PGbwWFw9t2d_R; expires=Tue, 14-Sep-2010 20:54:37 GMT; path=/; domain=.google.com; HttpOnly

However for that other site, I get this in my debug output

Redirecting (1) to: http://www.chefscatalog.com/error.aspx?impsid=0
Redirecting (2) to: http://www.chefscatalog.com/error.aspx?impsid=0

all the way up to 15 attempts.. if we look at what headers the site sends back

Set-Cookie: ASP.NET_SessionId=4o0bpi45evee0d45qos1uy55; path=/; HttpOnly
Set-Cookie: ChefsSite=CartID=00000000-0000-0000-0000-000000000000&cst=f4t8YpBpAAkNiRUd9BEf2luKAA%3d%3d&act=c0f2VBCSbv30F4kasnvWS5OfJQ%3d%3d&CookiesEnabled=False; expires=Wed, 14-Apr-2010 20:54:22 GMT; path=/

I note there the site is missing the domain off the end of the Set-Cookie, which is weird but I don't think it's a must from the RFC. if we look at the AddCookie/2 methods of idCookieManager its wanting a host on that param so maybe it wont work on any Set-Cookies that don't give the domain.

I have tested this on a couple more sites and all work fine IF the Set-Cookie includes domain=.google.com;

It's also interesting to note that on the idHttp.OnRedirect if you look at

idHttp.Response.RawHeaders.Text

for the site that doesn't work you don't see the Set-Cookies but on the sites that do work you do see the Set-Cookies...

However, if I set idhttp useragent to

    Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1

(from another answer)

then it seems to pickup the cookies just fine

    New cookie: ASP.NET_SessionId
    New cookie: ChefsSite
    Redirecting (1) to: http://www.chefscatalog.com/international/home.aspx
    New cookie: ChefsSite

Weird.

Wizzard
tried to hook up a cookie manager but still does not work..
davy yabut
hi, your right its in the useragent and cookies, its not being redirected to the error page anymore... its working now... problem left is if is a compressor attached to the tidhttp it produces error zlib error (-3). anyways ill probably find a way programmatically to detach the compressor if the redirect count reaches the maximum... thanks a lot you nailed it... :D
davy yabut