views:

268

answers:

3

I'm trying to upload a PDF file to a website using Hot Banana's content management system using a Python script. I've successfully logged into the site and can log out, but I can't seem to get file uploads to work.

The file upload is part of a large complicated web form that submits the form data and PDF file though a POST. Using Firefox along with the Firebug and Tamper Data extensions I took a peek at what the browser was sending in the POST and where it was going. I believe I mimicked the data the browser was sending in the code, but I'm still having trouble.

I'm importing cookielib to handle cookies, poster to encode the PDF, and urllib and urllib2 to build the request and send it to the URL.

Is it possible that registering the poster openers is clobbering the cookie processor openers? Am I doing this completely wrong?


Edit: What's a good way to debug the process? At the moment, I'm just dumping out the urllib2 response to a text file and examining the output to see if it matches what I get when I do a file upload manually.

Edit 2: Chris Lively suggested I post the error I'm getting. The response from urllib2 doesn't generate an exception, but just returns:

<script>
    if (parent != window) { 
        parent.document.location.reload(); 
    } else { 
        parent.document.location = 'login.cfm'; 
    }
</script>

I'll keep at it.

A: 

You might be better off instrumenting the server to see why this is failing, rather than trying to debug this on the client side.

lacker
I'm a newb with this kind of thing. What does it mean to instrument the server?
afrosteve
@afrosteve: do you have access to the server? If so, you should look at the logs there to see what's going on from the server's point of view.
S.Lott
@S.Lott: Unfortunately I don't. I'm prodding a black box at this point.
afrosteve
+1  A: 

A tool like WireShark will give you a more complete trace at a much lower-level than the firefox plugins.

Often this can be something as simple as not setting the content-type correctly, or failing to include content-length.

jeffamaphone
I'll try WireShark. Thanks.
afrosteve
A: 

"What's a good way to debug [a web services] process?"

At the moment, I'm just dumping out the urllib2 response to a text file and examining the output to see if it matches what I get when I do a file upload manually.

Correct. That's about all there is.

HTTP is a very simple protocol -- you make a request (POST, in this case) and the server responds. Not much else involved and not much more you can do while debugging.

What else would you like? Seriously. What kind of debugger are you imagining might exist for this kind of stateless protocol?

S.Lott
I imagine a debugger that only works inside a Bentley Continental. Then I imagine my client paying for the Bentley instead of all those servers. <sigh>
S.Lott
Something magic? I don't do much web stuff so I thought there might be some tool that all the cool kids are using. Sorry for the ignorance.
afrosteve
I would also like your Bentley debugger.
afrosteve
Not ignorance -- you must have something in mind. What kind of debugger do you have in mind? Seriously. What did you expect to find? What would you want to find? What kinds of features would it have?
S.Lott
I had in mind something that would clearly indicate errors. Perhaps by testing actual output against some kind of expected output, maybe matching the headers you should get if the POST completes correctly. Which, after writing this, sounds like a unit-test.
afrosteve
"Clearly indicate errors" -- if the site has a documented API, then I agree. If you're probing a site without a well-documented API, then there can't be a clear indication of errors, can there?
S.Lott
This is where the magic debugger comes into play. But seriously, you make an excellent point.
afrosteve
@afrosteve: Actually, you made the point. If the API is well-defined, unit testing is good enough. If the API is not well-defined, there is no better way to debug than what you're doing.
S.Lott