views:

498

answers:

2

Background:

Using urllib and urllib2 in Python, you can do a form submission.

You first create a dictionary.

formdictionary = { 'search' : 'stackoverflow' }

Then you use urlencode method of urllib to transform this dictionary.

params = urllib.urlencode(formdictionary)

You can now make a url request with urllib2 and pass the variable params as a secondary parameter with the first parameter being the url.

open = urllib2.urlopen('www.searchpage.com', params)

From my understanding, urlencode automatically encodes the dictionary in html and adds the input tag. It takes the key to be the name attribute. It takes value in the dictionary to be the value of the name attribute. Urllib2 send this html code via an HTTP POST request.

Problem:

This is alright if the html code you are submitting to is formatted in a standard way with the html tag input having the name attribute.

<input id="32324" type="text" name="search" >

But, there is the situation where the html code is not properly formatted. And the html input tag only has an id attribute no name attribute. Is there may be another way to access the input tag via the id attribute? Or is there may be yet another way?

Solution:

?

+2  A: 

According to the W3 standard, for an input field to be submitted, it must have a name attribute. A quick test on Firefox 3 and Safari 3.2 shows that an input field that is missing the name attribute but has an id attribute is not submitted.

With that said, if you have a form that you want to submit, and some of its fields have id but not name attributes, using the id attribute instead seems like the only available option. It could be that other browsers use the id attribute, or perhaps there is some JavaScript code that handles the submission event instead of letting the browser do it.

Ayman Hourieh
I think Ayman may be on to something. The input form may be a misdirection. The Javascript handles the submission event onclick. I have no idea how to make that event in python. I will try to search some more about about this topic.
Ben Hast
@Ayman: Thanks for testing that - I was wrong, and I've removed my misleading answer.
RichieHindle
One easy way to emulate the JavaScript code is by inspecting the HTTP headers to see what is sent by the browser, and then send the same query using urllib. The Live HTTP Headers extension for Firefox is helpful in this regard. https://addons.mozilla.org/en-US/firefox/addon/3829
Ayman Hourieh
A: 

An input tag without a name won't be submitted as a form parameter.

For example, create an HTML page containing just this:

<form>
    <input type="text" name="one" value="foo"/>
    <input type="text" value="bar"/>
    <input type="submit"/>
</form>

You can see that the second text field is missing a name attribute. If you click "Submit," the page will refresh with the query string:

test.html?one=foo


A good strategy for this would be to look at a live POST request sent by your browser and start by emulating that. Use a tool like the FireBug extension for Firefox to see the POST request and parameters sent by your browser. There might be parameters in there that you didn't notice before -- possibly because they were hidden form elements or they were created/set by JavaScript.

a paid nerd
I just looked at the Net section on Firebug and activated it. I can now see the HTTP POST request itself and all the parameters sent. Thanks for the suggestion. I will post a follow-up here if I find anything.
Ben Hast