I am scraping a page for some data, however I need to insert text into a text box, submit the form and scrape the result page. I looked at the page source, but I'm not sure how to activate the button or pass down the argument for it.
Website is http://archive.org/web/web.php
Trying to look at some historicals, and no idea what to use for this. Open to any solution
First you should know that click on that button usually does a POST to some urls, passes the data in that form, here is:
<form id="wwmform" name="wwmform" method="get" action="http://web.archive.org/form-submit.jsp" onsubmit="document.location.href='http://web.archive.org/web/*/'+document.getElementById('wwmurl').value;return false;" style="display:inline;">
<input id="wwmurl" type="text" name="url" size="50" value="http://">
<button type="submit" name="type" value="urlquery" class="roundbox5">Take Me Back</button>
</form>
you see the action attribute? That's where the data goes to.
So in python, you may need urllib and urllib2 to encode the data and post it to the target url and then fetch the outcome.
ps: watch out the onsubmit
Related
In html, I have a form consisting of three radio button
<form action="{{ url_for('handle_data') }}" method="post">
<input type="radio" name="option" value="banana"> Banana<br>
<input type="radio" name="option" value="apple">apple<br>
<input type="radio" name="option" value="peach"> peach<br>
<input type="submit" value="Submit Text">
</form>
In my app.py, I have the method defined for this:
#app.route('/result/<fruit>', methods=['POST'])
def handle_data(fruit):
fruit = request.form['option']
return render_template("result.html",fruit = fruit)
What I wanna do was redirecting from my current html file to a new URL, and the new URL is corresponding to which radio button I click. For example, if I click "apple" and hit the "submit text" button, I will be redirected to "/result/apple".
I did that in the way above, it gives me a server error. What did I do wrong in this case?
Either do dynamic URLs OR parse form data, you don't need to do both. In this case, I would only use the result URL without the dynamic URL with because your template is expecting your dynamic URL in url_for to be completed. You can still use dynamic URLs, but I believe you would need to add some JavaScript to populate the URL as you click the submit button. Since the template is being rendered before you select a fruit, it doesn't know to pass a fruit at the end of your URL, thus your request would only go to /result/ instead of /result/apple if you submit apple.
#app.route('/result',methods=['POST'])
def handle_data():
fruit = request.form.get["option"]
return render_template("result.html",fruit=fruit)
I am implementing soap toolkit api. After submitting credit card info it redirects me towards the 3rd party page to get the password. I want to avoid that redirection is it possible to load that page within my domain or on my custom page using IFRAME.
Not exactly an answer but more of a request for clarifying information: When you say "it redirects me towards the 3rd party page to get the password", are you talking about the 3D-Secure process? I think I understand what you are trying to accomplish, but in order to use 3D Secure, I think the idea is to guide the end-user through the process of the additional Verification (security) process. So I think the redirection is required as a matter of how the Business Process (or 3D-Secure) works.
It's definitely possible to render it inside an iframe. Assuming you've called ics_pa_enroll to get the pareq and ACS URL you'd do something like this:
<body>
<form action="[ACS URL]" method="POST" target="payerAuthFrame">
<input type="hidden" name="PaReq" value="[Returned PaReq]" />
<input type="hidden" name="MD" value="[Whatever]" />
<input type="hidden" name="TermUrl" value="[Your Site]" />
</form>
<iframe name="payerAuthFrame" width="800" height="800"></iframe>
</body>
and either have the customer submit the form with e.g. a "Proceed" button or else use JavaScript to submit it automatically.
I'm looping through zip codes and retrieving information from this site http://www.airnow.gov/index.cfm?action=school_flag_program.sfp_createwidget
Here's the form and input elements:
<form name="groovyform">
<input type="text" name="Title" id="Title" size="20" maxlength="20" />
<input type="text" name="Zipcode" id="Zipcode" size="10" maxlength="10" />
My question is how do I make a post request if there are no attributes in the form element (such as action or method)?
My code (I've tried request.get with the params argument, and request.post with the data argument):
url = 'http://www.airnow.gov/index.cfm?action=school_flag_program.sfp_createwidget'
data_to_send = {'zipcode ':'37217',
'Title': 'ph'}
response = requests.get(url, params=data_to_send)
contents = response.text
print contents
just returns the HTML of the url but I want the HTML of the page I get when I post the data. In other words, I don't think request.get is submitting my data and I think it has something to do with there not being an action or method attribute.
Enlighten me!
Thanks!
That form isn't intended to be submitted anywhere. It's just there for the benefit of the Copy button:
<input type="button" value="Copy" onclick="copy(document.groovyform.simba.value)" />
There are also a number of references to document.groovyform in the buildCall Javascript function, which is run when you click on Build your widget.
This is an old style of Javascript programming. These days, most would assign IDs to these elements, and use document.getElementById() to access them, so there would be no need to wrap them in a form. But before that approach was developed, the way to access DOM elements depended on the fact that forms are automatically added as properties of document, and input elements are properties of the containing form.
Reading Comprehension, I could learn it.
Ok, so like Barmar stated, the <form> I posted isn't supposed to be submitted. The form I was supposed to be filling out (top of the page) contained the following:
<form name="frmZipSearch" method="get" style="width:178px; float:left;">
Zip Code:
<input name="zipcode" type="text" size="5" maxlength="5" height="20">
Now my code works.
url = 'http://www.airnow.gov/index.cfm?action=airnow.local_city&zipcode=37217&submit=Go'
data_to_send = {'zipcode':'37217'}
response = requests.get(url, data=data_to_send)
contents = response.text
print contents
Thanks, Barmar, for directing me to the right path.
I've been searching Google for almost every query I can think of related to this. The page I'm trying to submit is something similar to this. There is no form. The objects are not grouped in a form. Most other threads talk about the form not having a name, but in my case, the page doesn't have a form at all.
<div class="container">
<br/>
<img id="imageXYZ" />
<br/>
<input id="inputXYZ" />
<br/>
<button id="submitObject">Go</button>
<br/>
<script type="text/javascript">blah blah blah</script>
</div>
So when there is no form, simply just an input field and button, how do I select a form so I can fill in the text box and click the button?
Thank-you so much!
To do this, I had to do 2 things to get this to work with the code above. First, I had to use this line of code to select all the objects without a form.
Br.form = Br.global_form()
Secondly, the HTML code is formatted wrong, so I had to add a parameter to my initial browser call:
Br = mechanize.Browser(factory=mechanize.RobustFactory())
my question is what is really happens when you hit the submit button on a html servey like this one?
<INPUT TYPE="radio" NAME="bev" VALUE="no" CHECKED>No beverage<BR>
<INPUT TYPE="radio" NAME="bev" VALUE="tea">Tea<BR>
<INPUT TYPE="radio" NAME="bev" VALUE="cof">Coffee<BR>
<INPUT TYPE="radio" NAME="bev" VALUE="lem">Lemonade<BR>
To be more specific, I mean how does the browser sending the data of my choie to the server, because I want to make a Python code that will vote for me in a HTML survey like this
If the form method attribute is post(which I think is) , then the browser sends a post request.If you're using requests library, this is the code
data = {'bev': 'tea'}
#Define a dict with parameter with keys as name attribute values and value as the content you want to send
r = requests.get("http://awebsite.com/", params=data)
print r.content
Requests docs POST requests
And if you aren't using requests, then God help you write the code.