I'm a novice at Python. I'm trying to learn how to post data to a web form and grab the result but I couldn't understand any of the examples I found on the web, and they pointed to websites that no longer exist. So I found this website http://www.autotrader.co.uk/vehiclecheck which accepts a vehicle reg and gives you some more data.
Can anyone show me how to put that data in the form and grab the text that then appears (on a new website)? I'm hoping someone can explain what the code does rather than just tell me the answer, as I just chose a random example (and therefore feel free to choose a different example). thanks
You can make a get request passing the reg:
import requests
params = {"SC":"132","vrm":"foobar"}
req = requests.get("https://www.vehiclecheck.co.uk/",params=params)
print(req.content)
Which if you run you will see in the output:
<h3>Sorry, we didn't recognise that registration.</h3>
<p>Please check that you have entered your registration correctly and try again. For example, a zero (0) and a capital O (letter) or a one (1) and a capital I (letter) can appear very similar.</p>
Which is exactly what you see in the browser.
Using a proper reg you can see part of the html returned below which contains the search result, I removed art of reg :
<div id="searchResult">
<h1 class="HeaderMargin HideOnMobile">Vehicle details</h1>
<h2 class="NoMargin HideOnMobile">We've identified this car using the details you provided</h2>
<h2 class="NoMargin ShowOnMobile">We've identified this car</h2>
<h3>HANGLONG UNKNOWN 2007 </h3>
<div class="SearchResultCarImageContainer">
<img src="/VehicleCheck/ShowImage/?id=&CapType=0" alt="Vehicle image" id="searchResultCarImage">
<p class="SubLine">Vehicle Image for illustrative purposes only.</p>
</div>
<div class="Column">
<table id="vrmSearchTable">
<tr>
<td class="Reg"><strong>Registration number:</strong></td>
<td class="Reg">GN57###</td>
</tr>
<tr>
<td><strong>Body type:</strong></td>
<td>Scooter</td>
</tr>
<tr>
<td><strong>Colour:</strong></td>
<td>Red</td>
</tr>
<tr>
<td><strong>Date of first registration:</strong></td>
<td>November 2007</td>
</tr>
</table>
If you were to input foobar into the search box in the browser you see a new tab open with the url:
https://www.vehiclecheck.co.uk/?SC=132&vrm=foobar
You just need to mimic that with the request.
If you inspect the html you can see the form and what the input name is i.e input name="vrm":
<form action="https://www.vehiclecheck.co.uk" method="get" target="_blank" class="js-top-form vrm-lookup-form">
<input type="hidden" id="SC" name="SC" value="132">
<div class="gb-reg-icon--wrap">
<svg class="gb-reg-icon">
<title>Registration GB Icon</title>
<use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/templates/_generated/svg_icons/vehicle-check.svg#icon-gb-reg"></use>
</svg>
</div>
<span class="js-vrm-input input-error-wrap">
<input name="vrm" type="text" maxlength="8" class="reg-input-large input-large js-vrm-input-focus" placeholder="ENTER REG">
<span class="js-input-error input-error vrm-mileage-input__vrm-error is-hidden"></span>
</span>
<button type="submit" class="vrm-lookup-form__button button-green-large track-submitVrmLookup tracking-motoring-products-link" data-label="vehicle-check-start-check-initiation">
Start check
</button>
</form>
When you go to the page after clicking i.e the:
https://www.vehiclecheck.co.uk/?SC=132&vrm=foobar
page, open the developer console in chrome firefox, open the network tab and hit f5, then have a look at the requests being made, in chrome you can see the first is ?SC=132&vrm=foobar then when you click on that you see under query string parameters:
SC=132&vrm=foobar
Or if you click view parsed:
SC:132
vrm:foobar
Which are the parameters that need to be passed.
You have the input name so if you were trying to post, you would use the name i.e vrm as a key and the value you want to submit, using your own details:
import requests
data= {"vrm": "SM59TXS"}
req = requests.post("https://www.vehiclecheck.co.uk/",data=data)
print(req.content)
We see the searchResult again this time with your car details:
<div id="searchResult">
<h1 class="HeaderMargin HideOnMobile">Vehicle details</h1>
<h2 class="NoMargin HideOnMobile">We've identified this car using the details you provided</h2>
<h2 class="NoMargin ShowOnMobile">We've identified this car</h2>
<h3>VOLVO V70 R-DESIGN SE D 2009 </h3>
<div class="SearchResultCarImageContainer">
<img src="/VehicleCheck/ShowImage/?id=42B913362B14F8DF&CapType=0" alt="Vehicle image" id="searchResultCarImage">
<p class="SubLine">Vehicle Image for illustrative purposes only.</p>
</div>
<div class="Column">
<table id="vrmSearchTable">
<tr>
<td class="Reg"><strong>Registration number:</strong></td>
<td class="Reg">SM59TXS</td>
</tr>
<tr>
<td><strong>Body type:</strong></td>
<td>Estate</td>
</tr>
<tr>
<td><strong>Colour:</strong></td>
<td>Blue</td>
</tr>
<tr>
<td><strong>Date of first registration:</strong></td>
<td>December 2009</td>
</tr>
</table>
So the bottom line is you need to find the input id or name from the html then post the using the id/name as the keys and the values you want to submit as the values. There is no magic bullet that will work for every form so you have to understand what is happening, that is why learning how to use the developer console will be be invaluable.
One library for manipulating forms with Python is selenium. here's an example of how you would interact with your given page with selenium:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.autotrader.co.uk/vehiclecheck')
# find the input you want to manipulate by checking the source code of the website
# for example, to enter a reg, first ID the reg form by it's class
reg_form = driver.find_element_by_name('vrm')
# then send some input to it
reg_form.send_keys('test reg you want to send')
# then ID the start check button to be able to click it
start_check_button = driver.find_element_by_css_selector('button.vrm-lookup-form__button.button-green-large.track-submitVrmLookup.tracking-motoring-products-link')
# and send a click
start_check_button.click()
# then parse the page as you want
Related
MechanicalSoup's tutorial shows you how to do things with named input boxes, but not everyone who writes html takes care with naming. I have an html page with a single form containing an un-named checkbox (which when you check it, checks all the others) plus a large number of other checkboxes all named 'alpkey'. Can anyone help me select the first checkbox, or alternatively, find and select all the checkboxes in the page? I need to do this and then follow the link in the 'More info' button. The html code goes like this:
<p></p>
<form name="summform" target="_blank" action="/cgi-bin/RBG" method="post">
<input type="hidden" name="form" value="PNG/png2">
<center>
<table width="850" frame="below">
<tr><td><div>39 matches found</div>
</td></tr>
<tr><td><i>Select the records you wish to see more information on</i> <b>:</b>
<br>
</td></tr>
<tr><td><input type="checkbox" onclick="if(this.checked){checkAll(true)}else{checkAll(false)}">
<b>Select ALL/None</b>
<input type="submit" value="More Info">
Click on thumbnail to see larger image.
</td></tr>
</table>
<table width="850" frame="below">
<tr>
<td></td>
<td width="50" align="center"><b>LAE No.</b></td>
<td align="center"><b>Summary of record</b></td>
<td align="center"><b>Specimen Images</b></td>
<tr>
<td><input type="checkbox" name="alpkey" value="83237" /></td>
<td width="50" align="center"></td>
<td align="left">
<b><font color="#FFCC33">
....and so on. Or do I need Selenium?
Update:
the code below code works (or seems to - browser.launch_browser() doesn’t work unless I use StatefulBrowser, so I can’t visually inspect the results)...but I can’t get any further. I can’t work out how to follow the ‘submit’ button link.
browser = mechanicalsoup.Browser()
page = browser.get(myurl)
soup = page.soup
form = soup.select("form")[0]
for i in range(3, len(form.select("input"))):
form.select("input")[i].checked = 'checked'
# print(form.select("input")[i].checked)
submit = soup.findAll(type='submit')
#form2 = soup.select_form()
#form2.choose_submit('More Info')
#page2 = browser.submit(form, page.url)
....alternatively, it seems that when I use browser = mechanicalsoup.StatefulBrowser() I’m unable to loop through the checkboxes.
This worked - to loop through a load of checkboxes, all with the same name but different 'checked' attributes (indicated in the html as e.g. value="83237"), check them all (by setting the appropriate value) and finally submit the form:
browser = mechanicalsoup.StatefulBrowser()
browser.open(myurl) # having set URL variable
browser.select_form('form[name="summform"]') # only one form on page, this was its name
check = browser.page.find_all('input')
rangeval = len(check)-2 # to loop through checkboxes, ignoring a couple of other input tags
names = []
for i in range(3, rangeval):
names.append(check[i]['value']) # get list of values to set
browser["alpkey"] = names # all checkboxes have the same name so use list
browser.launch_browser() # verify that they're all checked
response = browser.submit_selected() # and submit form
print(response.url)
print(response.text)
...it would've been nicer to just check the 'check all' box, but that was unnamed and had no 'value' attribute.
I have a list of forms that I need to edit one by one wherein I have to identify the form title first and then click the JavaScript link to open it's editing template.
In the sample code below, I need to identify the text Another Custom Form (Mobile) which is the form title and then click the a href link whose onclick value is editProjectFormType. It is the second sibling of form title. I am trying to perform this task in Python.
<tr class="trbg2">
<td width="10%" align="left" nowrap="nowrap">
<div align="center">
<input type="checkbox" name="selectedFormType" value="2192454$$rmymiK" checked="checked">
</div>
</td>
<td width="10%" align="left" nowrap="nowrap"><img src="https://dmsak.qa.asite.com/images/dots.gif" width="6" height="15">!!!!!!!!!!!!!!!!!!!!!!!!!!!!Ashish_test!!!!!!!!!!</td>
<td width="10%" align="left">Custom forms</td>
<td width="10%" align="center">ACFM</td>
<td width="24%">Another Custom Form (Mobile)</td>
<td width="20%">Custom</td>
<td align="center" width="15%">
<a href="javascript:void(0);" onclick="editProjectFormType('2192454$$rmymiK');">
<img src="https://dmsak.qa.asite.com/images/i_editfgt.gif" width="16" height="20" border="0">
</a>
</td>
<td align="center" width="15%">
<a href="javascript:void(0);" onclick="downloadFormTemplate('2192454$$rmymiK');">
<img src="https://dmsak.qa.asite.com/images/f_dow_tmple.gif" width="22" height="22" border="0" alt="Click here to Download Template" title="Click here to Download Template">
</a>
</td>
</tr>
I have used the following incomplete code so far and not sure what do I do next
button = browser.find_elements_by_tag_name('td')
for txt in button:
if txt == "Another Custom Form (Mobile)":
You can perform click on link with following single xPath where you can provide text Another Custom Form (Mobile) to identify their following-sibling and get that link as as below :-
link = browser.find_element_by_xpath("//td[contains(text(),'Another Custom Form (Mobile)')]/following-sibling::td/a[contains(#onclick, 'editProjectFormType')]")
link.click()
Edited..
Implement WebDriverWait to get the element as below :-
link = WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.XPATH, "//td[contains(text(),'Another Custom Form (Mobile)')]/following-sibling::td/a[contains(#onclick, 'editProjectFormType')]")))
link.click()
Note:- if target element is inside a frame. you need to switch to that frame first as browser.switch_to_frame("frame name or id") then go to find the target element as above.
Hope it will help you...:)
I would approach this by:
creating a class to represent a row in this table, say "MyRow", which has method to interact with each column as a field in the class.
using Selenium to find all of the rows in the table, returning a list of instances of this class.
loop thru the list looking for the row that matches the target name :
for (MyRow myRow : allRows) {
if (myRow.name.equals("Another Custom Form (Mobile)")) {
return myRow;
}
}
click on the link in the target row's column. myRow.editProjectForm()
The first one worked for me with some modification.
I have 3 buttons in single td, with other 4 td have texts. I have to click on specific button based on the one td value (e.g customer_name). Below is working the code.
self.driver.find_element_by_xpath(
"//td[contains(text(),'" + customer_name + "')]/following-sibling::td/button[text()='Accept']"
).click()
I am using selenium-python binding. I am getting the following error while trying to select and manipulate an element. (using Chromedriver)
Message: invalid element state: Element is not currently interactable and may not be manipulated
I think the element is successfully selected with the following syntax: but I cannot manipulate it with, for example, clear() or send_keys("some value"). I would like to fill the text area, but I cannot make it work. If you have experienced similar problems, please share your thought. Thank you.
UPDATE: I noticed html is changing as I manually type to style="display: none" that might be a reason for this error. Modified the code below. Can you please point out any solution?
driver.find_element(by='xpath', value="//table[#class='input table']//input[#id='gwt-debug-url-suggest-box']")
or
driver.find_element(by='xpath', value="//input[#id='gwt-debug-url-suggest-box']")
or
driver.find_element_by_id("gwt-uid-47")
or
driver.find_element(by='xpath', value="//div[contains(#class, 'sppb-b')][normalize-space()='www.example.com/page']")
Here is the html source code:
<div>
<div class="spH-c" id="gwt-uid-64"> Your landing page </div>
<div class="spH-f">
<table class="input-table" width="100%">
<tbody>
<tr>
<td class="spA-e">
<div class="sppb-a" id="gwt-uid-47">
<div class="sppb-b spA-b" aria-hidden="true" style="display: none;">www.example.com/page</div>
<input type="text" class="spC-a sppb-c" id="gwt-debug-url-suggest-box" aria-labelledby="gwt-uid-64 gwt-uid-47" dir="">
</div>
<div class="error" style="display:none" id="gwt-debug-invalid-url-error-message" role="alert"> Please enter a valid URL. </div>
</td>
<td class="spB-b">
<div class="spB-a" aria-hidden="true" style="display: none;"></div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
Have you tried selecting by:
driver.find_element_by_id("gwt-debug-url-suggest-box")
driver.send_keys("Your input")
This way you are selecting the input directly.
Anyway,the link to the page would help.
I am using python and flask to make a web app. I am new to it, but have gotten most of what I am trying to accomplish done. Where I am stuck, is that I have a label whose value is a python variable( {{id}} ) This id is the id of a row I need to update in a sqlite database. My code is below. when I click the approve button, it takes me to a route which does the update query, but I have no way to pass the {{id}} with it. This would have been much easier if I could have just used javascript for the update query, but everything I've found using javascript, is for web sql, not sqlite, even though some of them say they are for sqlite.
</script>
<table border='1' align="center">
{% for post in posts %}
<tr>
<td>
<label>{{post.id}}</label>
<h1 id ='grill1'>{{post.photoName}}</h1>
<span>
<img id = 'photo1' src='{{post.photo}}' alt="Smiley face" height="200" width="200">
</span><br>
<h5 id ='blurb1'>
{{post.blurb}}
</h5>
<br>
<div style=" padding:10px; text-align:center;">
<input type="button" value="Approve" name="approve" onclick="window.location='/approve;">
<input type="button" value="Deny" onclick="window.location='/deny';"><br>
</div>
</td>
</tr>
{% endfor %}
</table>
Why not just do:
...
<input type="button" value="Approve" name="approve" onclick="window.location='/approve/{{post.id}};">
<input type="button" value="Deny" onclick="window.location='/deny/{{post.id}}';">
...
Then your flask route for approve and / or deny can just take a parameter for the post to approve or deny. i.e.:
#app.route("/approve/<int:post_id>")
def approve(post_id):
"""approve this post!"""
I am creating a html template for a django based app. I am using the twitter bootstrap API for buttons here, but one of them (the cancel button) doesn't seem to be working correctly. I link it to another page using an href, but when I click on the button, it redirects to the current page's post method. See below:
<h2>Add new Schedule:</h2>
<form class="form-horizontal" method='post'>
<table>
{% load bootstrap %}
{{ form|bootstrap }}
{% csrf_token %}
<tr>
<td></td>
<td>
<input class="btn btn-primary" type='submit' name='reset' value='Save' />
</td>
<td></td>
<td>Cancel</button></td>
</tr>
</table>
</form>
However, if I get rid of the button and use it as a simple href it seems to work:
<td><a href='{%url head.views.edit_instance_binding binding.id %}'>Cancel</a></td>
What's going on here?
You have a <button> inside an <a> element - get rid of the button, otherwise you'll be submitting your form.
If you want your anchor to be styled as a button, give it a btn class.
And Bootstrap is just a big set of CSS facilities with little js thrown in - no APIs at all :))
EDIT: nowadays HTML semantics and appearance are well separated [though someone may argue that Bootstrap has its hacks regarding this, see its <i>'s use for icons].
Keeping the eye on your case, you wanted to use a <button> to style a simple anchor like an embossed button. But a <button> tag is just a way to provide a richer <input type="submit">, in which you can insert images for example [see all the BS examples with icons beside buttons].
Well, <input type="submit"> and <button> inside a <form> trigger the latter's action, i.e. they post some data the user entered to such location.
If you just need to reach some URL without submitting anything, you need an anchor tag [<a>], which can be styled as you wish, e.g. with BS btn, btn-primary, btn-whateva classes.