I am using mechanize in python to submit a form and print out the response but it does not seem to work
import mechanize
# The URL to this service
URL = 'http://sppp.rajasthan.gov.in/bidsearch.php'
def main():
# Create a Browser instance
b = mechanize.Browser()
# Load the page
b.open(URL)
# Select the form
b.select_form(nr=0)
# Fill out the form
b['ddlfinancialyear'] = '2015-2016'
b.submit()
b.response().read()
What I am trying to do is submit a form using the url 'sppp.rajasthan.gov.in/bidsearch.php';, and when the form is submitted( by trying to pass value '2015-2016' to 'ddfinancialyear' control) another page should be returned as a response and I am not getting any output.
Try assigning the b.submit before reading it:
S = b.submit()
S.read()
Related
I want to know if there is a way to POST parameters after reading the page source. Ex: read captcha before posting ID#
My current code:
import requests
id_number = "1"
url = "http://www.submitmyforum.com/page.php"
data = dict(id = id_number, name = 'Alex')
post = requests.post(url, data=data)
There is a captcha that is changeable after every request to http://submitforum.com/page.php (obv not a real site) I would like to read that parameter and submit it to the "data" variable.
As discussed in OP comments, selenium can be used, methods without browser emulation may also exists !
Using Selenium (http://selenium-python.readthedocs.io/) instead of requests module method:
import re
import selenium
from selenium import webdriver
regexCaptcha = "k=.*&co="
url = "http://submitforum.com/page.php"
# Get to the URL
browser = webdriver.Chrome()
browser.get(url)
# Example for getting page elements (using css seletors)
# In this example, I'm getting the google recaptcha ID if present on the current page
try:
element = browser.find_element_by_css_selector('iframe[src*="https://www.google.com/recaptcha/api2/anchor?k"]')
captchaID = re.findall(regexCaptcha, element.get_attribute("src"))[0].replace("k=", "").replace("&co=", "")
captchaFound = True
print "Captcha found !", captchaID
except Exception, ex:
print "No captcha found !"
captchaFound = False
# Treat captcha
# --> Your treatment code
# Enter Captcha Response on page
captchResponse = browser.find_element_by_id('captcha-response')
captchResponse.send_keys(captcha_answer)
# Validate the form
validateButton = browser.find_element_by_id('submitButton')
validateButton.click()
# --> Analysis of returned page if needed
I'm trying to create a simple login into a "Kahoot!" quiz .
First thing i'm trying to do is load from "https://kahoot.it/#/" JSON objects so i could fill the form in it (i tried to fill the form using 'mechenize' but its seems to support only html forms).
when im running the next script im getting exception that json could not be decoded:
import urllib, json
url = "https://kahoot.it/#/"
response = urllib.urlopen(url)
data = json.loads(response.read())
print data
output:
ValueError: No JSON object could be decoded
Any ideas? ,
Thanks.
type(response.read()) is str, representing the HTML of the page. Obviously it's not a valid JSON therefore you are getting that error.
EDIT If you are trying to login to that page, it is possible with selenium:
from selenium import webdriver
url = "https://kahoot.it/#/"
driver = webdriver.Chrome() # or webdriver.Firefox()
driver.get(url)
# finding the text field and 'typing' the game pin
driver.find_element_by_xpath('//*[#id="inputSession"]').send_keys('your_game_pin')
# finding and clicking the sumbit button
driver.find_element_by_xpath('/html/body/div[3]/div/div/div/form/button').click()
I am trying to submit a form, and get the results of the page that it heads to after submitting the form. I'm using mechanize.
1) When I'm using the code to click on the first-button, it is getting a response. But when I read the response, it is showing the source of the same page (the page where the form is located). Not of the page that the browser is redirected to after the submission of the form.
from mechanize import Browser
br = Browser()
br.open("http://link.net/form_page.php")
br.select_form(nr=0)
br.form['number'] = '0123456789'
response = br.submit(nr=0)
print response.read()
Now, when I do this, the source of the same page (i.e. form_page.php) is showing up. But, it should have shown the source of "results.php" (that is where the browser leads to when I do it manually)
2) There are multiple submit buttons in the page. I am clicking only the first one. But when I'm trying to click other submit buttons using nr=1 or nr=2, it is showing this error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/mechanize /_mechanize.py", line 524, in select_form
raise FormNotFoundError("no form matching "+description)
mechanize._mechanize.FormNotFoundError: no form matching nr 1
Can you please help me?
make sure you are selecting right form or make sure there is a form that you are selecting on the web page. you can check it by like this code :
for form in br.forms():
print form
and see what result returned to you.
This looks similar to this issue, where submit was calling some Javascript to validate the inputs before redirecting. It may be worth having a look at the HTML of the page and checking what it does on submit.
Try the following:
import mechanize
br = mechanize.Browser()
br.open("http://link.net/form_page.php")
br.select_form(nr=0)
br['number'] = '0123456789' ### try instead of 'br.form[]'
response = br.submit() ### no need to specify form again
text = response.read()
Don't forget about 'br.set_handle_robots(False)', 'br.set_all_readonly(False)', etc...
im making some simple python post script but it not working well.
there is 2 part to have to login.
first login is using 'http://mybuddy.buddybuddy.co.kr/userinfo/UserInfo.asp' this one.
and second login is using 'http://user.buddybuddy.co.kr/usercheck/UserCheckPWExec.asp'
i can login first login page, but i couldn't login second page website.
and return some error 'illegal access' such like .
i heard this is related with some cooke but i don't know how to implement to resolve this problem.
if anyone can help me much appreciated!! Thanks!
import re,sys,os,mechanize,urllib,time
import datetime,socket
params = urllib.urlencode({'ID':'ph896011', 'PWD':'pk1089' })
rq = mechanize.Request("http://mybuddy.buddybuddy.co.kr/userinfo/UserInfo.asp", params)
rs = mechanize.urlopen(rq)
data = rs.read()
logged_fail = r';history.back();</script>' in data
if not logged_fail:
print 'login success'
try:
params = urllib.urlencode({'PASSWORD':'pk1089'})
rq = mechanize.Request("http://user.buddybuddy.co.kr/usercheck/UserCheckPWExec.asp", params )
rs = mechanize.urlopen(rq)
data = rs.read()
print data
except:
print 'error'
You can't use selenium? IMHO it's better do automation with this.
For install utilize:
pip install selenium
A example:
from selenium import webdriver
browser = webdriver.Firefox()
# open site
browser.get('http://google.com.br')
# get page source
browser.page_source
A login example:
# different methods to get a html item
form = browser.find_element_by_tag_name('form')
username = browser.find_element_by_id('input_username')
password = browser.find_element_by_css_selector('input[type=password]')
username.send_keys('myUser')
password.send_keys('myPass')
form.submit()
I need to fill form values on a target page then click a button via Python. I've looked at Selenium and Windmill, but these are testing frameworks - I'm not testing. I'm trying to log into a 3rd party website programatically, then download and parse a file we need to insert into our database. The problem with the testing frameworks is that they launch instances of browsers; I just want a script I can schedule to run daily to retrieve the page I want. Any way to do this?
You are looking for Mechanize
Form submitting sample:
import re
from mechanize import Browser
br = Browser()
br.open("http://www.example.com/")
br.select_form(name="order")
# Browser passes through unknown attributes (including methods)
# to the selected HTMLForm (from ClientForm).
br["cheeses"] = ["mozzarella", "caerphilly"] # (the method here is __setitem__)
response = br.submit() # submit current form
Have a look on this example which use Mechanize: it will give the basic idea:
#!/usr/bin/python
import re
from mechanize import Browser
br = Browser()
# Ignore robots.txt
br.set_handle_robots( False )
# Google demands a user-agent that isn't a robot
br.addheaders = [('User-agent', 'Firefox')]
# Retrieve the Google home page, saving the response
br.open( "http://google.com" )
# Select the search box and search for 'foo'
br.select_form( 'f' )
br.form[ 'q' ] = 'foo'
# Get the search results
br.submit()
# Find the link to foofighters.com; why did we run a search?
resp = None
for link in br.links():
siteMatch = re.compile( 'www.foofighters.com' ).search( link.url )
if siteMatch:
resp = br.follow_link( link )
break
# Print the site
content = resp.get_data()
print content
You can use the standard urllib library to do this like so:
import urllib
urllib.urlretrieve("http://www.google.com/", "somefile.html", lambda x,y,z:0, urllib.urlencode({"username": "xxx", "password": "pass"}))
The Mechanize example as suggested seems to work. In input fields where you must enter text, use something like:
br["kw"] = "rowling" # (the method here is __setitem__)
If some content is generated after you submit the form, as in a search engine, you get it via:
print response.read()
For checkboxes, use 1 & 0 as true & false respectively:
br["checkboxname"] = 1 #checked = true
br["checkboxname2"] = 0 #checked = false