Python script to get CSV from HTML button that runs PHP - python

I'm trying to trigger the download of a CSV using requests-html I believe when the button is clicked it triggers "export_csv.php"
<button class='tiny' type='submit' name='action' value='export' style='width:200px;'>Export All Fields</button> </form> <form name='export' method='POST' action='../export_csv.php'>
I'm just not sure how to trigger the php file with python. I don't have to do it in requests if there's a better way but I would like to avoid using selenium if possible.
I'd share the URL but it's an internal resource and not available on the web.

Related

HTTP GET request is not triggering an action on website

I'm wrote a bot that used selenium to scrape all needed data and performed a few simple tasks. I don't know why I didnt use http requests instead from the start but I am now trying to switch to that. One of the selenium functions used a simple driver.get(url) to trigger an action on the site. Using requests.get, however, does not work.
This selenium code worked
import time
from selenium import webdriver
AM4_URL = 'https://www.airline4.net/?gameType=app&uid=102692112805909972638&uid_token=8adee69e774d89fb6e9f903e7d2afc70&mail=bsgpricecheck#gmail.com&mail_token=286f8bd25bcc32f49a02036102ce072c&device=ios&version=6&FCM=daf5d0d8bf4d7962061eac3a8e4bffa770d6593f31fd5b070d690f244dfb40d1#'
def depart():
# Load driver and get login url
if pax_rep > 80:
driver = webdriver.Firefox(executable_path='C:\webdrivers\geckodriver.exe')
driver.get(AM4_URL)
driver.minimize_window()
driver.get("https://www.airline4.net/route_depart.php?mode=all&ids=x")
time.sleep(100)
def randfunc():
depart()
But now im trying to switch over to requests because all the other bot functions work with it. I tried this and it doesn't perform the action.
import requests
# I was able to combine the URLs into one. It still performs the action when on a browser.
dep_url = 'https://www.airline4.net/route_depart.php?mode=all&ids=x?gameType=app&uid=102692112805909972638&uid_token=8adee69e774d89fb6e9f903e7d2afc70&mail=bsgpricecheck#gmail.com&mail_token=286f8bd25bcc32f49a02036102ce072c&device=ios&version=6&FCM=daf5d0d8bf4d7962061eac3a8e4bffa770d6593f31fd5b070d690f244dfb40d1#'
requests.get(dep_url)
I figured this code would work because the url doesnt return any content. I thought it was using a GET request as a command.
I would also like to note, I got the route_depart.php url from an ajax button.
Here's the HTML from that
<div class="btn-group d-flex" role="group">
<button class="btn" style="display:none;" onclick="Ajax('def227_j22.php','runme');"></button>
<button class="btn w-100 btn-danger btn-xs" onclick="Ajax('route_depart.php?mode=all&ids=x','runme',this);">
<span class="glyphicons glyphicons-plane"></span> Depart <span id="listDepartAmount">5</span></button>
</div>

Hi, I'm writing a bot in requests to fill out an HTML form. Have some questions about values and the payload

I created a program to fill out an HTML webpage form in Selenium, but now I want to change it to requests. However, I've come across a bit of a roadblock. I'm new to requests, and I'm not sure how to emulate a request as if a button had been pressed on the original website. Here's what I have so far -
import requests
import random
emailRandom = ''
for i in range(6):
add = random.randint(1,10)
emailRandom += str(add)
payload = {
'email':emailRandom+'#redacted',
'state_id':'34',
'tnc-optin':'on',
}
r= requests.get('redacted.com', data=payload)
The button I'm trying to "click" on the webpage looks like this -
<div class="button-container">
<input type="hidden" name="recaptcha" id="recaptcha">
<button type="submit" class="button red large">ENTER NOW</button>
</div>
What is the default/"clicked" value for this button? Will I be able to use it to submit the form using my requests code?
Using selenium and using requests are 2 different things, selenium uses your browser to submit the form via the html rendered UI, Python requests just submits the data from your python code without the html UI, it does not involve "clicking" the submit button.
The "submit" button in this case just merely triggers the browser to POST the form values.
However your backend will validate against the "recaptcha" token, so you will need to work around that.
Recommend u fiddling requests.
https://www.telerik.com/fiddler
And them recreating them.
James`s answer using selenium is slower than this.

Selenium to push button in form

Python: 3.4.1
Browser: Chrome
I'm trying to push a button which is located in a form using Selenium with Python. I'm fairly new to Selenium and HTML.
The HTML code is as follows:
<FORM id='QLf_437222' method='POST' action='xxxx'>
<script>document.write("<a href='javascript:void(0);' onclick='document.getElementById(\"QLf_437222\").submit();' title='xxx'>51530119</a>");</script>
<noscript><INPUT type='SUBMIT' value='51530119' title='xxx' name='xxxx'></noscript>
<INPUT type=hidden name="prodType" value="DDA"/>
<INPUT type=hidden name="BlitzToken" value="BlitzToken"/>
<INPUT type=hidden name="productInfo" value="40050951530119"/>
<INPUT type=hidden name="reDirectionURL" value="xxx"/>
</FORM>
I've been trying the following:
driver.execute("javascript:void(0)")
driver.find_element_by_xpath('//*[#id="QLf_437104"]/a').click()
driver.find_element_by_xpath('//*[#id="QLf_437104"]/a').submit()
driver.find_element_by_css_selector("#QLf_437104 > a").click()
driver.find_element_by_css_selector("#QLf_437104 > a").submit()
Python doesn't throw an exception, so it seems like I'm clicking something, but it doesn't do what I want.
In addition to this the webpage acts funny when the chrome driver is initialized from Selenium. When clicking the button in the initialized chrome driver, the webpage throws an error (888).
I'm not sure where to go from here. Might it be something with the hidden elements?
If I can provide additional information please let me know.
EDIT:
It looks like the form id changes sometimes.
What it sounds like you are trying to do, is to submit the form, right?
The <a> that you are pointing out is simply submitting that form. Since that is being injected via JavaScript, it's possible that it's not showing up when you try to click it. What i'd recommend, is doing:
driver.find_element_by_css_selector("form[id^='QLf']").submit()
That will avoid the button, and submit the appropriate form.
In the above CSS selector, i also used [id^= this means, find a <form> with an ID attribute that starts with QLf, because it looks like the numbers after, are automatically generated.

How to automate interaction for a website with POST method

I need to input text into the text box on this website:
http://www.link.cs.cmu.edu/link/submit-sentence-4.html
I then need the return page's html to be returned. I have looked at other solutions. But i am aware that there is no solution for all. I have seen selenium, but im do not understand its documentation and how i can apply it. Please help me out thanks.
BTW i have some experience with beautifulsoup, if it helps.I had asked before but requests was the only solution.I don't know how to use it though
First, imho automation via BeautifulSoup is overkill if you're looking at a single page. You're better off looking at the page source and get the form structure off it. Your form is really simple:
<FORM METHOD="POST"
ACTION="/cgi-bin/link/construct-page-4.cgi#submit">
<input type="text" name="Sentence" size="120" maxlength="120"></input><br>
<INPUT TYPE="checkbox" NAME="Constituents" CHECKED>Show constituent tree
<INPUT TYPE="checkbox" NAME="NullLinks" CHECKED>Allow null links
<INPUT TYPE="checkbox" NAME="AllLinkages" OFF>Show all linkages
<INPUT TYPE="HIDDEN" NAME="LinkDisplay" VALUE="on">
<INPUT TYPE="HIDDEN" NAME="ShortLength" VALUE="6">
<INPUT TYPE="HIDDEN" NAME="PageFile" VALUE="/docs/submit-sentence-4.html">
<INPUT TYPE="HIDDEN" NAME="InputFile" VALUE="/scripts/input-to-parser">
<INPUT TYPE="HIDDEN" NAME="Maintainer" VALUE="sleator#cs.cmu.edu">
<br>
<INPUT TYPE="submit" VALUE="Submit one sentence">
<br>
</FORM>
so you should be able to extract the fields and populate them.
I'd do it with curl and -X POST (like here -- see the answer too :)).
If you really want to do it in python, then you need to do something like POST using requests.
Pulled straight from the docs and changed to your example.
from selenium import webdriver
# Create a new instance of the Firefox driver
driver = webdriver.Firefox()
# go to the page
driver.get("http://www.link.cs.cmu.edu/link/submit-sentence-4.html")
# the page is ajaxy so the title is originally this:
print driver.title
# find the element that's name attribute is Sentence
inputElement = driver.find_element_by_name("Sentence")
# type in the search
inputElement.send_keys("You're welcome, now accept the answer!")
# submit the form
inputElement.submit()
This will at least help you input the text. Then, take a look at this example to retrieve the html.
Following OP's requirement of having the process in python.
I wouldn't use selenium, because it's launching a browser on your desktop and is overkill for just filling up a form and getting its reply (you could justify it if your page would have JS or ajax stuff).
The form request code could be something like:
import requests
payload = {
'Sentence': 'Once upon a time, there was a little red hat and a wolf.',
'Constituents': 'on',
'NullLinks': 'on',
'AllLinkages': 'on',
'LinkDisplay': 'on',
'ShortLegth': '6',
'PageFile': '/docs/submit-sentence-4.html',
'InputFile': "/scripts/input-to-parser",
'Maintainer': "sleator#cs.cmu.edu"
}
r = requests.post("http://www.link.cs.cmu.edu/cgi-bin/link/construct-page-4.cgi#submit",
data=payload)
print r.text
the r.text is the HTML body which you can parse via e.g. BeautifulSoup.
Looking at the HTML reply, I think your problem will be in processing the text within the <pre> tags, but that's an entirely different thing outside the scope of this question.
HTH,

Python web request with redirect

I am attempting to scrape the following website flow.gassco.no as one of my first python projects. I need to bypass the splash screen which redirects to the main page. I have isolated the following action,
<form method="get" action="acceptDisclaimer">
<input type="submit" value="Accept"/>
<input type="button" name="decline" value="Decline" onclick="window.location = 'http://www.gassco.no'" />
</form>
In a browser appending 'acceptDisclaimer?' to the url redirects to the target flow.gassco.no. However if I try to replicate this in urllib, I appear to stay on the same page when outputting the source.
import urllib, urllib2
url="http://flow.gassco.no/acceptDisclaimer?"
url2="http://flow.gassco.no/"
#first pass to invoke disclaimer
req=urllib2.Request(url)
res=urllib2.urlopen(req)
#second pass to access main page
req1=urllib2.Request(url2)
res2=urllib2.urlopen(req1)
data=res2.read()
print data
I suspect that I have oversimplified the problem, but would appreciate any input into how I can accept the disclaimer and continue to output the main page source.
Use a cookiejar. See python: urllib2 how to send cookie with urlopen request
Open the main url first
Open the /acceptDisclaimer after that

Categories

Resources