Using python to download a file after clicking the submit button - python

I need to login to a website. Navigate to a report page. After entering the required information and clicking on the "Go" button(This is a multipart/form-data that I am submitting), there's a pop up window asking me to save the file. I want to do it automatically by python.
I search the internet for a couple of days, but can't find a way in python. By using urllib2, I can process up to submitting multipart form, but how can I get the name and location of the file and download it?
Please Note: There is no Href associated with the "Go" button. After submitting the form, a file-save dialog popup asking me where to save the file.
thanks in advance

There are python bindings to Selenium, that are helping in scripting simulated browser behavior allowing to do really complex stuff with it - take a look at it, should be enough for what you need.

Related

Cannot identify and manipulate this website 'popup' advice please?

So I'm automating Exchange mailbox creation with our email hosted by a 3rd party. They use a software called Hosting Controller which gives me the ability to create new mailboxes. Except it's all manual, So Im working on a Python+Selenium script to automate this process.
I'm hitting a roadblock where I'm unable to identify this popup so I can manipulate it. I believe it's JQuery but the
Alert = alert.switch_to.alert()
Refuses to work.. I get tracebacks. I then examined switch_to.element/frame/window but I couldn't get any of those to work.
I'm very new to this stuff being this is only the second Python script I've ever tried outside of training coursework.
Here is a short video on what I'm talking about: https://streamable.com/3dx3z5
So, from looking at your video, the pop-up form that you're seeing is a modal. Without having access to HostingController.com or login credentials, I cannot find the xpath for you. So, what you will need to do is look at the HTML source using the Browser's Dev Tools / DOM ( F12 if you are using the new Chromium Edge or Google Chrome itself ) and check to see if the modal is displaying. I submitted an answer to a question in another thread regarding pop-up modal ( here ). I hope this guidance helps.

Python - Manipulate HTML in order to use printer

I am programming an application in Python that, among other functions, will print PDF files via a Xerox printer.
I am facing two options right now:
First one: find a way to comunicate with the printer driver so I could easily send instructions to the printer and do whatever I wanted with it.
Second one: since the first one seems to be a bit tricky and I don't know any API for Python that does something like that, I had the idea of using the service that Xerox provides. Basically there is an IP address that redirects me to an administration page where I can get informations about the status of the printer and... an option to select files to print (and set the number of pages, the tray where the pages will exit, etc...).
I think the best way is to follow the second option, but I don't know if that's doable.
Basically I want to be able to change that webpage source code in order to change, for example the textboxes and in the end "press" the submit button.
I don't know if this is possible, but if it is, can anyone point me in the right path, please?
Or if you have another idea, I would like to hear it.
By now I only managed to get the page source code, I still don't know how to submit it after I change it.
import requests
url = 'http://www.example.com'
response = requests.get(url)
print(response.content)
Unless Xerox has a Python API or library, your second option is the best choice.
When you visit the administration page and submit files for printing, try doing the following:
When you load the admin page, open Chromes developer tools (right click -> Inspect Element)
Open the "Network" tab in the developer console.
Try submitting some files for printing through the online form. Watch the Network panel for any activity. If a new row appears, click on it and view the request data.
Try to replicate the request's query parameters and HEAD parameters with Python's requests.
If you need any help replicating the exact request, feel free to start a new question with the request data and what you have tried.

Using Python's Requests library to navigate webpages / Click buttons

I'm new to web programming, and have recently began looking into using Python to automate some manual processes. What I'm trying to do is log into a site, click some drop-down menus to select settings, and run a report.
I've found the acclaimed requests library: http://docs.python-requests.org/en/latest/user/advanced/#request-and-response-objects
and have been trying to figure out how to use it.
I've successfully logged in using bpbp's answer on this page: How to use Python to login to a webpage and retrieve cookies for later usage?
My understanding of "clicking" a button is to write a post() command that mimics a click: Python - clicking a javascript button
My question (since I'm new to web programming and this library) is how I would go about pulling the data I need to figure out how I would construct these commands. I've been looking into [RequestObject].headers, .text, etc. Any examples would be great.
As always, thanks for your help!
EDIT:::
To make this question more concrete, I'm having trouble interacting with different aspects of a web-page. The following image shows what I'm actually trying to do:
I'm on a web-page that looks like this. There is a drop-down menu with click-able dates that can be changed. My goal is to automate changing the date to the most recent date, "click"'Save and Run', and download the report when it's finished running.
The only solution to this I have found is Selenium. If it werent a javascript heavy website you could try mechanize but for this you need to render the javascript and then inject javascript...like Selenium does.
Upside: You can record actions in Firefox (using selenium) then export those actions to python. The downside is that this code has to open a browser window to run.

populate html form using python and open in browser without submitting

I am trying to write a python script that populates the fields of an html form and then opens that form in a browser WITHOUT submitting it.
I can fill the form and submit it using URllib and urllib2 however I dont want to submit it - I want the person to check the data and then submit it manually.
I have seen this might be possible with Mechanize or Selenium but I want to try and do this with what comes standard (the script will be run on various computers by people who don't know what python is...)
Does anyone know how I could do this?
Opens the form in the browser? This will be tricky, as there is no cross-platform way to open a browser and point it to a URL. On Linux you would probably use xdg-open, on Windows I believe you can just use start, and I have no clue on Mac OS X. But regardless, you would use the subprocess module to open a web browser.
As for the filling it out part...you might be able to replicate the page and serve the local, pre-filled copy with a basic webserver, shutting it down when the user submits the form. I don't think this would be the best idea.
An alternative is using Sikuli script to automate everything - open the user's web browser, populate the fields, maybe even move the mouse cursor to the submit button or highlight it without clicking. That sounds more like what you're trying to achieve.

Advanced screen-scraping using curl

I need to create a script that will log into an authenticated page and download a pdf.
However, the pdf that I need to download is not at a URL, but it is generated upon clicking on a specific input button on the page. When I check the HTML source, it only gives me the url of the button graphic and some obscure name of the button input, and action=".".
In addition, both the url where the button is and the form name is obscured, for example:
url = /WebObjects/MyStore.woa/wo/5.2.0.5.7.3
input name = 0.0.5.7.1.1.11.19.1.13.13.1.1
How would I log into the page, 'click' that button, and download the pdf file within a script?
Maybe Mechanize module can help.
I think that url on clicking the button maybe generated using javascript.So, to run javascript code from python script take a look at Spidermonkey.
Try mechanize or twill. HttpFox or firebug can help you to build your queries. Remember you can also pickle cookies from browser and use it later with py libs. If the code is generated by javascript it could be possible to 'reverse engineer' it. If nof you can run some javascript interpret or use selenium or windmill to script a real browser.
You could observe what requests are made when you click the button (using Firebug in Firefox or Developer Tools in Chrome). You may then be able to request the PDF directly.
It's difficult to help without seeing the page in question.
As Acorn said, you should try monitoring the actual requests and see if you can spot a pattern.
If not, then your best bet is actually to automate a fully-featured browser, that will be able to run Javascript, so you'll exactly mimic what a regular user would do. Have a look at this page on the Python Wiki for ideas, check the section Python Wrappers around Web "Libraries" and Browser Technology.

Categories

Resources