Automate browser interaction - python

I need to upload hundreds of files via a html form, so that they end up on a server. I have no other options and must go via the form.
I have tried doing this problematically using python but I'm doing something wrong and the files are empty when I try to open them via the web interface. I also tried doing a replay via firefox's TamperData and the file is also uploaded incorrectly in this case.
So I'm interested in exploring the idea of uploading the files by automating my browser instead. All I need to do is:
for file in files:
open a page
click on the browse button
select a file
click the upload button
So what software/library can I use to do this? I don't necessarily need to do this using python as I will never need to do this again in the future. I just need to get the files up there by any means possible.
I have access to Windows 7, Mac Os X, and Suse Linux.
I also don't care which browser I use.

Splinter works well for this kind of thing:
https://github.com/cobrateam/splinter
You could do something like:
from splinter import Browser
with Browser('firefox') as browser:
browser.visit('http://yourwebsite.com')
browser.find_by_name('element_name').click()
do some other stuff...
You just need to find the name or ID of the element you want to interact with on the page

please check out selenium:
It's an automated browser tool (behaves like a real user)
http://selenium-python.readthedocs.org/en/latest/
You can also if you're more the techy type and understand html write your own python code to only send files to the server with httplib2
https://code.google.com/p/httplib2/
for an auto clicker try autopy (this emulates mouse and keyboard input)
http://www.autopy.org/

Related

Python and Firefox (Windows) - How to save a page as I see it

I am trying to implement automation in Python, because they changed the way we access data in my work. We used to log into an Unix based terminal, while now by using a simcard we open a browser page in Firefox that emulates the terminal.
Here is my problem: I wrote a script to save a .txt file with the data I am interested in. But when I tried to run it at work, it didn't run, because the page wouldn't let me open it (even if I was logged in).
So I decided to use a dumb solution: I saved the Firefox page as it was open and ran the script locally. But... I can't find how to save the page opened in Firefox without having to type the URL in Python beforehand. I want to automate this step too.
Is there a way to save the page opened like this one, acting like a Ctrl + S shortcut?
I appreciate any kind of help! Thanks!

populate html form using python and open in browser without submitting

I am trying to write a python script that populates the fields of an html form and then opens that form in a browser WITHOUT submitting it.
I can fill the form and submit it using URllib and urllib2 however I dont want to submit it - I want the person to check the data and then submit it manually.
I have seen this might be possible with Mechanize or Selenium but I want to try and do this with what comes standard (the script will be run on various computers by people who don't know what python is...)
Does anyone know how I could do this?
Opens the form in the browser? This will be tricky, as there is no cross-platform way to open a browser and point it to a URL. On Linux you would probably use xdg-open, on Windows I believe you can just use start, and I have no clue on Mac OS X. But regardless, you would use the subprocess module to open a web browser.
As for the filling it out part...you might be able to replicate the page and serve the local, pre-filled copy with a basic webserver, shutting it down when the user submits the form. I don't think this would be the best idea.
An alternative is using Sikuli script to automate everything - open the user's web browser, populate the fields, maybe even move the mouse cursor to the submit button or highlight it without clicking. That sounds more like what you're trying to achieve.

selenium firefox download

I am using python selenium and would like to download a pdf file, however it opens in my browser? How can I download it from my browser? Any way to click the following image,
Before, all i had to do was disable the firefox download box dialog, but now I am not able to request the download. Any ideas?
What should i do to request the download? I am also not able to find the file on the server.
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', "application/vnd.csv")
There is no way do do it with selenium in a way you described. However you can tweak your browser, so it doesn't open pdf's but downloads them - and that can be done with selenium.
Here's (How to auto save files using custom Firefox profile )a good example how to do it.
With AppLoader you can click on any button, icon, or shortcut, just like a real user.
Add
profile.set_preference("pdfjs.disabled",True)
That will disable the default pdf viewer in firefox.
Also your save-to-disk preference seems incorrect. It should be
profile.set_preference("browser.helperApps.neverAsk.saveToDisk","application/pdf")
The last thing to keep in mind is that if you have multiple file types to download, you need to put them all under the same set_preference saveToDisk statement, comma separated, otherwise the later statement will overwrite the earlier settings.

Advanced screen-scraping using curl

I need to create a script that will log into an authenticated page and download a pdf.
However, the pdf that I need to download is not at a URL, but it is generated upon clicking on a specific input button on the page. When I check the HTML source, it only gives me the url of the button graphic and some obscure name of the button input, and action=".".
In addition, both the url where the button is and the form name is obscured, for example:
url = /WebObjects/MyStore.woa/wo/5.2.0.5.7.3
input name = 0.0.5.7.1.1.11.19.1.13.13.1.1
How would I log into the page, 'click' that button, and download the pdf file within a script?
Maybe Mechanize module can help.
I think that url on clicking the button maybe generated using javascript.So, to run javascript code from python script take a look at Spidermonkey.
Try mechanize or twill. HttpFox or firebug can help you to build your queries. Remember you can also pickle cookies from browser and use it later with py libs. If the code is generated by javascript it could be possible to 'reverse engineer' it. If nof you can run some javascript interpret or use selenium or windmill to script a real browser.
You could observe what requests are made when you click the button (using Firebug in Firefox or Developer Tools in Chrome). You may then be able to request the PDF directly.
It's difficult to help without seeing the page in question.
As Acorn said, you should try monitoring the actual requests and see if you can spot a pattern.
If not, then your best bet is actually to automate a fully-featured browser, that will be able to run Javascript, so you'll exactly mimic what a regular user would do. Have a look at this page on the Python Wiki for ideas, check the section Python Wrappers around Web "Libraries" and Browser Technology.

python web script send job to printer

Is it possible for my python web app to provide an option the for user to automatically send jobs to the locally connected printer? Or will the user always have to use the browser to manually print out everything.
If your Python webapp is running inside a browser on the client machine, I don't see any other way than manually for the user.
Some workarounds you might want to investigate:
if you web app is installed on the client machine, you will be able to connect directly to the printer, as you have access to the underlying OS system.
you could potentially create a plugin that can be installed on the browser that does this for him, but I have no clue as how this works technically.
what is it that you want to print ? You could generate a pdf that contains everything that the user needs to print, in one go ?
You can serve to the user's browser a webpage that includes the necessary Javascript code to perform the printing if the user clicks to request it, as shown for example here (a pretty dated article, but the key idea of using Javascript to call window.print has not changed, and the article has some useful suggestions, e.g. on making a printer-friendly page; you can locate lots of other articles mentioning window.print with a web search, if you wish).
Calling window.print (from the Javascript part of the page that your Python server-side code will serve) will actually (in all browsers/OSs I know) bring up a print dialog, so the user gets system-appropriate options (picking a printer if he has several, maybe saving as PDF instead of doing an actual print if his system supports that, etc, etc).

Categories

Resources