I'm trying to download files using python 3. I use webbrowser.open_new(url) to open file locations. some files are downloaded automatically by chrome's downloader, and some are just opened in a chorme window. How can I choose between the options?
You cannot influence that, not with the Python webbrowser module.
What is downloaded and what is displayed in the browser is a preference set in the browser itself.
You could try and set those preferences using Selenium, see Set chrome.prefs with python binding for selenium in chromedriver. This is not going to be simple; you'll need to figure out the exact preference strings to alter. Perhaps the Chromium prefences list can be used as a guide there.
The Web server on which the file is hosted sends a header that suggests to the browser how it might handle the file, and the user's preferences hold some sway as well. You likely won't be able to override it easily.
You can avoid this by not using a Web browser from Python. urllib2 or better yet, the third-party requests module is a much easier way to talk to the Web.
Related
I have deployed WordPress inside my localhost and I am using selenium web driver to automatically navigate to each and every link. I need to save each dynamically loaded html pages of that WordPress site using a python script.Please help me. Here I am using Ubuntu 14.04.
If you're just trying to get a plain HTML version of your wordpress site, you'll usually not need to go at it via a full browser, as very few wordpress sites are ajax-heavy.
Give Webhttrack (or plain httrack) a try.
Install webhttrack on your machine and run it via the menu (It's usually found under Internet / Network) or terminal (by simple running "webhttrack").
It will start a local webserver and point your browser to a web interface which you can use to setup your download project. Once you run it, you will find a plain copy of the wordpress site in ~/websites/.
I am trying to create a script that get the data from a google keep list I was thinking Google Takeout might do part of what I want but I cannot find a API to automate the downloads. Does anyone know a way to grab this data via script (python/bash) so that I can easily extract what I need?
I am not sure if it is allowed or not, but you could login via a BeautifulSoup session and navigate to the site you wish to parse.
I've written a quite similar script for Python, you can find it at github, i thinkt it's pretty self-explanatory but if you should require any more help feel free to ask.
You could use selenium library for that.
Used the framework to scrape the keep.google.com webpage for all the notes and export them to a csv file
This Might be helpful, i made the script to backup my notes to my computer
https://github.com/darshkpatel/GoogleKeep_Backup
There is no API for Google Keep at this time. I don't think your going to be able automate Google Takeout either the best you will be able to do would be run it manually then create your own application to import it were ever it is you want to import it to.
Here is an automated solution for this question: a link!
Or just execute these commands in the terminal:
git clone https://github.com/Dmitry9/exportKeep.git;
cd exportKeep;
npm install;
npm run scrape;
After all dependencies installed (could take a minute or so) chrome instance will navigate to the sign-in page. After posting credentials it will scroll to the bottom of the window to force the browser to load all of the notes inside DOM. Inspecting the output of the terminal you will find a path to the saved JSON file.
In the meanwhile there is an API, see here: https://developers.google.com/keep/api/reference/rest
Also, there is a python library that implements this API (I'm not the author of the library): https://github.com/kiwiz/gkeepapi
I am pretty new to the Selenium testing with Electron apps; I know how to use Python to drive Chrome via the webdriver, and how to use Selenium IDE on Firefox, but I am having trouble to find a good source of info.
So far I have an app made with Electron, and I would like to use Selenium to drive it and automate the basics. I did some research and most of the results were using node.js, which I do not know at all. I would like to use Python, so before moving on a whole different language, I would like to ask to a bigger audience, if there is something already to do Selenium testing with Python, on Electron apps
In particular, how do you assign the variable that will contain the electron app? with the browser I would say
from selenium import webdriver
driver = webdriver.Chrome('/chromedriver')
but this won't make sense for an electron app.
I did find a way to catch the application.
You need to download Chromedriver; and run it on a port that you like(example: 8765).
Then you can access the application written via Electron, in Python using
from selenium import webdriver
remote_app = webdriver.remote.webdriver.WebDriver(
command_executor='http://localhost:8765',
desired_capabilities = {'chromeOptions':{ 'binary': '/myapp'}},
browser_profile=None,
proxy=None,
keep_alive=False)
Then you can access the DOM elements on the app as usual. Not sure if it will work on Windows, OSX and Linux, will have to try.
Yes you can do it with driver options and capabilities.
You need to set binary path and you should add Arguments on options.
Binary path is your electron application path under project directory in '.bin'.
Argument path is your project's main directory.
For example :
Let's say, your project under home directory and named 'ElectronProject'
Binart path is '/Users/Home/ElectronProject/node_modules/.bin/electron'
Argument Path is '/Users/Home/ElectronProject'
Yes, It is possible. you can refer the documentation # https://electronjs.org/docs/tutorial/using-selenium-and-webdriver
I want to run a python file in the web I have in a GitHub repository. Is it possible to do this?
And by running in the web, I mean putting #!/usr/bin/python and print 'Content-type:text/html\n in the first two lines.
In general this is not possible, Github (pages) serves only static content (ex: HTML, CSS, JS). If you want python to run (ex generate dynamic content) you need a web server capable of running python (your browser were the contents of GitHub Pages get downloaded and run can't do it).
That said there are experimental ways of running subsets of python in the browser. Take a look for a example at this question.
If you truly want to generate a complete HTTP response via standard output, then start by reading about CGI and Python's standard cgi module. You'll also need to have access to a CGI-compatible web server, perhaps running on a virtual host.
However, CGI is quite obsolescent as a way to produce dynamic output for the web. #jjwon's suggestion to look at Python-based web application frameworks like Flask is a good one.
I m trying to automate a Web Application validation performed by my team.I have choosen Python as the language to do this, although my exp. with Python is very limited.I have done similar things in the past using Perl. Now the problem is that after posting the url of the website it directs to a logon page which is made in Javascript. From whatever little Python I know, I believe scrapping/parsing website made in Javascript is not possible. I faced the same issue while doing this with Perl as well and wasn't able to proceed.
Any pointers or help in resolving the above issue would be highly appreciated.
Thanks
Spynner may help http://code.google.com/p/spynner/
Maybe you can take a look a Selenium. It's a firefox plugin that enables automation, but it also has a webdriver system where you can write automation scripts in various languages (including python), and a server execute the code in various browsers. I never tried the webdriver part myself, but that should do what you want.