Firefox + Selenium WebDriver and download a csv file automatically - python

I have problem with Selenium WebDriver and Firefox. I want to download csv file without confirmation in dialog window and I have code like this:
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.dir", download_dir)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
but it seems not working.
I tried many combination with browser.helperApps.neverAsk.saveToDisk
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv,application/csv,text/plan,text/comma-separated-values")
or
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","application/csv")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/plain")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/comma-separated-values")
but there's no difference and Firefox won't download automaticly.
How can I fix it?

Sometime the content type is not as you'd expect
Use HttpFox Firefox plugin (or similar) to find the real content type of the file and use it in your code
BTW, For me the content type was
fp.set_preference("browser.helperApps.neverAsk.openFile", "application/octet-stream");
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/octet-stream");

SetPreference("browser.helperApps.neverAsk.saveToDisk", "application/comma-separated-values ,text/csv"); //in java selenium
this will work for downloading all type of csv files...
thanks, enjoy....

Now (May 2016),
SetPreference("browser.helperApps.neverAsk.saveToDisk", "text/csv"); // C#
works for me

Related

Save PDF by using Selenium and IE Browser

To save PDF by using CHrome Browser does not cause any issues (I'm using these options):
options.add_experimental_option('prefs',{
'credentials_enable_service': False,
'plugins':{
'always_open_pdf_externally': True
},
'profile': {
'password_manager_enabled': False,
},
'download': {
'prompt_for_download': False,
'directory_upgrade': True,
'default_directory': ''
}
})
BUT .... How to save PDF by using webdriver.Ie() Internet Explorer Driver with Python + Selenium?
P.S. AFAIK Internet explorer can not be executed by using headless mode, but if someone will not the way to do it, will be amazing !!!
You can't use Selenium to deal with the download prompt in IE because that's an OS-level prompt. Selenium WebDriver has no capability to automate OS-level prompt window. You need to use some 3rd party tools to help you to download file in IE using Selenium.
Here I use Wget to bypass the download prompt and download file in IE. You can refer to this article about how to use Wget.
About using headless mode in IE in Selenium, you can also use a 3rd party tool called headless_ie_selenium. You can download this tool and use headless_ie_selenium.exe instead of IEDriverServer.exe to automate IE.
The sample code to download a pdf file is like below, please note to change the paths in the code to your owns:
from selenium import webdriver
import time
import os
url = "https://file-examples.com/index.php/sample-documents-download/sample-pdf-download/"
driver = webdriver.Ie('D:\\headless-selenium-for-win-v1-4\\headless_ie_selenium.exe')
driver.get(url)
time.sleep(3)
link = driver.find_elements_by_class_name("download-button")[0]
hrefurl = link.get_attribute("href")
os.system('cmd /c C:\\Wget\\wget.exe -P D:\\Download --no-check-certificate ' + hrefurl)
print("*******************")

send files to a website through a 'browse files' on pc

I'm browsing through a website using dryscrape in python and i need to upload a file to this site. But there is only one way of doing it, that is clicking in a button and browse into my files and select the one i want. How can i do it with python? i would appreciate if someone could help me using dryscrape too, but i'm accepting all answers.
heres the example image:
You can use Selenium. I tested this code and it works.
from selenium import webdriver
url = "https://example.com/"
driver = webdriver.Chrome("./chromedriver")
driver.get(url)
input_element = driver.find_element_by_css_selector("input[type=\"file\"]")
# absolute path to file
abs_file_path = "/Users/foo/Downloads/bar.png"
input_element.send_keys(abs_file_path)
sleep(5)
driver.quit()
Resources
Selenium python
Chrome driver download
For those who are searching for the answer in dryscrape i translated the selenium code to dryscrape:
element = sessin.at_xpath("xpath...") # Session is a dryscrape session
# The xpath is from the button like the one in the image "Browse..."
element.set("fullpath")
just as simple as it is.

How to read a file downloaded by selenium webdriver in python

I am using selenium with webdriver in python to download a csv file from a site . The file gets downloaded into the download directory specified. Here is an overview of my code
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir",'xx/yy')
fp.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
driver = webdriver.Firefox(fp)
driver.get('url')
I need to print the contents of this csv to the terminal . A lot of similar files with random names will be downloaded into the same folder so accessing the file via filename wont work as I don't know what it will be in advance
You can get the last downloaded file from that location and then read the file:
path = /path to folder
list = os.listdir(path)
time_sorted_list = sorted(list, key=os.path.getmtime)
file_name = time_sorted_list[len(time_sorted_list)-1]
and then u can read from this file. Hoping not multiple files are getting there by parallel processes.
EDIT:
Just saw comment that multiple instances are up for downloading, so other way around you can use urllib and download the file by using its url as:
import urllib
urllib.urlretrieve( "http://www.example.com/yourfile.ext", "your-file-name.ext") // you can provide unique-id to your file name
This answer was formed from a combination of previous stack overflow questions , answers as well as comments in this post so thank you everyone.
I combined selenium webdriver and the python requests module for this solution . I essentially logged into the site using selenium, copied the cookies from the webdriver session and then used a requests.get(url,cookies = webdriver_cookies) to get the file.
Here's the gist of my solution
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir",'xx/yy')
fp.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
driver = webdriver.Firefox(fp)
# selenium login code ...
driver_cookies = driver.get_cookies()
cookies_copy = {}
for driver_cookie in driver_cookies:
cookies_copy[driver_cookie["name"]] = driver_cookie["value"]
r = requests.get('url',cookies = cookies_copy)
print r.text
I hope that this helps someone
Downloading files in Selenium is never a good idea. You cannot control where and under which filename the file is downloaded, and if you want to find out, then you have to use dirty hacks. It depends on the browser and its settings and if the same file has already been downloaded before or not.
Plus, you have to take care of deleting the file after the download, bc otherwise, numerous copies of the same file will spam your hard drive until it's completely full.
If possible, you should call something like
string downloadUrl = ButtonDownloadPdf.GetAttribute("href");
and then handle the downloading yourself, using conventional methods, not Selenium.

How to set preferences for FireFox in Robot Framework

I'm trying to write a test case in robot framework to download an excel file automatically from a web-site. I want to set preferences for my browser using robot scripts to download files automatically in my desired destination directory without asking me!
I have tried this solution; but it didn't work.
I also tried to set an existing firefox profile as this says which works fine, but I want to be capable of automatically adjusting preferences.
Any idea?
As #Sachin said I wrote a python script to set preferences for FireFox as well:
from selenium import webdriver
class WebElement(object):
#staticmethod
def create_ff_profile(path):
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", 'application/csv')
fp.update_preferences()
return fp
And used it in Robot scenario:
*** Settings ***
Library Selenium2Library
Library Selenium2LibraryExtensions
Library OperatingSystem
Library ../../../Libraries/WebElement.py
*** Variables ***
${profileAddress} C:\\Users\\user\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\VdtJKHal.default
${destinationUrl} http://www.principlesofeconometrics.com/excel.htm
${browserType} firefox
${downloadDir} C:\\Users\\user\\Desktop
${acceptedTypes} text/csv/xls/xlsx
${itemXpath} //*[text()="airline"]
*** Test Cases ***
My Test Method
log to console Going to open browser with custome firefox profile!
${profile} = create_ff_profile ${downloadDir}
Open Browser ${destinationUrl} ${browserType} ff_profile_dir=${profile}
Maximize Browser Window
Click Element xpath=${itemXpath}
Sleep 10
Close Browser
But I got error TypeError: coercing to Unicode: need string or buffer, FirefoxProfile found in method _make_browser of library _browsermanagement.py.
I edited the code and removed return fp and then changed the Robot test case like this:
And used it in Robot scenario:
*** Test Cases ***
My Test Method
log to console Going to open browser with custome firefox profile!
create_ff_profile ${downloadDir}
Open Browser ${destinationUrl} ${browserType} ff_profile_dir=${profileAddress}
Maximize Browser Window
Click Element xpath=${itemXpath}
Sleep 10
Close Browser
It removed the exception and set my preferences as well, but I still need to pass the profile address.
I have written following python code to create profile:
def create_profile(path):
from selenium import webdriver
fp =webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/csv')
fp.update_preferences()
Using above function in testcase as follows:
${random_string} generate random string 3
${path} Catenate SEPARATOR=\\ ${TEMPDIR} ${random_string}
${profile}= create_profile ${path}
open browser ${app_url} ff ff_profile_dir=${profile}
It saves the excel file to the location specified in the path variable.
Your keyword should return the path to created Firefox profile:
def create_profile(path):
from selenium import webdriver
fp =webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/csv')
fp.update_preferences()
return fp.path
And only then you can use it:
${profile_path} Create Profile ${path}
Open Browser ${app_url} ff ff_profile_dir=${profile_path}
If it is not working then try with mime-type - application/octet-stream for CSV file.
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/octet-stream")
You can return profile path from create_profile function and then use it open browser.
Make sure to delete directory profile path in teardown test/suite
def create_profile(path):
from selenium import webdriver
fp =webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/csv')
fp.update_preferences()
return fp.path
Use path in open browser keyword
${random_string} generate random string 3
${path} Catenate SEPARATOR=\\ ${TEMPDIR} ${random_string}
${profile_path}= create_profile ${path}
open browser ${app_url} ff ff_profile_dir=${profile_path}

Using Selenium with Python and PhantomJS to download file to filesystem

I've been grappling with using PhantomJS/Selenium/python-selenium to download a file to the filesystem.
I'm able to easily navigate through the DOM and click, hover etc. Downloading a file is, however, proving to be quite troublesome. I've tried a headless approach with Firefox and pyvirtualdisplay but that wasn't working well either and was unbelievably slow. I know That CasperJS allows for file downloads. Does anyone know how to integrate CasperJS with Python or how to utilize PhantomJS to download files. Much appreciated.
Despite this question is quite old, downloading files through PhantomJS is still a problem. But we can use PhantomJS to get download link and fetch all needed cookies such as csrf tokens and so on. And then we can use requests to download it actually:
import requests
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('page_with_download_link')
download_link = driver.find_element_by_id('download_link')
session = requests.Session()
cookies = driver.get_cookies()
for cookie in cookies:
session.cookies.set(cookie['name'], cookie['value'])
response = session.get(download_link)
And now in response.content actual file content should appear. We can next write it with open or do whatever we want.
PhantomJS doesn't currently support file downloads. Relevant issues with workarounds:
File download
How to handle file save dialog box using Selenium webdriver and PhantomJS?
As far as I understand, you have at least 3 options:
switch to casperjs (and you should leave python here)
try with headless on xvfb
switch to normal non-headless browsers
Here are also some links that might help too:
Selenium Headless Automated Testing in Ubuntu
XWindows for Headless Selenium (with further links inside)
How to run browsers(chrome, IE and firefox) in headless mode?
Tutorial: How to use Headless Firefox for Scraping in Linux
My use case required a form submission to retrieve the file. I was able to accomplish this using the driver's execute_async_script() function.
js = '''
var callback = arguments[0];
var theForm = document.forms['theFormId'];
data = new FormData();
data.append('eventTarget', "''' + target + '''"); // this is the id of the file clicked
data.append('otherFormField', theForm.otherFormField.value);
var xhr = new XMLHttpRequest();
xhr.open('POST', theForm.action, true);
'''
for cookie in driver.get_cookies():
js += ' xhr.setRequestHeader("' + cookie['name'] + '", "' + cookie['value'] + '"); '
js += '''
xhr.onload = function () {
callback(this.responseText);
};
xhr.send(data);
'''
driver.set_script_timeout(30)
file = driver.execute_async_script(js)
Is not posible in that way. You can use other alternatives to download files like wget o curl.
Use firefox to find the right request and selenium to get the values for that and finally use out of to the box to download the file
curlCall=" curl 'http://www_sitex_org/descarga.jsf' -H '...allCurlRequest....' > file.xml"
subprocess.call(curlCall, shell=True)

Categories

Resources