I am trying to automate clicking through a website and currently have the following code
def open_site(self, url, headless=True, **kwargs):
options = webdriver.ChromeOptions()
options.add_argument("--incognito")
options.add_argument("--disable-popup-blocking")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--start-maximized")
options.add_argument("--disable-extensions")
options.add_argument('--disable-infobars')
options.add_argument("--disable-web-security")
options.add_argument("--allow-running-insecure-content")
options.add_experimental_option('prefs', {
"download.default_directory": str(Path(self.download_dir).resolve()), # Change default directory for downloads
"download.prompt_for_download": False, # To auto download the file
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True, # It will not show PDF directly in chrome
"download.extensions_to_open": "applications/pdf",
"profile.default_content_settings.popups": 1
})
self.create_webdriver('Chrome', desired_capabilities=options.to_capabilities(), executable_path="/path/to/chrome/driver")
self.go_to(url=url)
However, when the website opens I get stuck at the point of loading the pdf. It seems as though chrome fails on the aspx part. The website does load the pdf in Safari but this can only be used locally. Firefox isint supported at all by the website. Chrome is the only option. pdf image and click through image. Any assistance would be appreciated.
Related
I'm building a python program that using selenium and chrome driver to download .xml and .pdf files from a website, it's running fine, but when i turn the driver into headless mode then "CORS policy occur", I already try adding "disable-web-security" and "--disable-site-isolation-trials" after spend alot of time searching on internet but still no luck,So anyone please tell me what am I missing? What am I doing wrong? this is how I implement chrome driver:
options = webdriver.ChromeOptions()
options.add_argument('no-sandbox')
options.add_argument('--disable-gpu')
options.add_argument('--disable-extensions')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--safebrowsing-disable-download-protection')
options.add_argument('--start-maximized')
options.headless = True
extset = ['enable-automation', 'ignore-certificate-errors']
options.add_experimental_option('excludeSwitches', extset)
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--ignore-certificate-errors')
options.add_argument('--allow-insecure-localhost')
options.add_argument('--safebrowsing-disable-download-protection')
options.add_argument('--disable-web-security')
options.add_argument('--disable-site-isolation-trials')
options.add_argument('safebrowsing-disable-extension-blacklist')
prefs = {
'download.default_directory' : DOWNLOAD_DIRECTORY, # set up download directory
'safebrowsing.enabled': True, # disable xml download asking
'profile.default_content_setting_values.automatic_downloads': 1, # allow multiple files download
'download.prompt_for_download': False
}
if rpa.get('options') == 'AUTO_DOWNLOAD_PDF':
prefs.update({ 'plugins.plugins_list': [{ 'enabled': False, 'name': 'Chrome PDF Viewer' }] })
prefs.update({ 'plugins.always_open_pdf_externally': True })
prefs.update({ 'browser.helperApps.neverAsk.saveToDisk': 'application/pdf,application/vnd.adobe.xfdf,application/vnd.fdf,application/vnd.adobe.xdp+xml' })
options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(os.getcwd()+'\\webdriver\\chromedriver.exe', options=options)
I read a lot of answers about setting a download directory for Chrome, but couldn't find anything when you want to save a file to a specific directory, different from the default one, and you are using a specific profile.
The below code works fine, and when I download a file, it gets saved to the intended downloadFolderPath:
downloadFolderPath = 'P:\\reports\\attachments'
options = webdriver.ChromeOptions()
preferences = {"download.default_directory": downloadFolderPath,
"directory_upgrade": True,
"safebrowsing.enabled": True }
options.add_experimental_option("prefs", preferences)
options.headless = False
browser = webdriver.Chrome(executable_path='chromedriver.exe', options=options)
# browse to the page, click download, etc.
But when I try to do the same, this time using a specific Chrome profile, by adding user-data-dir and profile-directory arguments:
downloadFolderPath = 'P:\\reports\\attachments'
options = webdriver.ChromeOptions()
options.add_argument(r'--user-data-dir=C:\\Users\\my.name\\AppData\\Local\\Google\\Chrome\\User Data')
options.add_argument(r'--profile-directory=Profile 5')
preferences = {
"download.default_directory": downloadFolderPath,
"directory_upgrade": True,
"safebrowsing.enabled": True
}
options.add_experimental_option("prefs", preferences)
options.headless = False
browser = webdriver.Chrome(executable_path='chromedriver.exe', options=options)
# browse the page and download the file
Chrome indeed gets opened with the chosen profile, but the file that I download gets saved under the user profile default directory, C:\Users\my.name\Downloads instead of the one that I defined (P:\reports\attachments). It seems therefore that the preferences passed with options.add_experimental_option("prefs", preferences) are ignored when a profile different from the default one is used.
Or is there any different argument that needs to be used when working with specific profiles?
For info, I am on Windows 10, Python 3.9, using Chrome 95.0.4638.54 and chromedriver.exe is 95.0.4638.17
Thanks
I have written a simple selenium code that downloads files into the custom directory. The code is
for firefox:
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList",2)
# 0 for desktop
# 1 for default download folder
# 2 for specific folder
# You can specify directory by using profile.set_preference("browser.download.dir","<>")
profile.set_preference("browser.download.dir",dest_dir1)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/octet-stream")
profile.set_preference("browser.helperApps.alwaysAsk.force", False);
# If you don't have some download manager then you can remove these
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.download.manager.useWindow", False);
profile.set_preference("browser.download.manager.focusWhenStarting", False);
profile.set_preference("browser.download.manager.alertOnEXEOpen", False);
profile.set_preference("browser.download.manager.showAlertOnComplete", False);
driver=webdriver.Firefox(firefox_profile=profile,executable_path="geckodriver.exe")
Now, I want to create a similar type of script for chrome. I just want the chrome script to download into the dest_dir1. I don't see any options like webdriver.ChromeProfile similar to webdriver.FirefoxProfile().
from selenium import webdriver
options = webdriver.ChromeOptions()
prefs = {"download.default_directory": "YOUR DOWNLOAD DIRECTORY",
"download.directory_upgrade": True,
"download.manager.showWhenStarting": False,
"download.manager.useWindow": False,
"helperApps.alwaysAsk.force":False,
"download.manager.showAlertOnComplete": False}
}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options, path="EXECUTABLE CHROME DRIVER PATH")
I have added few tags hope you understand how it works and build your own preference option.
I need to download a pdf, I use headless so the browser doesn't open, and the pdf is in a view, so I used the "plugins.always_open_pdf_externally" parameter: True.
To not render the browser I use the options.add_argument ("- headless") parameter.
If I comment on options.add_argument ("- headless") the pdf download usually happens, but if I leave it enabled it doesn't work.
How can I solve this problem?
parameters:
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1920,1080")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
appState = {
"recentDestinations": [
{
"id": "Save as PDF",
"origin": "local"
}
],
"selectedDestinationId": "Save as PDF",
"version": 2
}
profile = {"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], # Disable Chrome's PDF Viewer
"download.extensions_to_open": "applications/pdf",
# "plugins.always_open_pdf_externally": True,
"printing.print_preview_sticky_settings.appState": json.dumps(appState)}
options.add_experimental_option("prefs", profile)
driver = webdriver.Chrome(chrome_options=options, executable_path=r'D:\Mega\Raiz\Dados_brcaptura\chromedriver.exe')
print ("Headless Chrome Initialized")
params = {'behavior': 'allow', 'downloadPath': r'C:\Users\dieinimy\Downloads'}
driver.execute_cdp_cmd('Page.setDownloadBehavior', params)```
In order to fire on event on headless browser you have to set the window size.Because headless browser can't recognise where to click without window size.
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('window-size=1920x1080');
I'm using Selenium to download an embedded pdf accessed through many complex layers of logins and other browser actions. I've set up my chromedriver with the following options per instruction from various other posts:
chromedriver = r'C:\Users\cj9250\AppData\Local\Continuum\anaconda3\chromedriver.exe'
download_dir = "C:\\Users\\CJ9250\\Downloads\\" # for linux/*nix, download_dir="/usr/Public"
options = webdriver.ChromeOptions()
profile = {
"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}],
"download.default_directory": download_dir ,
"download.extensions_to_open": "applications/pdf",
"plugins.always_open_pdf_externally": True,
"download.prompt_for_download": False,
"safebrowsing.enabled": True
}
options.add_experimental_option("prefs", profile)
browser = webdriver.Chrome(chromedriver, chrome_options=options)
However, I get this box that I have to click before it downloads to my specified directory:
The 'Open' element doesn't have an xpath that I can find through the inspector. I'm guessing that this is some kind of internal security setting for the ChromeDriver but I can't find a way past it.
My end goal is just to download a embedded PDF in an open Selenium Test page, this seemed the only suggested course of action.
reportSho.do OPEN
I'm not sure why it worked but, I modified my profile variable to:
profile = {
"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}], # Disable Chrome's PDF Viewer
"download.default_directory": download_dir ,
"download.extensions_to_open": "applications/pdf",
"safebrowsing.enabled": False
}
From there I was able to get a frame element on the page with the open button. One of the element attributes had a URL. When I instructed the browser to go to the URL, it downloaded the file to my specified directory!