With selenium I try to download something (in order to verify its content), using the following code as a proof-of-concept:
from selenium import webdriver
profile = webdriver.FirefoxProfile()
#Set Location to store files after downloading.
profile.set_preference("browser.download.dir", "/tmp")
profile.set_preference("browser.download.folderList", 2)
#Set Preference to not show file download confirmation dialogue using MIME types Of different file extension types.
#profile.set_preference("browser.helperApps.neverAsk.saveToDisk",
# "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;")
profile.set_preference("browser.download.manager.showWhenStarting", False )
profile.set_preference("pdfjs.disabled", True )
profile.set_preference("browser.helperApps.neverAsk.saveToDisk","application/zip")
profile.set_preference("plugin.disable_full_page_plugin_for_types", "application/zip")
browser = webdriver.Firefox(profile)
browser.implicitly_wait(10)
browser.get('https://www.thinkbroadband.com/download')
time.sleep(15)
elem = browser.find_element_by_xpath('//a[#href="http://ipv4.download.thinkbroadband.com/5MB.zip"]')
elem.click()
time.sleep(15)
However, nothing 'happens' (i.e. the download is not performed), and also no error message is shown. When I click on that download link manually, the test-file is being downloaded into /tmp.
Is there anything I am missing?
The issue could be because the click in this case needs to go through a child element
elem = browser.find_element_by_xpath('//a[#href="http://ipv4.download.thinkbroadband.com/5MB.zip"]/img')
elem.click()
But otherwise when you click on a link browser checks stuff in background for that link, and the site seems to have a problem, when I open the target link in browser I get an empty response
Related
I am trying to scrape a website and download all the webpages as .html files (including all the HTML assets) so that the locally downloaded page opens just like the same in the server.
Currently using Selenium, Chrome Webdriver, and Python.
Approach:
I tried updating the prefs of the chrome browser. And then login into the website. After logging in I want to download the webpage similarly we do download by clicking ctrl + s from the keyboard.
Below code opens the desired page I want to download but does not disable Windows's save as a pop-up and neither downloads the page to the specified path.
from selenium import webdriver
import pyautogui
chrome_options = webdriver.ChromeOptions()
preferences = {
"download.default_directory":"C:\\Users\\pathtodir",
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": True
}
chrome_options.add_experimental_option("prefs", preferences)
driver = webdriver.Chrome(options=chrome_options)
driver.get(***URL to the website***)
driver.find_element("xpath", '//*[#id="id_username"]').send_keys('username')
driver.find_element("xpath", '//*[#id="id_password"]').send_keys('password')
driver.find_element("xpath", '//*[#id="datagrid-0"]/div[2]/div[1]/div[1]/table/tbody/tr[1]/td[2]/a').click()
pyautogui.hotkey('ctrl', 's')
pyautogui.typewrite('hello1' + '.html')
pyautogui.hotkey('enter')
Can somebody please help me to understand what I am doing wrong? Please suggest if there is any other alternative library that can be used in python.
To save a page first obtain the page source behind the webpage with the help of the page_source method.
Then open a file with a particular encoding with the codecs.open method. The file has to be opened in the write mode represented by w and encoding type as utf−8. Then use the write method to write the content obtained from the page_source method.
from selenium import webdriver
import codecs
driver = webdriver.Chrome(executable_path="path to chromedriver.exe")
driver.implicitly_wait(0.5)
driver.get(***URL to the website***)
h = driver.page_source
n=os.path.join("C:\ANYPATH","Page.html")
f = codecs.open(n, "w", "utf−8")
f.write(h)
driver.quit()
I was able to fix the issue, the problem was that my code quit before the browser was able to download the file. Adding time.sleep() fixed it.
Updated code:
from selenium import webdriver
import pyautogui
driver.get(***URL to the website***)
driver.find_element("xpath", '//*[#id="id_username"]').send_keys('username')
driver.find_element("xpath", '//*[#id="id_password"]').send_keys('password')
driver.find_element("xpath", '//*[#id="datagrid-0"]/div[2]/div[1]/div[1]/table/tbody/tr[1]/td[2]/a').click()
FILE_NAME = r'C:\ANYPATH\Page.html'
pyautogui.typewrite(FILE_NAME)
pyautogui.press('enter')
time.sleep(10)
driver.quit()
Having trouble figuring out the next step, trying to download a pdf file from a website and getting stuck.
"https://www.southtechhosting.com/SanJoseCity/CampaignDocsWebRetrieval/Search/SearchByElection.aspx"
Page with Links to PDF Files
PDF file to download
I was able to click on the pdf link from the "Page with Links" using Selenium & ChromeDriver but then I get a popup form instead of a download.
I tried disabling the Chrome PDF Viewer ("plugins.plugins_list":[{"enabled":False,"name":"Chrome PDF Viewer"}]), but that doesn't work.
The popup form (viewed in "PDF file to download") has a hover link to download the pdf file. I've tried ActionChains(), but I get this exception after running this line:
from selenium.webdriver.common.action_chains import ActionChains
element_to_hover = driver.find_element_by_xpath("//paper-icon-button[#id='download']")
hover = ActionChains(driver).move_to_element(element_to_hover)
hover.perform()
Looking for the most efficient way to download pdf files in this type of situation. Thanks!
Please try this:
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True}
chromeOptions.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chromeOptions)
driver.get('https://www.southtechhosting.com/SanJoseCity/CampaignDocsWebRetrieval/Search/SearchByElection.aspx')
#Code to open the pop-up
driver.find_element_by_xpath('//*[#id="ctl00_DefaultContent_ASPxRoundPanel1_btnFindFilers_CD"]').click()
driver.find_element_by_xpath('//*[#id="ctl00_GridContent_gridFilers_DXCBtn0"]').click()
driver.find_element_by_xpath('//*[#id="ctl00_DefaultContent_gridFilingForms_DXCBtn0"]').click()
driver.switch_to.frame(driver.find_element_by_tag_name('iframe'))
a = driver.find_element_by_link_text("Click here")
ActionChains(driver).key_down(Keys.CONTROL).click(a).key_up(Keys.CONTROL).perform()
UPDATE:
To exit the popup, you can try this:
driver.switch_to.default_content()
driver.find_element_by_xpath('//*[#id="ctl00_GenericPopupSizeable_InnerPopupControl_HCB-1"]/img').click()
I am trying to download a PDF files using Firefox in Selenium but the preferences I have set below do not seem to be working. Whenever I run the code, I am still getting the "You have chosen to open:" dialog box even though the preferences state that PDF files should automatically be downloaded.
Am I missing something?
def setUp(self):
downloads_folder = initialSearch.download_path(self)
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.dir", downloads_folder)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")
self.driver = webdriver.Firefox(profile)
Try this one profile.update_preferences() before
webdriver.Firefox(profile)
please may help you
I currently have a script that will log on to my company's wiki, visit a page, and select a download to pdf option available on the page. However, when this option is chosen, this dialogue box
pops up asking me to tell Firefox what to do with it. I just need selenium to interact and hit the "ok" button.
I'm not sure how to inspect this window for elements, and am need of direction. Any documentation helps.
from splinter import Browser
browser = Browser()
browser.visit('https://company.wiki.com')
browser.find_by_id('login-link').click()
browser.fill('os_username', 'user')
browser.fill('os_password', 'pass')
browser.find_by_name('login').click()
browser.visit('https://pageoncompany.wiki.com')
browser.find_by_xpath('//*[#id="navigation"]/ul/li[4]').click()
browser.find_by_id('action-export-pdf-link').click()
I was able to set the preferences through the web browser, then call my profile:
browser = Browser('firefox', profile=r'C:\Users\craab\AppData\Roaming\Mozilla\Firefox\Profiles\0lot9hun.default')
You can set preferences in order to prevent coming of download popup ad download it to pre-defined folder.
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2) # custom folder as set by repo
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", <download_folder_path>)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", content_type)
# Enable auto download, Avoid popup during downloads
fp.set_preference("browser.download.panel.shown", False)
fp.set_preference("browser.helperApps.neverAsk.openFile", content_type)
driver = webdriver.Firefox(fp)
Trying to extract an image, successfully triggered "Save Image as..." dialog, but couldn't send any keys, is there a way to solve this problem?
driver = webdriver.Firefox()
actions = webdriver.ActionChains(driver)
actions.move_to_element(img).context_click(img).send_keys('v').perform()
time.sleep(2)
# and this line does not work
actions.send_keys('image.jpg').perform()
Only one step away from making everything works, what should I do?
This is a kind of popup you cannot control with selenium.
In this case you need to ask the browser to save the file automatically by tweaking it's preferences (aka desired capabilities):
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", "/path/to/file")
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "image/jpeg")
driver = webdriver.Firefox(firefox_profile=profile)
where browser.helperApps.neverAsk.saveToDisk setting value should have a mime-type (or a comma-separated list of mime-types) of the files that should be downloaded automatically.
See also:
Access to file download dialog in Firefox