Cannot download csv using pyvirtualdisplay + selenium + firefox on Pythonanywhere

Cannot download csv using pyvirtualdisplay + selenium + firefox on Pythonanywhere - python

I am trying to host my selenium script on Pythonanywhere.
However, I cannot see any .csv being downloaded through my code.
I have searched around for a while. such a headache!
Any help would be greatly appreciated!
from pyvirtualdisplay import Display
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By
from os import getcwd
with Display():
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", getcwd())
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")
browser = webdriver.Firefox(firefox_profile=profile)
try:
browser.get("https://track.cruxsystems.com/login")
browser.implicitly_wait(30)
WebDriverWait(browser, 50).until(
expected_conditions.element_to_be_clickable(
(By.XPATH, '//button[#uib-tooltip="Download"]')))
browser.find_elements_by_xpath("//button[#uib-tooltip='Download']")[0].click()
time.sleep(30)
finally:
browser.quit()
print('finished')
I have a screenshot after clicking the download button and it shows the pre-downloading loader and seems like it's about to download the file. However, nothing has been downloaded after.

Related

Python, selenium find_element_by_link_text not working

I am trying to scrape a website. Where in I have to press a link. for this purpose, I am using selenium library with chrome drive.
from selenium import webdriver
url = 'https://sjobs.brassring.com/TGnewUI/Search/Home/Home?partnerid=25222&siteid=5011&noback=1&fromSM=true#Applications'
browser = webdriver.Chrome()
browser.get(url)
time.sleep(3)
link = browser.find_element_by_link_text("Don't have an account yet?")
link.click()
But it is not working. Any ideas why it is not working? Is there a workaround?

You can get it done in several ways. Here is one of such. I've used driver.execute_script() command to force the clicking. You should not go for hardcoded delay as they are very inconsistent.
Modified script:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
url = 'https://sjobs.brassring.com/TGnewUI/Search/Home/Home?partnerid=25222&siteid=5011&noback=1&fromSM=true#Applications'
driver = webdriver.Chrome()
driver.get(url)
item = wait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "a[ng-click='newAccntScreen()']")))
driver.execute_script("arguments[0].click();",item)

The file download path setting in python selenium chrome headless does not apply

I am a web developer in Korea. We've recently been using this Python to implement the website crawl feature.
I'm new to Python. We looked for a lot of things for about two days, and we applied them. Current issues include:
Click the Excel download button to display a new window (pop up).
Clicking Download in the new window opens a new tab in the parent window and shuts down all browsers down as soon as the download starts.
Download page is PHP and data is set to Excel via header so that browser automatically recognizes download.
The problem is that the browser has shut down and the download is not complete, nor is the file saved.
I used the following source code.
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
chrome_driver = './browser_driver/chromedriver'
options = webdriver.ChromeOptions()
options.add_argument('--headless')
download_path = r"C:\Users\files"
timeout = 10
driver = webdriver.Chrome(executable_path=chrome_driver, chrome_options=options)
driver.command_executor._commands["send_command"] = (
"POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior',
'params': {'behavior': 'allow', 'downloadPath': download_path}}
command_result = driver.execute("send_command", params)
driver.get("site_url")
#download new window
down_xls_btn = driver.find_element_by_id("download")
down_xls_btn.click()
driver.switch_to_window(driver.window_handles[1])
#download start
down_xls_btn = driver.find_element_by_id("download2")
down_xls_btn.click()
The browser itself shuts down as soon as the download is started during testing without headless mode.
The headless mode does not download the file itself.
Annotating a DevTools source related to Page.setDownloadBehavior removes the shutdown but does not change the download path.
I am not good at English, so I translated it into a translator. It's too hard because I'm a beginner. Please help me.
I just tested it with the Firefox web browser.
Firefox, unlike Chrome, shows a download window in a new form rather than a new tab, which runs an automatic download and closes the window automatically.
There is a problem here.
In fact, the download was successful even in headless mode in the Firefox.
However, the driver of the previously defined driver.get() was not recognized when the new window was closed.
import os
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.firefox.options import Options
import json
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir",download_path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","application/octet-stream, application/vnd.ms-excel")
fp.set_preference("dom.webnotifications.serviceworker.enabled",False)
fp.set_preference("dom.webnotifications.enabled",False)
timeout = 10
driver = webdriver.Firefox(executable_path=geckodriver, firefox_options=options, firefox_profile=fp)
driver.get(siteurl)
down_btn = driver.find_element_by_xpath('//*[#id="searchform"]/div/div[1]/div[6]/div/a[2]')
down_btn.click()
#down_btn Click to display a new window
#Automatic download starts in new window and closes window automatically
driver.switch_to_window(driver.window_handles[0])
#window_handles Select the main window and output the table to output an error.
print(driver.title)
Perhaps this is the same problem as the one we asked earlier.
Since the download is currently successful in the Firefox, we have written code to define a new driver and proceed with postprocessing.
Has anyone solved this problem?

I came across the same issue and I managed to solve it that way:
After you switch to the other window, you should enable the download again:
Isolate this code into a function
def enable_download_in_headless_chrome(driver, download_path):
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {
'cmd': 'Page.setDownloadBehavior',
'params': {'behavior': 'allow', 'downloadPath': download_path}
}
driver.execute("send_command", params)
Call it whenever you need to download a file from another window.
Your code will then be:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
chrome_driver = './browser_driver/chromedriver'
options = webdriver.ChromeOptions()
options.add_argument('--headless')
download_path = r"C:\Users\files"
timeout = 10
driver = webdriver.Chrome(executable_path=chrome_driver, chrome_options=options)
enable_download_in_headless_chrome(driver, download_path)
driver.get("site_url")
#download new window
down_xls_btn = driver.find_element_by_id("download")
down_xls_btn.click()
driver.switch_to_window(driver.window_handles[1])
enable_download_in_headless_chrome(driver, download_path) # THIS IS THE MISSING AND SUPER IMPORTANT PART
#download start
down_xls_btn = driver.find_element_by_id("download2")
down_xls_btn.click()

How to click and download with python selenium

I try to download CSV data from GoogleTrend by selenium(python).
In previous, I tried to print source page and extract data that I want later.
It worked for some period, but now it does not work.
I try to click download button to got CSV file but nothing happen.
Do you have any idea for this case?
I got button path from firebug+firepath (firefox plugin).
html/body/div[2]/div[2]/div/md-content/div/div/div[1]/trends-widget/ng-include/widget/div/div/div/widget-actions/div/button[1]
I try on chrome driver and firefox drive.
This code; put 1 (word)argument that you want to get trend of search.
import sys
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def run_text_extract(search_word):
try:
print(search_word)
driver = webdriver.Firefox('/home/noah/Desktop/Google_Trend_downloader/chromedriver/geckodriver')
# driver = webdriver.Chrome('/home/noah/Desktop/Google_Trend_downloader/chromedriver')
driver.get("https://trends.google.com/trends/explore?date=all&geo=TH&q="+search_word)
driver.find_element_by_xpath('html/body/div[2]/div[2]/div/md-content/div/div/div[1]/trends-widget/ng-include/widget/div/div/div/widget-actions/div/button[1]').click()
try:
driver.manage().deleteAllCookies()
clear_cache(driver)
except TimeoutException as ex:
isrunning = 0
print("Exception has been thrown. " + str(ex))
print("Timeout line is", line ,".")
driver.close()
except Exception:
print ("Here 5")
pass
time.sleep(2)
driver.close()
print("======== END_OF_FILE ===============")
except:
pass
if name == 'main':
run_text_extract(sys.argv[1])
time.sleep(8)
# run_text_extract()

I have navigated to the link you have provided.
If you search for any term, you can see download csv button link will appear at the right side. But there will be 3 download csv buttton links with the same class or css selector are present. So you need to collect all the elements and loop through it so that you can click on specific element. In your case, I assume you want to click on first element. so below code should work. If you want 2nd or 3rd element to click change the index accordingly.
def run_text_extract(search_word):
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import time
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", 'C:\\Python27')
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")
driver = webdriver.Firefox(firefox_profile=profile,executable_path=r'C:\\Python27\\geckodriver.exe')
driver.get("https://trends.google.com/trends/explore?date=all&geo=TH&q="+ search_word)
time.sleep(7)
lst = driver.find_elements_by_css_selector(".widget-actions-item.export")
lst[0].click()
run_text_extract("selenium")

Need to download mutilple PDF link in one go from web table [duplicate]

I want to download csv files from "https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=" website
I am using python and selenium script as written below:But i get the exception "ElementNotInteractableException" and unable to download the page
from selenium import webdriver
fp=webdriver.FirefoxProfile()
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
browser = webdriver.Firefox(fp)
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("submit-download-list")

Here is the Answer to your Question:
The element you referred as find_element_by_id("submit-download-list") actually downloads a PDF file. So for the benefit of future programmers and readers of this question/post/thread/discussion, you may consider to change your question header to Download and Save PDF file using selenium and python from popup
Here is the code block to Download and Save PDF file using selenium and python from popup:
import os
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
binary = FirefoxBinary('C:\\Program Files\\Mozilla Firefox\\firefox.exe')
newpath = 'C:\\home\\DebanjanB'
if not os.path.exists(newpath):
os.makedirs(newpath)
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.dir",newpath)
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain,text/x-csv,text/csv,application/vnd.ms-excel,application/csv,application/x-csv,text/csv,text/comma-separated-values,text/x-comma-separated-values,text/tab-separated-values,application/pdf")
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.helperApps.neverAsk.openFile","text/plain,text/x-csv,text/csv,application/vnd.ms-excel,application/csv,application/x-csv,text/csv,text/comma-separated-values,text/x-comma-separated-values,text/tab-separated-values,application/pdf")
profile.set_preference("browser.helperApps.alwaysAsk.force", False)
profile.set_preference("browser.download.manager.useWindow", False)
profile.set_preference("browser.download.manager.focusWhenStarting", False)
profile.set_preference("browser.helperApps.neverAsk.openFile", "")
profile.set_preference("browser.download.manager.alertOnEXEOpen", False)
profile.set_preference("browser.download.manager.showAlertOnComplete", False)
profile.set_preference("browser.download.manager.closeWhenDone", True)
profile.set_preference("pdfjs.disabled", True)
caps = DesiredCapabilities.FIREFOX
browser = webdriver.Firefox(firefox_profile=profile, capabilities=caps, firefox_binary=binary, executable_path='C:\\Utility\\BrowserDrivers\\geckodriver.exe')
browser.maximize_window()
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("save-list-link").click()
download_link = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.XPATH, "//input[#id='submit-download-list']"))
)
download_link.click()
Let me know if this Answers your Question.

You are getting the exception ElementNotInteractableException because the element will be accessible after once the popup opens up. You are missing to click the download link which opens up the popup. please try the following,
from selenium import webdriver
fp=webdriver.FirefoxProfile()
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
browser = webdriver.Firefox(fp)
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("save-list-link").click()
browser.find_element_by_id("submit-download-list")

Download and save multiple csv files using selenium and python from popup

I want to download csv files from "https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=" website
I am using python and selenium script as written below:But i get the exception "ElementNotInteractableException" and unable to download the page
from selenium import webdriver
fp=webdriver.FirefoxProfile()
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
browser = webdriver.Firefox(fp)
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("submit-download-list")

Here is the Answer to your Question:
The element you referred as find_element_by_id("submit-download-list") actually downloads a PDF file. So for the benefit of future programmers and readers of this question/post/thread/discussion, you may consider to change your question header to Download and Save PDF file using selenium and python from popup
Here is the code block to Download and Save PDF file using selenium and python from popup:
import os
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
binary = FirefoxBinary('C:\\Program Files\\Mozilla Firefox\\firefox.exe')
newpath = 'C:\\home\\DebanjanB'
if not os.path.exists(newpath):
os.makedirs(newpath)
profile = webdriver.FirefoxProfile()
profile.set_preference("browser.download.dir",newpath)
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain,text/x-csv,text/csv,application/vnd.ms-excel,application/csv,application/x-csv,text/csv,text/comma-separated-values,text/x-comma-separated-values,text/tab-separated-values,application/pdf")
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.helperApps.neverAsk.openFile","text/plain,text/x-csv,text/csv,application/vnd.ms-excel,application/csv,application/x-csv,text/csv,text/comma-separated-values,text/x-comma-separated-values,text/tab-separated-values,application/pdf")
profile.set_preference("browser.helperApps.alwaysAsk.force", False)
profile.set_preference("browser.download.manager.useWindow", False)
profile.set_preference("browser.download.manager.focusWhenStarting", False)
profile.set_preference("browser.helperApps.neverAsk.openFile", "")
profile.set_preference("browser.download.manager.alertOnEXEOpen", False)
profile.set_preference("browser.download.manager.showAlertOnComplete", False)
profile.set_preference("browser.download.manager.closeWhenDone", True)
profile.set_preference("pdfjs.disabled", True)
caps = DesiredCapabilities.FIREFOX
browser = webdriver.Firefox(firefox_profile=profile, capabilities=caps, firefox_binary=binary, executable_path='C:\\Utility\\BrowserDrivers\\geckodriver.exe')
browser.maximize_window()
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("save-list-link").click()
download_link = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.XPATH, "//input[#id='submit-download-list']"))
)
download_link.click()
Let me know if this Answers your Question.

You are getting the exception ElementNotInteractableException because the element will be accessible after once the popup opens up. You are missing to click the download link which opens up the popup. please try the following,
from selenium import webdriver
fp=webdriver.FirefoxProfile()
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")
browser = webdriver.Firefox(fp)
browser.get("https://clinicaltrials.gov/ct2/results?cond=&term=lomitapide&cntry1=&state1=&SearchAll=Search+all+studies&recrs=")
browser.find_element_by_id("save-list-link").click()
browser.find_element_by_id("submit-download-list")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Cannot download csv using pyvirtualdisplay + selenium + firefox on Pythonanywhere - python

Related

Python, selenium find_element_by_link_text not working

The file download path setting in python selenium chrome headless does not apply

How to click and download with python selenium

Need to download mutilple PDF link in one go from web table [duplicate]

Download and save multiple csv files using selenium and python from popup

Categories

Resources