I am using selenium webdriver to do some automation on IE11 and am stuck on auto-downloading file while screen is locked.
The file download starts after pressing a button. That button does not link to a url for the download file, but seemed to link to a javascript function. I have managed everything until pressing the button but am stuck on the bottom bar prompt from IE11 (open or save). For security reasons I have to use IE11 and do the automation in locked screen. I have tried to send Alt+S using WScript.Shell, but it only seemed to work after I unlock the screen. Here's what I've tried.
shell = win32.Dispatch("WScript.Shell")
config = Path(localpath + localfile)
#confirm file exists
while not config.is_file():
shell.SendKeys("%s", 0)
time.sleep(2)
Is there a way to bypass the IE prompt and automatically save the file to the download folder during locked screen?
As far as I know, WebDriver has no capability to access the IE Download dialog boxes presented by browsers when you click on a download link or button. But, we can bypass these dialog boxes using a separate program called "wget".
we can use command-line program and this wget to download the file. Code as below (the following code is C# code, you could convert it to Python):
var options = new InternetExplorerOptions()
{
InitialBrowserUrl = URL, // "http://demo.guru99.com/test/yahoo.html";
IntroduceInstabilityByIgnoringProtectedModeSettings = true
};
//IE_DRIVER_PATH: #"D:\Downloads\webdriver\IEDriverServer_x64_3.14.0";
var driver = new InternetExplorerDriver(IE_DRIVER_PATH, options);
driver.Navigate();
Thread.Sleep(5000);
//get the download file link.
String sourceLocation = driver.FindElementById("messenger-download").GetAttribute("href");
Console.WriteLine(sourceLocation);
//using command-line program to execute the script and download file.
String wget_command = #"cmd /c D:\\temp\\wget.exe -P D:\\temp --no-check-certificate " + sourceLocation;
try
{
Process exec = Process.Start("CMD.exe", wget_command);
exec.WaitForExit();
Console.WriteLine("success");
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
driver.Close(); // closes browser
driver.Quit(); // closes IEDriverServer process
More details, you could refer to this article.
Edit: Since you are using python, you could use the find_element_by_id method to find the element, then using the get_attribute method to get the value. More details, you could check these articles:
Get Element Attribute
Python + Selenium for Dummies like Me
WebDriver API
Get value of an input box using Selenium (Python)
Related
To save PDF by using CHrome Browser does not cause any issues (I'm using these options):
options.add_experimental_option('prefs',{
'credentials_enable_service': False,
'plugins':{
'always_open_pdf_externally': True
},
'profile': {
'password_manager_enabled': False,
},
'download': {
'prompt_for_download': False,
'directory_upgrade': True,
'default_directory': ''
}
})
BUT .... How to save PDF by using webdriver.Ie() Internet Explorer Driver with Python + Selenium?
P.S. AFAIK Internet explorer can not be executed by using headless mode, but if someone will not the way to do it, will be amazing !!!
You can't use Selenium to deal with the download prompt in IE because that's an OS-level prompt. Selenium WebDriver has no capability to automate OS-level prompt window. You need to use some 3rd party tools to help you to download file in IE using Selenium.
Here I use Wget to bypass the download prompt and download file in IE. You can refer to this article about how to use Wget.
About using headless mode in IE in Selenium, you can also use a 3rd party tool called headless_ie_selenium. You can download this tool and use headless_ie_selenium.exe instead of IEDriverServer.exe to automate IE.
The sample code to download a pdf file is like below, please note to change the paths in the code to your owns:
from selenium import webdriver
import time
import os
url = "https://file-examples.com/index.php/sample-documents-download/sample-pdf-download/"
driver = webdriver.Ie('D:\\headless-selenium-for-win-v1-4\\headless_ie_selenium.exe')
driver.get(url)
time.sleep(3)
link = driver.find_elements_by_class_name("download-button")[0]
hrefurl = link.get_attribute("href")
os.system('cmd /c C:\\Wget\\wget.exe -P D:\\Download --no-check-certificate ' + hrefurl)
print("*******************")
As this answer by myself:
How to download files headless in Selenium (Java) when download happens in new tab?
If download button trigger download action in opened new tab, I will switch to new tab and send command(as below code) to download file.
def enable_download_in_headless_chrome(self, driver, download_path):
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {
'cmd': 'Page.setDownloadBehavior',
'params': {'behavior': 'allow', 'downloadPath': download_path}
}
driver.execute("send_command", params)
I found that the above method failed on linux occasionally.
When driver got the error, it would be crashed for a long time.
(I detect the situation from my error handling with retry.)
Error message:
Message: timeout
(Session info: headless chrome=77.0.3865.120)
And I have looked at solution with the issue discussion but no one can solve it: https://bugs.chromium.org/p/chromium/issues/detail?id=696481
I guess the root cause is that I send enable download command before the new tab in ready state?
Please help me to find the solution. Thanks.
if there is a new tab is opened when you click on a download link (for example if it has target="_blank" attribute ). In such case download in headless with the enable_download_in_headless_chrome method, a solution doesn't work. So you can remove target="_blank" attribute by JS or get href and try to download by direct opening the link in the same tab.
if there is link and open in new tab then you can open in the same tab by overwriting javascript
def open_link_same_tab_download_file(current_user_driver, element):
# open element in same tab add target to self
current_user_driver.execute_script('arguments[0].target="_self"',element)
# click on element to download file
current_user_driver.execute_script("arguments[0].click()", element
if there is no link attribute then you can override javascript of open window in new tab like below
def open_new_tab_download_file(current_user_driver, element):
# open element in same tab override javascript for that
current_user_driver.execute_script('window.open = function(url) {window.location=url}')
# click on element to download file
current_user_driver.execute_script("arguments[0].click()", element)
On Linux,
I found that the file will be downloaded two times through below code:
# Open new tab by clicking download button
download_button.click()
# Switch to new tab
driver.switch_to.window(driver.window_handles[-1])
# Wait for ready state
WebDriverWait(driver, 60).until(lambda driver: driver.execute_script('return document.readyState') == 'complete')
##### The file will be downloaded to original download path #####
# Change download directory
enable_download_in_headless_chrome(driver, download_path)
# Refresh to trigger download behavior again
driver.refresh()
##### The file will be downloaded to your specific path again #####
If send "enable download" command before ready state,
it cannot change download directory successfully.
Here is my problem: I'm trying to use selenium to access a webpage and the special about this page is it is an auto redirecting page (you open that page and after few seconds, it automatically redirect to another page). When i use driver = webdriver.Firefox(), my IDM catched that link just perfectly after few seconds.
And because i don't want the browser to come up so i use Phantomjs instead, ut it not working. My application just can get the loading page url (bitdl-1336...) but not the redirected link. Please help!
This is my code:
link = 'http://torrent.ajee.sh/hash.php?hash=' + self.global_hash_code
driver = webdriver.PhantomJS('phantomjs.exe')
driver.get(str(link))
element = driver.find_element_by_link_text('Download Zip')
element.click()
time.sleep(10)
msg = QMessageBox.information(self, QString('Thành công'),QString(driver.current_url))
And this is the result:
Please help!
Sorry about my english
Not exactly an answer to your PhantomJS-specific question, but a workaround to the problem.
And because i don't want the browser to come up so i use Phantomjs instead
You can continue using Firefox, but start it in a Virtual Display, see more information at:
How do I run Selenium in Xvfb?
You may also need to let the browser automatically save the archive in a specified directory, see:
How do I automatically download files from a pop up dialog using selenium-python
Access to file download dialog in Firefox
I am trying to download a google doc as a pdf using Selenium in Python. Unfortunately, my html knowledge is quite minimal and as a result I don't know what html I need to have it click file and then download as pdf. I realize that I can use the web developer tool to get html but that isn't working for me so well.
Here is what I have tried so far:
from selenium import webdriver
url = ' https://docs.google.com/document/d/1Y1n-RR5j_FQ9WFMG8E_ajO0OpWLNANRu4lQCxTw9T5g/edit?pli=1'
browser = webdriver.Firefox()
browser.get(url)
Any help would be appreciated; thanks!
As you mention in your comment, Google Drive doesn't like being scraped.
The drive command looks like the right tool for this sort of job. - It'll do you're trying to do, but not the way you want to do it. According to the docs (i.e. I haven't tested it), this command looks like it would download your file:
drive pull --export docx --id 1Y1n-RR5j_FQ9WFMG8E_ajO0OpWLNANRu4lQCxTw9T5g
(Also, in general, I find the easiest way to use Selenium is to use the Selenium IDE to tell Selenium what you want to do, then export the resulting test case by going to File > Export Test Case As... > Python 2 / unittest / Web Driver.)
Hope that helps.
I have a working solution, I don't know if google will update to mitigate this. Now this is in c#, but the selenium functionality is basically the same.
Show all the menu items, except the download as menu and return the download as webelement. Use selenium to click it, then select a format and return the webelement to click as well. I couldn't do a click using just javascript, I was unable to figure out how to they triggered it, but clicking it using selenium driver worked just fine.
Make most of the menu's visible and return download as webelement.
document.querySelector(`#docs-file-menu`).className = 'menu-button goog-control goog-
inline-block goog-control-open docs-menu-button-open-below';
document.querySelector(`#docs-file-menu`).setAttribute('aria-expanded', 'true');
document.querySelectorAll(`.goog-menu:not(.goog-menu-noaccel)`)[0].className = 'goog-menu goog-menu-vertical docs-material docs-menu-hide-mnemonics docs-menu-attached-button-above';
document.querySelectorAll(`.goog-menu:not(.goog-menu-noaccel)`)[0].setAttribute('style', 'user-select: none; visibility: visible; left: 64px; top: 64px;');
// download as
// 2 parents above
document.querySelector(`[aria-label='Download as d']`).parentElement.parentElement.className = 'goog-menuitem apps-menuitem goog-submenu goog-submenu-open goog-menuitem-highlight'
return document.querySelector(`[aria-label='Download as d']`).parentElement.parentElement;
Click download as btn:
IWebElement btn = (IWebElement)((IJavaScriptExecutor)driver).ExecuteScript(btnClickJs);
btn.Click();
Select format:
var formatCss = document.querySelectorAll(`.goog-menu.goog-menu-noaccel`)[6].querySelectorAll(`.goog-menuitem.apps-menuitem`)
var format = 'injectformathere' ? 'injectformathere' : '.html'
for (let i = 0; i < formatCss.length; i++) {
if(formatCss[i].innerText.indexOf(format)!= -1)
return formatCss[i]
}
return null
Click format:
btn = (IWebElement)((IJavaScriptExecutor)driver).ExecuteScript(btnClickJs);
if (btn != null)
btn.Click();
I've been grappling with using PhantomJS/Selenium/python-selenium to download a file to the filesystem.
I'm able to easily navigate through the DOM and click, hover etc. Downloading a file is, however, proving to be quite troublesome. I've tried a headless approach with Firefox and pyvirtualdisplay but that wasn't working well either and was unbelievably slow. I know That CasperJS allows for file downloads. Does anyone know how to integrate CasperJS with Python or how to utilize PhantomJS to download files. Much appreciated.
Despite this question is quite old, downloading files through PhantomJS is still a problem. But we can use PhantomJS to get download link and fetch all needed cookies such as csrf tokens and so on. And then we can use requests to download it actually:
import requests
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('page_with_download_link')
download_link = driver.find_element_by_id('download_link')
session = requests.Session()
cookies = driver.get_cookies()
for cookie in cookies:
session.cookies.set(cookie['name'], cookie['value'])
response = session.get(download_link)
And now in response.content actual file content should appear. We can next write it with open or do whatever we want.
PhantomJS doesn't currently support file downloads. Relevant issues with workarounds:
File download
How to handle file save dialog box using Selenium webdriver and PhantomJS?
As far as I understand, you have at least 3 options:
switch to casperjs (and you should leave python here)
try with headless on xvfb
switch to normal non-headless browsers
Here are also some links that might help too:
Selenium Headless Automated Testing in Ubuntu
XWindows for Headless Selenium (with further links inside)
How to run browsers(chrome, IE and firefox) in headless mode?
Tutorial: How to use Headless Firefox for Scraping in Linux
My use case required a form submission to retrieve the file. I was able to accomplish this using the driver's execute_async_script() function.
js = '''
var callback = arguments[0];
var theForm = document.forms['theFormId'];
data = new FormData();
data.append('eventTarget', "''' + target + '''"); // this is the id of the file clicked
data.append('otherFormField', theForm.otherFormField.value);
var xhr = new XMLHttpRequest();
xhr.open('POST', theForm.action, true);
'''
for cookie in driver.get_cookies():
js += ' xhr.setRequestHeader("' + cookie['name'] + '", "' + cookie['value'] + '"); '
js += '''
xhr.onload = function () {
callback(this.responseText);
};
xhr.send(data);
'''
driver.set_script_timeout(30)
file = driver.execute_async_script(js)
Is not posible in that way. You can use other alternatives to download files like wget o curl.
Use firefox to find the right request and selenium to get the values for that and finally use out of to the box to download the file
curlCall=" curl 'http://www_sitex_org/descarga.jsf' -H '...allCurlRequest....' > file.xml"
subprocess.call(curlCall, shell=True)