How to pass Desired Capabilities to Undetected Chromedriver with Selenium Python? - python

I'm using the Python package Undetected Chromedriver as I need to be able to log into a Google account with the webdriver, and I want to pass the options {"credentials_enable_service": False, "profile.password_manager_enabled": False} to the driver so that it doesn't bring up the popup to save the password. I was trying to pass those options using:
import undetected_chromedriver.v2 as uc
uc_options = uc.ChromeOptions()
uc_options.add_argument("--start-maximized")
uc_options.add_experimental_option("prefs", {"credentials_enable_service": False, "profile.password_manager_enabled": False})
driver2 = uc.Chrome(options=uc_options)
The argument --start-maximized works perfectly fine, and if I run the code with just that it starts maximised as intended. However, when adding the experimental options and running the code it returns the error:
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: cannot parse capability: goog:chromeOptions
from invalid argument: unrecognized chrome option: prefs
So I thought that I would try to pass the arguments as Desired Capabilities instead, making the code:
import undetected_chromedriver.v2 as uc
uc_options = uc.ChromeOptions()
uc_options.add_argument("--start-maximized")
uc_options.add_experimental_option("prefs", {"credentials_enable_service": False, "profile.password_manager_enabled": False})
uc_caps = uc_options.to_capabilities()
driver2 = uc.Chrome(desired_capabilities=uc_caps)
While this code runs and doesn't generate any errors, it also doesn't do anything at all. The password popup still shows up, and the driver doesn't even start maximised, despite the fact that the latter part worked as an option.
So my question is: how do I correctly pass Desired Capabilities to the Undetected Chromedriver?
Or, alternatively: how do I correctly pass the Experimental Options to the Undetected Chromedriver?

Undetected Chromedriver start webdriver service and Chrome as a normal browser with arguments, and after attaches a webdriver. Probably experimental preferents cannot be used on already running instance.
As workaround you can use Undetected Chromedriver patcher to modify the chromedriver and then use the it. But you need to check if the chrome is steal not detectable for your website. There're additional settings done for headless browser, so if you need headless check the source code.
import undetected_chromedriver.v2 as uc
from selenium import webdriver
patcher = uc.Patcher()
patcher.auto()
options = webdriver.ChromeOptions()
options.add_argument("--start-maximized")
options.add_experimental_option(
"prefs", {"credentials_enable_service": False, "profile.password_manager_enabled": False})
with webdriver.Chrome(options=options, executable_path=patcher.executable_path) as driver:
driver.get("")

Earlier this year, there were changes made to fix this issue and allow preferences:
https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/524
See https://github.com/ultrafunkamsterdam/undetected-chromedriver/commit/487969811851be6bcf6e3c55c8fc0d471940c6c3 for the commit.
That made important updates to https://github.com/ultrafunkamsterdam/undetected-chromedriver/blob/master/undetected_chromedriver/options.py in order to handle preferences.
Upgrade to a newer version to fix your issue.

Try using Options() instead of ChromeOptions() with #Sers answer
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_experimental_option(
"prefs", {"credentials_enable_service": False, "profile.password_manager_enabled": False})

Related

Can I get the gecko driver to launch in an existing Firefox window?

I'm trying to use Selenium in Python to fill out a form, which is why I need it to run in an existing Firefox window where I'm logged in due to the sensitive login data. But I can't seem to find a way to prevent the driver launching in a new window where I am logged out. Is there a way to do this?
I've tried using options to set preference:
`from selenium.webdriver.firefox.options import Options
options = Options()
options.set_preference("browser.tabs.loadInBackground", False)
driver = webdriver.Firefox(options=options)`
I also tried using the -no-remote command on Firefox in cmd before running the script, but it didn't work either.
From chrome driver you can achieve this.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("user-data-dir=C:/Users/Designer1/AppData/Local/Google/Chrome/User Data/Profile 1")
driver = webdriver.Chrome(chrome_options=options,executable_path="C:\webdrivers\chromedriver.exe")

Selenium: How can I find the correct command for add_arguments()?

I am using selenium python and chrome driver and wanted to know if there is any way you can get the command to enable or disable options in add_arguments() function. For example, there are '--disable-infobars', etc., but if I come across a new setting, how do I find its appropriate command?
An example being the settings to auto-download pdfs.
Any help is appreciated.
Chromium has lots of command switches, such as --disable-extensions or --disable-popup-blocking that can be enabled at runtime using Options().add_argument()
Here is a list of some of the Chromium Command Line Switches.
Chromium also allows for other runtime enabled features, such as useAutomationExtension or plugins.always_open_pdf_externally. These features are enabled using DesiredCapabilities.
I normal review the source code for Chromium when I need to find find other features to control with DesiredCapabilities.
The code below uses both command switches and runtime enabled features to automatically save a PDF file to disk without being prompted.
For my answer I downloaded a PDF file from the Library of Congress.
If you have any questions related to this code or something else related to your question please let me know.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
chrome_options = Options()
chrome_options.add_argument('--disable-infobars')
chrome_options.add_argument('--start-maximized')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--disable-popup-blocking')
# disable the banner "Chrome is being controlled by automated test software"
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_experimental_option("excludeSwitches", ['enable-automation'])
# you can set the path for your download_directory
prefs = {
'download.default_directory': 'download_directory',
'download.prompt_for_download': False,
'plugins.always_open_pdf_externally': True
}
capabilities = DesiredCapabilities().CHROME
chrome_options.add_experimental_option('prefs', prefs)
capabilities.update(chrome_options.to_capabilities())
driver = webdriver.Chrome('/usr/local/bin/chromedriver', options=chrome_options)
url_main = 'https://www.loc.gov/aba/publications/FreeLCC/freelcc.html'
driver.get(url_main)
driver.implicitly_wait(20)
download_pdf_file = driver.find_element_by_xpath('//*[#id="main_body"]/ul[2]/li[1]/a')
download_pdf_file.click()
You can add options arguments to a chromium webdriver using Python the following way:
options = webdriver.ChromeOptions()
# Arguments go below
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=800,600")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--user-agent={}".format("your user agent string"))
# Etc etc..
options.binary_location = "absolute/path/to/chrome.exe"
driver = webdriver.Chrome(
desired_capabilities=caps,
executable_path="absolute/path/to/chromium-driver.exe",
options=options,
)
Here you can find the list of all the supported arguments for chrome.

Is there anyway to make selenium without headless mode on Firefox? [duplicate]

Is there a way to make your Selenium script undetectable in Python using geckodriver?
I'm using Selenium for scraping. Are there any protections we need to use so websites can't detect Selenium?
There are different methods to avoid websites detecting the use of Selenium.
The value of navigator.webdriver is set to true by default when using Selenium. This variable will be present in Chrome as well as Firefox. This variable should be set to "undefined" to avoid detection.
A proxy server can also be used to avoid detection.
Some websites are able to use the state of your browser to determine if you are using Selenium. You can set Selenium to use a custom browser profile to avoid this.
The code below uses all three of these approaches.
profile = webdriver.FirefoxProfile('C:\\Users\\You\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\something.default-release')
PROXY_HOST = "12.12.12.123"
PROXY_PORT = "1234"
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", PROXY_HOST)
profile.set_preference("network.proxy.http_port", int(PROXY_PORT))
profile.set_preference("dom.webdriver.enabled", False)
profile.set_preference('useAutomationExtension', False)
profile.update_preferences()
desired = DesiredCapabilities.FIREFOX
driver = webdriver.Firefox(firefox_profile=profile, desired_capabilities=desired)
Once the code is run, you will be able to manually check that the browser run by Selenium now has your Firefox history and extensions. You can also type "navigator.webdriver" into the devtools console to check that it is undefined.
The fact that selenium driven Firefox / GeckoDriver gets detected doesn't depends on any specific GeckoDriver or Firefox version. The Websites themselves can detect the network traffic and can identify the Browser Client i.e. Web Browser as WebDriver controled.
As per the documentation of the WebDriver Interface in the latest editor's draft of WebDriver - W3C Living Document the webdriver-active flag which is initially set as false, is set to true when the user agent is under remote control i.e. when controlled through Selenium.
Now that the NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.
So,
webdriver
Returns true if webdriver-active flag is set, false otherwise.
where as,
navigator.webdriver
Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.
So, the bottom line is:
Selenium identifies itself
However some generic approaches to avoid getting detected while web-scraping are as follows:
The first and foremost attribute a website can determine your script/program is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website, you need to keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate human like behavior you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep webdriver in python for milliseconds
As per the current WebDriver W3C Editor's Draft specification:
The webdriver-active flag is set to true when the user agent is under remote control. It is initially false.
Hence, the readonly boolean attribute webdriver returns true if webdriver-active flag is set, false otherwise.
Further the specification further clarifies:
navigator.webdriver Defines a standard way for co-operating user
agents to inform the document that it is controlled by WebDriver, for
example so that alternate code paths can be triggered during
automation.
There had been tons and millions of discussions demanding Feature: option to disable navigator.webdriver == true ? and #whimboo in his comment concluded that:
that is because the WebDriver spec defines that property on the
Navigator object, which has to be set to true when tests are running
with webdriver enabled:
https://w3c.github.io/webdriver/#interface
Implementations have to be conformant to this requirement. As such we
will not provide a way to circumvent that.
Generic Conclusion
From the above discussions it can be concluded that:
Selenium identifies itself
and there is no way to conceal the fact that the browser is WebDriver driven.
Recommendations
However some users have suggested approaches which can conceal the fact that the Mozilla Firefox browser is WebDriver controled through the usage of Firefox Profiles and Proxies as follows:
selenium4 compatible python code
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
profile_path = r'C:\Users\Admin\AppData\Roaming\Mozilla\Firefox\Profiles\s8543x41.default-release'
options=Options()
options.set_preference('profile', profile_path)
options.set_preference('network.proxy.type', 1)
options.set_preference('network.proxy.socks', '127.0.0.1')
options.set_preference('network.proxy.socks_port', 9050)
options.set_preference('network.proxy.socks_remote_dns', False)
service = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = Firefox(service=service, options=options)
driver.get("https://www.google.com")
driver.quit()
Other Alternatives
It is observed that in some specific os variants a couple of diverse settings/configuration can bypass the bot detectation which are as follows:
selenium4 compatible code block
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.chrome.service import Service
options = Options()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
s = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = webdriver.Chrome(service=s, options=options)
Potential Solution
A potential solution would be to use the tor browser as follows:
selenium4 compatible python code
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
import os
torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe')
profile_path = r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default'
firefox_options=Options()
firefox_options.set_preference('profile', profile_path)
firefox_options.set_preference('network.proxy.type', 1)
firefox_options.set_preference('network.proxy.socks', '127.0.0.1')
firefox_options.set_preference('network.proxy.socks_port', 9050)
firefox_options.set_preference("network.proxy.socks_remote_dns", False)
firefox_options.binary_location = r'C:\Users\username\Desktop\Tor Browser\Browser\firefox.exe'
service = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = webdriver.Firefox(service=service, options=firefox_options)
driver.get("https://www.tiktok.com/")
It may sound simple, but if you look how the website detects selenium (or bots) is by tracking the movements, so if you can make your program slightly towards like a human is browsing the website you can get less captcha, such as add cursor/page scroll movements in between your operations, and other actions which mimics the browsing. So between two operations try to add some other actions, Add some delay etc. This will make your bot slower and could get undetected.
Thanks

How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?

Is there a way to make your Selenium script undetectable in Python using geckodriver?
I'm using Selenium for scraping. Are there any protections we need to use so websites can't detect Selenium?
There are different methods to avoid websites detecting the use of Selenium.
The value of navigator.webdriver is set to true by default when using Selenium. This variable will be present in Chrome as well as Firefox. This variable should be set to "undefined" to avoid detection.
A proxy server can also be used to avoid detection.
Some websites are able to use the state of your browser to determine if you are using Selenium. You can set Selenium to use a custom browser profile to avoid this.
The code below uses all three of these approaches.
profile = webdriver.FirefoxProfile('C:\\Users\\You\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\something.default-release')
PROXY_HOST = "12.12.12.123"
PROXY_PORT = "1234"
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", PROXY_HOST)
profile.set_preference("network.proxy.http_port", int(PROXY_PORT))
profile.set_preference("dom.webdriver.enabled", False)
profile.set_preference('useAutomationExtension', False)
profile.update_preferences()
desired = DesiredCapabilities.FIREFOX
driver = webdriver.Firefox(firefox_profile=profile, desired_capabilities=desired)
Once the code is run, you will be able to manually check that the browser run by Selenium now has your Firefox history and extensions. You can also type "navigator.webdriver" into the devtools console to check that it is undefined.
The fact that selenium driven Firefox / GeckoDriver gets detected doesn't depends on any specific GeckoDriver or Firefox version. The Websites themselves can detect the network traffic and can identify the Browser Client i.e. Web Browser as WebDriver controled.
As per the documentation of the WebDriver Interface in the latest editor's draft of WebDriver - W3C Living Document the webdriver-active flag which is initially set as false, is set to true when the user agent is under remote control i.e. when controlled through Selenium.
Now that the NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.
So,
webdriver
Returns true if webdriver-active flag is set, false otherwise.
where as,
navigator.webdriver
Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.
So, the bottom line is:
Selenium identifies itself
However some generic approaches to avoid getting detected while web-scraping are as follows:
The first and foremost attribute a website can determine your script/program is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website, you need to keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate human like behavior you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep webdriver in python for milliseconds
As per the current WebDriver W3C Editor's Draft specification:
The webdriver-active flag is set to true when the user agent is under remote control. It is initially false.
Hence, the readonly boolean attribute webdriver returns true if webdriver-active flag is set, false otherwise.
Further the specification further clarifies:
navigator.webdriver Defines a standard way for co-operating user
agents to inform the document that it is controlled by WebDriver, for
example so that alternate code paths can be triggered during
automation.
There had been tons and millions of discussions demanding Feature: option to disable navigator.webdriver == true ? and #whimboo in his comment concluded that:
that is because the WebDriver spec defines that property on the
Navigator object, which has to be set to true when tests are running
with webdriver enabled:
https://w3c.github.io/webdriver/#interface
Implementations have to be conformant to this requirement. As such we
will not provide a way to circumvent that.
Generic Conclusion
From the above discussions it can be concluded that:
Selenium identifies itself
and there is no way to conceal the fact that the browser is WebDriver driven.
Recommendations
However some users have suggested approaches which can conceal the fact that the Mozilla Firefox browser is WebDriver controled through the usage of Firefox Profiles and Proxies as follows:
selenium4 compatible python code
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
profile_path = r'C:\Users\Admin\AppData\Roaming\Mozilla\Firefox\Profiles\s8543x41.default-release'
options=Options()
options.set_preference('profile', profile_path)
options.set_preference('network.proxy.type', 1)
options.set_preference('network.proxy.socks', '127.0.0.1')
options.set_preference('network.proxy.socks_port', 9050)
options.set_preference('network.proxy.socks_remote_dns', False)
service = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = Firefox(service=service, options=options)
driver.get("https://www.google.com")
driver.quit()
Other Alternatives
It is observed that in some specific os variants a couple of diverse settings/configuration can bypass the bot detectation which are as follows:
selenium4 compatible code block
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.chrome.service import Service
options = Options()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('excludeSwitches', ['enable-logging'])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
s = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = webdriver.Chrome(service=s, options=options)
Potential Solution
A potential solution would be to use the tor browser as follows:
selenium4 compatible python code
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
import os
torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe')
profile_path = r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default'
firefox_options=Options()
firefox_options.set_preference('profile', profile_path)
firefox_options.set_preference('network.proxy.type', 1)
firefox_options.set_preference('network.proxy.socks', '127.0.0.1')
firefox_options.set_preference('network.proxy.socks_port', 9050)
firefox_options.set_preference("network.proxy.socks_remote_dns", False)
firefox_options.binary_location = r'C:\Users\username\Desktop\Tor Browser\Browser\firefox.exe'
service = Service('C:\\BrowserDrivers\\geckodriver.exe')
driver = webdriver.Firefox(service=service, options=firefox_options)
driver.get("https://www.tiktok.com/")
It may sound simple, but if you look how the website detects selenium (or bots) is by tracking the movements, so if you can make your program slightly towards like a human is browsing the website you can get less captcha, such as add cursor/page scroll movements in between your operations, and other actions which mimics the browsing. So between two operations try to add some other actions, Add some delay etc. This will make your bot slower and could get undetected.
Thanks

Headless Chrome Browser - Using Selenium and ChromeDriver

Running latest version of all features
(chrome version 66, selenium version 3.12, chromedriver version 2.39, python version 3.6.5)
I have tried all of the solutions that I have found online but nothing seems to be working. I have automated something using Selenium for Chrome and it does exactly what I need it to do.
The last thing I need to to is hide the browser because I do not need to see the UI. I have attempted to make the browser headless using the following code but when I do the program crashes.
I have also tried to alter the window size to 0x0 and even tried the command: options.set_headless(headless=True) instead but nothing seems to work.
NOTE: I am running on Windows and have also tried with the command:
options.add_argument('--disable-gpu')
Try to move browser over the monitor
driver = webdriver.Chrome()
driver.set_window_position(-2000,0) # if -20000 don't help put -10000
What is happening when you run, does it produce any error or just run as if you hadn't even added the headless argument?
Here's what I do for mine to run headlessly -
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
selenium_chrome_driver_path = "blah blah my path is here"
my_driver = webdriver.Chrome(chrome_options = options, executable_path = selenium_chrome_driver_path)
I'm running this code with python 3.6 with up to date selenium drivers on Windows 8 (unfortunately)
Got the solutions after combining a few sources. It seems that the libraries get often updated and eventually it becomes deprecated. Atm this one worked for me in Windows, using Python 3.4:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.headless = True
driver = webdriver.Chrome(executable_path="C:/Users/Admin/Documents/chromedriver_win32/chromedriver", options=options)
Try to use user agent in chrome_options
ua = UserAgent()
userAgent = ua.random
print(userAgent)
chrome_options.add_argument('user-agent={userAgent}')

Categories

Resources