I am using the following code to run my script in local machine
from seleniumwire import webdriver
import pytest
from selenium.webdriver.chrome.options import Options
import time
import allure
class Test_main():
#pytest.fixture()
def test_setup(self):
# instantiate browser
chrome_options = Options()
chrome_options.add_argument('--start-maximized')
chrome_options.add_argument('--headless')
self.driver = webdriver.Chrome(executable_path=r"D:/Python/Sel_python/drivers/chromedriverv86/chromedriver.exe", chrome_options=chrome_options)
# terminate script
yield
self.driver.close()
self.driver.quit()
print("Test completed")
##Remaining functions/test cases followed. Not adding the entire script here
I pushed this code onto git and then tried to run the same in jenkins using following build commands:
cd "D:\Python\Sel_python\Pytest"
pip install -r requirements.txt
pytest Test_Tracking_code_scripts.py -s -v
But then jenkins threw an error that chromedriver cannot be located. My questions are:
Do I need to upload chromedriver.exe as well into my git repository
Does jenkins have its own chrome browser? If yes how do I use it and what path has to be specified?
I am new to jenkins, please help me out here
Check chrome version in Jenkins system
Download chrome driver based on Jenkins system from here
Copy the chrome driver to "C:/drivers/" in Jenkins server (As C driver is common to all windows system)
update code as below
self.driver = webdriver.Chrome(executable_path=r"D:/Python/Sel_python/drivers/chromedriverv86/chromedriver.exe", chrome_options=chrome_options)
as
self.driver = webdriver.Chrome(executable_path=r"C:/drivers/chromedriver.exe", chrome_options=chrome_options)
Let me know if you face any issues with this.
NOTE:
In local system please move driver to "C:/driver" so that both remote and local system path is same.
If chrome version is updated in local or remote, please update chrome driver version i.e. chromedriver.exe
I found the solution. My code was missing the chrome binary path. Adding the same as an Options() argument resolved the error.
Sharing the updated patch of code:
from seleniumwire import webdriver
import pytest
from selenium.webdriver.chrome.options import Options
import time
import allure
class Test_main():
#pytest.fixture()
def test_setup(self):
# initiating browser
chrome_options = Options()
chrome_options.binary_location=r"C:\Users\libin.thomas\AppData\Local\Google\Chrome\Application\chrome.exe"
chrome_options.add_argument('--start-maximized')
chrome_options.add_argument('--headless')
self.driver = webdriver.Chrome(executable_path=r"D:/Python/Sel_python/drivers/chromedriver v86/chromedriver.exe",options=chrome_options)
# terminate script
yield
self.driver.close()
self.driver.quit()
print("Test completed")
#test cases followed below
Related
So I'm trying some stuff out with selenium and I really want it to be quick.
So my thought is that running it with headless chrome would make my script faster.
First is that assumption correct, or does it not matter if i run my script with a headless driver?
Anyways I still want to get it to work to run headless, but I somehow can't, I tried different things and most suggested that it would work as said here in the October update
How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?
But when I try that, I get weird console output and it still doesn't seem to work.
Any tipps appreciated.
To run chrome-headless just add --headless via chrome_options.add_argument, i.e.:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
#chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
# b'<!DOCTYPE html><html xmlns="http://www....
driver.quit()
So my thought is that running it with headless chrome would make my
script faster.
Try using chrome options like --disable-extensions or --disable-gpu and benchmark it, but I wouldn't count with much improvement.
References: headless-chrome
Install & run containerized Chrome:
docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome
Connect using webdriver.Remote:
driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')
from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver"
driver.get(url)
sleep(5)
h1 = driver.find_element_by_xpath("//h1[#itemprop='name']").text
print(h1)
Then I run script on our local machine
➜ python script.py
Running Selenium with Headless Chrome Webdriver
It is working and it is with headless Chrome.
If you are using Linux environment, may be you have to add --no-sandbox as well and also specific window size settings. The --no-sandbox flag is no needed on Windows if you set user container properly.
Use --disable-gpu only on Windows. Other platforms no longer require it. The --disable-gpu flag is a temporary work around for a few bugs.
//Headless chrome browser and configure
WebDriverManager.chromedriver().setup();
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.addArguments("--no-sandbox");
chromeOptions.addArguments("--headless");
chromeOptions.addArguments("disable-gpu");
// chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
driver = new ChromeDriver(chromeOptions);
Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)
Todo (tested on headless server Debian Linux 9.4):
Do this:
# install chrome
curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
apt-get -y update
apt-get -y install google-chrome-stable
# install chrome driver
wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
mv chromedriver /usr/bin/chromedriver
chown root:root /usr/bin/chromedriver
chmod +x /usr/bin/chromedriver
Install selenium:
pip install selenium
and run this Python code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("no-sandbox")
options.add_argument("headless")
options.add_argument("start-maximized")
options.add_argument("window-size=1900,1080");
driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
driver.get("https://www.example.com")
html = driver.page_source
print(html)
As stated by the accepted answer:
options.add_argument("--headless")
These tips might help to speed things up especially for headless:
There are quite a few things you can do in headless that you cant do in non headless
Since you will be using Chrome Headless, I've found adding this reduces the CPU usage by about 20% for me (I found this to be a CPU and memory hog when looking at htop)
--disable-crash-reporter
This will only disable when you are running in headless This might speed things up for you!!!
My settings are currently as follows and I reduce the CPU (but only a marginal time saving) by about 20%:
options.add_argument("--no-sandbox");
options.add_argument("--disable-dev-shm-usage");
options.add_argument("--disable-renderer-backgrounding");
options.add_argument("--disable-background-timer-throttling");
options.add_argument("--disable-backgrounding-occluded-windows");
options.add_argument("--disable-client-side-phishing-detection");
options.add_argument("--disable-crash-reporter");
options.add_argument("--disable-oopr-debug-crash-dump");
options.add_argument("--no-crash-upload");
options.add_argument("--disable-gpu");
options.add_argument("--disable-extensions");
options.add_argument("--disable-low-res-tiling");
options.add_argument("--log-level=3");
options.add_argument("--silent");
I found this to be a pretty good list (full list I think) of command line switches with explanations: https://peter.sh/experiments/chromium-command-line-switches/
Some additional things you can turn off are also mentioned here: https://github.com/GoogleChrome/chrome-launcher/blob/main/docs/chrome-flags-for-tools.md
I hope this helps someone
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r"C:\Program
Files\Google\Chrome\Application\chromedriver.exe", options=chrome_options)
This is ok for me.
So I'm trying some stuff out with selenium and I really want it to be quick.
So my thought is that running it with headless chrome would make my script faster.
First is that assumption correct, or does it not matter if i run my script with a headless driver?
Anyways I still want to get it to work to run headless, but I somehow can't, I tried different things and most suggested that it would work as said here in the October update
How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?
But when I try that, I get weird console output and it still doesn't seem to work.
Any tipps appreciated.
To run chrome-headless just add --headless via chrome_options.add_argument, i.e.:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
#chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
# b'<!DOCTYPE html><html xmlns="http://www....
driver.quit()
So my thought is that running it with headless chrome would make my
script faster.
Try using chrome options like --disable-extensions or --disable-gpu and benchmark it, but I wouldn't count with much improvement.
References: headless-chrome
Install & run containerized Chrome:
docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome
Connect using webdriver.Remote:
driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')
from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver"
driver.get(url)
sleep(5)
h1 = driver.find_element_by_xpath("//h1[#itemprop='name']").text
print(h1)
Then I run script on our local machine
➜ python script.py
Running Selenium with Headless Chrome Webdriver
It is working and it is with headless Chrome.
If you are using Linux environment, may be you have to add --no-sandbox as well and also specific window size settings. The --no-sandbox flag is no needed on Windows if you set user container properly.
Use --disable-gpu only on Windows. Other platforms no longer require it. The --disable-gpu flag is a temporary work around for a few bugs.
//Headless chrome browser and configure
WebDriverManager.chromedriver().setup();
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.addArguments("--no-sandbox");
chromeOptions.addArguments("--headless");
chromeOptions.addArguments("disable-gpu");
// chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
driver = new ChromeDriver(chromeOptions);
Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)
Todo (tested on headless server Debian Linux 9.4):
Do this:
# install chrome
curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
apt-get -y update
apt-get -y install google-chrome-stable
# install chrome driver
wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
mv chromedriver /usr/bin/chromedriver
chown root:root /usr/bin/chromedriver
chmod +x /usr/bin/chromedriver
Install selenium:
pip install selenium
and run this Python code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("no-sandbox")
options.add_argument("headless")
options.add_argument("start-maximized")
options.add_argument("window-size=1900,1080");
driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
driver.get("https://www.example.com")
html = driver.page_source
print(html)
As stated by the accepted answer:
options.add_argument("--headless")
These tips might help to speed things up especially for headless:
There are quite a few things you can do in headless that you cant do in non headless
Since you will be using Chrome Headless, I've found adding this reduces the CPU usage by about 20% for me (I found this to be a CPU and memory hog when looking at htop)
--disable-crash-reporter
This will only disable when you are running in headless This might speed things up for you!!!
My settings are currently as follows and I reduce the CPU (but only a marginal time saving) by about 20%:
options.add_argument("--no-sandbox");
options.add_argument("--disable-dev-shm-usage");
options.add_argument("--disable-renderer-backgrounding");
options.add_argument("--disable-background-timer-throttling");
options.add_argument("--disable-backgrounding-occluded-windows");
options.add_argument("--disable-client-side-phishing-detection");
options.add_argument("--disable-crash-reporter");
options.add_argument("--disable-oopr-debug-crash-dump");
options.add_argument("--no-crash-upload");
options.add_argument("--disable-gpu");
options.add_argument("--disable-extensions");
options.add_argument("--disable-low-res-tiling");
options.add_argument("--log-level=3");
options.add_argument("--silent");
I found this to be a pretty good list (full list I think) of command line switches with explanations: https://peter.sh/experiments/chromium-command-line-switches/
Some additional things you can turn off are also mentioned here: https://github.com/GoogleChrome/chrome-launcher/blob/main/docs/chrome-flags-for-tools.md
I hope this helps someone
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r"C:\Program
Files\Google\Chrome\Application\chromedriver.exe", options=chrome_options)
This is ok for me.
What I have: CURRENT_BROWSER=chrome in Win Environments
def before_scenario(context, scenario):
use_fixture(browser, context)
def after_scenario(context, scenario):
context.cache.clear()
context.driver.quit()
#fixture
def browser(context):
browser_type = os.getenv('CURRENT_BROWSER', 'chrome')
if browser_type is None:
raise Exception(f"Unable to identify test browser which is {browser_type}")
if browser_type == 'chrome':
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
# chrome_options.add_argument('--incognito')
context.driver = webdriver.Chrome(desired_capabilities=chrome_options.to_capabilities())
if browser_type == 'firefox':
pass
yield context.driver
What I need is: the answer how to deal with the chromedriver on CI CD (azureDevops) should I also put ENV variable similar to Browser in to the PATH and do the same on CI CD or there is different way to deal with chrome driver. I need above code will work locally and on CI CD and I never did that before. Locally I use above code + chromedriver.exe added in to project structure
If you are using a Microsoft-hosted agent: windows-latest, windows-2019 or vs2017-win2016, the Chrome Driver 87.0.4280.88 is already installed.
If you want to use another version of Chrome Driver, you can download it using npm:
- script: npm install chromedriver --chromedriver_version=LATEST
Click this document for detailed information.
If you are using a Self-hosted agent and the agent is on a machine that has already downloaded the Chrome Driver and configured PATH, you can use Chrome Driver just as you work on your own machine.
I tried to follow an example how to parse websites via python and selenium.
But I am running always into the following problem: calling the function webdriver.Firefox
opens a firefox instance, but no website via get could be called, it seems: the whole code is blocking in function Firefox (see: print("open call never reached")) The browser is opening and after ca. 30 seconds an exception causes the broswer to exit, with message:
selenium.common.exceptions.WebDriverException: Message: Can't load the profile. Possible firefox version mismatch. You must use GeckoDriver instead for Firefox 48+. Profile Dir: /tmp/tmpl5dm_azd If you specified a log_file in the FirefoxBinary constructor, check it for details
So what do I am wrong here ? How could I set the profile right ?
I tried to set marionette mode True, but got the error : "Unable to find a matching set of capabilities"
from selenium.webdriver import Firefox
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = False
options = Options()
options.log.level = "trace"
options.headless = True
binary = FirefoxBinary("/usr/bin/firefox")
pathDriver = "./geckodriver"
testUrl="https://duckduckgo.com/"
print("will create firefox instance")
browser = webdriver.Firefox(firefox_binary=binary,options=options,capabilities=cap,executable_path=pathDriver)
print("open call never reached")
browser.get(testUrl)
webdriver.quit()
My test environment:
$ name -a
Linux 5.5.0-0.bpo.2-amd64 #1 SMP Debian 5.5.17-1~bpo10+1 (2020-04-23) x86_64 GNU/Linux
Also I downloaded the latest selenium and the geckodriver
here see what versions I do use:
$ python3 –version
Python 3.7.3
$ pip3 freeze | grep sel
selenium==3.141.0
$ geckodriver -V
geckodriver 0.27.0 (7b8c4f32cdde 2020-07-28 18:16 +0000)
$ which firefox
/usr/bin/firefox
$ firefox -v
Mozilla Firefox 68.10.0esr
When using GeckoDriver to initiate/spawn a new Browsing Context i.e. Firefox Browser session with Firefox 48+ versions, you have to use Marionette mandatorily.
Solution
The solution would be either to work with default setting of marionette or turn marionette to True as follows:
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = True
you added the parentheses for DesiredCapabilities
cap = DesiredCapabilities.FIREFOX
cap['marionette'] = False
or you can use webdriver_manager library which will help to get rid of a lot of headaches
pip install webdriver_manager
and use it like this
from webdriver_manager.firefox import GeckoDriverManager
from selenium.webdriver import DesiredCapabilities
options = webdriver.FirefoxOptions()
options.log.level = "trace"
options.headless = True
capabilities = DesiredCapabilities.FIREFOX
capabilities["marionette"] = False
driver = webdriver.Firefox(executable_path=GeckoDriverManager().install(), options=options)
this setup helps you have the latest browser version for selenium, your error could be caused by the unmatching versions
I have a website made using Django, click a button on the website triggers a scraper to start. This scraper uses selenium. I have added the following two build packs needed for selenium to my heroku app:
1) https://github.com/heroku/heroku-buildpack-chromedriver
2) https://github.com/heroku/heroku-buildpack-google-chrome
chrome_options = webdriver.ChromeOptions()
chrome_options.binary_location='/app/.apt/usr/bin/google-chrome'
os.environ.get("GOOGLE_CHROME_BIN", "chromedriver")
browser=webdriver.Chrome(executable_path=os.environ.get("GOOGLE_CHROME_BIN", "chromedriver"),chrome_options=chrome_options)
But yet it fails to find the chromedriver and throws the error chromedriver needs to be in PATH, how to fix this issue? Where is the chromedriver executable?
I wanted to comment you the link, where I previously answered this question, but I don't have enough rep to comment, so anywho here you go..
Set the following path using heroku congfig:set command
heroku config:set CHROMEDRIVER_PATH=/app/.chromedriver/bin/chromedriver and
heroku config:set GOOGLE_CHROME_BIN=/app/.apt/usr/bin/google-chrome
Verify the paths using heroku config command
You can use this snippet to configure your definition
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def load_chrome_driver(proxy):
options = Options()
options.binary_location = os.environ.get('GOOGLE_CHROME_BIN')
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')
options.add_argument('--remote-debugging-port=9222')
options.add_argument('--proxy-server='+proxy)
return webdriver.Chrome(executable_path=str(os.environ.get('CHROMEDRIVER_PATH')), chrome_options=options)
I'm using proxies, but you can probably avoid that.