I am trying to execute the chrome driver from the docker container but did not succeed with the process, so far I did the following.
Implement the Microsoft Azure Functions
Implement the Dockerfile
Running a docker container successfully
Python File
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request. 11:38')
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
path_to_chrome=os.path.join(os.getcwd(),'/usr/local/bin/chromedriver')
driver = webdriver.Chrome(executable_path=path_to_chrome, chrome_options=chrome_options)
Dockerfile
FROM mcr.microsoft.com/azure-functions/python:3.0-python3.8
# 0. Install essential packages
RUN apt-get update \
&& apt-get install -y \
build-essential \
cmake \
git \
wget \
unzip \
&& rm -rf /var/lib/apt/lists/*
# 1. Install Chrome (root image is debian)
# See https://stackoverflow.com/questions/49132615/installing-chrome-in-docker-file
ARG CHROME_VERSION="google-chrome-stable"
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list \
&& apt-get update -qqy \
&& apt-get -qqy install \
${CHROME_VERSION:-google-chrome-stable} \
&& rm /etc/apt/sources.list.d/google-chrome.list \
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/*
# 2. Install Chrome driver used by Selenium
RUN LATEST=$(wget -q -O - http://chromedriver.storage.googleapis.com/LATEST_RELEASE) && \
wget http://chromedriver.storage.googleapis.com/$LATEST/chromedriver_linux64.zip && \
unzip chromedriver_linux64.zip && ln -s $PWD/chromedriver /usr/local/bin/chromedriver
ENV PATH="/usr/local/bin/chromedriver:${PATH}"
# 3. Install selenium in Python
RUN pip install -U selenium
# 4. Finally, copy python code to image
COPY . /home/site/wwwroot
# 5. Install other packages in requirements.txt
RUN cd /home/site/wwwroot && \
pip install -r requirements.txt
Check whether chromedriver is in that path or not
To check open CMD and type chromedriver (assuming your chromedriver executable is still named like this) and hit enter if Starting ChromeDriver 2.15.322448 is appearing, the PATH is set appropriately
Alternatively you can use a direct path to the chromedriver like this:
driver = webdriver.Chrome('/path/to/chromedriver')
So in your specific case:
driver = webdriver.Chrome("C:/Users/Username/Downloads/chromedriver_win32/chromedriver.exe")
Also, you need to install selenium packages
try installing the Alpine compatible version of chromedriver, available in Alpine repositories, using apk add chromium-chromedriver:
https://pkgs.alpinelinux.org/package/v3.9/community/x86_64/chromium-chromedriver
Related
I'm trying to dockerize and run the web scrapper developed using the selenium library in python. I used Windows 10 for development. It ran well there. While running the same script as a docker image, I'm getting multiple issues. This is how I connect the driver in windows.
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
I didn't use options as I don't have any use cases. As I got root user error while running in docker I added the option and ran the code as below.
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(options = chrome_options, service=Service(ChromeDriverManager().install()))
Still, it didn't start. So I configured it by hardcoding the driver path.
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=driverPath,options=option)
Even then it didn't get started as the display was not configured. So configured the headless argument and ran, but in the end, I got the below error.
**
Tkinter.TclError: no display name and no $DISPLAY environment variable
**
So I tried to start the display by the below code.
if platform.system() == 'Linux':
from pyvirtualdisplay import Display
display = Display(visible=0, size=(800, 800))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=driverPath,options=option)
But it is not running, it is frozen and not creating the driver session.
This is my Dockerfile
FROM python
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
RUN apt-get update && apt-get -y install google-chrome-stable
RUN apt-get install -yqq unzip
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
RUN apt-get install xvfb mesa-utils -y \
&& apt install freeglut3-dev -y
ENV DISPLAY=:99
RUN mkdir -p /app/drivers
ADD requirements.txt /app
ADD sample.py /app
COPY run.sh /app
COPY drivers /app/drivers
COPY csv /app/csv
WORKDIR /app
RUN pip3 install -r requirements.txt
CMD ./run.sh
run.sh
#!/bin/sh
#Xvfb :99 -screen 0 640x480x8 -nolisten tcp &
python3 ./sample.py
requirements.txt
selenium==4.3.0
webdriver-manager==3.8.2
chromedriver-py==103.0.5060.53
pyvirtualdisplay==3.0
What are the mistakes I made in the code? And how to run the selenium python app with display in docker? Thank you.
Seems the display can't be enabled in the python jar. So I have created the python image from the ubuntu image as said in this site. There I have installed the python and the other dependencies required for my application. And now I'm able run the application without any issues.
FROM ubuntu
#Enabling noninteractive environment and setting Timezone to install python3-tk without any interruption
# python
RUN export TZ=Asia/Kolkata
RUN apt-get update
RUN apt-get install -y python3 python3-setuptools python3-pip python3-tk
ENV DEBIAN_FRONTEND noninteractive
# Essential tools and xvfb
RUN apt-get update && apt-get install -y \
software-properties-common \
unzip \
curl \
xvfb
# Chrome browser to run the tests
RUN curl https://dl-ssl.google.com/linux/linux_signing_key.pub -o /tmp/google.pub \
&& cat /tmp/google.pub | apt-key add -; rm /tmp/google.pub \
&& echo 'deb http://dl.google.com/linux/chrome/deb/ stable main' > /etc/apt/sources.list.d/google.list \
&& mkdir -p /usr/share/desktop-directories \
&& apt-get -y update && apt-get install -y google-chrome-stable
# Disable the SUID sandbox so that chrome can launch without being in a privileged container
RUN dpkg-divert --add --rename --divert /opt/google/chrome/google-chrome.real /opt/google/chrome/google-chrome \
&& echo "#!/bin/bash\nexec /opt/google/chrome/google-chrome.real --no-sandbox --disable-setuid-sandbox \"\$#\"" > /opt/google/chrome/google-chrome \
&& chmod 755 /opt/google/chrome/google-chrome
# Chrome Driver
RUN mkdir -p /opt/selenium \
&& curl http://chromedriver.storage.googleapis.com/2.45/chromedriver_linux64.zip -o /opt/selenium/chromedriver_linux64.zip \
&& cd /opt/selenium; unzip /opt/selenium/chromedriver_linux64.zip; rm -rf chromedriver_linux64.zip; ln -fs /opt/selenium/chromedriver /usr/local/bin/chromedriver;
# display
RUN export DISPLAY=:20
RUN Xvfb :20 -screen 0 1366x768x16 &
RUN mkdir -p /app
ADD requirements.txt /app
ADD app.py /app
WORKDIR /app
RUN pip3 install -r requirements.txt
CMD ./run.sh
Inside the Docker File,have following code where m setting up the base image as python.
FROM python:3.9
ENV DISPLAY=:99
ENV DISPLAY_CONFIGURATION = 1080x820x24
ADD ./requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
RUN apt-get update && apt-get install -y xvfb wget unzip libnss3-tools
RUN echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main"
>>/etc/apt/sources.list.d/google-chrome.list \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN apt-get update && apt-get install -y google-chrome-stable
RUN apt-get install -yqq unzip
# Download the Chrome Driver
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS
chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
# Unzip the Chrome Driver into /usr/local/bin directory
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
In Robotframework File :
Navigate To
Start Virtual Display 1080 800
go to ${BASE_URL}
Wait Until Element Is Visible ${LOGIN_PAGE_Text}
click element ${LOGIN_SIGNIN_BUTTON}
Set Window Size 1080 820
maximize browser window
I need a Dockerfile to run my Python script. The script uses Selenium, so I need to load a driver for it to work. An ordinary .exe file - driver is not suitable, so according to the advice of the administrators of the hosting where the script is located I need to create a Dockerfile for the script to work properly.
The main problem is that I simply can not run my script, because I do not understand how to load the required driver on the server.
This is a sample code of what should be in the Dockerfile.
FROM python:3
RUN apt-get update -y
RUN apt-get install -y wget
RUN wget -O $HOME/geckodriver.tar.gz https://github.com/mozilla/geckodriver/releases/download/v0.23.0/geckodriver-v0.23.0-linux64.tar.gz
RUN tar xf $HOME/geckodriver.tar.gz -C $HOME
RUN cp $HOME/geckodriver /usr/local/bin/geckodriver
RUN chmod +x /usr/local/bin/geckodriver
RUN rm -f $HOME/geckodriver $HOME/geckodriver.tar.gz
This is the code used in the Python script
options = Options()
options.add_argument('headless')
options.add_argument('window-size=1920x935')
driver = webdriver.Chrome(options=options, executable_path=r"chromedriver.exe")
driver.get(f"https://www.wildberries.ru/catalog/{id}/feedbacks?imtId={imt_id}")
time.sleep(5)
big_stat = driver.find_element(by=By.CLASS_NAME, value="rating-product__numb")
I can redo this snippet of code to make it work on Firefox, if necessary.
This is what the directories of the hosting where all the files are located look like
The directories of the hosting
For getting Selenium to work with Python using a Dockerfile, here's an existing SeleniumBase Dockerfile.
For instructions on using it, see the README.
For building, it's basically this:
Non Apple M1 Mac:
docker build -t seleniumbase .
If running on an Apple M1 Mac, use this instead:
docker build --platform linux/amd64 seleniumbase .
Before building the Dockerfile, you'll need to clone SeleniumBase.
Here's what the Dockerfile currently looks like:
FROM ubuntu:18.04
#=======================================
# Install Python and Basic Python Tools
#=======================================
RUN apt-get -o Acquire::Check-Valid-Until=false -o Acquire::Check-Date=false update
RUN apt-get install -y python3 python3-pip python3-setuptools python3-dev python-distribute
RUN alias python=python3
RUN echo "alias python=python3" >> ~/.bashrc
#=================================
# Install Bash Command Line Tools
#=================================
RUN apt-get -qy --no-install-recommends install \
sudo \
unzip \
wget \
curl \
libxi6 \
libgconf-2-4 \
vim \
xvfb \
&& rm -rf /var/lib/apt/lists/*
#================
# Install Chrome
#================
RUN curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
apt-get -yqq update && \
apt-get -yqq install google-chrome-stable && \
rm -rf /var/lib/apt/lists/*
#=================
# Install Firefox
#=================
RUN apt-get -qy --no-install-recommends install \
$(apt-cache depends firefox | grep Depends | sed "s/.*ends:\ //" | tr '\n' ' ') \
&& rm -rf /var/lib/apt/lists/* \
&& cd /tmp \
&& wget --no-check-certificate -O firefox-esr.tar.bz2 \
'https://download.mozilla.org/?product=firefox-esr-latest&os=linux64&lang=en-US' \
&& tar -xjf firefox-esr.tar.bz2 -C /opt/ \
&& ln -s /opt/firefox/firefox /usr/bin/firefox \
&& rm -f /tmp/firefox-esr.tar.bz2
#===========================
# Configure Virtual Display
#===========================
RUN set -e
RUN echo "Starting X virtual framebuffer (Xvfb) in background..."
RUN Xvfb -ac :99 -screen 0 1280x1024x16 > /dev/null 2>&1 &
RUN export DISPLAY=:99
RUN exec "$#"
#=======================
# Update Python Version
#=======================
RUN apt-get update -y
RUN apt-get -qy --no-install-recommends install python3.8
RUN rm /usr/bin/python3
RUN ln -s python3.8 /usr/bin/python3
#=============================================
# Allow Special Characters in Python Programs
#=============================================
RUN export PYTHONIOENCODING=utf8
RUN echo "export PYTHONIOENCODING=utf8" >> ~/.bashrc
#=====================
# Set up SeleniumBase
#=====================
COPY sbase /SeleniumBase/sbase/
COPY seleniumbase /SeleniumBase/seleniumbase/
COPY examples /SeleniumBase/examples/
COPY integrations /SeleniumBase/integrations/
COPY requirements.txt /SeleniumBase/requirements.txt
COPY setup.py /SeleniumBase/setup.py
RUN find . -name '*.pyc' -delete
RUN find . -name __pycache__ -delete
RUN pip3 install --upgrade pip
RUN pip3 install --upgrade setuptools
RUN pip3 install --upgrade setuptools-scm
RUN cd /SeleniumBase && ls && pip3 install -r requirements.txt --upgrade
RUN cd /SeleniumBase && pip3 install .
#=====================
# Download WebDrivers
#=====================
RUN wget https://github.com/mozilla/geckodriver/releases/download/v0.31.0/geckodriver-v0.31.0-linux64.tar.gz
RUN tar -xvzf geckodriver-v0.31.0-linux64.tar.gz
RUN chmod +x geckodriver
RUN mv geckodriver /usr/local/bin/
RUN wget https://chromedriver.storage.googleapis.com/2.44/chromedriver_linux64.zip
RUN unzip chromedriver_linux64.zip
RUN chmod +x chromedriver
RUN mv chromedriver /usr/local/bin/
#==========================================
# Create entrypoint and grab example tests
#==========================================
COPY integrations/docker/docker-entrypoint.sh /
COPY integrations/docker/run_docker_test_in_firefox.sh /
COPY integrations/docker/run_docker_test_in_chrome.sh /
RUN chmod +x *.sh
COPY integrations/docker/docker_config.cfg /SeleniumBase/examples/
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["/bin/bash"]
I need to put into a Docker container my little Flask app that goes and check what type of Google Tags my company's clients have installed. For that i need to have selenium-wire . You supply a website and you get a json back telling you which tags are installed ( a bit like http://gachecker.com/ ). Now it works just fine with the Flask App. The issue arises when i try to put it into Docker, here is my docker script:
FROM python:3.9 WORKDIR /bziiit_checker_app
RUN pip install flask flask_restful requests BeautifulSoup4 selenium-wire undetected-chromedriver chromedriver-py
COPY ./app ./app
CMD ["python", "./app/main.py"]
Once it's in Docker and try to run it, i get that message
"selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH"
Which is a common issue when the chromedriver.exe file is not in the working directory. But it IS.
Do i need to set the PATH when i'm creating the virtual environment, and if so how do i do that?
Again, i'm good at A.I, terrible at app development.
I'm using Python 3.9 and am on Windows 10, Visual Studio Code, and Flask
Thank you
After a few days of pain and suffering i finally worked it out, so here is the Docker file i created to get chromedriver to work in a Docker container.
This works on Windows 10 using VS code
FROM python:3.8
# Adding trusting keys to apt for repositories
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
# Adding Google Chrome to the repositories
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
# Updating apt to see and install Google Chrome
RUN apt-get -y update
# Magic happens
RUN apt-get install -y google-chrome-stable
# Installing Unzip
RUN apt-get install -yqq unzip
# Download the Chrome Driver
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
RUN apt-get -y update
RUN apt-get install -y google-chrome-stable
# install chromedriver
RUN apt-get install -yqq unzip
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
# Set display port as an environment variable
ENV DISPLAY=:99
COPY ./app ./app
WORKDIR /app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD ["python", "./main.py"]
Then, in your script, add those arguments to Chromedriver's options otherwise it'll give you an error message telling you that "Chromedriver has exited abnormally"
option = webdriver.ChromeOptions()
option.add_argument("--disable-gpu")
option.add_argument("--disable-extensions")
option.add_argument("--disable-infobars")
option.add_argument("--start-maximized")
option.add_argument("--disable-notifications")
option.add_argument('--headless')
option.add_argument('--no-sandbox')
option.add_argument('--disable-dev-shm-usage')
I hope this will save someone all the headache that problem gave me
You will also have to install chrome driver and chrome inside your container
RUN add-apt-repository -y ppa:openjdk-r/ppa
RUN apt-get install -y openjdk-12-jre cron wget unzip
ARG CHROME_VERSION=78.0.3904.87-1
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list \
&& apt-get update -qqy \
&& apt-get -qqy install google-chrome-stable=$CHROME_VERSION \
&& rm /etc/apt/sources.list.d/google-chrome.list \
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/* \
&& sed -i 's/"$HERE\/chrome"/"$HERE\/chrome" --no-sandbox/g' /opt/google/chrome/google-chrome
ARG CHROME_DRIVER_VERSION=78.0.3904.70
RUN wget --no-verbose -O /tmp/chromedriver_linux64.zip https://chromedriver.storage.googleapis.com/$CHROME_DRIVER_VERSION/chromedriver_linux64.zip \
&& rm -rf /opt/chromedriver \
&& unzip /tmp/chromedriver_linux64.zip -d /opt \
&& rm /tmp/chromedriver_linux64.zip \
&& mv /opt/chromedriver /opt/chromedriver-$CHROME_DRIVER_VERSION \
&& chmod 755 /opt/chromedriver-$CHROME_DRIVER_VERSION \
&& ln -fs /opt/chromedriver-$CHROME_DRIVER_VERSION /usr/bin/chromedriver
I have a python script that scrapes some web information using selenium. I've build a docker image of my project:
FROM python:3.7-slim
WORKDIR /
COPY requirements.txt ./
RUN pip install --upgrade pip && pip install -r requirements.txt
COPY . .
RUN pip install -e .
CMD ["python", "src/project/scraper.py"]
I get the following error when I run it: selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
The chromedriver.exe file is located in a data folder and the .py script refers to the right place (it does run locally).
Does anyone know how I would be able to run chrome in this container?
map structure is as follows:
|-- data
| |--chromedriver.exe
| |--file.csv
|-- src
| |--project
| |--scraper.py
|-- Dockerfile
|-- requirements.txt
Thanks!
Let me share what has worked for me in the past.
Try installing chrome, chromedriver, and the PATH from within the DockerFile.
Note:
Using Python 3.8. You can try changing it to 3.7 and see if it works for you.
NOT configured for multi-stage builds.
i.e. You may want to remove "FROM python:3.7-slim" before appending your part at the end.
FROM python:3.8 AS builder
RUN apt-get update; apt-get clean
# Install chrome dependencies
RUN apt-get install -y x11vnc xvfb fluxbox wget wmctrl unzip
# Set up the Chrome PPA
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
# Update the package list and install chrome
RUN apt-get update -y
RUN apt-get install -y google-chrome-stable
# Set up Chromedriver Environment variables
ENV CHROMEDRIVER_VERSION 87.0.4280.88
ENV CHROMEDRIVER_DIR /chromedriver
RUN mkdir $CHROMEDRIVER_DIR
# Download and install Chromedriver
RUN wget -q --continue -P $CHROMEDRIVER_DIR "http://chromedriver.storage.googleapis.com/$CHROMEDRIVER_VERSION/chromedriver_linux64.zip"
RUN unzip $CHROMEDRIVER_DIR/chromedriver* -d $CHROMEDRIVER_DIR
# Put Chromedriver into the PATH
ENV PATH $CHROMEDRIVER_DIR:$PATH
RUN python -m venv /opt/venv
# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"
...
<YOUR_DOCKERFILE_PARTS>
As of today (Jan 25th, 2021), I can check the latest stable version (released Jan 19th, 2021) of google chrome is 88.0.4324.96. So if the above don't work, try changing the Chromedriver version so that it matches with the installed chrome browser.
I am expecting you are using Linux containers in Docker, since python:3.7-slim is a Linux image. You cannot execute Windows binaries (.exe) files in Linux. Therefore you need to install chromedriver on Linux: How to Setup Selenium with ChromeDriver on Ubuntu 18.04 & 16.04
Your Dockerfile should look something like this
FROM python:3.7-slim
# install chromedriver
RUN apt-get update && \
apt-get install -y unzip xvfb libxi6 libgconf-2-4 && \
apt-get install default-jdk && \
curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add && \
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
apt-get -y update && \
wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip && \
unzip chromedriver_linux64.zip && \
mv chromedriver /usr/bin/chromedriver && \
chown root:root /usr/bin/chromedriver && \
chmod +x /usr/bin/chromedriver
WORKDIR /
COPY requirements.txt ./
RUN pip install --upgrade pip && pip install -r requirements.txt
COPY . .
RUN pip install -e .
CMD ["python", "src/project/scraper.py"]