How to make a Dockerfile? - python

I need a Dockerfile to run my Python script. The script uses Selenium, so I need to load a driver for it to work. An ordinary .exe file - driver is not suitable, so according to the advice of the administrators of the hosting where the script is located I need to create a Dockerfile for the script to work properly.
The main problem is that I simply can not run my script, because I do not understand how to load the required driver on the server.
This is a sample code of what should be in the Dockerfile.
FROM python:3
RUN apt-get update -y
RUN apt-get install -y wget
RUN wget -O $HOME/geckodriver.tar.gz https://github.com/mozilla/geckodriver/releases/download/v0.23.0/geckodriver-v0.23.0-linux64.tar.gz
RUN tar xf $HOME/geckodriver.tar.gz -C $HOME
RUN cp $HOME/geckodriver /usr/local/bin/geckodriver
RUN chmod +x /usr/local/bin/geckodriver
RUN rm -f $HOME/geckodriver $HOME/geckodriver.tar.gz
This is the code used in the Python script
options = Options()
options.add_argument('headless')
options.add_argument('window-size=1920x935')
driver = webdriver.Chrome(options=options, executable_path=r"chromedriver.exe")
driver.get(f"https://www.wildberries.ru/catalog/{id}/feedbacks?imtId={imt_id}")
time.sleep(5)
big_stat = driver.find_element(by=By.CLASS_NAME, value="rating-product__numb")
I can redo this snippet of code to make it work on Firefox, if necessary.
This is what the directories of the hosting where all the files are located look like
The directories of the hosting

For getting Selenium to work with Python using a Dockerfile, here's an existing SeleniumBase Dockerfile.
For instructions on using it, see the README.
For building, it's basically this:
Non Apple M1 Mac:
docker build -t seleniumbase .
If running on an Apple M1 Mac, use this instead:
docker build --platform linux/amd64 seleniumbase .
Before building the Dockerfile, you'll need to clone SeleniumBase.
Here's what the Dockerfile currently looks like:
FROM ubuntu:18.04
#=======================================
# Install Python and Basic Python Tools
#=======================================
RUN apt-get -o Acquire::Check-Valid-Until=false -o Acquire::Check-Date=false update
RUN apt-get install -y python3 python3-pip python3-setuptools python3-dev python-distribute
RUN alias python=python3
RUN echo "alias python=python3" >> ~/.bashrc
#=================================
# Install Bash Command Line Tools
#=================================
RUN apt-get -qy --no-install-recommends install \
sudo \
unzip \
wget \
curl \
libxi6 \
libgconf-2-4 \
vim \
xvfb \
&& rm -rf /var/lib/apt/lists/*
#================
# Install Chrome
#================
RUN curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
apt-get -yqq update && \
apt-get -yqq install google-chrome-stable && \
rm -rf /var/lib/apt/lists/*
#=================
# Install Firefox
#=================
RUN apt-get -qy --no-install-recommends install \
$(apt-cache depends firefox | grep Depends | sed "s/.*ends:\ //" | tr '\n' ' ') \
&& rm -rf /var/lib/apt/lists/* \
&& cd /tmp \
&& wget --no-check-certificate -O firefox-esr.tar.bz2 \
'https://download.mozilla.org/?product=firefox-esr-latest&os=linux64&lang=en-US' \
&& tar -xjf firefox-esr.tar.bz2 -C /opt/ \
&& ln -s /opt/firefox/firefox /usr/bin/firefox \
&& rm -f /tmp/firefox-esr.tar.bz2
#===========================
# Configure Virtual Display
#===========================
RUN set -e
RUN echo "Starting X virtual framebuffer (Xvfb) in background..."
RUN Xvfb -ac :99 -screen 0 1280x1024x16 > /dev/null 2>&1 &
RUN export DISPLAY=:99
RUN exec "$#"
#=======================
# Update Python Version
#=======================
RUN apt-get update -y
RUN apt-get -qy --no-install-recommends install python3.8
RUN rm /usr/bin/python3
RUN ln -s python3.8 /usr/bin/python3
#=============================================
# Allow Special Characters in Python Programs
#=============================================
RUN export PYTHONIOENCODING=utf8
RUN echo "export PYTHONIOENCODING=utf8" >> ~/.bashrc
#=====================
# Set up SeleniumBase
#=====================
COPY sbase /SeleniumBase/sbase/
COPY seleniumbase /SeleniumBase/seleniumbase/
COPY examples /SeleniumBase/examples/
COPY integrations /SeleniumBase/integrations/
COPY requirements.txt /SeleniumBase/requirements.txt
COPY setup.py /SeleniumBase/setup.py
RUN find . -name '*.pyc' -delete
RUN find . -name __pycache__ -delete
RUN pip3 install --upgrade pip
RUN pip3 install --upgrade setuptools
RUN pip3 install --upgrade setuptools-scm
RUN cd /SeleniumBase && ls && pip3 install -r requirements.txt --upgrade
RUN cd /SeleniumBase && pip3 install .
#=====================
# Download WebDrivers
#=====================
RUN wget https://github.com/mozilla/geckodriver/releases/download/v0.31.0/geckodriver-v0.31.0-linux64.tar.gz
RUN tar -xvzf geckodriver-v0.31.0-linux64.tar.gz
RUN chmod +x geckodriver
RUN mv geckodriver /usr/local/bin/
RUN wget https://chromedriver.storage.googleapis.com/2.44/chromedriver_linux64.zip
RUN unzip chromedriver_linux64.zip
RUN chmod +x chromedriver
RUN mv chromedriver /usr/local/bin/
#==========================================
# Create entrypoint and grab example tests
#==========================================
COPY integrations/docker/docker-entrypoint.sh /
COPY integrations/docker/run_docker_test_in_firefox.sh /
COPY integrations/docker/run_docker_test_in_chrome.sh /
RUN chmod +x *.sh
COPY integrations/docker/docker_config.cfg /SeleniumBase/examples/
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["/bin/bash"]

Related

Running a Selenium(Python) based app in Docker

I'm trying to dockerize and run the web scrapper developed using the selenium library in python. I used Windows 10 for development. It ran well there. While running the same script as a docker image, I'm getting multiple issues. This is how I connect the driver in windows.
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
I didn't use options as I don't have any use cases. As I got root user error while running in docker I added the option and ran the code as below.
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(options = chrome_options, service=Service(ChromeDriverManager().install()))
Still, it didn't start. So I configured it by hardcoding the driver path.
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=driverPath,options=option)
Even then it didn't get started as the display was not configured. So configured the headless argument and ran, but in the end, I got the below error.
**
Tkinter.TclError: no display name and no $DISPLAY environment variable
**
So I tried to start the display by the below code.
if platform.system() == 'Linux':
from pyvirtualdisplay import Display
display = Display(visible=0, size=(800, 800))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path=driverPath,options=option)
But it is not running, it is frozen and not creating the driver session.
This is my Dockerfile
FROM python
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
RUN apt-get update && apt-get -y install google-chrome-stable
RUN apt-get install -yqq unzip
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
RUN apt-get install xvfb mesa-utils -y \
&& apt install freeglut3-dev -y
ENV DISPLAY=:99
RUN mkdir -p /app/drivers
ADD requirements.txt /app
ADD sample.py /app
COPY run.sh /app
COPY drivers /app/drivers
COPY csv /app/csv
WORKDIR /app
RUN pip3 install -r requirements.txt
CMD ./run.sh
run.sh
#!/bin/sh
#Xvfb :99 -screen 0 640x480x8 -nolisten tcp &
python3 ./sample.py
requirements.txt
selenium==4.3.0
webdriver-manager==3.8.2
chromedriver-py==103.0.5060.53
pyvirtualdisplay==3.0
What are the mistakes I made in the code? And how to run the selenium python app with display in docker? Thank you.
Seems the display can't be enabled in the python jar. So I have created the python image from the ubuntu image as said in this site. There I have installed the python and the other dependencies required for my application. And now I'm able run the application without any issues.
FROM ubuntu
#Enabling noninteractive environment and setting Timezone to install python3-tk without any interruption
# python
RUN export TZ=Asia/Kolkata
RUN apt-get update
RUN apt-get install -y python3 python3-setuptools python3-pip python3-tk
ENV DEBIAN_FRONTEND noninteractive
# Essential tools and xvfb
RUN apt-get update && apt-get install -y \
software-properties-common \
unzip \
curl \
xvfb
# Chrome browser to run the tests
RUN curl https://dl-ssl.google.com/linux/linux_signing_key.pub -o /tmp/google.pub \
&& cat /tmp/google.pub | apt-key add -; rm /tmp/google.pub \
&& echo 'deb http://dl.google.com/linux/chrome/deb/ stable main' > /etc/apt/sources.list.d/google.list \
&& mkdir -p /usr/share/desktop-directories \
&& apt-get -y update && apt-get install -y google-chrome-stable
# Disable the SUID sandbox so that chrome can launch without being in a privileged container
RUN dpkg-divert --add --rename --divert /opt/google/chrome/google-chrome.real /opt/google/chrome/google-chrome \
&& echo "#!/bin/bash\nexec /opt/google/chrome/google-chrome.real --no-sandbox --disable-setuid-sandbox \"\$#\"" > /opt/google/chrome/google-chrome \
&& chmod 755 /opt/google/chrome/google-chrome
# Chrome Driver
RUN mkdir -p /opt/selenium \
&& curl http://chromedriver.storage.googleapis.com/2.45/chromedriver_linux64.zip -o /opt/selenium/chromedriver_linux64.zip \
&& cd /opt/selenium; unzip /opt/selenium/chromedriver_linux64.zip; rm -rf chromedriver_linux64.zip; ln -fs /opt/selenium/chromedriver /usr/local/bin/chromedriver;
# display
RUN export DISPLAY=:20
RUN Xvfb :20 -screen 0 1366x768x16 &
RUN mkdir -p /app
ADD requirements.txt /app
ADD app.py /app
WORKDIR /app
RUN pip3 install -r requirements.txt
CMD ./run.sh

How to create Dockerfile with R + Anaconda3 + non-root User

I need to create a Dockerfile that emulates a normal workspace.
We have a virtual machine where we train models.
We Use R and Python3.
I want to automate some of the processes without changing the codebase.
e.g. ~ must point to a /home/<some user>
Biggest problem is Anaconda3 in docker. because every RUN is a standalone login.
Basis for my answer: https://github.com/xychelsea/anaconda3-docker/blob/main/Dockerfile
I've created my own mini R package installer:
install_r_packages.sh
#!/bin/bash
input="r-requirements.txt"
Rscript -e "install.packages('remotes')"
IFS='='
while IFS= read -r line; do
read -r package version <<<$line
package=$(echo "$package" | sed 's/ *$//g')
version=$(echo "$version" | sed 's/ *$//g')
if ! [[ ($package =~ ^#.*) || (-z $package) ]]; then
Rscript -e "remotes::install_version('$package', version = '$version')"
fi
done <$input
r-requirement
# packages for rmarkdown
htmltools=0.5.2
jsonlite=1.7.2
...
rmarkdown=2.11
# more packages
...
Dockerfile
FROM debian:bullseye
RUN apt-get update
# install R
RUN apt-get install -y r-base r-base-dev libatlas3-base r-recommended libssl-dev openssl \
libcurl4-openssl-dev libfontconfig1-dev libxml2-dev xml2 pandoc lua5.3 clang
ENV ARROW_S3=ON \
LIBARROW_MINIMAL=false \
LIBARROW_BINARY=true \
RSTUDIO_PANDOC=/usr/lib/rstudio-server/bin/pandoc \
TZ=Etc/UTC
COPY r-requirements.txt .
COPY scripts/install_r_packages.sh scripts/install_r_packages.sh
RUN bash scripts/install_r_packages.sh
# create user
ENV REPORT_USER="reporter"
ENV PROJECT_HOME=/home/${REPORT_USER}/<project>
RUN useradd -ms /bin/bash ${REPORT_USER} \
&& mkdir /data \
&& mkdir /opt/mlflow \
&& chown -R ${REPORT_USER}:${REPORT_USER} /data \
&& chown -R ${REPORT_USER}:${REPORT_USER} /opt/mlflow
# copy project files
WORKDIR ${PROJECT_HOME}
COPY src src
... bla bla bla ...
COPY requirements.txt .
RUN chown -R ${REPORT_USER}:${REPORT_USER} ${PROJECT_HOME}
# Install python Anaconda env
ENV ANACONDA_PATH="/opt/anaconda3"
ENV PATH=${ANACONDA_PATH}/bin:${PATH}
ENV ANACONDA_INSTALLER=Anaconda3-2021.11-Linux-x86_64.sh
RUN mkdir ${ANACONDA_PATH} \
&& chown -R ${REPORT_USER}:${REPORT_USER} ${ANACONDA_PATH}
RUN apt-get install -y wget
USER ${REPORT_USER}
RUN wget https://repo.anaconda.com/archive/${ANACONDA_INSTALLER} \
&& /bin/bash ${ANACONDA_INSTALLER} -b -u -p ${ANACONDA_PATH} \
&& chown -R ${REPORT_USER} ${ANACONDA_PATH} \
&& rm -rvf ~/${ANACONDA_INSTALLER}.sh \
&& echo ". ${ANACONDA_PATH}/etc/profile.d/conda.sh" >> ~/.bashrc \
&& echo "conda activate base" >> ~/.bashrc
RUN pip3 install --upgrade pip \
&& pip3 install -r requirements.txt \
&& pip3 install awscli
# run training and report
ENV PYTHONPATH=/home/${REPORT_USER}/<project> \
MLFLOW_TRACKING_URI=... \
MLFLOW_EXPERIMENT_NAME=...
CMD dvc config core.no_scm true \
&& dvc repro

cronjob inside docker - can't capture logs

I'm trying to run a cronjob inside a docker container, and the logs (created with python logging) from docker logs my_container or from /var/log/cron.log. Neither is working. I tried a bunch of solutions I found in stackoverflow.
This is my Dockerfile:
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=Europe/Minsk
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update && apt-get install -y \
python3-dev \
python3-tk \
python3-pip \
libglib2.0-0\
libsm6 \
postgresql-server-dev-all \
postgresql-common \
openssh-client \
libxext6 \
nano \
pkg-config \
rsync \
cron \
&& \
apt-get clean && \
apt-get autoremove && \
rm -rf /var/lib/apt/lists/*
RUN pip3 install --upgrade setuptools
RUN pip3 install numpy
ADD requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt && rm /requirements.txt
RUN touch /var/log/cron.log
COPY crontab /etc/cron.d/cjob
RUN chmod 0644 /etc/cron.d/cjob
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
ENV PYTHONUNBUFFERED 1
ADD . /code
WORKDIR /code
COPY ssh_config /etc/ssh/ssh_config
CMD cron -f
and this is how I run it:
nvidia-docker run -d \
-e DISPLAY=unix$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /media/storage:/opt/images/ \
-v /home/user/.aws/:/root/.aws/ \
--net host \
my_container
I tried different things such as:
Docker ubuntu cron tail logs not visible
See cron output via docker logs, without using an extra file
But I don't get any logs.
Change your chmod code to 755 if you're trying to execute something from there. You might also want to add an -R parameter while at that.
Next, add the following to your Dockerfile before chmod layer.
# Symlink the cron to stdout
RUN ln -sf /dev/stdout /var/log/cron.log
And add this as your final layer
# Run the command on container startup
CMD cron && tail -F /var/log/cron.log 2>&1
Referenced this from the first link that you mentioned. This should work.

Selenium within a Docker container can't find chromedriver

I need to put into a Docker container my little Flask app that goes and check what type of Google Tags my company's clients have installed. For that i need to have selenium-wire . You supply a website and you get a json back telling you which tags are installed ( a bit like http://gachecker.com/ ). Now it works just fine with the Flask App. The issue arises when i try to put it into Docker, here is my docker script:
FROM python:3.9 WORKDIR /bziiit_checker_app
RUN pip install flask flask_restful requests BeautifulSoup4 selenium-wire undetected-chromedriver chromedriver-py
COPY ./app ./app
CMD ["python", "./app/main.py"]
Once it's in Docker and try to run it, i get that message
"selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH"
Which is a common issue when the chromedriver.exe file is not in the working directory. But it IS.
Do i need to set the PATH when i'm creating the virtual environment, and if so how do i do that?
Again, i'm good at A.I, terrible at app development.
I'm using Python 3.9 and am on Windows 10, Visual Studio Code, and Flask
Thank you
After a few days of pain and suffering i finally worked it out, so here is the Docker file i created to get chromedriver to work in a Docker container.
This works on Windows 10 using VS code
FROM python:3.8
# Adding trusting keys to apt for repositories
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
# Adding Google Chrome to the repositories
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
# Updating apt to see and install Google Chrome
RUN apt-get -y update
# Magic happens
RUN apt-get install -y google-chrome-stable
# Installing Unzip
RUN apt-get install -yqq unzip
# Download the Chrome Driver
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
RUN apt-get -y update
RUN apt-get install -y google-chrome-stable
# install chromedriver
RUN apt-get install -yqq unzip
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
# Set display port as an environment variable
ENV DISPLAY=:99
COPY ./app ./app
WORKDIR /app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD ["python", "./main.py"]
Then, in your script, add those arguments to Chromedriver's options otherwise it'll give you an error message telling you that "Chromedriver has exited abnormally"
option = webdriver.ChromeOptions()
option.add_argument("--disable-gpu")
option.add_argument("--disable-extensions")
option.add_argument("--disable-infobars")
option.add_argument("--start-maximized")
option.add_argument("--disable-notifications")
option.add_argument('--headless')
option.add_argument('--no-sandbox')
option.add_argument('--disable-dev-shm-usage')
I hope this will save someone all the headache that problem gave me
You will also have to install chrome driver and chrome inside your container
RUN add-apt-repository -y ppa:openjdk-r/ppa
RUN apt-get install -y openjdk-12-jre cron wget unzip
ARG CHROME_VERSION=78.0.3904.87-1
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list \
&& apt-get update -qqy \
&& apt-get -qqy install google-chrome-stable=$CHROME_VERSION \
&& rm /etc/apt/sources.list.d/google-chrome.list \
&& rm -rf /var/lib/apt/lists/* /var/cache/apt/* \
&& sed -i 's/"$HERE\/chrome"/"$HERE\/chrome" --no-sandbox/g' /opt/google/chrome/google-chrome
ARG CHROME_DRIVER_VERSION=78.0.3904.70
RUN wget --no-verbose -O /tmp/chromedriver_linux64.zip https://chromedriver.storage.googleapis.com/$CHROME_DRIVER_VERSION/chromedriver_linux64.zip \
&& rm -rf /opt/chromedriver \
&& unzip /tmp/chromedriver_linux64.zip -d /opt \
&& rm /tmp/chromedriver_linux64.zip \
&& mv /opt/chromedriver /opt/chromedriver-$CHROME_DRIVER_VERSION \
&& chmod 755 /opt/chromedriver-$CHROME_DRIVER_VERSION \
&& ln -fs /opt/chromedriver-$CHROME_DRIVER_VERSION /usr/bin/chromedriver

Running a .py file with selenium in docker

I have a python script that scrapes some web information using selenium. I've build a docker image of my project:
FROM python:3.7-slim
WORKDIR /
COPY requirements.txt ./
RUN pip install --upgrade pip && pip install -r requirements.txt
COPY . .
RUN pip install -e .
CMD ["python", "src/project/scraper.py"]
I get the following error when I run it: selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
The chromedriver.exe file is located in a data folder and the .py script refers to the right place (it does run locally).
Does anyone know how I would be able to run chrome in this container?
map structure is as follows:
|-- data
| |--chromedriver.exe
| |--file.csv
|-- src
| |--project
| |--scraper.py
|-- Dockerfile
|-- requirements.txt
Thanks!
Let me share what has worked for me in the past.
Try installing chrome, chromedriver, and the PATH from within the DockerFile.
Note:
Using Python 3.8. You can try changing it to 3.7 and see if it works for you.
NOT configured for multi-stage builds.
i.e. You may want to remove "FROM python:3.7-slim" before appending your part at the end.
FROM python:3.8 AS builder
RUN apt-get update; apt-get clean
# Install chrome dependencies
RUN apt-get install -y x11vnc xvfb fluxbox wget wmctrl unzip
# Set up the Chrome PPA
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
# Update the package list and install chrome
RUN apt-get update -y
RUN apt-get install -y google-chrome-stable
# Set up Chromedriver Environment variables
ENV CHROMEDRIVER_VERSION 87.0.4280.88
ENV CHROMEDRIVER_DIR /chromedriver
RUN mkdir $CHROMEDRIVER_DIR
# Download and install Chromedriver
RUN wget -q --continue -P $CHROMEDRIVER_DIR "http://chromedriver.storage.googleapis.com/$CHROMEDRIVER_VERSION/chromedriver_linux64.zip"
RUN unzip $CHROMEDRIVER_DIR/chromedriver* -d $CHROMEDRIVER_DIR
# Put Chromedriver into the PATH
ENV PATH $CHROMEDRIVER_DIR:$PATH
RUN python -m venv /opt/venv
# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"
...
<YOUR_DOCKERFILE_PARTS>
As of today (Jan 25th, 2021), I can check the latest stable version (released Jan 19th, 2021) of google chrome is 88.0.4324.96. So if the above don't work, try changing the Chromedriver version so that it matches with the installed chrome browser.
I am expecting you are using Linux containers in Docker, since python:3.7-slim is a Linux image. You cannot execute Windows binaries (.exe) files in Linux. Therefore you need to install chromedriver on Linux: How to Setup Selenium with ChromeDriver on Ubuntu 18.04 & 16.04
Your Dockerfile should look something like this
FROM python:3.7-slim
# install chromedriver
RUN apt-get update && \
apt-get install -y unzip xvfb libxi6 libgconf-2-4 && \
apt-get install default-jdk && \
curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add && \
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list && \
apt-get -y update && \
wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip && \
unzip chromedriver_linux64.zip && \
mv chromedriver /usr/bin/chromedriver && \
chown root:root /usr/bin/chromedriver && \
chmod +x /usr/bin/chromedriver
WORKDIR /
COPY requirements.txt ./
RUN pip install --upgrade pip && pip install -r requirements.txt
COPY . .
RUN pip install -e .
CMD ["python", "src/project/scraper.py"]

Categories

Resources