Below my docker file,
FROM python:3.9.0
ARG WORK_DIR=/opt/quarter_1
RUN apt-get update && apt-get install cron -y && apt-get install -y default-jre
# Install python libraries
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
WORKDIR $WORK_DIR
EXPOSE 8888
VOLUME /home/data/quarter_1/
# Copy etl code
# copy code on container under your workdir "/opt/quarter_1"
COPY . .
I tried to connect to the server then i did the build with docker build -t my-python-app .
when i tried to run the container from a build image i got nothing and was not able to do it.
docker run -p 8888:8888 -v /home/data/quarter_1/:/opt/quarter_1 image_id
work here is opt
Update based on comments
If I understand everything you've posted correctly, my suggestion here is to use a base Docker Jupyter image, modify it to add your pip requirements, and then add your files to the work path. I've tested the following:
Start with a dockerfile like below
FROM jupyter/base-notebook:python-3.9.6
COPY requirements.txt /tmp/requirements.txt
RUN pip install --upgrade pip && pip install -r /tmp/requirements.txt
COPY ./quarter_1 /home/jovyan/quarter_1
Above assumes you are running the build from the folder containing dockerfile, "requirements.txt", and the "quarter_1" folder with your build files.
Note "home/joyvan" is the default working folder in this image.
Build the image
docker build -t biwia-jupyter:3.9.6 .
Start the container with open port to 8888. e.g.
docker run -p 8888:8888 biwia-jupyter:3.9.6
Connect to the container to access token. A few ways to do but for example:
docker exec -it CONTAINER_NAME bash
jupyter notebook list
Copy the token in the URL and connect using your server IP and port. You should be able to paste the token there, and afterwards access the folder you copied into the build, as I did below.
Jupyter screenshot
If you are deploying the image to different hosts this is probably the best way to do it using COPY/ADD etc., but otherwise look at using Docker Volumes which give you access to a folder (for example quarter_1) from the host, so you don't constantly have to rebuild during development.
Second edit for Python 3.9.0 request
Using the method above, 3.9.0 is not immediately available from DockerHub. I doubt you'll have much compatibility issues between 3.9.0 and 3.9.6, but we'll build it anyway. We can download the dockerfile folder from github, update a build argument, create our own variant with 3.9.0, and do as above.
Assuming you have git. Otherwise download the repo manually.
Download the Jupyter Docker stack repo
git clone https://github.com/jupyter/docker-stacks
change into the base-notebook directory of the cloned repo
cd ./base-notebook
Build the image with python 3.9.0 instead
docker build --build-arg PYTHON_VERSION=3.9.0 -t jupyter-base-notebook:3.9.0 .
Create the version with your copied folders and 3.9.0 version from the steps above, replacing the first line in the dockerfile instead with:
FROM jupyter-base-notebook:3.9.0
I've tested this and it works, running Python 3.9.0 without issue.
There are lots of ways to build Jupyter images, this is just one method. Check out docker hub for Jupyter to see their variants.
Related
I'm experiencing differences with the contents of a container depending on whether I open a bash shell via docker run -i -t <container> bash or docker-compose run <container> bash and I don't know/understand how this is possible.
To aid in the explanation, please see this screenshot from my terminal. In both instances, I am running the image called blaze which has been built from the Dockerfile in my code. One of the steps during the build is to create a virutalenv called venv, however when I open a bash shell via docker-compose this virtualenv doesn't seem to exist unlike when I run docker run ....
I am relatively new to setting up my own builds with Docker, but surely if they are both referencing the same image, the output of ls within a bash shell should be the same? I would greatly appreciate any help or guidance to resources that would explain what exactly is going wrong here...
As an additional point, running docker images shows that both commands must be using the same image...
Thanks in advance!
This is my Dockerfile:
FROM blaze-base-image:latest
# add an URL that PIP automatically searches (e.g., Azure Artifact Store URL)
ARG INDEX_URL
ENV PIP_EXTRA_INDEX_URL=$INDEX_URL
# Copy source code to docker image
RUN mkdir /opt/app
COPY . /opt/app
RUN ls /opt/app
# Install Blaze pip dependencies
WORKDIR /opt/app
RUN python3.7 -m venv /opt/app/venv
RUN /opt/app/venv/bin/python -m pip install --upgrade pip
RUN /opt/app/venv/bin/python -m pip install keyring artifacts-keyring
RUN touch /opt/app/venv/pip.conf
RUN echo $'[global]\nextra-index-url=https://www.index.com' > /opt/app/venv/pip.conf
RUN /opt/app/venv/bin/python -m pip install -r /opt/app/requirements.txt
RUN /opt/app/venv/bin/python -m spacy download en_core_web_sm
# Comment
CMD ["echo", "Container build complete"]
And this is my docker-compose.yml:
version: '3'
services:
blaze:
build: .
image: blaze
volumes:
- .:/opt/app
There are two intersecting things going on here:
When you have a Compose volumes: or docker run -v option mounting host content over a container directory, the host content completely replaces what's in the image. If you don't have a ./venv directory on the host, then there won't be a /opt/app/venv directory in the container. That's why, when you docker-compose run blaze ..., the virtual environment is missing.
If you docker run a container, the only options that are considered are those in that specific docker run command. docker run doesn't know about the docker-compose.yml file and won't take options from there. That means there isn't this volume mount in the docker run case, which is why the virtual environment reappears.
Typically in Docker you don't need a virtual environment at all: the Docker image is isolated from other images and Python installations, and so it's safe and normal to install your application into the "system" Python. You also typically want your image to be self-contained and not depend on content from the host, so you wouldn't generally need the bind mount you show.
That would simplify your Dockerfile to:
FROM blaze-base-image:latest
# Any ARG will automatically appear as an environment variable to
# RUN directives; this won't be needed at run time
ARG PIP_EXTRA_INDEX_URL
# Creates the directory if it doesn't exist
WORKDIR /opt/app
# Install the Python-level dependencies
RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install -r requirements.txt
# The requirements.txt file should list every required package
# Install the rest of the application
COPY . .
# Set the main container command to run the application
CMD ["./app.py"]
The docker-compose.yml file can be similarly simplified to
version: '3.8' # '3' means '3.0'
services:
blaze:
build: .
# Compose picks its own image name
# Do not need volumes:, the image is self-contained
and then it will work consistently with either docker run or docker-compose run (or docker-compose up).
I'm fairly new to Sagemaker and Docker.I am trying to train my own custom object detection algorithm in Sagemaker using an ECS container. I'm using this repo's files:
https://github.com/svpino/tensorflow-object-detection-sagemaker
I've followed the instructions exactly, and I'm able to run the image in a container perfectly fine on my local machine. But when I push the image to ECS to run in Sagemaker, I get the following message in Cloudwatch:
I understand that for some reason, when deployed to ECS suddenly the image can't find python. At the top of my training script is the text #!/usr/bin/env python. I've tried to run the *which python * command and changed up text to point to #!/usr/local/bin python, but I just get additional errors. I don't understand why this image would work on my local (tested with both docker on windows and docker CE for WSL). Here's a snippet of the docker file:
ARG ARCHITECTURE=1.15.0-gpu
FROM tensorflow/tensorflow:${ARCHITECTURE}-py3
RUN apt-get update && apt-get install -y --no-install-recommends \
wget zip unzip git ca-certificates curl nginx python
# We need to install Protocol Buffers (Protobuf). Protobuf is Google's language and platform-neutral,
# extensible mechanism for serializing structured data. To make sure you are using the most updated code,
# replace the linked release below with the latest version available on the Git repository.
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.10.1/protoc-3.10.1-linux-x86_64.zip
RUN unzip protoc-3.10.1-linux-x86_64.zip -d protoc3
RUN mv protoc3/bin/* /usr/local/bin/
RUN mv protoc3/include/* /usr/local/include/
# Let's add the folder that we are going to be using to install all of our machine learning-related code
# to the PATH. This is the folder used by SageMaker to find and run our code.
ENV PATH="/opt/ml/code:${PATH}"
RUN mkdir -p /opt/ml/code
WORKDIR /opt/ml/code
RUN pip install --upgrade pip
RUN pip install cython
RUN pip install contextlib2
RUN pip install pillow
RUN pip install lxml
RUN pip install matplotlib
RUN pip install flask
RUN pip install gevent
RUN pip install gunicorn
RUN pip install pycocotools
# Let's now download Tensorflow from the official Git repository and install Tensorflow Slim from
# its folder.
RUN git clone https://github.com/tensorflow/models/ tensorflow-models
RUN pip install -e tensorflow-models/research/slim
# We can now install the Object Detection API, also part of the Tensorflow repository. We are going to change
# the working directory for a minute so we can do this easily.
WORKDIR /opt/ml/code/tensorflow-models/research
RUN protoc object_detection/protos/*.proto --python_out=.
RUN python setup.py build
RUN python setup.py install
# If you are interested in using COCO evaluation metrics, you can tun the following commands to add the
# necessary resources to your Tensorflow installation.
RUN git clone https://github.com/cocodataset/cocoapi.git
WORKDIR /opt/ml/code/tensorflow-models/research/cocoapi/PythonAPI
RUN make
RUN cp -r pycocotools /opt/ml/code/tensorflow-models/research/
# Let's put the working directory back to where it needs to be, copy all of our code, and update the PYTHONPATH
# to include the newly installed Tensorflow libraries.
WORKDIR /opt/ml/code
COPY /code /opt/ml/code
ENV PYTHONPATH=${PYTHONPATH}:tensorflow-models/research:tensorflow-models/research/slim:tensorflow-models/research/object_detection
RUN chmod +x /opt/ml/code/train
CMD ["/bin/bash","-c","chmod +x /opt/ml/code/train && /opt/ml/code/train"]
I have created a docker container for my pure python program and have set python main.py to be executed when the container is run. Running the container works as expected on my local machine. However, I want to run the container on my institution's high-performance cluster. The cluster machines use Singularity, which I am using to pull my docker image hosted on Dockerhub (the repo is darshank11/ga_paci_final). However, when I try to run the Singularity container, I get the following error: python3: can't open file 'main.py': [Errno 2] No such file or directory.
I've tried to change the base image in the Dockerfile, for example from FROM python:latest to FROM ubuntu:latest. I've made sure the docker container worked on my local machine, and then got one of my co-workers to pull the container from Dockerhub and run it too. Everything works fine until I get to Singularity.
Here is my docker file:
FROM ubuntu:16.04
RUN apt-get update -y && \
apt-get install -y python3-pip python3-dev
RUN mkdir src
WORKDIR /src
COPY . /src
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
CMD ["python3", "-u", "main.py"]
You're getting that error because the execution context is not what you're expecting. The run path in singularity is the current directory on the host OS (e.g., ~/ga_paci_final), which has been mounted into the singularity image.
As mentioned in the comments, one solution is to give the full path to the python file in the docker CMD statement. Another option is to modify the %runscript block of singularity definition file to something like:
%runscript
cd /src
python3 -u main.py
That way you ensure the run environment is identical between Docker and Singularity.
I have pulled jenkins container from docker hub like this:
docker pull jenkins
The container runs and I can access Jenkins UI in :
http://localhost:8080
My question is:
If I want to be able to create a jenkins job that pulls from a github repo and I want to run some python tests from one of the test files of that repo, how can I install extra packages such as virtualenvwrapper, pip, pytest, nose, selenium etc?
It appears that the docker container does not share any reference with local host file system.
How can I install such packages in this running container?
Thanks
You will need to install all your dependencies at docker container build time.
You can make your own Dockerfile off of the jenkins library, and then put custom stuff in there. Your Dockerfile can look like
FROM jenkins:latest
MAINTAINER Becks
RUN apt-get update && apt-get install -y {space delimited list of package}
Then, you can do something like...
docker build -t jenkins-docker --file Dockerfile .
docker run -it -d --name=jenkins-docker jenkins-docker
I might not have written all the syntax correctly, but this is basically what you need to do. If you want the run step to spin up jenkins, follow along with what they are doing in the existing Dockerfile here and add relevant sections to your dockerfile, to add some RUN steps to run jenkins.
Came across this page, which approaches a similar problem, although it also mounts the docker sock inside another container, to kind of connect one container to another. Given that its an external link, here's the relevant dockerfile from there,
FROM jenkins:1.596
USER root
RUN apt-get update \
&& apt-get install -y sudo \
&& rm -rf /var/lib/apt/lists/*
RUN echo "jenkins ALL=NOPASSWD: ALL" >> /etc/sudoers
USER jenkins
COPY plugins.txt /usr/share/jenkins/plugins.txt
RUN /usr/local/bin/plugins.sh /usr/share/jenkins/plugins.txt
And this is how you can spin it up.
docker build -t myjenk .
...
Successfully built 471fc0d22bff
$ docker run -d -v /var/run/docker.sock:/var/run/docker.sock \
-v $(which docker):/usr/bin/docker -p 8080:8080 myjenk
I strongly suggest going through that post. Its pretty awesome.
Currently for my python project I have a deploy.sh file which runs some apt-get's, pip installs, creates some dirs and copies some files.... so the process is git clone my private repo then run deploy.sh.
Now I'm playing with docker, and the basic question is, should the dockerfile RUN a git clone and then RUN deploy.sh or should the dockerfile have its own RUNs for each apt-get, pip, etc and ignore deploy.sh... which seems like duplicating work (typing) and has the possibility of going out of sync?
That work should be duplicated in the dockerfile. The reason for this is to take advantage of docker's layer and caching system. Take this example:
# this will only execute the first time you build the image, all future builds will use a cached layer
RUN apt-get update && apt-get install somepackage -y
# this will only run pip if your requirements file changes
ADD requirements.txt /app/requirements.txt
RUN pip install -r requirements.txt
ADD . /app/
Also, you should not do a git checkout of your code in the docker build. Just simply add the files from the local checkout to the images like in the above example.