Setting
I have 10+ gigs of images on my local machine, there could be terabytes of images in the future, in which case it will be hosted on, ie, aws. The images will be served to some website, and also be inputs for a machine learning pipeline.
Right now I am developing on my local machine. The source code is at path/to/src, and the data is at path/to/images. I have already set up a Docker environment with Dockerfile:
FROM bamos/openface
ADD . /face-off
WORKDIR /face-off
RUN pip install -r requirements.txt
EXPOSE 5000
CMD [ "python", "app.py" ]
And docker-compose file:
version: '2'
services:
web:
build: .
image: face-off-web
command: python app.py
ports:
- "5000:5000"
volumes:
- .:/face-off
redis:
image: "redis:alpine"
Problem
Since I am developing in the Docker container, I would need to access all images in path/to/images. For now let's keep it simple and say I read images using pre specified path to disk. I think my options are:
One obvious way is to move all the images to path/to/src, but this seems dirty to me.
Another possibility is to ADD the directory in the Dockerfile. So then I would need to move both directories to path/to/project, where the Dockerfile would be located. And then move the src to path/to/project/src and data to path/to/project/data. But again this smells a lot like option 1.
Bring the data in somehow using docker-compose. I do not know how to do this right now, despite reading the docs on docker volume.
What is the idiomatic way to handle this problem? If the way is option 3, could someone explain how?
Bring the data in somehow using docker-compose. I do not know how to do this right now, despite reading the docs on docker volume.
For your use case you want to mount a host path into your container, much like you already do with your project working dir. Just add another line for the images:
version: '2'
services:
web:
build: .
image: face-off-web
command: python app.py
ports:
- "5000:5000"
volumes:
- .:/face-off
- /images/on/host:/path/in/container
redis:
image: "redis:alpine"
On a side note, either use build or image not both.
Related
I have searched but I couldn't find a solution for my problem. My docker-compose.yml file as below.
#
version: '2.1'
services:
mongo:
image: mongo_db
build: mongo_image
container_name: my_mongodb
restart: always
networks:
- isolated_network
ports:
- "27017"
environment:
- MONGO_INITDB_ROOT_USERNAME=root
- MONGO_INITDB_ROOT_PASSWORD=root_pw
entrypoint: ["python3", "/tmp/script/get_api_to_mongodb.py", "&"]
networks:
isolated_network:
So here I use a custom Dockerfile. And my Dockerfile is like below.
FROM mongo:latest
RUN apt-get update -y
RUN apt-get install python3-pip -y
RUN pip3 install requests
RUN pip3 install pymongo
RUN apt-get clean -y
RUN mkdir -p /tmp/script
COPY get_api_to_mongodb.py /tmp/script/get_api_to_mongodb.py
#CMD ["python3","/tmp/script/get_api_to_mongodb.py","&"]
Here I want to create a container which have MongoDB and after create the container I collect a data using an API and send the data to MongoDB. But when I run the python script in that time mongodb is not initialized. So I need to run my script after container is created and right after mongodb initialized. Thanks in advance.
You should run this script as a separate container. It's not "part of the database", like an extension or plugin, but rather an ordinary client process that happens to connect to the database and that you want to run relatively early on. In general, if you're thinking about trying to launch a background process in a container, it's often a better approach to run foreground processes in two separate containers.
This setup means you can use a simpler Dockerfile that starts from an image with Python preinstalled:
FROM python:3.10
RUN pip install requests pymongo
WORKDIR /app
COPY get_api_to_mongodb.py .
CMD ["./get_api_to_mongodb.py"]
Then in your Compose setup, declare this as a second container alongside the first one. Since the script is in its own image, you can use the unmodified mongo image.
version: '2.4'
services:
mongo:
image: mongo:latest
restart: always
ports:
- "27017"
environment:
- MONGO_INITDB_ROOT_USERNAME=root
- MONGO_INITDB_ROOT_PASSWORD=root_pw
loader:
build: .
restart: on-failure
depends_on:
- mongodb
# environment:
# - MONGO_HOST=mongo
# - MONGO_USERNAME=root
# - MONGO_PASSWORD=root_pw
Note that the loader will re-run every time you run docker-compose up -d. You also may have to wait for the database to do its initialization before you can run the loader process; see Docker Compose wait for container X before starting Y.
It's likely you have an existing Compose service for your real application
version: '2.4'
services:
mongo: { ... }
app:
build: .
...
If that image contains the loader script, then you can docker-compose run it. This launches a new temporary container, using most of the attributes from the Compose service declaration, but you provide an alternate command: and the ports: are ignored.
docker-compose run app ./get_api_to_mongodb.py
One might ideally like a workflow where first the database container starts; then once it's accepting requests, run the loader script as a temporary container; then once that's completed start the main application server. This is mostly beyond Compose's capabilities, though you can probably get close with a combination of extended depends_on: declarations and a healthcheck: for the database.
I have two services, on two different GitLab repositories, deployed to the same host. I am currently using supervisord to run all of the services. The CI/CD for each repository pushes the code to the host.
I am trying to replace supervisord with Docker. What I did was the following:
Set up a Dockerfile for each service.
Created a third repository with only a docker-compose.yml, that runs docker-compose up in its CI to build and run the two services. I expect this repository to only be deployed once.
I am looking for a way to have the docker-compose automatically update when I deploy one of the two services.
Edit: Essentially, I am trying to figure out the best way to use docker-compose with a multi repository setup and one host.
My docker-compose:
version: "3.4"
services:
redis:
image: "redis:alpine"
api:
build: .
command: gunicorn -c gunicorn_conf.py --bind 0.0.0.0:5000 --chdir server "app:app" --timeout 120
volumes:
- .:/app
ports:
- "8000:8000"
depends_on:
- redis
celery-worker:
build: .
command: celery worker -A server.celery_config:celery
volumes:
- .:/app
depends_on:
- redis
celery-beat:
build: .
command: celery beat -A server.celery_config:celery --loglevel=INFO
volumes:
- .:/app
depends_on:
- redis
other-service:
build: .
command: python other-service.py
volumes:
- .:/other-service
depends_on:
- redis
If you're setting this up in the context of a CI system, the docker-compose.yml file should just run the images; it shouldn't also take responsibility for building them.
Do not overwrite the code in a container using volumes:.
You mention each service's repository has a Dockerfile, which is a normal setup. Your CI system should run docker build there (and typically docker push). Then your docker-compose.yml file just needs to mention the image: that the CI system builds:
version: "3.4"
services:
redis:
image: "redis:alpine"
api:
image: "me/django:${DJANGO_VERSION:-latest}"
ports:
- "8000:8000"
depends_on:
- redis
celery-worker:
image: "me/django:${DJANGO_VERSION:-latest}"
command: celery worker -A server.celery_config:celery
depends_on:
- redis
I hint at docker push above. If you're using Docker Hub, or a cloud-hosted Docker image repository, or are running a private repository, the CI system should run docker push after it builds each image, and (if it's not Docker Hub) the image: lines need to include the repository address.
The other important question here is what to do on rebuilds. I'd recommend giving each build a unique Docker image tag, a timestamp or a source control commit ID both work well. In the docker-compose.yml file I show above, I use an environment variable to specify the actual image tag, so your CI system can run
DJANGO_VERSION=20200113.1114 docker-compose up -d
Then Compose will know about the changed image tag, and will be able to recreate the containers based on the new images.
(This approach is highly relevant in the context of cluster systems like Kubernetes. Pushing images to a registry is all but required there. In Kubernetes changing the name of an image: triggers a redeployment, so it's also all but required to use a unique image tag per build. Except that there are multiple and more complex YAML files, the overall approach in Kubernetes would be very similar to what I've laid out here.)
I am trying to test my setup using docker and tensorflow. I am using the official Tensorflow image tensorflow/tensorflow:1.15.0rc2-gpu-py3
My project has the minimum structure:
project/
Dockerfile
docker-compose.yml
jupyter/
README.md
I have the following Dockerfile:
# from official image
FROM tensorflow/tensorflow:1.15.0rc2-gpu-py3-jupyter
# add my notebooks so they are a part of the container
ADD ./jupyter /tf/notebooks
# copy-paste from tf github dockerfile in attempt to troubleshoot
# https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/dockerfiles/dockerfiles/gpu-jupyter.Dockerfile
WORKDIR /tf
RUN which jupyter
CMD ["jupyter-notebook --notebook-dir=/tf/notebooks --ip 0.0.0.0 --no-browser --allow-root"]
and the docker-compose.yml
version: '3'
services:
tf:
image: tensorflow/tensorflow:1.15.0rc2-gpu-py3-jupyter
# mount host system volume to save updates from container
volumes:
- jupyter:/tf/notebooks
ports:
- '8888:8888'
# added as part of troubleshooting
build:
context: .
dockerfile: Dockerfile
volumes:
jupyter:
running docker-compose build and docker-compose up succeeds (if the CMD in the Dockerfile is commented out), but just exits. From the docker hub repository, I thought adding the volume would auto-start a notebook.
Trying to run jupyter-notebook or jupyter notebook fails.
Thoughts on how to correct?
If you want to create a custom image from the official one adding the notebook directory then the image property in the docker-compose should be the name of the your local image not the tensorflow/tensorflow:1.15.0rc2-gpu-py3-jupyter. All you need is in this case a following Dockerfile:
FROM tensorflow/tensorflow:1.15.0rc2-gpu-py3-jupyter
ADD ./jupyter /tf/notebooks
In this case the docker-compose.yaml file should look like the following:
version: '3'
services:
tf:
image: tensorflow
# mount host system volume to save updates from container
volumes:
- jupyter:/tf/notebooks
ports:
- '8888:8888'
# added as part of troubleshooting
build:
context: .
dockerfile: Dockerfile
volumes:
jupyter:
Note, that the image is tensorflow.
However, there is really no need to use the custom Dockerfile. Just use the following docker-compose.yaml file:
version: '3'
services:
tf:
image: tensorflow/tensorflow:1.15.0rc2-gpu-py3-jupyter
# mount host system volume to save updates from container
volumes:
- ./jupyter:/tf/notebooks:Z
ports:
- '8888:8888'
It will directly map your local jupyter directory to the container and will use an official image without modification.
Note though, it might not work as expected on Windows due to issues with mapping of the host directories.
try this
RUN pip3 install nvidia-tensorflow
this will install tf 1.15
I have been working with Docker previously using services to run a website made with Django.
Now I would like to know how I should create a Docker to just run Python scripts without a web server and any service related with websited.
An example of normal docker which I am used to work is:
version: '2'
services:
nginx:
image: nginx:latest
container_name: nz01
ports:
- "8001:8000"
volumes:
- ./src:/src
- ./config/nginx:/etc/nginx/conf.d
depends_on:
- web
web:
build: .
container_name: dz01
depends_on:
- db
volumes:
- ./src:/src
expose:
- "8000"
db:
image: postgres:latest
container_name: pz01
ports:
- "5433:5432"
volumes:
- postgres_database:/var/lib/postgresql/data:Z
volumes:
postgres_database:
external: true
How should be the docker-compose.yml file?
Simply remove everything from your Dockerfile that has nothing to do with your script and start with something simple, like
FROM python:3
ADD my_script.py /
CMD [ "python", "./my_script.py" ]
You do not need Docker compose for containerizing a single python script.
The example is taken from this simple tutorial about containerizing Python applications: https://runnable.com/docker/python/dockerize-your-python-application
You can easily overwrite the command specified in the Dockerfile (via CMD) when starting a container from the image. Just append the desired command to your docker run command, e.g:
docker run IMAGE /path/to/script.py
You can easily run Python interactively without even having to build a container:
docker run -it python
If you want to have access to some code you have written within the container, simply change that to:
docker run -it -v /path/to/code:/app: python
Making a Dockerfile is unnecessary for this simple application.
Most Linux distributions come with Python preinstalled. Using Docker here adds significant complexity and I'd pretty strongly advise against Docker just to run a simple script. You can use a virtual environment to isolate a particular Python package's dependencies from the rest of the system.
(There is a pretty consistent stream of SO questions around getting filesystem permissions and user IDs right for scripts that principally want to interact with the host system. Also remember that running docker anything implies root-equivalent permissions. If you don't want Docker's filesystem and user namespace isolation, IMHO it's easier to just not use Docker where it doesn't make sense.)
I am running a python application as a docker container and in my python application I am using pythons logging class to log execution steps using logger.info, logger.debug and logger.error. The problem with this is the log file is only persistent within the docker container and if the container goes away then the log file is also lost and also that every time I have to view the log file I have to manually copy the container log file to local system.What I want to do is that whatever log is being written to container log file, it should be persistent on the local system - so write to a local system log file or Auto mount the docker log file to local system.
Few things about my host machine:
I run multiple docker containers on the machine.
My sample docker-core file is:
FROM server-base-v1
ADD . /app
WORKDIR /app
ENV PATH /app:$PATH
CMD ["python","-u","app.py"]
My sample docker-base file is:
FROM python:3
ADD ./setup /app/setup
WORKDIR /app
RUN pip install -r setup/requirements.txt
A sample of my docker-compose.yml file is:
`
version: "2"
networks:
server-net:
services:
mongo:
container_name: mongodb
image: mongodb
hostname: mongodb
networks:
- server-net
volumes:
- /dockerdata/mongodb:/data/db
ports:
- "27017:27017"
- "28017:28017"
server-core-v1:
container_name: server-core-v1
image: server-core-v1:latest
depends_on:
- mongo
networks:
- server-net
ports:
- "8000:8000"
volumes:
- /etc/localtime:/etc/localtime:ro
`
Above yml file sample is just a part of my actual yml file. I have multiple server-core-v1 containers(with different names) running parallel with each having their own logging file.
I would also appreciate if there are some better strategies for doing logging in python with docker and make it persistent. I read few articles which mentioned using sys.stderr.write() and sys.stdout.write() but not sure how to use that especially with multiple containers running and logging.
Volumes are what you need.
You can create volumes to bind an internal container folder with a local system folder. So that you can store your logs in a logs folder and map this as a volume to any folder on your local system.
You can specify a volume in the docker-compose.yml file for each service you are creating. See the docs.
Bind-mounts are what you need.
As you can see, bind-mounts are accesible from yours host file system. It is very simmilar to shared folders in VM architecture.
You can simple achieve that with mounting your volume directly to path on your PC.
In yours case:
version: "2"
networks:
server-net:
services:
mongo:
container_name: mongodb
image: mongodb
hostname: mongodb
networks:
- server-net
volumes:
- /dockerdata/mongodb:/data/db
ports:
- "27017:27017"
- "28017:28017"
server-core-v1:
container_name: server-core-v1
image: server-core-v1:latest
depends_on:
- mongo
networks:
- server-net
ports:
- "8000:8000"
volumes:
- ./yours/example/host/path:/etc/localtime:ro
Just replace ./yours/example/host/path with target directory on yours host.
In this scenario, i belive that logger is on server side.
If you are working on windows remember to bind in current user directory!