Im working on a OpenCv project and as many know, the instalation of that on windows is iritating. So, what i want to do is to run the project in a docker container and store the output to a folder on the host computer. In simple terms it is something like this:
Program python / opencv code
Build Docker image
Run Docker image --> Saves the output data somewhere
In some way - get access to the output data on host.
Now, i have been trying to find many ways of dooing this, and i will probably at a later time send it using other means. However, for development i need this slightly more direct approach. It also has somthing to do with colaboration with others.
Simple Docker file that can be used as the base:
FROM python:3
WORKDIR /usr/src/app
COPY . .
CMD [ "python", "./script.py" ]
Lets say that script.py creates a file called output.txt. I want that output.txt stored at my E: drive.
How to do this automatically - without having to do multiple comandline operations?
TLDR; How to get files from Docker container to host? Goal: File physically stored on E:
There are two ways to do this. First is to mount a docker volume.
docker run --name=name -d -v /path/in/host:/path/in/docker name
By mounting a volume like this, whatever you write in the directory you've mounted will be written in the host automatically. Check this for more information on volumes.
The second way is to copy the files from a container to the host like this
docker cp <containerId>:/file/path/within/container /host/path/target
Check docs here for more info on this.
I have built a base image from Dockerfile named centos+ssh. In centos+ssh's Dockerfile, I use CMD to run ssh service.
Then I want to build a image run other service named rabbitmq,the Dockerfile:
FROM centos+ssh
EXPOSE 22
EXPOSE 4149
CMD /opt/mq/sbin/rabbitmq-server start
To start rabbitmq container,run:
docker run -d -p 222:22 -p 4149:4149 rabbitmq
but ssh service doesn't work, it sense rabbitmq's Dockerfile CMD override centos's CMD.
How does CMD work inside docker image?
If I want to run multiple service, how to? Using supervisor?
You are right, the second Dockerfile will overwrite the CMD command of the first one. Docker will always run a single command, not more. So at the end of your Dockerfile, you can specify one command to run. Not more.
But you can execute both commands in one line:
FROM centos+ssh
EXPOSE 22
EXPOSE 4149
CMD service sshd start && /opt/mq/sbin/rabbitmq-server start
What you could also do to make your Dockerfile a little bit cleaner, you could put your CMD commands to an extra file:
FROM centos+ssh
EXPOSE 22
EXPOSE 4149
CMD sh /home/centos/all_your_commands.sh
And a file like this:
service sshd start &
/opt/mq/sbin/rabbitmq-server start
Even though CMD is written down in the Dockerfile, it really is runtime information. Just like EXPOSE, but contrary to e.g. RUN and ADD. By this, I mean that you can override it later, in an extending Dockerfile, or simple in your run command, which is what you are experiencing. At all times, there can be only one CMD.
If you want to run multiple services, I indeed would use supervisor. You can make a supervisor configuration file for each service, ADD these in a directory, and run the supervisor with supervisord -c /etc/supervisor to point to a supervisor configuration file which loads all your services and looks like
[supervisord]
nodaemon=true
[include]
files = /etc/supervisor/conf.d/*.conf
If you would like more details, I wrote a blog on this subject here: http://blog.trifork.com/2014/03/11/using-supervisor-with-docker-to-manage-processes-supporting-image-inheritance/
While I respect the answer from qkrijger explaining how you can work around this issue I think there is a lot more we can learn about what's going on here ...
To actually answer your question of "why" ... I think it would for helpful for you to understand how the docker stop command works and that all processes should be shutdown cleanly to prevent problems when you try to restart them (file corruption etc).
Problem: What if docker did start SSH from it's command and started RabbitMQ from your Docker file? "The docker stop command attempts to stop a running container first by sending a SIGTERM signal to the root process (PID 1) in the container." Which process is docker tracking as PID 1 that will get the SIGTERM? Will it be SSH or Rabbit?? "According to the Unix process model, the init process -- PID 1 -- inherits all orphaned child processes and must reap them. Most Docker containers do not have an init process that does this correctly, and as a result their containers become filled with zombie processes over time."
Answer: Docker simply takes that last CMD as the one that will get launched as the root process with PID 1 and get the SIGTERM from docker stop.
Suggested solution: You should use (or create) a base image specifically made for running more than one service, such as phusion/baseimage
It should be important to note that tini exists exactly for this reason, and as of Docker 1.13 and up, tini is officially part of Docker, which tells us that running more than one process in Docker IS VALID .. so even if someone claims to be more skilled regarding Docker, and insists that you absurd for thinking of doing this, know that you are not. There are perfectly valid situations for doing so.
Good to know:
https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/
http://www.techbar.me/stopping-docker-containers-gracefully/
https://www.ctl.io/developers/blog/post/gracefully-stopping-docker-containers/
https://github.com/phusion/baseimage-docker#docker_single_process
The official docker answer to Run multiple services in a container.
It explains how you can do it with an init system (systemd, sysvinit, upstart) , a script (CMD ./my_wrapper_script.sh) or a supervisor like supervisord.
The && workaround can work only for services that starts in background (daemons) or that will execute quickly without interaction and release the prompt. Doing this with an interactive service (that keeps the prompt) and only the first service will start.
To address why CMD is designed to run only one service per container, let's just realize what would happen if the secondary servers run in the same container are not trivial / auxiliary but "major" (e.g. storage bundled with the frontend app). For starters, it would break down several important containerization features such as horizontal (auto-)scaling and rescheduling between nodes, both of which assume there is only one application (source of CPU load) per container. Then there is the issue of vulnerabilities - more servers exposed in a container means more frequent patching of CVEs...
So let's admit that it is a 'nudge' from Docker (and Kubernetes/Openshift) designers towards good practices and we should not reinvent workarounds (SSH is not necessary - we have docker exec / kubectl exec / oc rsh designed to replace it).
More info
https://devops.stackexchange.com/questions/447/why-it-is-recommended-to-run-only-one-process-in-a-container
I am new to Docker and I am confused about containers and images somehow. I want to sue Docker for Tensorflow development. All I need is to have an easy way to write Jupyter Notebooks and use GPU powered Tensorflow.
I have the latest Tensorflow Jupyter Python 3 Image already. I run the Image with
docker run --rm --runtime=nvidia -v -it -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter
How can I make it so that my data when I work in that Image and add and edit my Jupyter Notebooks won't get lost after I exit the process. I know that Docker Images aren't meant to persist state but I am so new to this I just want something to work in with persistent data. Can someone help me guide me through this or point to a resource which will answer all my prayers?
I would also like to move some stuff into the Container that is going to be run so that I can access some custom Python libs because they contain some things that my Notebooks need to import!
Side questions:
--rm removes the container or whatever by default I run it without this flag still my data was lost
-v is for volumes? I tried with -v Bachelor:/app to mount a volume like so. It apparently doesn't make any difference. I don't know how to use the volume Bachelor that I created. Instead there are a multitude of unnamed volumes being created that are not usable whenever I run this
-it does also something no idea what
-p is the port number right?
Use Docker volumes:
Volumes are the preferred mechanism for persisting data generated by and used by Docker containers
Example:
docker run --runtime=nvidia -v ${SOURCE_FOLDER}:${DEST_FOLDER} -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter
Change SOURCE_FOLDER and DEST_FOLDER accordingly (use absolute paths!).
Now if you navigate to localhost:8888 and create a notebook on DEST_FOLDER, it also should be available on SOURCE_FOLDER.
As for your side questions:
--it runs a container in interactive mode. You generally add /bin/bash after the run command, so you can start an interactive bash session inside the container.
--rm cleans the container after it exists.
Those options aren't really necessary for your use case. Just remember to use docker ps and docker rm <ID> to clean up your container after you're done.
I'm trying to create a container to run a program. I'm using a pre-configured image and now I need to run the program. However, it's a machine learning program and I need a dataset from my computer to run.
The file is too large to be copied to the container. It would be best if the program running in the container searched the dataset in a local directory of my computer, but I don't know how I can do this.
Well, I have made the shared folder from my machine appeared using docker run -it -v ~/Volumes/Data/Studies/PhD\Work/gitlab/J2/ydk-py:/ydk-py ydkdev/ydk-py in the container, but all files in folder ydk-py are not shown. This is the safe, usually-desired behavior. But for development and instance setup, it would be immensely useful to have access to an existing file structure.
docker run with -v will automatically mount sub-directories. In your case you are using relative path, which you need to use absolute path as per this documentation.
So change your command from
docker run -it -v ~/Volumes/Data/Studies/PhD\Work/gitlab/J2/ydk-py:/ydk-py ydkdev/ydk-py
to
docker run -it -v /home/<what ever user>/Volumes/Data/Studies/PhD\Work/gitlab/J2/ydk-py:/ydk-py ydkdev/ydk-py
it will work.
Make sure you have enough permissions on directory that you are trying to mount.
What's the proper development workflow for code that runs in a Docker container?
Solomon Hykes said that the "official" workflow involves building and running a new Docker image for each Git commit. That makes sense, but what if I want to test a change before committing it to the Git repo?
I can think of two ways to do it:
Run the code on a local development server (e.g., the Django development server). Edit a file; test on the dev server; make a Git commit; rebuild the Docker image with the new code; test again on the local Docker container.
Don't run a local dev server. Instead, build and run a new Docker image each time I edit a file, and then test the change on local Docker container.
Both approaches are pretty inefficient. Is there a better way?
A more efficient way is to run a new container from the latest image that was built (which then has the latest code).
You could start that container starting a bash shell so that you will be able to edit files from inside the container:
docker run -it <some image> bash -l
You would then run the application in that container to test the new code.
Another way to alter files in that container is to start it with a volume. The idea is to alter files in a directory on the docker host instead of messing with files from the command line from the container itself:
docker run -it -v /home/joe/tmp:/data <some image>
Any file that you will put in /home/joe/tmp on your docker host will be available under /data/ in the container. Change /data to whatever path is suitable for your case and hack away.