I have a simple Python program that I want to run in IBM Cloud functions. Alas it needs two libraries (O365 and PySnow) so I have to Dockerize it and it needs to be able to accept a Json feed from STDIN. I succeeded in doing this:
FROM python:3
ADD requirements.txt ./
RUN pip install -r requirements.txt
ADD ./main ./main
WORKDIR /main
CMD ["python", "main.py"]
This runs with: cat env_var.json | docker run -i f9bf70b8fc89
I've added the Docker container to IBM Cloud Functions like this:
ibmcloud fn action create e2t-bridge --docker [username]/e2t-bridge
However when I run it, it times out.
Now I did see a possible solution route, where I dockerize it as an Openwhisk application. But for that I need to create a binary from my Python application and then load it into a rather complicated Openwhisk skeleton, I think?
But having a file you can simply run was is the whole point of my Docker, so to create a binary of an interpreted language and then adding it into a Openwhisk docker just feels awfully clunky.
What would be the best way to approach this?
It turns out you don't need to create a binary, you just need to edit the OpenWhisk skeleton like so:
# Dockerfile for example whisk docker action
FROM openwhisk/dockerskeleton
ENV FLASK_PROXY_PORT 8080
### Add source file(s)
ADD requirements.txt /action/requirements.txt
RUN cd /action; pip install -r requirements.txt
# Move the file to
ADD ./main /action
# Rename our executable Python action
ADD /main/main.py /action/exec
CMD ["/bin/bash", "-c", "cd actionProxy && python -u actionproxy.py"]
And make sure that your Python code accepts a Json feed from stdin:
json_input = json.loads(sys.argv[1])
The whole explaination is here: https://github.com/iainhouston/dockerPython
Related
My folder structure looked like this:
My Dockerfile looked like this:
FROM python:3.8-slim-buster
WORKDIR /src
COPY src/requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ .
CMD [ "python", "main.py"]
When I ran these commands:
docker build --tag FinTechExplained_Python_Docker .
docker run free
my main.pyfile ran and gave the correct print statements as well. Now, I have added another file tests.py in the src folder. I want to run the tests.py first and then main.py.
I tried modifying the cmdwithin my docker file like this:
CMD [ "python", "test.py"] && [ "python", "main.py"]
but then it gives me the print statements from only the first test.pyfile.
I read about docker-compose and added this docker-compose.yml file to the root folder:
version: '3'
services:
main:
image: free
command: >
/bin/sh -c 'python tests.py'
main:
image: free
command: >
/bin/sh -c 'python main.py'
then I changed my docker file by removing the cmd:
FROM python:3.8-slim-buster
WORKDIR /src
COPY src/requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ .
Then I ran the following commands:
docker compose build
docker compose run tests
docker compose run main
When I run these commands separately, I get the correct print statements for both testsand main. However, I am not sure if I am using docker-composecorrectly or not.
Am I supposed to run both scripts separately? Or is there a way to run one after another using a single docker command?
How is my Dockerfile supposed to look like if I am running the python scripts from the docker-compose.yml instead?
Edit:
Ideally looking for solutions based on docker-compose
In the Bourne shell, in general, you can run two commands in sequence by putting && between them. It sounds like you're already aware of this.
# without Docker, at a normal shell prompt
python test.py && python main.py
The Dockerfile CMD has two syntactic forms. The JSON-array form does not run a shell, and so it is slightly more efficient and has slightly more consistent escaping rules. If it's not a JSON array then Docker automatically runs it via a shell. So for your use you can use the shell form:
CMD python test.py && python main.py
In comments to other answers you ask about providing this as an override in the docker-compose.yml file. Compose will not normally run a shell for you, so you need to explicitly specify it as part of the command: override.
command: /bin/sh -c 'python test.py && python main.py'
Your Dockerfile should generally specify a CMD and the docker-compose.yml often will not include a command:. This makes it easier to run the image in other contexts (via docker run without Compose; in Kubernetes) since you won't have to retype the command every different way you want to run the container. The entrypoint wrapper pattern highlighted in #sytech's answer is very useful in general and it's easy to add to a container that uses a CMD without an ENTRYPOINT; but it requires the Dockerfile to use CMD as a normal well-formed shell command.
You have to change CMD to ENTRYPOINT. And run the 1st script as daemon in the background using &.
ENTRYPOINT ["/docker_entrypoint.sh"]
docker_entrypoint.sh
#!/bin/bash
set -e
exec python tests.py &
exec python main.py
In general, it is a good rule of thumb that a container should only a single process and that essential process should be pid 1
Using an entrypoint can help you do multiple things at runtime and optionally run user-defined commands using exec, as according to the best practices guide.
For example, if you always want the tests to run whenever the container starts, then execute the defined command in CMD.
First, create an entrypoint script (be sure to make it executable with chmod +x):
#!/usr/bin/env bash
# always run tests first
python /src/tests.py
# then run user-defined command
exec "$#"
Then configure the dockerfile to copy the script and set it as the entrypoint:
#...
COPY entrypoint.sh /docker-entrypoint.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["python", "main.py"]
Then when you build an image from this dockerfile and run it, the entrypoint will first execute the tests then run the command to run main.py
The command can also still be overridden by the user when running the image like docker run ... myimage <new command> which will still result in the entrypoint tests being executed, but the user can change the command being run.
You can achieve this by creating a bash script(let's name entrypoint.sh) which is containing the python commands. If you want, you can create background processes of those.
#!/usr/bin/env bash
set -e
python tests.py
python main.py
Edit your docker file as follows:
FROM python:3.8-slim-buster
# Create workDir
RUN mkdir code
WORKDIR code
ENV PYTHONPATH = /code
#upgrade pip if you like here
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy Code
COPY . .
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
In the docker compose file, add the following line to the service.
entrypoint: [ "./entrypoint.sh" ]
Have you try this in your docker-compose.yaml?
version: '3'
services:
main:
image: free
command: >
/bin/sh -c 'python3 tests.py & && python3 main.py &'
both will run in the background
then run in terminal
docker-compose up --build
A sample repo with the directory structure of what I'm working on is on GitHub here. To run the GitHub Action, you just need to go to the Action tab of the repo and run the Action manually.
I have a custom GitHub Action I've written as well with python as the base image in the Docker container but want the python version to be an input for the GitHub Action. In order to do so, I am creating a second intermediate Docker container to run with the python version input argument.
The problem I'm running into is I don't have access to the original repo's files that is calling the GitHub Action. For example, say the repo is called python-sample-project and has folder structure:
python-sample-project
│ main.py
│ file1.py
│
└───folder1
│ │ file2.py
I see main.py, file1.py, and folder1/file2.py in entrypoint.sh. However, in docker-action/entrypoint.sh I only see the linux folder structure and the entrypoint.sh file copied over in docker-action/Dockerfile.
In the Alpine example I'm using, the action entrypoint.sh script looks like this:
#!/bin/sh -l
ALPINE_VERSION=$1
cd /docker-action
docker build -t docker-action --build-arg alpine_version="$ALPINE_VERSION" . && docker run docker-action
In docker-action/ I have a Dockerfile and entrypoint.sh script that should run for the inner container with the dynamic version of Alpine (or Python)
The docker-action/Dockerfile is as follows:
# Container image that runs your code
ARG alpine_version
FROM alpine:${alpine_version}
# Copies your code file from your action repository to the filesystem path `/` of the container
COPY entrypoint.sh /entrypoint.sh
RUN ["chmod", "+x", "/entrypoint.sh"]
# Code file to execute when the docker container starts up (`entrypoint.sh`)
ENTRYPOINT ["/entrypoint.sh"]
In the docker-action/entrypoint I run ls but I do not see the repository files.
Is it possible to access the main.py, file1.py, and folder1/file2.py in entrypoint.sh in the docker-action/entrypoint.sh?
There's generally two ways to get files from your repository available to a docker container you build and run. You either (1) add the files to the image when you build it or (2) mount the files into the container when you run it. There are some other ways, like specifying volumes, but that's probably out of scope for this case.
The Dockerfile docker-action/Dockerfile does not copy any files except for the entrypoint.sh script. Your entrypoint.sh also does not provide any mount points when running the container. Hence, the outcome you observe is the expected outcome based on these facts.
In order to resolve this, you must either (1) add COPY/ADD statements to your Dockerfile to copy files into the image (and set appropriate build context) OR (2) mount the files into the container when it runs by adding -v /source-path:/container-path to the docker run command in your entrypoint.sh.
See references:
COPY reference
Docker run reference
Though, this approach of building another container just to get a user-provided python version is a highly questionable practice for GitHub Actions and should probably be avoided. Consider leaning on the setup-python action instead.
The docker-in-docker problem
Nevertheless, if you continue this route and want to go about mounting the directory, you'll have to keep in mind that, when invoking docker from within a docker action on GitHub, the filesystem in the mount specification refers to the filesystem of the docker host, not the filesystem of the container.
It works on my machine?!
Counter to what you might experience running docker on a local system for example, this does not work in GitHub -- the working directory is not mounted:
docker run -v $(pwd):/opt/workspace \
--workdir /opt/workspace \
--entrypoint /bin/ls \
my-container "-R"
This doesn't work either:
docker run -v $GITHUB_WORKSPACE:$GITHUB_WORKSPACE \
--workdir $GITHUB_WORKSPACE \
--entrypoint /bin/ls \
my-container "-R"
This kind of thing would work perfectly fine if you tried it on a system running docker locally. What gives?
Dealing with the devil (daemon)
In Actions, the starting working directory where files are checked out into $GITHUB_WORKSPACE. In docker actions, that's /github/workspace. The workspace files populate into the workspace when your action runs by the Actions runner mounting the workspace from the host where the docker daemon is running.
You can see that in the command run when your action starts:
/usr/bin/docker run --name f884202608aa2bfab75b6b7e1f87b3cd153444_f687df --label f88420 --workdir /github/workspace --rm -e INPUT_ALPINE-VERSION -e HOME -e GITHUB_JOB -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_REPOSITORY_OWNER -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RETENTION_DAYS -e GITHUB_RUN_ATTEMPT -e GITHUB_ACTOR -e GITHUB_WORKFLOW -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GITHUB_EVENT_NAME -e GITHUB_SERVER_URL -e GITHUB_API_URL -e GITHUB_GRAPHQL_URL -e GITHUB_WORKSPACE -e GITHUB_ACTION -e GITHUB_EVENT_PATH -e GITHUB_ACTION_REPOSITORY -e GITHUB_ACTION_REF -e GITHUB_PATH -e GITHUB_ENV -e RUNNER_OS -e RUNNER_NAME -e RUNNER_TOOL_CACHE -e RUNNER_TEMP -e RUNNER_WORKSPACE -e ACTIONS_RUNTIME_URL -e ACTIONS_RUNTIME_TOKEN -e ACTIONS_CACHE_URL -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/home/runner/work/_temp/_github_home":"/github/home" -v "/home/runner/work/_temp/_github_workflow":"/github/workflow" -v "/home/runner/work/_temp/_runner_file_commands":"/github/file_commands" -v "/home/runner/work/my-repo/my-repo":"/github/workspace" f88420:2608aa2bfab75b6b7e1f87b3cd153444 "3.9.5"
The important bits are this:
-v "/home/runner/work/my-repo/my-repo":"/github/workspace"
-v "/var/run/docker.sock":"/var/run/docker.sock"
/home/runner/work/my-repo/my-repo is the path on the host, where the repository files are. As mentioned, that first line is what gets it mounted into /github/workspace in your action container when it gets run.
The second line is mounting the docker socket from the host to the action container. This means any time you call docker within your action, you're actually talking to the docker daemon outside of your container. This is important because that means when you use the -v argument inside your action, the arguments need to reflect directories that exist outside of the container.
So, what you would actually need to do instead is this:
docker run -v /home/runner/work/my-repo/my-repo:/opt/workspace \
--workdir /opt/workspace \
--entrypoint /bin/ls \
my-container "-R"
Becoming useful to others
And that works. If you only use it for the project itself. However, you have (among others) a remaining problem if you want this action to be consumable by other projects. How do you know where the workspace is on the host? This path will change for each repository, after all. GitHub does not guarantee these paths, either. They may be different on different platforms, or your action may be running on a self-hosted runner.
So how do you content with that problem? There is no inbuilt environment variable that points to this directory you need specifically, unfortunately. However, by relying on implementation detail, you might be able to get away with using the $RUNNER_WORKSPACE variable, which will point, in this case to /home/runner/work/your-project. This is not the same place as the origin of $GITHUB_WORKSPACE but it's close. You can use the GITHUB_REPOSITORY variable to build the path, though this isn't guaranteed to always be the case afaik:
PROJECT_NAME="$(basename ${GITHUB_REPOSITORY})"
WORKSPACE="${RUNNER_WORKSPACE}/${PROJECT_NAME}"
You also have some other things to fix like the working directory form which you build.
TL;DR
You need to mount files in the container when you run it. In GitHub, you're running docker-in-docker, so paths you need to use to mount files work different, so you need to find the correct paths to pass to docker when called from within your action container.
A minimally working solution for the example project you linked is this entrypoint.sh in the root of the repo looks like this:
#!/usr/bin/env sh
ALPINE_VERSION=$1
docker build -t docker-action \
-f ./docker-action/Dockerfile \
--build-arg alpine_version="$ALPINE_VERSION" \
./docker-action
PROJECT_NAME="$(basename ${GITHUB_REPOSITORY})"
WORKSPACE="${RUNNER_WORKSPACE}/${PROJECT_NAME}"
docker run --workdir=$GITHUB_WORKSPACE \
-v $WORKSPACE:$GITHUB_WORKSPACE \
docker-action "$#"
There are probably further concerns with your action, depending on what it does, like making available all the default and user-defined environment variables for the action to the 'inner' container, if that's important.
So, is this possible? Sure. Is it reasonable just to get a dynamic version of alpine/python? I don't think so. There's probably better ways of accomplishing what you want to do, like using setup-python, but that sounds like a different question.
So basically I have a python script that will write to a file once it is done running. How do I access this file? My end goal is to run the docker image on jenkins and then read the xml file that the python script generates.
FROM python:3
ADD WebChecker.py /
ADD requirements.txt /
ADD sites.csv /
RUN pip install -r requirements.txt
CMD [ "python", "./WebChecker.py" ]
That is my Dockerfile. I have a print("Finished") in there and it is printing so that means everything is working fine. It's just now I need to see my output.xml file.
You should have done it now by following above comments. In case if you still stuck, you may give a try as below:
Build:
docker build -t some_tag_name_to_your_image .
After build is completed, you may run a container and get the xml file as below:
1. Write output file to bind volume
Run your container as below:
docker run -d --rm --name my_container \
-v ${WORKSPACE}:/path/to/xml/file/in/container \
some_tag_name_to_your_image
Once the xml file generated, that will be available at the Jenkins-host:${WORKSPACE}
Notes:
${WORKSPACE} is an env variable set by Jenkins. Read more env-vars here
Read more about bind mount here
A very simple Python program. Suppose the current directory is /PYTHON. I want to pass file.txt as an argument to the Python script boot.py. Here is my Dockerfile:
FROM python
COPY boot.py ./
COPY file.txt ./
RUN pip install numpy
CMD ["python", "boot.py", "file.txt"]
Then I build the Docker container with:
docker build -t boot/latest .
Then run the container
docker run -t boot:latest python boot.py file.txt
I got the correct results.
But if I copy another file, file1.txt, to the current directory (from a different directory (not /PYTHON)), then I run the container again:
docker run -t boot:latest python boot.py file1.txt
I got the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'file1.txt'
so the error is due to fact that file1.txt is not in the container, but if I share this container with a friend and the friend wants to pass a very different file as argument, how do I write the Dockerfile so anybody with my container can pass very different files as argument without errors?
It won't work that way. Like you said, file1.txt is not in the container.
The workaround is to use Docker volumes to inject files from your host machine to the container when running it.
Something like this:
docker run -v /local/path/to/file1.txt:/container/path/to/file1.txt -t boot:latest python boot.py /container/path/to/file1.txt
Then /local/path/to/file1.txt would be the path on your host machine which will override /container/path/to/file1.txt on the container.
You may also make your script read from STDIN and then pass data to docker using cat. Have a look at how to get docker container to read from stdin?
The trick is to keep STDIN open even if not attached with
--interactive or -i (alias) option for Docker.
Something like:
cat /path/to/file | docker run -i --rm boot python boot.py
Or:
docker run -i --rm boot python booty.py < /path/to/file
EOF is the end of the input.
If I understand the question correctly, you are acknowledging that the file isn't in the container, and you are asking how to best share you container with the world, allowing people to add their own content into it.
You have a couple of options, either use Docker volumes, which allows your friends (and other interested parties) to mount local volumes inside your Docker containers. That is, you can overlay a folder on your local filesystem onto a folder inside the container (this is generally quite nifty when you are developing locally as well).
Or, again, depending on the purpose of your container, somebody could extend your image. For example, a Dockerfile like
FROM yourdockerimage:latest
COPY file1.txt ./
CMD ["python", "boot.py", "file1.txt"]
Choose whichever option suits your project the best.
One option is to make use of volumes.
This way all collaborators on the project are able to mount them in the containers.
You could change your Dockerfile to:
FROM python
COPY boot.py ./
COPY file.txt ./
RUN pip install numpy
ENTRYPOINT ["python", "boot.py"]
And then run it to read from STDIN:
docker run -i boot:latest -<file1.txt
I have a Docker file trying to deploy Django code to a container
FROM ubuntu:latest
MAINTAINER { myname }
#RUN echo "deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -sc) main universe" >> /etc/apt/sou$
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y tar git curl dialog wget net-tools nano buil$
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y python python-dev python-distribute python-p$
RUN mkdir /opt/app
WORKDIR /opt/app
#Pull Code
RUN git clone git#bitbucket.org/{user}/{repo}
RUN pip install -r website/requirements.txt
#EXPOSE = ["8000"]
CMD python website/manage.py runserver 0.0.0.0:8000
And then I build my code as docker build -t dockerhubaccount/demo:v1 ., and this pulls my code from Bitbucket to the container. I run it as docker run -p 8000:8080 -td felixcheruiyot/demo:v1 and things appear to work fine.
Now I want to update the code i.e since I used git clone ..., I have this confusion:
How can I update my code when I have new commits and upon Docker containers build it ships with the new code (note: when I run build it does not fetch it because of cache).
What is the best workflow for this kind of approach?
There are a couple of approaches you can use.
You can use docker build --no-cache to avoid using the cache of the Git clone.
The startup command calls git pull. So instead of running python manage.py, you'd have something like CMD cd /repo && git pull && python manage.py or use a start script if things are more complex.
I tend to prefer 2. You can also run a cron job to update the code in your container, but that's a little more work and goes somewhat against the Docker philosophy.
I would recommend you checkout out the code on your host and COPY it into the image. That way it will be updated whenever you make a change. Also, during development you can bind mount the source directory over the code directory in the container, meaning any changes are reflected immediately in the container.
A docker command for git repositories that checks for the last update would be very useful though!
Another solution.
Docker build command uses cache as long as a instruction string is exactly same as the one of cached image. So, if you write
RUN echo '2014122400' >/dev/null && git pull ...
On next update, you change as follows.
RUN echo '2014122501' >/dev/null && git pull ...
This can prevents docker from using cache.
I would like to offer another possible solution. I need to warn however that it's definitely not the "docker way" of doing things and relies on the existence of volumes (which could be a potential blocker in tools like Docker Swarm and Kubernetes)
The basic principle that we will be taking advantage of is the fact that the contents of container directories that are used as Docker Volumes, are actually stored in the file system of the host. Check out this part of the documentation.
In your case you would make /opt/app a Docker Volume. You don't need to map the Volume explicitly to a location on the host's file-system since as a I will describe below, the mapping can be obtained dynamically.
So for starters leave your Dockerfile exactly as it is and switch your container creation command to something like:
docker run -p 8000:8080 -v /opt/app --name some-name -td felixcheruiyot/demo:v1
The command docker inspect -f {{index .Volumes "/opt/webapp"}} some-name will print the full file system path on the host where your code is stored (this is where I picked up the inspect trick).
Armed with that knowledge all you have to do is replace that code and your all set.
So a very simple deploy script would be something like:
code_path=$(docker inspect -f {{index .Volumes "/opt/webapp"}} some-name)
rm -rfv $code_path/*
cd $code_path
git clone git#bitbucket.org/{user}/{repo}
The benefits you get with an approach like this are:
There are no potentially costly cacheless image rebuilds
There is no need to move application specific running information into the run command. The Dockerfile is the only source of needed for instrumenting the application
UPDATE
You can achieve the same results I have mentioned above using docker cp (starting Docker 1.8). This way the container need not have volumes, and you can replace code in the container as you would on the host file-system.
Of course as I mentioned in the beginning of the answer, this is not the "docker way" of doing things, which advocates containers being immutable and reproducible.
If you use GitHub you can use the GitHub API to not cache specific RUN commands.
You need to have jq installed to parse JSON: apt-get install -y jq
Example:
docker build --build-arg SHA=$(curl -s 'https://api.github.com/repos/Tencent/mars/commits' | jq -r '.[0].sha') -t imageName .
In Dockerfile (ARG command should be right before RUN):
ARG SHA=LATEST
RUN SHA=${SHA} \
git clone https://github.com/Tencent/mars.git
Or if you don't want to install jq:
SHA=$(curl -s 'https://api.github.com/repos/Tencent/mars/commits' | grep sha | head -1)
If a repository has new commits, git clone will be executed.