How to access generated files inside Docker image

How to access generated files inside Docker image - python

So basically I have a python script that will write to a file once it is done running. How do I access this file? My end goal is to run the docker image on jenkins and then read the xml file that the python script generates.
FROM python:3
ADD WebChecker.py /
ADD requirements.txt /
ADD sites.csv /
RUN pip install -r requirements.txt
CMD [ "python", "./WebChecker.py" ]
That is my Dockerfile. I have a print("Finished") in there and it is printing so that means everything is working fine. It's just now I need to see my output.xml file.

You should have done it now by following above comments. In case if you still stuck, you may give a try as below:
Build:
docker build -t some_tag_name_to_your_image .
After build is completed, you may run a container and get the xml file as below:
1. Write output file to bind volume
Run your container as below:
docker run -d --rm --name my_container \
-v ${WORKSPACE}:/path/to/xml/file/in/container \
some_tag_name_to_your_image
Once the xml file generated, that will be available at the Jenkins-host:${WORKSPACE}
Notes:
${WORKSPACE} is an env variable set by Jenkins. Read more env-vars here
Read more about bind mount here

Related

How to not stop nohup and get output files

I’m new to working on Linux. I apologize if this is a dumb question. Despite searching for more than a week, I was not able to derive a clear answer to my question.
I’m running a very long Python program on Nvidia CPUs. The output is several csv files. It takes long to compute the output, so I use nohup to be able to exit the process.
Let’s say main.py file is this
import numpy as p
import pandas as pd
if __name__ == ‘__main__’:
a = np.arange(1,1000)
data = a*2
filename = ‘results.csv’
output = pd.DataFrame(data, columns = [“Output”])
output.to_csv(filename)
The calculations for data is more complicated, of course. I build a docker container, and run this program inside this container. When I use python main.py for a smaller-sized example, there is no problem. It writes the csv files.
My question is this:
When I do nohup python main.py &, I check what’s going on with tail -f nohup.out in the docker container, I get what it is doing at that time but I cannot exit it and let the execution run its course. It just stops there. How can I exit safely from the screen that comes with tail -f nohup.out?
I tried not checking the condition of the code and letting the code continue for two days, then I returned. The output of tail -f nohup.out indicated that the execution finished but csv files were nowhere to be seen. It is somehow bundled up inside nohup.out or does it indicate something else is wrong?

If you're going to run this setup in a Docker container:
A Docker container runs only one process, as a foreground process; when that process exits the container completes. That process is almost always the script or server you're trying to run and not an interactive shell. But;
It's possible to use Docker constructs to run the container itself in the background, and collect its logs while it's running or after it completes.
A typical Dockerfile for a Python program like this might look like:
FROM python:3.10
# Create and use some directory; it can be anything, but do
# create _some_ directory.
WORKDIR /app
# Install Python dependencies as a separate step. Doing this first
# saves time if you repeat `docker build` without changing the
# requirements list.
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy in the rest of the application.
COPY . .
# Set the main container command to be the script.
CMD ["./main.py"]
The script should be executable (chmod +x main.py on your host) and begin with a "shebang" line (#!/usr/bin/env python3) so the system knows where to find the interpreter.
You will hear recommendations to use both CMD and ENTRYPOINT for the final line. It doesn't matter much to your immediate question. I prefer CMD for two reasons: it's easier to launch an alternate command to debug your container (docker run --rm your-image ls -l vs. docker run --rm --entrypoint ls your-image -l), and there's a very useful pattern of using ENTRYPOINT to do some initial setup (creating environment variables dynamically, running database migrations, ...) and then launching CMD.
Having built the image, you can use the docker run -d option to launch it in the background, and then run docker logs to see what comes out of it.
# Build the image.
docker build -t long-python-program .
# Run it, in the background.
docker run -d --name run1 long-python-program
# Review its logs.
docker logs run1
If you're running this to produce files that need to be read back from the host, you need to mount a host directory into your container at the time you start it. You need to make a couple of changes to do this successfully.
In your code, you need to write the results somewhere different than your application code. You can't mount a host directory over the /app directory since it will hide the code you're actually trying to run.
data_dir = os.getenv('DATA_DIR', 'data')
filename = os.path.join(data_dir, 'results.csv')
Optionally, in your Dockerfile, create this directory and set a pointer to it. Since my sample code gets its location from an environment variable you can again use any path you want.
# Create the data directory.
RUN mkdir /data
ENV DATA_DIR=/data
When you launch the container, the docker run -v option mounts filesystems into the container. For this sort of output file you're looking for a bind mount that directly attaches a host directory to the container.
docker run -d --name run2 \
-v "$PWD/results:/data" \
long-python-program
In this example so far we haven't set the USER of the program, and it will run as root. You can change the Dockerfile to set up an alternate USER (which is good practice); you do not need to chown anything except the data directory to be owned by that user (leaving your code owned by root and not world-writeable is also good practice). If you do this, when you launch the container (on native Linux) you need to provide the host numeric user ID that can write to the host directory; you do not need to make other changes in the Dockerfile.
docker run -d --name run2 \
-u $(id -u) \
-v "$PWD/results:/data" \
long-python-program

1- Container is a foreground process. Use CMD or Entrypoint in Dockerfile.
2- Map volume in docker to linux directory's.

How to run a docker image in IBM Cloud functions?

I have a simple Python program that I want to run in IBM Cloud functions. Alas it needs two libraries (O365 and PySnow) so I have to Dockerize it and it needs to be able to accept a Json feed from STDIN. I succeeded in doing this:
FROM python:3
ADD requirements.txt ./
RUN pip install -r requirements.txt
ADD ./main ./main
WORKDIR /main
CMD ["python", "main.py"]
This runs with: cat env_var.json | docker run -i f9bf70b8fc89
I've added the Docker container to IBM Cloud Functions like this:
ibmcloud fn action create e2t-bridge --docker [username]/e2t-bridge
However when I run it, it times out.
Now I did see a possible solution route, where I dockerize it as an Openwhisk application. But for that I need to create a binary from my Python application and then load it into a rather complicated Openwhisk skeleton, I think?
But having a file you can simply run was is the whole point of my Docker, so to create a binary of an interpreted language and then adding it into a Openwhisk docker just feels awfully clunky.
What would be the best way to approach this?

It turns out you don't need to create a binary, you just need to edit the OpenWhisk skeleton like so:
# Dockerfile for example whisk docker action
FROM openwhisk/dockerskeleton
ENV FLASK_PROXY_PORT 8080
### Add source file(s)
ADD requirements.txt /action/requirements.txt
RUN cd /action; pip install -r requirements.txt
# Move the file to
ADD ./main /action
# Rename our executable Python action
ADD /main/main.py /action/exec
CMD ["/bin/bash", "-c", "cd actionProxy && python -u actionproxy.py"]
And make sure that your Python code accepts a Json feed from stdin:
json_input = json.loads(sys.argv[1])
The whole explaination is here: https://github.com/iainhouston/dockerPython

Exporting files from docker volume to another directory

I have a python code which reads data from the file and do some calculation and save the result to the output file. The code also saves the logs in log file. So in my current directory, I have below files:
1. code.py --> The main python application
2. input.json --> This json file is used to take input data
3. output.json --> The output data is saved in this file.
4. logfile.log --> This file saves the log.
All the above file is inside the directory Application. Full path is /home/user/Projects/Application/. Now when I am running the code.py I am getting the expected results. So I converted the above code into docker by using below Dockerfile:
FROM python:3
ADD code.py /
ADD input.json /
ADD output.json /
ADD logfile.log /
CMD [ "python3", "./code.py" ]
When I am running the docker container, it is running fine but I cannot see the output data and logs in output.json and logfile.log. Then I searched for these file in the file system and found these files in below directory:
/var/lib/docker/overlay2/7c237c143f9f2e711832daccecdfb29abaf1e37a4714f34f34870e0ee4b1af07/diff/home/user/Projects/Application/
and all my files were in that directory. I checked for the logs and the data, it was there. Then I understood that all the files will be saved inside the docker volumes and not in our current directory.
Is there any way I can keep the files and all the data in my current directory /home/user/Projects/Application/ instead of docker because in this way it will be easy for me to check the outputs.
Thanks

The files are located under the docker overlay volume because you didn’t do volume mounting. To overcome this, you can modify your Dockerfile to look similar to this:
FROM python:3
RUN mkdir /app
ADD code.py /app
ADD input.json /app
ADD output.json /app
ADD logfile.log /app
WORKDIR /app
VOLUME /app
CMD [ "python3", "./code.py" ]
Then in your docker run command, make sure you pass this option:
-v /home/user/Projects/Application:/app
More information about container options can be found at https://www.aquasec.com/wiki/display/containers/Docker+Containers.
If you are using docker compose, you need to add:
volumes:
- /home/user/Projects/Application: /var/www/app

You may try to run your container as below: [you may not need to build your image]
docker run --rm -v /home/user/Projects/Application/:/home/user/Projects/Application/ -d python:3 /home/user/Projects/Application/code.py
-v ; bind mount local folder into your container at /home/user/Projects/Application/.
Feel free to take out --rm if you don't need that.
Please make sure code.py write logs to /home/user/Projects/Application/logfile.log
To verify files and folders are there are not by running command:
docker run --rm -v /home/user/Projects/Application/:/home/user/Projects/Application/ -d python:3 sh
This will drop you in a terminal, you can list files and make sure required files and configs are there.

Passing a file as an argument to a Docker container

A very simple Python program. Suppose the current directory is /PYTHON. I want to pass file.txt as an argument to the Python script boot.py. Here is my Dockerfile:
FROM python
COPY boot.py ./
COPY file.txt ./
RUN pip install numpy
CMD ["python", "boot.py", "file.txt"]
Then I build the Docker container with:
docker build -t boot/latest .
Then run the container
docker run -t boot:latest python boot.py file.txt
I got the correct results.
But if I copy another file, file1.txt, to the current directory (from a different directory (not /PYTHON)), then I run the container again:
docker run -t boot:latest python boot.py file1.txt
I got the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'file1.txt'
so the error is due to fact that file1.txt is not in the container, but if I share this container with a friend and the friend wants to pass a very different file as argument, how do I write the Dockerfile so anybody with my container can pass very different files as argument without errors?

It won't work that way. Like you said, file1.txt is not in the container.
The workaround is to use Docker volumes to inject files from your host machine to the container when running it.
Something like this:
docker run -v /local/path/to/file1.txt:/container/path/to/file1.txt -t boot:latest python boot.py /container/path/to/file1.txt
Then /local/path/to/file1.txt would be the path on your host machine which will override /container/path/to/file1.txt on the container.

You may also make your script read from STDIN and then pass data to docker using cat. Have a look at how to get docker container to read from stdin?
The trick is to keep STDIN open even if not attached with
--interactive or -i (alias) option for Docker.
Something like:
cat /path/to/file | docker run -i --rm boot python boot.py
Or:
docker run -i --rm boot python booty.py < /path/to/file
EOF is the end of the input.

If I understand the question correctly, you are acknowledging that the file isn't in the container, and you are asking how to best share you container with the world, allowing people to add their own content into it.
You have a couple of options, either use Docker volumes, which allows your friends (and other interested parties) to mount local volumes inside your Docker containers. That is, you can overlay a folder on your local filesystem onto a folder inside the container (this is generally quite nifty when you are developing locally as well).
Or, again, depending on the purpose of your container, somebody could extend your image. For example, a Dockerfile like
FROM yourdockerimage:latest
COPY file1.txt ./
CMD ["python", "boot.py", "file1.txt"]
Choose whichever option suits your project the best.

One option is to make use of volumes.
This way all collaborators on the project are able to mount them in the containers.

You could change your Dockerfile to:
FROM python
COPY boot.py ./
COPY file.txt ./
RUN pip install numpy
ENTRYPOINT ["python", "boot.py"]
And then run it to read from STDIN:
docker run -i boot:latest -<file1.txt

Docker interactive mode and executing script

I have a Python script in my docker container that needs to be executed, but I also need to have interactive access to the container once it has been created ( with /bin/bash ).
I would like to be able to create my container, have my script executed and be inside the container to see the changes/results that have occurred (no need to manually execute my python script).
The current issue I am facing is that if I use the CMD or ENTRYPOINT commands in the docker file I am unable to get back into the container once it has been created. I tried using docker start and docker attach but I'm getting the error:
sudo docker start containerID
sudo docker attach containerID
"You cannot attach to a stepped container, start it first"
Ideally, something close to this:
sudo docker run -i -t image /bin/bash python myscript.py
Assume my python script contains something like (It's irrelevant what it does, in this case it just creates a new file with text):
open('newfile.txt','w').write('Created new file with text\n')
When I create my container I want my script to execute and I would like to be able to see the content of the file. So something like:
root#66bddaa892ed# sudo docker run -i -t image /bin/bash
bash4.1# ls
newfile.txt
bash4.1# cat newfile.txt
Created new file with text
bash4.1# exit
root#66bddaa892ed#
In the example above my python script would have executed upon creation of the container to generate the new file newfile.txt. This is what I need.

My way of doing it is slightly different with some advantages.
It is actually multi-session server rather than script but could be even more usable in some scenarios:
# Just create interactive container. No start but named for future reference.
# Use your own image.
docker create -it --name new-container <image>
# Now start it.
docker start new-container
# Now attach bash session.
docker exec -it new-container bash
Main advantage is you can attach several bash sessions to single container. For example I can exec one session with bash for telling log and in another session do actual commands.
BTW when you detach last 'exec' session your container is still running so it can perform operations in background

You can run a docker image, perform a script and have an interactive session with a single command:
sudo docker run -it <image-name> bash -c "<your-script-full-path>; bash"
The second bash will keep the interactive terminal session open, irrespective of the CMD command in the Dockerfile the image has been created with, since the CMD command is overwritten by the bash - c command above.
There is also no need to appending a command like local("/bin/bash") to your Python script (or bash in case of a shell script).
Assuming that the script has not yet been transferred from the Docker host to the docker image by an ADD Dockerfile command, we can map the volumes and run the script from there:
sudo docker run -it -v <host-location-of-your-script>:/scripts <image-name> bash -c "/scripts/<your-script-name>; bash"
Example: assuming that the python script in the original question is already on the docker image, we can omit the -v option and the command is as simple as follows:
sudo docker run -it image bash -c "python myscript.py; bash"

Why not this?
docker run --name="scriptPy" -i -t image /bin/bash python myscript.py
docker cp scriptPy:/path/to/newfile.txt /path/to/host
vim /path/to/host
Or if you want it to stay on the container
docker run --name="scriptPy" -i -t image /bin/bash python myscript.py
docker start scriptPy
docker attach scriptPy
Hope it was helpful.

I think this is what you mean.
Note: THis uses Fabric (because I'm too lazy and/or don't have the time to work out how to wire up stdin/stdout/stderr to the terminal properly but you could spend the time and use straight subprocess.Popen):
Output:
$ docker run -i -t test
Entering bash...
[localhost] local: /bin/bash
root#66bddaa892ed:/usr/src/python# cat hello.txt
Hello World!root#66bddaa892ed:/usr/src/python# exit
Goodbye!
Dockerfile:
# Test Docker Image
FROM python:2
ADD myscript.py /usr/bin/myscript
RUN pip install fabric
CMD ["/usr/bin/myscript"]
myscript.py:
#!/usr/bin/env python
from __future__ import print_function
from fabric.api import local
with open("hello.txt", "w") as f:
f.write("Hello World!")
print("Entering bash...")
local("/bin/bash")
print("Goodbye!")

Sometimes, your python script may call different files in your folder, like another python scripts, CSV files, JSON files etc.
I think the best approach would be sharing the dir with your container, which would make easier to create one environment that has access to all the required files
Create one text script
sudo nano /usr/local/bin/dock-folder
Add this script as its content
#!/bin/bash
echo "IMAGE = $1"
## image name is the first param
IMAGE="$1"
## container name is created combining the image and the folder address hash
CONTAINER="${IMAGE}-$(pwd | md5sum | cut -d ' ' -f 1)"
echo "${IMAGE} ${CONTAINER}"
# remove the image from this dir, if exists
## rm remove container command
## pwd | md5 get the unique code for the current folder
## "${IMAGE}-$(pwd | md5sum)" create a unique name for the container based in the folder and image
## --force force the container be stopped and removed
if [[ "$2" == "--reset" || "$3" == "--reset" ]]; then
echo "## removing previous container ${CONTAINER}"
docker rm "${CONTAINER}" --force
fi
# create one special container for this folder based in the python image and let this folder mapped
## -it interactive mode
## pwd | md5 get the unique code for the current folder
## --name="${CONTAINER}" create one container with unique name based in the current folder and image
## -v "$(pwd)":/data create ad shared volume mapping the current folder to the /data inside your container
## -w /data define the /data as the working dir of your container
## -p 80:80 some port mapping between the container and host ( not required )
## pyt#hon name of the image used as the starting point
echo "## creating container ${CONTAINER} as ${IMAGE} image"
docker create -it --name="${CONTAINER}" -v "$(pwd)":/data -w /data -p 80:80 "${IMAGE}"
# start the container
docker start "${CONTAINER}"
# enter in the container, interactive mode, with the shared folder and running python
docker exec -it "${CONTAINER}" bash
# remove the container after exit
if [[ "$2" == "--remove" || "$3" == "--remove" ]]; then
echo "## removing container ${CONTAINER}"
docker rm "${CONTAINER}" --force
fi
Add execution permission
sudo chmod +x /usr/local/bin/dock-folder
Then you can call the script into your project folder calling:
# creates if not exists a unique container for this folder and image. Access it using ssh.
dock-folder python
# destroy if the container already exists and replace it
dock-folder python --replace
# destroy the container after closing the interactive mode
dock-folder python --remove
This call will create a new python container sharing your folder. This makes accessible all the files in the folder as CSVs or binary files.
Using this strategy, you can quickly test your project in a container and interact with the container to debug it.
One issue with this approach is about reproducibility. That is, you may install something using your shell script that is required to your application run. But, this change just happened inside of your container. So, anyone that will try to run your code will have to figure out what you have done to run it and do the same.
So, if you can run your project without installing anything special, this approach may suits you well. But, if you had to install or change some things in your container to be able to run your project, probably you need to create a Dockerfile to save these commands. That will make all the steps from loading the container, making the required changes and loading the files easy to replicate.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.