Debug Python in Docker Container - python

I have a docker container running a python server, mounted on my local volume (so it gets updated if I restart the container for instance)
However, this gets tremendously hard to debug. Im using PyCharm professional IDEA.
Ive tried following the guides on how to debug inside docker containers, but it only shows how to do it when you start the container inside PyCharm, in my case I got a big Terraform stuff going on to setup all the environment, so I gotta find a way of attaching to the container python interpreter or something like that.
Would any1 have any ideas or guides on this ?
Thanks !

There are many details missing that would be needed to get a full view, but there are generally two ways to debug containers: 1) debug a running container and 2) debug a container image.
Debugging Container Images and Failed Builds
The latter is much easier because you can look at the history of a particular image and run a layer inside it.
First, we take a look at our locally built images:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 77af4d6b9913 19 hours ago 1.089 GB
committ latest b6fa739cedf5 19 hours ago 1.089 GB
Next, we can pick a particular image and run docker history on it:
$ docker history 77af4d6b9913
IMAGE CREATED CREATED BY SIZE COMMENT
3e23a5875458 8 days ago /bin/sh -c #(nop) ENV LC_ALL=C.UTF-8 0 B
8578938dd170 8 days ago /bin/sh -c dpkg-reconfigure locales && loc 1.245 MB
be51b77efb42 8 days ago /bin/sh -c apt-get update && apt-get install 338.3 MB
4b137612be55 6 weeks ago /bin/sh -c #(nop) ADD jessie.tar.xz in / 121 MB
Then we can pick a layer anywhere in the history of the image and run that interactively:
$ docker run -it --rm 3e23a5875458 /bin/sh
This will dump you into a shell where you can run whatever the next command in the image-build process would be. This is super useful if your docker build command failed and you need to understand why, but it can also be useful if you just want to look at how things are set-up inside a particular container (such as your Python interpreter, dependencies, PATH, etc.).
Attaching to a Running Container
This can be a little more confusing, but similarly, you can run a command inside a runnning container using exec. For instance, I often want to make sure my environment variables are set correctly, so I'll run something like this:
$ docker exec my_container env
You can use this to create a shell inside the running container as well:
$ docker exec -it my_container /bin/sh
This is generic stuff, but useful broadly for debugging containers.
Note: I am using /bin/sh above because a lot of small base images (like Alpine) don't have bash installed.

Related

How to efficiently input files with docker

I am starting to get a hand on docker and try to containerized some of the applications I use. Thanks to the tutorial I was able to create docker images and containers but now I am trying to thing about the most efficient and practical ways to do things.
To present my use-case, I have a python code (let's call it process.py) that takes as an input a single .jpg image, does some operations on this image, and then output the processed .jpg image.
Normally I would run it through :
python process.py -i path_of_the_input_image -o path_of_the_output_image
Then, the way I do the connection input/output with my docker is the following. First I create the docker file :
FROM python:3.6.8
COPY . /app
WORKDIR /app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD python ./process.py -i ./input_output/input.jpg -o ./input_output/output.jpg
And then after building the image, I run docker run mapping the a local folder with the input_output folder of docker:
docker run -v C:/local_folder/:/app/input_output my_docker_image
This seems to work, but is not really practical, as I have to create locally a specific folder to mount it to the docker container. So here are the questions I am asking myself :
Is there a more practical ways of doings things ? To directly send one single input file and directly receive one single output files from the output of a docker container ?
When I run the docker image, what happens (If I understand correctly) is that it will create a docker container that will run my program once process.py once and then just sits there doing nothing. Even after finishing running process.py it will still be there listed in the command "docker ps -a". Is this behaviour expected ? Is there a way to automatically delete finished container ? Am I using docker run the right way ?
Is there a more practical way of having a container running continuously and on which I can query to run the program process.py on demand with a given input ?
I have a python code (let's call it process.py) that takes as an input a single .jpg image, does some operations on this image, and then output the processed .jpg image.
That's most efficiently done without Docker; just run the python command you already have. If your application has interesting Python library dependencies, you can install them in a virtual environment to avoid conflicts with the system Python installation.
When I run the Docker image...
...the container runs its main command (docker run command arguments, Dockerfile CMD, possibly combined with an entrypoint from the some sources), and when that command exits, the container exits. It will be listed in docker ps -a output, but as "Stopped" (probably with status 0 for a successful completion). You can docker run --rm to have the container automatically delete itself.
Is there a more practical way of having a container running continuously and on which I can query to run the program process.py on demand with a given input ?
Wrap it in a network service, like a Flask application. As long as this is running, you can use a tool like curl to do an HTTP POST with the input JPEG file as the body, and get the output JPEG file as the response. Avoid using local files and Docker together whenever that's an option (prefer network I/O for process inputs and outputs; prefer a database to local-file storage).
Why are volume mounts not practical?
I would argue that Dockerising your application is not practical, but you've chosen to do so for, presumably very good, reasons. Volume mounts are simply an extension to this. If you want to get data in/out of your container, the 'normal' way to do this is by using volume mounts as you have done. Sure, you could use docker cp to copy the files manually, but that's not really practical either.
As far as the process exiting goes, normally, once the main process exits, the container exits. docker ps -a shows stopped containers as well as running ones. You should see that it says Exited n minutes(hours, days etc) ago. This means that your container has run and exited, correctly. You can remove it with docker rm <containerid>.
docker ps (no -a) will only show the running ones, btw.
If you use the --rm flag in your Docker run command, it will be removed when it exits, so you won't see it in the ps -a output. Stopped containers can be started again, but that's rather unusual.
Another solution might be to change your script to wait for incoming files and process them as they are received. Then you can leave the container running, and it will just process them as needed. If doing this, make sure that your idle loop has a sleep or something in it to ensure that you don't consume too many resources.

Run .sh script from python as sudo

I'm working on a project with python where I want to automate docker containers creation. I have the project folder already with includes all the files required to create the image.
One of these is create_image.sh
docker build -t my_container:latest .
Currently I do:
sudo bash create_image.sh
But now I need to automate this process from python.
I have tried:
import os
import subprocess
subprocess.check_call("bash -c '. create_image.sh'", shell=True)
But I get this error:
CalledProcessError: Command 'bash -c '. create_image.sh'' returned non-zero exit status 1.
EDIT:
The use case is to automate containers creation through an API, I have the code in flask and python until this point, where I got stuck in the images creation from the docker file. The rest is automated from templates.
You can try:
subprocess.call(['sudo', 'bash', 'create_image.sh' ])
which is equivalent of
sudo bash create_image.sh
Note: Let me say that there are better ways of automating docker container creation - please check docker-compose which can build and start the container easily. If you can elaborate more on the use case, we could help you with an elegant solution for docker. It might not be a python problem
EDIT:
Following the comments, it would be better to create a docker-compose and makefile is used to issue docker commands. Inspiration - https://medium.com/#daniel.carlier/how-to-build-a-simple-flask-restful-api-with-docker-compose-2d849d738137
In case that's because your user can't run docker without sudo, probably it's better to grant him docker API access by including him to docker group: https://askubuntu.com/questions/477551/how-can-i-use-docker-without-sudo
Simply adding user to docker group:
sudo gpasswd -a $USER docker
Also if You want to automate docker operations on python, I'd recommend to use python library for docker: How to build an Image using Docker API Python Client?

How to persist my notebooks and data in my Docker image/container

I am new to Docker and I am confused about containers and images somehow. I want to sue Docker for Tensorflow development. All I need is to have an easy way to write Jupyter Notebooks and use GPU powered Tensorflow.
I have the latest Tensorflow Jupyter Python 3 Image already. I run the Image with
docker run --rm --runtime=nvidia -v -it -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter
How can I make it so that my data when I work in that Image and add and edit my Jupyter Notebooks won't get lost after I exit the process. I know that Docker Images aren't meant to persist state but I am so new to this I just want something to work in with persistent data. Can someone help me guide me through this or point to a resource which will answer all my prayers?
I would also like to move some stuff into the Container that is going to be run so that I can access some custom Python libs because they contain some things that my Notebooks need to import!
Side questions:
--rm removes the container or whatever by default I run it without this flag still my data was lost
-v is for volumes? I tried with -v Bachelor:/app to mount a volume like so. It apparently doesn't make any difference. I don't know how to use the volume Bachelor that I created. Instead there are a multitude of unnamed volumes being created that are not usable whenever I run this
-it does also something no idea what
-p is the port number right?
Use Docker volumes:
Volumes are the preferred mechanism for persisting data generated by and used by Docker containers
Example:
docker run --runtime=nvidia -v ${SOURCE_FOLDER}:${DEST_FOLDER} -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter
Change SOURCE_FOLDER and DEST_FOLDER accordingly (use absolute paths!).
Now if you navigate to localhost:8888 and create a notebook on DEST_FOLDER, it also should be available on SOURCE_FOLDER.
As for your side questions:
--it runs a container in interactive mode. You generally add /bin/bash after the run command, so you can start an interactive bash session inside the container.
--rm cleans the container after it exists.
Those options aren't really necessary for your use case. Just remember to use docker ps and docker rm <ID> to clean up your container after you're done.

Using Jenkins and Docker to run Python scripts

I would like to run Python scripts in various stages of Jenkins (pipeline) jobs, abroad a wide range of agents. I want the same Python environment for all of these, so I'm considering using Docker for this purpose.
I'm considering using Docker to build an image that contains the Python environment (with installed packages, etc.), that then allows an external Python script based on argument input:
docker run my_image my_python_file.py
My question is now, how should the infrastructure be? I see that the Python docker distribution is 688MB, and transferring this image to all steps would surely be an overhead? However, they are all on the same network, so perhaps it wouldn't be a big issue.
Updates. So, my Dockerfile looks like this:
FROM python:3.6-slim-jessie
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python3"]
Then I build the image using
>docker build ./ -t my-app
which successfully builds the image and install my requirements. Then I want to start the image as daemon using
> docker run -dit my-app
Then I execute the process using
> docker exec -d {DAEMON_ID} my-script.py
Run your Docker container as a daemon process, and every time you need to run your Python script, call docker exec.
docker exec -d <your-container> <your-python-file.py>
Using a Docker agents for build is an effective way to have distributed and reproducible builds.
I see that the Python docker distribution is 688MB, and transferring this image to all steps would surely be an overhead?
You should consider using smaller Docker images. There are alpine and slim docker images for python. You should consider using these first. The size of the alpine python image is 89.2MB.
Also the most of the image layers will be cached by Docker, so you will be pulling a some layers with significantly smaller sizes.

Spring Boot Application which triggers another App with CLI

I am currently developing a Spring Boot Application which triggers a Python program via CLI. I've used Processbuilder to do that and it's been working ok so far.
Now I'm trying to get the Spring Boot Application and the Python program in a Docker container. Since I'm new to Docker I don't know the best way to do this. I've tried using COPY to copy the whole folder to create an image but for some reason the folder pythonapp in the Container is always empty.
Am I missing something or is there a better way to do this?
FROM openjdk:8u151-jdk-slim
EXPOSE 8080
ADD springbootapp-0.0.1.jar app.jar
COPY . /root/pythonapp
RUN sh -c 'touch /app.jar'
RUN apt-get update && apt-get install -y python \
python-gi \
gir1.2-gtk-3.0
ENV JAVA_OPTS=""
ENTRYPOINT [ "sh", "-c", "java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar" ]
Normally the idea of docker is that 1 container does 1 thing and 1 thing good. So it mostly is not a good idea to put two things in 1 docker container. Think about two containers :-)
Other than that it might be a good idea to add files separately or as a tar/zip file and extract it in the image.

Categories

Resources