Docker From python:2 containarized simple code not running nohup parallel commands

Docker From python:2 containarized simple code not running nohup parallel commands - python

I have sample code to run some mode in python which needs to run 4000 times to complete the process. I have been created a docker build using the docker file below.
FROM python:2
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
RUN pip install fbprophet
CMD ["python", "./startup.py"]
Inside the startup.py file I am creating one shell script file which having 4000 nohup commands need to run as python scripts,
here is the example of sample nohup command will start at the end of "startup.py" script.
nohup python `runprocess.py` arg1 arg2
My problem is if I start the build using docker run command, let say docker build name is startup-build
docker run startup-build
This will create the shell script inside the container and start only 2 or 3 nohup commands from the file, not entire commands. ideally, it should start 100 processes at a time because the script file has 'wait' command on after every 100 lines.
I don't know why is this happening. I am running this docker image in GCP container optimized OS VM, The actual problem is the container starting while 'docker run' not using the entire resources available in the VM and not completing the process on time.
Is it because of docker image can't run shell command inside container parallel ? or does there nohup command have any limitation?

Related

Why a dockerized script have a different behaviour when I docker run or I docker execute it?

I'm using a python script for send websocket notification,
as suggested here.
The script is _wsdump.py and I have a script script.sh that is:
#!/bin/sh
set -o allexport
. /root/.env set
env
python3 /utils/_wsdump.py "wss://mywebsocketserver:3000/message" -t "message" &
If I try to dockerizing this script with this Dockerfile:
FROM python:3.8-slim-buster
RUN set -xe \
pip install --upgrade pip wheel && \
pip3 install websocket-client
ENV TZ="Europe/Rome"
ADD utils/_wsdump.py /utils/_wsdump.py
ADD .env /root/.env
ADD script.sh /
ENTRYPOINT ["./script.sh"]
CMD []
I have a strange behaviour:
if I execute docker run -it --entrypoint=/bin/bash mycontainer and after that I call the script.sh everything works fine and I receive the notification.
if I run mycontainer with docker run mycontainer I see no errors but the notification doesn't arrive.
What could be the cause?

Your script doesn't launch a long-running process; it tries to start something in the background and then completes. Since the script completes, and it's the container's ENTRYPOINT, the container exits as well.
The easy fix is to remove the & from the end of the last line of the script to cause the Python process to run in the foreground, and the container will stay alive until the process completes.
There's a more general pattern of an entrypoint wrapper script that I'd recommend adopting here. If you look at your script, it does two things: (1) set up the environment, then (2) run the actual main container command. I'd suggest using the Docker CMD for that actual command
# end of Dockerfile
ENTRYPOINT ["./script.sh"]
CMD python3 /utils/_wsdump.py "wss://mywebsocketserver:3000/message" -t "message"
You can end the entrypoint script with the magic line exec "$#" to run the CMD as the actual main container process. (Technically, it replaces the current shell script with a command constructed by replaying the command-line arguments; in a Docker context the CMD is passed as arguments to the ENTRYPOINT.)
#!/bin/sh
# script.sh
# set up the environment
. /root/.env set
# run the main container command
exec "$#"
With this use you can debug the container setup by replacing the command part (only), like
docker run --rm your-image env
to print out its environment. The alternate command env will replace the Dockerfile CMD but the ENTRYPOINT will remain in place.

You install script.sh to the root dir /, but your ENTRYPOINT is defined to run the relative path ./script.sh.
Try changing ENTRYPOINT to reference the absolute path /script.sh instead.

Run python script using a docker image

I downloaded a python script and a docker image containing commands to install all the dependencies. How can I run the python script using the docker image?

Copy python file in Docker image then execute -
docker run image-name PATH-OF-SCRIPT-IN-IMAGE/script.py
Or you can also build the DockerFile by using the RUN python PATH-OF-SCRIPT-IN-IMAGE/script.py inside DockerFile.
How to copy container to host
docker cp <containerId>:/file/path/within/container /host/path/target
How to copy host to the container
docker cp /host/local/path/file <containerId>:/file/path/in/container/file

Run in interactive mode:
docker run -it image_name python filename.py
or if you want host and port to be specified:
docker run -it -v filename.py:filename.py -p 8888:8888 image_name python filename.py

Answer
First, Copy your python script and other required files to your docker container.
docker cp /path_to_file <containerId>:/path_where_you_want_to_save
Second, open the container cli using docker desktop and run your python script.

The best way, I think, is to make your own image that contains the dependencies and the script.
When you say you've been given an image, I'm guessing that you've been given a Dockerfile, since you talk about it containing commands.
Place the Dockerfile and the script in the same directory. Add the following lines to the bottom of the Dockerfile.
# Existing part of Dockerfile goes here
COPY my-script.py .
CMD ["python", "my-script.py"]
Replace my-script.py with the name of the script.
Then build and run it with these commands
docker build -t my-image .
docker run my-image

How to not stop nohup and get output files

I’m new to working on Linux. I apologize if this is a dumb question. Despite searching for more than a week, I was not able to derive a clear answer to my question.
I’m running a very long Python program on Nvidia CPUs. The output is several csv files. It takes long to compute the output, so I use nohup to be able to exit the process.
Let’s say main.py file is this
import numpy as p
import pandas as pd
if __name__ == ‘__main__’:
a = np.arange(1,1000)
data = a*2
filename = ‘results.csv’
output = pd.DataFrame(data, columns = [“Output”])
output.to_csv(filename)
The calculations for data is more complicated, of course. I build a docker container, and run this program inside this container. When I use python main.py for a smaller-sized example, there is no problem. It writes the csv files.
My question is this:
When I do nohup python main.py &, I check what’s going on with tail -f nohup.out in the docker container, I get what it is doing at that time but I cannot exit it and let the execution run its course. It just stops there. How can I exit safely from the screen that comes with tail -f nohup.out?
I tried not checking the condition of the code and letting the code continue for two days, then I returned. The output of tail -f nohup.out indicated that the execution finished but csv files were nowhere to be seen. It is somehow bundled up inside nohup.out or does it indicate something else is wrong?

If you're going to run this setup in a Docker container:
A Docker container runs only one process, as a foreground process; when that process exits the container completes. That process is almost always the script or server you're trying to run and not an interactive shell. But;
It's possible to use Docker constructs to run the container itself in the background, and collect its logs while it's running or after it completes.
A typical Dockerfile for a Python program like this might look like:
FROM python:3.10
# Create and use some directory; it can be anything, but do
# create _some_ directory.
WORKDIR /app
# Install Python dependencies as a separate step. Doing this first
# saves time if you repeat `docker build` without changing the
# requirements list.
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy in the rest of the application.
COPY . .
# Set the main container command to be the script.
CMD ["./main.py"]
The script should be executable (chmod +x main.py on your host) and begin with a "shebang" line (#!/usr/bin/env python3) so the system knows where to find the interpreter.
You will hear recommendations to use both CMD and ENTRYPOINT for the final line. It doesn't matter much to your immediate question. I prefer CMD for two reasons: it's easier to launch an alternate command to debug your container (docker run --rm your-image ls -l vs. docker run --rm --entrypoint ls your-image -l), and there's a very useful pattern of using ENTRYPOINT to do some initial setup (creating environment variables dynamically, running database migrations, ...) and then launching CMD.
Having built the image, you can use the docker run -d option to launch it in the background, and then run docker logs to see what comes out of it.
# Build the image.
docker build -t long-python-program .
# Run it, in the background.
docker run -d --name run1 long-python-program
# Review its logs.
docker logs run1
If you're running this to produce files that need to be read back from the host, you need to mount a host directory into your container at the time you start it. You need to make a couple of changes to do this successfully.
In your code, you need to write the results somewhere different than your application code. You can't mount a host directory over the /app directory since it will hide the code you're actually trying to run.
data_dir = os.getenv('DATA_DIR', 'data')
filename = os.path.join(data_dir, 'results.csv')
Optionally, in your Dockerfile, create this directory and set a pointer to it. Since my sample code gets its location from an environment variable you can again use any path you want.
# Create the data directory.
RUN mkdir /data
ENV DATA_DIR=/data
When you launch the container, the docker run -v option mounts filesystems into the container. For this sort of output file you're looking for a bind mount that directly attaches a host directory to the container.
docker run -d --name run2 \
-v "$PWD/results:/data" \
long-python-program
In this example so far we haven't set the USER of the program, and it will run as root. You can change the Dockerfile to set up an alternate USER (which is good practice); you do not need to chown anything except the data directory to be owned by that user (leaving your code owned by root and not world-writeable is also good practice). If you do this, when you launch the container (on native Linux) you need to provide the host numeric user ID that can write to the host directory; you do not need to make other changes in the Dockerfile.
docker run -d --name run2 \
-u $(id -u) \
-v "$PWD/results:/data" \
long-python-program

1- Container is a foreground process. Use CMD or Entrypoint in Dockerfile.
2- Map volume in docker to linux directory's.

Committing a container on windows breaks python code

A strange python script failure happens when running the python script through docker run
Circumstances:
I have an image(was built on linux) which contains a simple python file named /opt/test.py with permission of -rwxr-xr-x
#!/usr/bin/python3
from os import mkdir
print("hello world")
Running the script in the original image
stepping into the image (docker run -it --entrypoint bash myimage) then run: bash-4.4# /opt/test.py
result: hello world
using docker run: docker run myimage /opt/test.py result: hello world
Modify the image in the following way
commit the container without any change on windows docker commit container_id newimage
Running the script in the new image
stepping into the image (docker run -it --entrypoint bash newimage) then run: bash-4.4# /opt/test.py
result: hello world
so far so good...
using docker run: docker run newimage /opt/test.py
result: /opt/test.py: line 2: from: command not found'
This error means that the OS don't know how to run the file, thats why we have to provide the python command as #!/usr/bin/python3 Which we already did. The permission is still -rwxr-xr-x
And as you can see it can run the script successfully when you step into the container created from the image.

Docker commit has a magical property- it carries over all of the properties on an existing container to the new image.
You did docker run --entrypoint /bin/bash
This means that when you do docker run newimage /some/file.py you are executing that file AS A SHELL SCRIPT. And, generally, python scripts don't work very well in bash. The equivalent is /bin/bash /opt/test.py
Read your error message- its literally saying that there's a syntax error on line 2 of /opt/test.py

I managed to 'solve' this, however I still don't understand the behaviour behind this.
The keypoint is how the container was created from which we commit the new image
1.Committing a container which was created by stepping into in it
(like docker run -it --entrypoint bash myimage) will break the image
2.Committing a container which was created by any other way (like docker run myimage /some_command) won't break it

AWS Batch:/usr/local/bin/python: cannot execute binary file

I built an AWS Batch compute environment. I want to run a python script in jobs.
Here is the docker file I'm using :
FROM python:slim
RUN apt-get update
RUN pip install boto3 matplotlib awscli
COPY runscript.py /
ENTRYPOINT ["/bin/bash"]
The command in my task definition is :
python /runscript.py
When I submit a job in AWS console I get this error in CloudWatch:
/usr/local/bin/python: /usr/local/bin/python: cannot execute binary file
And the job gets the status FAILED.
What is going wrong? I run the container locally and I can launch the script without any errors.

Delete your ENTRYPOINT line. But replace it with the CMD that says what the container is actually doing.
There are two parts to the main command that a Docker container runs, ENTRYPOINT and CMD; these are combined together into one command when the container starts. The command your container is running is probably something like
/bin/bash python /runscript.py
So bash finds a python in its $PATH (successfully), and tries to run it as a shell script (leading to that error).
You don't strictly need an ENTRYPOINT, and here it's causing trouble. Conversely there's a single thing you usually want the container to do, so you should just specify it in the Dockerfile.
# No ENTRYPOINT
CMD ["python", "/runscript.py"]

You can try with following docker file and task definition.
Docker File
FROM python:slim
RUN apt-get update
RUN pip install boto3 matplotlib awscli
COPY runscript.py /
CMD ["/bin/python"]
Task Definition
['/runscript.py']
By passing script name in task definition will give you flexibility to run any script while submitting a job. Please refer below example to submit a job and override task definition.
import boto3
session = boto3.Session()
batch_client = session.client('batch')
response = batch_client.submit_job(
jobName=job_name,
jobQueue=AWS_BATCH_JOB_QUEUE,
jobDefinition=AWS_BATCH_JOB_DEFINITION,
containerOverrides={
'command': [
'/main.py'
]
}
)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.