Using Docker and Python, how to access csv file created in volume? - python

Edit
Added suggestions from daudnadeem.
Created folder in the directory with my Dockerfile called temp_folder.
Updated the last line of the python file to be df.to_csv('/temp_folder/temp.csv').
Ran the docker build and then new run command docker run -v temp_folder:/temp_folder alexd/myapp ..
I have a very simple Python example using Docker. The code runs fine, but I can't figure out how to access the CSV file created by the Python code. I have created a volume in Docker and used docker inspect to try to access the CSV file but I'm unsure of the syntax and can't find an example online that makes sense to me.
Python Code
import pandas as pd
import numpy as np
import os
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=['A', 'B', 'C', 'D'])
df.to_csv('temp.csv')
Dockerfile
FROM python:3.7.1
RUN mkdir -p /var/docker-example
WORKDIR /var/docker-example
COPY ./ /var/docker-example
RUN pip install -r requirements.txt
ENTRYPOINT python /var/docker-example/main.py
Docker commands
$ docker build -t alexf/myapp -f ./Dockerfile .
$ docker volume create temp-vol
$ docker run -v temp-vol alexf/myapp .
$ docker inspect -f temp.csv temp-vol
temp.csv

Your "temp.csv" lives on the ephemeral docker image. So in order for you to access it outside of the docker image, the best thing for you to do is expose a volume.
In the directory where you have your Dockerfile, make a folder called "this_folder"
Then when you run your image, mount that volume to the folder within your container docker run -v this_folder:/this_folder <image-name>
Then change this code to:
import pandas as pd
import numpy as np
import os
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=['A', 'B', 'C', 'D'])
df.to_csv('/this_folder/temp.csv')
"this_folder" is now a "volume" which is mutually accessible by your docker container and the host machine. So outside of your docker container, if you ls /this_folder you should see temp.csv lives there now.
If you don't want to mount a volume, you could upload the file somewhere, and download it later. But in a local env, just mount the folder and use it to share files between your container and your local machine.
Edit
when stuff is not going as planned with docker, you may want to access it interactively. That means 'ssh'ing into your docker container'
You do that with: docker run -it pandas_example /bin/bash
When I logged in, I saw the file 'temp.csv' is being made in the same folder as the main.py
Now for you to solve the issue further is up to you. You need to move the 'temp.csv' file into the directory which is shared with your local machine.
FROM python:3.7.1
RUN pip3 install pandas
COPY test.py /my_working_folder/
CMD python3 /my_working_folder/test.py
For a quick fix add
import subprocess
subprocess.call("mv temp.csv /temp_folder/", shell=True)
to the end of main.py. But this is not recommended.

Let's make things simple if your only goal is to understand how volumes work and where to find the file on the host created by Python code inside the container.
Dockerfile:
FROM python:3.7.1
RUN mkdir -p /var/docker-example
WORKDIR /var/docker-example
COPY . /var/docker-example
ENTRYPOINT python /var/docker-example/main.py
main.py - will create /tmp/temp.txt inside the container with hi inside
with open('/tmp/temp.txt', 'w') as f:
f.write('hi')
Docker commands (run inside the project folder):
Build image:
docker build -t alexf/myapp .
Use named volumed vol which is mapped to /tmp folder inside the container:
Run container: docker run -d -v vol:/tmp alexf/myapp
Inspect the volume:docker inspect vol
[
{
"CreatedAt": "2019-11-05T22:07:02+02:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/vol/_data",
"Name": "vol",
"Options": null,
"Scope": "local"
}
]
Bash commands run on the docker host
sudo ls /var/lib/docker/volumes/vol/_data
temp.txt
sudo cat /var/lib/docker/volumes/vol/_data/temp.txt
hi
You can use also bind mounts and anonymous volumes to achieve the same result.
Docs

Related

Run python script using a docker image

I downloaded a python script and a docker image containing commands to install all the dependencies. How can I run the python script using the docker image?
Copy python file in Docker image then execute -
docker run image-name PATH-OF-SCRIPT-IN-IMAGE/script.py
Or you can also build the DockerFile by using the RUN python PATH-OF-SCRIPT-IN-IMAGE/script.py inside DockerFile.
How to copy container to host
docker cp <containerId>:/file/path/within/container /host/path/target
How to copy host to the container
docker cp /host/local/path/file <containerId>:/file/path/in/container/file
Run in interactive mode:
docker run -it image_name python filename.py
or if you want host and port to be specified:
docker run -it -v filename.py:filename.py -p 8888:8888 image_name python filename.py
Answer
First, Copy your python script and other required files to your docker container.
docker cp /path_to_file <containerId>:/path_where_you_want_to_save
Second, open the container cli using docker desktop and run your python script.
The best way, I think, is to make your own image that contains the dependencies and the script.
When you say you've been given an image, I'm guessing that you've been given a Dockerfile, since you talk about it containing commands.
Place the Dockerfile and the script in the same directory. Add the following lines to the bottom of the Dockerfile.
# Existing part of Dockerfile goes here
COPY my-script.py .
CMD ["python", "my-script.py"]
Replace my-script.py with the name of the script.
Then build and run it with these commands
docker build -t my-image .
docker run my-image

Docker Container Not Executing Python Commands

I have a simple script writecsv.py which queries an external API, parses the response, and writes to 2 CSVs.
My Dockerfile reads:
FROM python:3.8.5-slim-buster
COPY . /src
RUN pip install -r /src/requirements.txt
CMD ["python", "./src/writecsv.py"]
In the directory containing my Dockerfile, I have 4 files:
writecsv.py # My script that queries API and writes 2 csvs
keys.yaml # Stores my API keys which are read by writecsv.py
requirements.txt
Dockerfile
When I build this image, I use docker build -t write-to-csv-application . and to run this image, I use docker run write-to-csv-application
I am able to show that the script runs, and the 2 CSV files are successfully created by printing the contents of the current working directory before and after calling csv.DictWriter.
So far, so good. Now I'd like to expose these files on localhost:5000, to be downloaded.
My current approach isn't working. I don't know Docker very well, so any suggestions are welcomed. Here's where things go wrong:
I then add 2 more lines to my Dockerfile; expose 5000, and http.server to get:
FROM python:3.8.5-slim-buster
COPY . /src
RUN pip install -r /src/requirements.txt
EXPOSE 5000/tcp
CMD ["python", "./src/writecsv.py"]
CMD python -m http.server -dp ./src/ 5000
Now when I build this image, again using docker build -t write-to-csv-application . and run this image using docker run -p 5000:5000 write-to-csv-application I don't get any command line output from the writecsv.py program, that I did previously. I am able to access localhost:5000 and see the image file structure, but I find that the files weren't created. The output in command line hangs indefinitely (which I would expect, as I don't have anything to terminate the http server.)
What I've tried:
I wrote the files to ./src/data/ and pointed the http.server to /src/data/, which doesn't exist.
I pointed the http.server to ./ and checked the entire file structure: they aren't being written anywhere when ran with docker run -p 5000:5000 write-to-csv-application
This worked for me. I neglected the requirements.txt. I just made a very basic example.
"The main purpose of a CMD is to provide defaults for an executing container". See the official docs for more information.
writecsv.py:
import csv
fields = ['Name', 'Branch', 'Year', 'CGPA']
rows = [ ['Nikhil', 'COE', '2', '9.0'],
['Sanchit', 'COE', '2', '9.1'],
['Aditya', 'IT', '2', '9.3'],
['Sagar', 'SE', '1', '9.5'],
['Prateek', 'MCE', '3', '7.8'],
['Sahil', 'EP', '2', '9.1']]
filename = "university_records.csv"
with open(filename, 'w') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(fields)
csvwriter.writerows(rows)
Dockerfile:
FROM python:3.8.5-slim-buster
WORKDIR /src
COPY . /src
EXPOSE 5000/tcp
RUN ["python", "./writecsv.py"]
CMD python -m http.server -d . 5000
docker build -t python-test .
docker run --rm -it -p 5000:5000 --name python-test python-test

Running .env files within a docker container

I have been struggling to add env variables into my container for the past 3 hrs :( I have looked through the docker run docs but haven't managed to get it to work.
I have built my image using docker build -t sellers_json_analysis . which works fine.
I then go to run it with: docker run -d --env-file ./env sellers_json_analysis
As per the docs: $ docker run --env-file ./env.list ubuntu bash but I get the following error:
docker: open ./env: no such file or directory.
The .env file is in my root directory
But when running docker run --help I am unable to find anything about env variables, but it doesn't provide the following:
Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
So not sure I am placing things incorrectly. I could add my variables into the dockerfile but I want to keep it as a public repo as it's a project I would like to display.
Your problem is wrong path, either use .env or ./.env, when you use ./env it mean a file named env in current directory
docker run -d --env-file .env sellers_json_analysis

Reading an HDF file outside a Docker container from a Python script inside a container

I have a Python script, python_script.py, that reads an HDF5 file, hdf_file.h5, on my local machine. The directory path to the files is
folder1
folder2
python_script.py
hdf_file.h5
I have the following sample code:
from pandas import read_hdf
df = read_hdf('hdf_file.h5')
When I run this code on my local machine, it works fine.
However, I need to place the Python script inside a Docker container, keep the HDF file out of the container, and have the code read the file. I want to have something like the following directory path for the container:
folder1
folder2
hdf_file.h5
docker-folder
python_script.py
requirements.txt
Dockerfile
I use the following Dockerfile:
FROM python:3
WORKDIR /project
COPY ./requirements.txt /project/requirements.txt
RUN pip install -r requirements.txt
COPY . /project
CMD [ "python", "python_script.py" ]
I am new to Docker and am having a lot of trouble figuring out how to get a Python script inside a container to read a file outside the container. What commands do I use or code changes do I make to be able to do this?
It seems you need to use docker volumes (https://docs.docker.com/storage/volumes/).
Try the following:
docker run -v path/where/lives/hdf5/:path/to/your/project/folder/your_image your_docker_image:your_tag
Where the first part before the : refers to host machine and after, the container.
Hope it helps!

Creating files (pdf, xls) with a python script inside a docker container

Trying to create simple files with a python script called "scriptfile.py". When I run it, it outputs a pdf with a sine wave and an xls file containing a 3x10 dataframe that was initially imported from a csv file called "csv_file.csv". In addition, the sine wave plot is shown. This all works fine.
Now I've created a Dockerfile, based on the app.py example in the Docker documentation. I build an image using
sudo docker build --tag=testrun .
and run it using
sudo docker run -p 4000:80 testrun
The console output is normal, but no files are created and no plot displayed. The code of the Dockerfile and the scriptfile.py are given below.
It reads
FROM python:3
WORKDIR /app
COPY . /app
ADD scriptfile.py /
RUN pip install matplotlib
RUN pip install xlwt
RUN pip install pandas
EXPOSE 80
ENV NAME DockerTester
CMD ["python","/scriptfile.py"]
The scriptfile.py reads
import math
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('csv_file.csv', sep=",", header=None)
df.to_excel(r'xlx_file.xls')
print("plotting ...")
sinusoid=[]
for i in range(100):
sinusoid.append(math.sin(i))
f = plt.figure()
plt.plot(sinusoid)
plt.show()
f.savefig("sin.pdf")
plt.close()
print("... success")
Question: Where are the files?
There are multiple ways to do this, here are some.
Using docker cp
First figure out your containerid by using docker ps -a, then do:
docker cp <containerid>:/app /tmp/mydir
You will find the content on your host at /tmp/mydir.
Using Dockerfile VOLUME
Add this line to your Dockerfile after your COPY:
VOLUME /app
Now run your container like you are:
docker run -p 4000:80 testrun
Now do:
docker inspect -f '{{ .Mounts }}' <containerid>
Where <containerid> is obtained from docker ps -a. You will see something like:
[{volume 511961d95cd5de9a32afe3358c7b9af3eabd50179846fdebd9c882d50c7ffee7 /var/lib/docker/volumes/511961d95cd5de9a32afe3358c7b9af3eabd50179846fdebd9c882d50c7ffee7/_data /app local true }]
As you can see there is a path:
/var/lib/docker/volumes/511961d95cd5de9a32afe3358c7b9af3eabd50179846fdebd9c882d50c7ffee7/_data
That is where the container's /app directory contents are located.
Using docker run -v
Change your python script to write a location other than /app, something like f.savefig("/tmp/sin.pdf").
Then run docker like this:
docker run -it -v /tmp/share/:/tmp -p 4000:80 testrun
Now you will find your file on your host at /tmp/share/

Categories

Resources