I've encountered a weird problem and I do not know how to proceed.
I have docker 18.09.2, build 6247962 on a VMware ESXi 6.5 virtual machine running Ubuntu 18.04. I have docker 19.03.3, build a872fc2f86 on a Azure virtual machine running Ubuntu 18.04. I have the following little test script that I run on both hosts and in different docker containers:
#!/usr/bin/python3
import fcntl
import struct
image_path = 'foo.img'
f_obj = open(image_path, 'rb')
binary_data = fcntl.ioctl(f_obj, 2, struct.pack('I', 0))
bsize = struct.unpack('I', binary_data)[0]
print('bsize={0}'.format(bsize))
exit(0)
I run "ps -ef >foo.img" to get the foo.img file. The output of the above script on both virtual machines is bsize=4096.
I have the following Dockerfile on both VMs:
FROM ubuntu:19.04
RUN apt-get update && \
apt-get install -y \
python \
python3 \
vim
WORKDIR /root
COPY testfcntl01.py foo.img ./
RUN chmod 755 testfcntl01.py
If I create a docker image with the above Dockerfile on the VM running docker 18.09.2, the above gives me the same results as the host.
If I create a docker image with the above Dockerfile on the VM running docker 19.03.3, the above gives me the following error:
root#d317404714a6:~# ./testfcntl01.py
Traceback (most recent call last):
File "./testfcntl01.py", line 9, in <module>
binary_data = fcntl.ioctl(f_obj, 2, struct.pack('I', 0))
OSError: [Errno 22] Invalid argument
I compared the docker directory structure, the daemon.json file, the logs, the "docker info" between the hosts. They look to be identical. I tried with a FROM ubuntu:18.04 as well as ubuntu:19.04. I've tried with python2 as well as python3. Same results.
I do not know why the fcntl fails only on a docker container on the Azure VM running docker 19.03.3. Did something change in docker between 18 and 19 that might have caused this? Is there some configuration change that I need to make to get this to work? Something else I'm missing?
Any help would be greatly appreciated.
Thank you
Lewis Muhlenkamp
UPDATE01:
I was following the steps here to prepare my own custom Ubuntu 18.04 VHD to use in Azure. I started with a generic install of Ubuntu Server 18.04 using ubuntu-18.04.3-live-server-amd.iso that I just downloaded from Ubuntu's website. The test below works just fine on that freshly intalled VM. I finish the step
sudo apt-get install linux-generic-hwe-18.04 linux-cloud-tools-generic-hwe-18.04
and then my test fails. So, I believe there is some issue with these hardware enablement packages.
I had a pretty similar error and found that if the file is in a mounted volume, at least owned by the host, it won't fail. Ie:
docker run -it -v $PWD:/these_work ubuntu:18.04 bash
Files under the /these_work directory in the container worked, however other files that were solely accessible from within the container resulted in [Errno 22] Invalid Argument.
I came here from a yocto build error from a nearly identical method of accessing the blocksize within filemap.py:
# Get the block size of the host file-system for the image file by calling
# the FIGETBSZ ioctl (number 2).
try:
binary_data = fcntl.ioctl(file_obj, 2, struct.pack('I', 0))
except OSError:
raise IOError("Unable to determine block size")
Related
Im completely new to Docker and I'm trying to create and run a very simple example using instructions defined in a DockerFile.
DockerFile->
FROM ubuntu:latest
RUN apt-get update
RUN apt-get install -y python3 pip
COPY ./ .
RUN python3 test.py
contents of test.py ->
import pandas as pd
import numpy as np
print('test code')
command being used to create a Docker Container ->
docker build --no-cache . -t intro_to_docker -f abs/path/to/DockerFile
folder structure -> (both files are present at abs/path/to)
abs/path/to:
-DockerFile
-test.py
Error message ->
error from sender: open .Trash: operation not permitted
(using sudo su did not resolve the issue, which i believe is linked to the copy commands)
I'm using a Mac.
any help in solving this will be much appreciated!
The Dockerfile should be inside a folder. Navigate to that folder and then run docker build command. I was also facing the same issue but got resovled when moved the docker file inside a folder
Usually the error would look like:
error: failed to solve: failed to read dockerfile: error from sender: open .Trash: operation not permitted
And in my case, it's clearly saying that it is unable to find the dockerfile.
Also, in your command, I see a . after --no-cache, I think that's not required?
So better, try navigating to the specified path and then run the build command replacing the -f option with a ., which specifies the build command to consider the current folder for its build process.
In your case
cd abs/path/to/
docker build --no-cache -t intro_to_docker .
It seems the system policies are not allowing the application to execute this command. The application "Terminal" might not have approval to access the entire file system.
Enable full disk access to terminal. Change it using "System Preferences > Security & Privacy > Privacy > Full Disk Access"
I had the same error message and my Dockerfile was located in the HOME directory, I moved the Docker file to a different location and executed the docker build from that newly moved location and it successfully executed.
I have somewhat successfully dockerized a software repository (KPConv) that I plan to work with and extend with the following Dockerfile
FROM tensorflow/tensorflow:1.12.0-devel-gpu-py3
# Install other required python stuff
RUN apt-get update && apt install -y --fix-missing --no-install-recommends\
python3-setuptools python3-pip python3-tk
RUN pip install --upgrade pip
RUN pip3 install numpy scikit-learn psutil matplotlib pyqt5 laspy
# Compile the custom operations and CPP wrappers
# For some reason this must be done within container, cannot access libcuda.so during docker build
# Ref: https://stackoverflow.com/questions/66575232
#COPY . /kpconv
#WORKDIR /kpconv/tf_custom_ops
#RUN sh compile_op.sh
#WORKDIR /kpconv/cpp_wrappers
#RUN sh compile_wrappers.sh
# Set the working directory to kpconv
WORKDIR /kpconv
# Set root user password so we can su/sudo later if need be
RUN echo "root:pass" | chpasswd
# Create a user and group akin to the host within the container
ARG USER_ID
ARG GROUP_ID
RUN addgroup --gid $GROUP_ID user
RUN adduser --disabled-password --gecos '' --uid $USER_ID --gid $GROUP_ID user
USER user
#Build
#sudo docker build -t kpconv-test \
# --build-arg USER_ID=$(id -u) \
# --build-arg GROUP_ID=$(id -g) \
# .
At the end of this Dockerfile I followed a post found here which describes a way to correctly set the permissions of files generated by/within a container so that the host machine/user can access them without having to alter the file permissions.
Also, this software repository makes use of custom tensorflow operations in C++ (KPConv/tf_custom_ops) along with Python wrappers for custom C++ code (KPConv/cpp_wrappers). The author of KPConv, Thomas Hugues, provides a bash script which compiles each to generate various .so files.
If I COPY the repository into the image during the build process (COPY . /kpconv), startup the container, call both of the compile bash scripts, and run the code then Python correctly loads the C++ wrapper (the generated .so grid_subsampling.cpython-35m-x86_64-linux-gnu.so) and begins running the software as expected/intended.
$ sudo docker run -it \
> -v /<myhostpath>/data_sets:/data \
> -v /<myhostpath>/_output:/output \
> --runtime=nvidia kpconv-test /bin/bash
user#eec8553dcb5d:/kpconv$ cd tf_custom_ops
user#eec8553dcb5d:/kpconv/tf_custom_ops$ sh compile_op.sh
user#eec8553dcb5d:/kpconv/tf_custom_ops$ cd ..
user#eec8553dcb5d:/kpconv$ cd cpp_wrappers/
user#eec8553dcb5d:/kpconv/cpp_wrappers$ sh compile_wrappers.sh
running build_ext
building 'grid_subsampling' extension
<Redacted for brevity>
user#eec8553dcb5d:/kpconv/cpp_wrappers$ cd ..
user#eec8553dcb5d:/kpconv$ python training_ModelNet40.py
Dataset Preparation
*******************
Loading training points
1620.2 MB loaded in 0.6s
Loading test points
411.6 MB loaded in 0.2s
<Redacted for brevity>
This works well and allows me run the KPConv software.
Also to note for later the .so file has the hash
user#eec8553dcb5d:/kpconv/cpp_wrappers/cpp_subsampling$ sha1sum grid_subsampling.cpython-35m-x86_64-linux-gnu.so
a17eef453f6d2370a15bc2a0e6714c978390c5c3 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
It also has the permissions
user#eec8553dcb5d:/kpconv/cpp_wrappers/cpp_subsampling$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 user user 561056 Mar 14 02:16 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
Though it produces a difficult workflow for quickly editing and the software for my purposes and quickly running it within the container. Every change to the code requires a new build of the image. Thus, I would much rather mount/volume the KPConv code from the host into the container at runtime and then the edits are "live" within the container as it is running.
Doing this and using the Dockerfile at the top of the post (no COPY . /kpconv) to compile an image, perform the same compilation steps, and run the code
$ sudo docker run -it \
> -v /<myhostpath>/data_sets:/data \
> -v /<myhostpath>/KPConv_Tensorflow:/kpconv \
> -v /<myhostpath>/_output:/output \
> --runtime=nvidia kpconv-test /bin/bash
user#a82e2c1af21a:/kpconv$ cd tf_custom_ops/
user#a82e2c1af21a:/kpconv/tf_custom_ops$ sh compile_op.sh
user#a82e2c1af21a:/kpconv/tf_custom_ops$ cd ..
user#a82e2c1af21a:/kpconv$ cd cpp_wrappers/
user#a82e2c1af21a:/kpconv/cpp_wrappers$ sh compile_wrappers.sh
running build_ext
building 'grid_subsampling' extension
<Redacted for brevity>
user#a82e2c1af21a:/kpconv/cpp_wrappers$ cd ..
user#a82e2c1af21a:/kpconv$ python training_ModelNet40.py
I receive the following Python ImportError
user#a82e2c1af21a:/kpconv$ python training_ModelNet40.py
Traceback (most recent call last):
File "training_ModelNet40.py", line 36, in <module>
from datasets.ModelNet40 import ModelNet40Dataset
File "/kpconv/datasets/ModelNet40.py", line 40, in <module>
from datasets.common import Dataset
File "/kpconv/datasets/common.py", line 29, in <module>
import cpp_wrappers.cpp_subsampling.grid_subsampling as cpp_subsampling
ImportError: /kpconv/cpp_wrappers/cpp_subsampling/grid_subsampling.cpython-35m-x86_64-linux-gnu.so: failed to map segment from shared object
Why is this Python wrapper for C++ only useable when COPY'ing code into the docker image and not when mounted by volume?
This .so file has the same hash and permissions as the first described situation
user#a82e2c1af21a:/kpconv/cpp_wrappers/cpp_subsampling$ sha1sum grid_subsampling.cpython-35m-x86_64-linux-gnu.so
a17eef453f6d2370a15bc2a0e6714c978390c5c3 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
user#a82e2c1af21a:/kpconv/cpp_wrappers/cpp_subsampling$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 user user 561056 Mar 14 02:19 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
On my host machine the file has the following permissions (it's on the host because /kpconv was mounted as a volume) (for some reason the container is in the future too, check the timestamps)
$ ls -al grid_subsampling.cpython-35m-x86_64-linux-gnu.so
-rwxr-xr-x 1 <myusername> <myusername> 561056 Mar 13 21:19 grid_subsampling.cpython-35m-x86_64-linux-gnu.so
After some research on the error message it looks like every result is specific to a situation. Though most seem to mention that the error is the result of some sort of permissions issue.
This Unix&Linux Stack answer I think provides the answer to what the actual problem is. But I am a bit too far from my days of working with C++ as an intern in college to necessarily understand how to use it to fix this issue. But I think the issue lies with the permissions between the container and host and between the users on each (that is, root on the container, user (Dockerfile) on the container, root on host, and <myusername> on host).
I have also attempted to first elevate permissions within the container using the root password created in the Dockerfile, then compiling the code, and running the software. But this results in the same issue. I have also tried compiling the code as user in the container, but running the software as root, again with the same issue.
Thus another clue I have found and provide is that there is seemingly something different with the .so when compiled "only within" the container (no --volume) and when it is compiled within the --volume (thus why I attempted to compare the file hashes). So maybe its not so much permissions but how the .so is loaded within the container by the kernel or how its location within the --volume effects that loading process?
EDIT: As for a SSCCE you should be able to clone the linked repository to your machine and use the same Dockerfile. You do not need to specify the /data or /output volumes or alter the code in any way (It attempts to load the .so before loading the data (which will just error and end execution))
If you do not have a GPU or do not want to install nvidia-runtime you should be able to alter the Dockerfile base image to tensorflow:1.12.0-devel-py3 and run the code on CPU.
Your problem is created by the linker trying to dynamically load the library. There could be several root-causes for this:
Permissions. The user should have permission to load the library, so when mounting file systems in docker, the owner id and the group id that are in the host are not necessary the same id in the container although they might be the same name.
Wrong binary format. The host OS is compiling the binary in wrong format. This can happen if you run the compile on (by example) macOS and use it in a linux container.
Wrong mounting. The mounting, by example, with noexec will also prevent the library to be loaded.
Difference in libraries from both environments. Due to the differences of the environment where the library was compiled, you might be missing some libraries, so use ldd grid_subsampling.cpython-35m-x86_64-linux-gnu.so and ldd -r -d -v grid_subsampling.cpython-35m-x86_64-linux-gnu.so check all the libraries that are linked.
I'm building an application using django and I wanted to add docker to this project.
I'm trying to run
sudo docker-compose up
Which gives me this output:
ERROR: .IOError: [Errno 13] Permission denied: './docker-compose.yml'
I checked the permissions using GUI. Everything is fine.
I'm trying to run my app from an mounted drive. I also tested it on other drives. The only drive this problem does not appear is my main drive running Ubuntu 18.04.
Looking forward to some answers
I found a working solution.
Don't use the snap installation and do this instead (tested Ubuntu 20.04)
apt install docker.io docker-compose
adding the directory where I am running my docker-compose.yml using the apparmor reconfigure tool:
$ sudo dpkg-reconfigure apparmor
You need to update your AppArmor configuration :
Snap Dockers are heavily controlled with AppArmor.
To diagnose if it is really the case, check the last lines of the syslog after you triggered the error :
dmesg | grep docker-compose
You should see a snap.docker that was denied:
kernel: [ ] audit: type=1400 audit(....):
apparmor="DENIED" operation="exec" profile="snap.docker.dockerd"
name="/bin/kmod" pid=7213 comm="exe" requested_mask="x"
denied_mask="x" fsuid=0 ouid=0
To correct this, just go to apparmor config's tunables :
cd /etc/apparmor.d/tunables
And edit HOMEDIRS variables in the 'home' file, for example from :
#{HOMEDIRS}=/home/
to
#{HOMEDIRS}=/home/ /media/aUser/Linux/
hope that helps.
All the other answers didn't work for me.
docker --version
Docker version 20.10.17, build 100c701
docker-compose -v
docker-compose version 1.29.2, build unknown
Instead of
docker-compose up
please use
docker compose up
Trying to run docker command :
nvidia-docker run -d -p 8888:8888 -e PASSWORD="123abcChangeThis" theano_secure start-notebook.sh
# Then open your browser at http://HOST:8888
taken from https://github.com/nouiz/Theano-Docker
returns error :
Error: image library/theano_secure:latest not found
Appears the theano_secure image is not currently available ?
Searching for theano_secure :
$ nvidia-docker search theano_secure:latest
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
The return of this command is empty so image is not available ?
If so is there an alternative Theano docker image from nvidia ?
Update :
building from source :
docker build -t theano_secure -f Dockerfile.0.8.X.jupyter.cuda.secure .
returns :
Err http://developer.download.nvidia.com Release.gpg
Unable to connect to developer.download.nvidia.com:http: [IP: 184.24.98.231 80]
and :
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty/InRelease
Manually checking URL's : http://developer.download.nvidia.com & http://archive.ubuntu.com/ubuntu/dists/trusty/InRelease are both not available. Should I build with alternative docker file ?
Update 2 :
I think this error is occurring as http://archive.ubuntu.com/ubuntu/dists/trusty/InRelease does not exist. However http://archive.ubuntu.com/ubuntu/dists/trusty/Release does exist.
Can docker be modified to use http://archive.ubuntu.com/ubuntu/dists/trusty/Release instead of http://archive.ubuntu.com/ubuntu/dists/trusty/InRelease ?
OS version :
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
Update 3 :
"you are supposed to docker build first", before nvidia-docker run" I did try
docker build -t theano_secure -f Dockerfile.0.8.X.jupyter.cuda.secure .
which returns :
Err http://developer.download.nvidia.com Release.gpg Unable to connect to developer.download.nvidia.com:http: [IP: 184.24.98.231 80]
I can pull image docker pull kaixhin/theano but this does not run via Jupyter notebook in same way as nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu documented at https://hub.docker.com/r/tensorflow/tensorflow/ . There does not appear to be a docker Jupyter Theano container available.
How to expose the docker instance kaixhin/theano via Jupyter notebook ?
I tried : nvidia-docker run -d -p 8893:8893 -v --name theano2 kaixhin/theano start-notebook.sh but receive error :
docker: Error response from daemon: invalid header field value "oci runtime error: container_linux.go:247:
starting container process caused \"exec: \\\"start-notebook.sh\\\": executable file not found in $PATH\"\n".
Modification of kaixhin/theano docker container in order expose it via Jupyter notebook ?
Error: image library/theano_secure:latest not found
Because theano_secure doesn't like ubuntu,centos, it is not official repository on docker hub, so you need to build it by yourself.
Err http://developer.download.nvidia.com Release.gpg Unable to connect to developer.download.nvidia.com:http: [IP: 184.24.98.231 80]
Please check your internet connection first, telnet 184.24.98.231 80.
Maybe you are in a limited network place, try behind a proxy to do this again. You may want take a look about how to build image behind a proxy.
From what I understand of the nouiz/Theano-Docker README, you are supposed to docker build first, before nvidia-docker run.
But since the build is tricky, I would try instead docker pull kaixhin/theano (from kaixhin/cuda-theano/), much more recent (3 days ago), which is based on theano Dockerfile.
That image does rely on CUDAand needs to be run on an Ubuntu host OS with NVIDIA Docker installed. The driver requirements can be found on the NVIDIA Docker wiki.
I installed Tensorflow on Ubuntu 16.04 LTS following the tutorial given here (with GPU support): Docker Installation for Tensorflow
Managed to run docker with this command:
nvidia-docker run -it -p 8888:8888 -v /home/myusername/notebooks:/notebooks gcr.io/tensorflow/tensorflow:latest-gpu
docker exec -it [my_DOCKER_ID] bash
Once I managed to get into the docker bash successfully, I found that there is tensorflow directory here:
cd /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/
I proceeded to try the example code and successfully reached Test error of 0.8%:
python convolutional.py
Next, following https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html tutorial page, I would like to try mnist_softmax.py. So I cloned tensorflow's package to /notebooks:
cd /notebooks
git clone https://githubcom/tensorflow/tensorflow.git
However, I found problem when running the code:
cd tensorflow/tensorflow/examples/tutorials/mnist/
python mnist_softmax.py --data_dir /notebooks/tensorflow/tensorflow/examples/tutorials/mnist
Traceback (most recent call last):
File "mnist_softmax.py", line 78, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
TypeError: run() got an unexpected keyword argument 'argv'
At this point I'm pretty clueless whether the error was caused by bad installation or it's because there are steps that I havent done. My questions:
Is my installation complete? I assumed I had a clean installation knowing that I can run docker and get into the docker bash. Plus, I managed to run convolution.py
If I understand Docker correctly, I do not need to clone and build tensorflow package at all?
I had the same problem and it was caused by running tutorial code from a later version (eg v0.12) against an older version of tensorflow which was in my docker container (v0.11 in my case).
The same problem is discussed here: https://github.com/tensorflow/tensorflow/issues/5643
The app.run() method didn't have the argv parameter until v0.12.