Using a pip cache directory in docker builds - python

I'm hoping to get my pip install instructions inside my docker builds as fast as possible.
I've read many posts explaining how adding your requirements.txt before the rest of the app helps you take advantage of Docker's own image cache if your requirements.txt hasn't changed. But this is no help at all when dependencies do change, even slightly.
The next step would be if we could use a consistent pip cache directory. By default, pip will cache downloaded packages in ~/.cache/pip (on Linux), and so if you're ever installing the same version of a module that has been installed before anywhere on the system, it shouldn't need to go and download it again, but instead simply use the cached version. If we could leverage a shared cache directory for docker builds, this could help speed up dependency installs a lot.
However, there doesn't appear to be any simple way to mount a volume while running docker build. The build environment seems to be basically impenetrable. I found one article suggesting a genius but complex method of running an rsync server on the host and then, with a hack inside the build to get the host IP, rsyncing the pip cache in from the host. But I'm not relishing the idea of running an rsync server in Jenkins (which isn't the most secure platform at the best of times).
Does anyone know if there's any other way to achieve a shared cache volume more simply?

I suggest you to use buildkit, also see this.
Dockerfile:
# syntax = docker/dockerfile:experimental
FROM python:3.6-alpine
RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml
NOTE: # syntax = docker/dockerfile:experimental is a must,you have to add it at the beginning of Dockerfile to enable this feature.
1.
The first execute build:
export DOCKER_BUILDKIT=1
docker build --progress=plain -t abc:1 . --no-cache
The first log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:11:35.296107357 +0000 UTC
#9 1.955 Collecting pyyaml
#9 3.050 Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 5.006 Building wheels for collected packages: pyyaml
#9 5.007 Building wheel for pyyaml (setup.py): started
#9 5.249 Building wheel for pyyaml (setup.py): finished with status 'done'
#9 5.250 Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=867daf35eab43c2d047ad737ea1e9eaeb4168b87501cd4d62c533f671208acaa
#9 5.250 Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 5.267 Successfully built pyyaml
#9 5.274 Installing collected packages: pyyaml
#9 5.309 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:11:42.221146294 +0000 UTC
#9 duration: 6.925038937s
From above, you can see the first time, the build will download pyyaml from internet.
2.
The second execute build:
docker build --progress=plain -t abc:1 . --no-cache
The second log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:16:58.588157354 +0000 UTC
#9 1.786 Collecting pyyaml
#9 2.234 Installing collected packages: pyyaml
#9 2.270 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:17:01.933398002 +0000 UTC
#9 duration: 3.345240648s
From above, you can see the build no longer download package from internet, just use the cache. NOTE, this is not the traditional docker build cache as I have use --no-cache, it's /root/.cache/pip which I mount into build.
3.
The third execute build which delete buildkit cache:
docker builder prune
docker build --progress=plain -t abc:1 . --no-cache
The third log:
#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9 digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9 started: 2019-09-20 03:19:07.434792944 +0000 UTC
#9 1.894 Collecting pyyaml
#9 2.740 Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 3.319 Building wheels for collected packages: pyyaml
#9 3.319 Building wheel for pyyaml (setup.py): started
#9 3.560 Building wheel for pyyaml (setup.py): finished with status 'done'
#9 3.560 Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=cea5bc4689e231df7915c2fc3abca225d4ee2e869a7540682aacb6d42eb17053
#9 3.560 Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 3.580 Successfully built pyyaml
#9 3.585 Installing collected packages: pyyaml
#9 3.622 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:19:12.530742712 +0000 UTC
#9 duration: 5.095949768s
From above, you can see if delete buildkit cache, the package download again.
In a word, it will give you a shared cache between several times build, and this cache will only be mounted when image build. But, the image self will not have these cache, so avoid a lots of intermediate layer in image.
EDIT for folks who are using docker compose and are lazy to read the comments...:
You can also do this with docker-compose if you set
COMPOSE_DOCKER_CLI_BUILD=1. For example: COMPOSE_DOCKER_CLI_BUILD=1
DOCKER_BUILDKIT=1 docker-compose build –
UPDATE according to folk's question 2020/09/02:
I don't know from which version (my version now is 19.03.11), if not specify mode for cache directory, the cache won't be reused by next time build.
Don't know the detail reason, but you could add mode=0755, to Dockerfile to make it work again:
Dockerfile:
# syntax = docker/dockerfile:experimental
FROM python:3.6-alpine
RUN --mount=type=cache,mode=0755,target=/root/.cache/pip pip install pyyaml

Related

Python GDAL cannot find app GDAL on docker image

I'm trying to build an application into a docker container that uses the OSGEO/GDAL libraries in Python, which are wrappers around the GDAL program. The GDAL program appears to install ok (at least Docker reports => CACHED [3/9] RUN apk add --no-cache gdal without any errors from the step that I can see) however, when I get to the step where pip is supposed to bring in the GDAL Python libraries, it fails looking for files that don't exist, which some initial searching shows is likely to mean it can't find the GDAL program. Does anyone know how to resolve or work around this?
Collecting GDAL~=3.5.1
#11 15.03 Downloading GDAL-3.5.1.tar.gz (752 kB)
#11 15.16 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 752.4/752.4 KB 7.3 MB/s eta 0:00:00
#11 15.30 Preparing metadata (setup.py): started
#11 15.71 Preparing metadata (setup.py): finished with status 'error'
#11 15.73 error: subprocess-exited-with-error
#11 15.73
#11 15.73 × python setup.py egg_info did not run successfully.
#11 15.73 │ exit code: 1
#11 15.73 ╰─> [120 lines of output]
#11 15.73 WARNING: numpy not available! Array support will not be enabled
#11 15.73 running egg_info
#11 15.73 creating /tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info
#11 15.73 writing /tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info/PKG-INFO
#11 15.73 writing dependency_links to /tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info/dependency_links.txt
#11 15.73 writing requirements to /tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info/requires.txt
#11 15.73 writing top-level names to /tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info/top_level.txt
#11 15.73 writing manifest file '/tmp/pip-pip-egg-info-2jyd4zlx/GDAL.egg-info/SOURCES.txt'
#11 15.73 Traceback (most recent call last):
#11 15.73 File "/tmp/pip-install-tjr9j_9m/gdal_6b994752ac484434b194dfc7ccf64728/setup.py", line 105, in fetch_config
#11 15.73 p = subprocess.Popen([command, args], stdout=subprocess.PIPE)
#11 15.73 File "/usr/local/lib/python3.9/subprocess.py", line 951, in __init__
#11 15.73 self._execute_child(args, executable, preexec_fn, close_fds,
#11 15.73 File "/usr/local/lib/python3.9/subprocess.py", line 1821, in _execute_child
#11 15.73 raise child_exception_type(errno_num, err_msg, err_filename)
#11 15.73 FileNotFoundError: [Errno 2] No such file or directory: '../../apps/gdal-config'
Here's my dockerfile (the base image is non-negotiable, I'm afraid)
FROM python:3.9.12-alpine3.15
RUN apk add --no-cache gdal
RUN mkdir -p /usr/src/app/file
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ /usr/src/app/src/
COPY main.py /usr/src/app/
CMD ["python", "main.py"]
The requirements.txt is as follows.
requests>= 2.25.0
numpy>=1.23.1
pillow>=9.2.0
GDAL~=3.5.1
DateTime~=4.3
bitstring~=3.1.9
behave~=1.2.6
I'm not sure if there is a way to install GDAL to a particular spot so that python can find it when installing its GDAL libraries, or if I need to give pip some kind of hint, or if something else entirely is going on. If anyone has worked with this library before inside a docker container, thanks in advance!
The main thrust of the problem was lacking gdal-dev, an additional package that separately installs the python bindings. However, once pip was able to find gdal and begin installing the python libraries, those are built from C code and several additional tools plus were needed in the image, plus some environment settings to point them to the right place. Here is the dockerfile that ultimately worked:
FROM python:3.9.12-alpine3.15
RUN apk add --no-cache gcc
RUN apk add --no-cache gdal
RUN apk add --no-cache gdal-dev
RUN apk add --no-cache build-base
RUN apk add --no-cache zlib
RUN export CPLUS_INCLUDE_PATH=/usr/include/gdal
RUN export C_INCLUDE_PATH=/usr/include/gdal
RUN export LDFLAGS="-L/usr/local/opt/zlib/lib"
RUN export CPPFLAGS="-I/usr/local/opt/zlib/include"
RUN mkdir -p /usr/src/app/file
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip3 install --no-cache-dir -r requirements.txt
COPY src/ /usr/src/app/src/
COPY main.py /usr/src/app/
CMD ["python", "main.py"]
Lastly, in the requirements.txt, the current 3.5.1 GDAL bindings are for Python 3.10, so I had to downgrade the GDAL until I found one that built without errors since the Python version in the base image is pinned.

Error while installing psycopg2 using pip [duplicate]

This question already has answers here:
pg_config executable not found
(54 answers)
Closed 6 months ago.
I am trying to run docker in django using this command docker build -t myimage . Now the docker file tries to run the RUN pip install -r /app/requirements.txt --no-cache-dir but when ot gets to the Downloading psycopg2-2.9.3.tar.gz (380 kB) section, it throws the error.
NOTE: i do not have psycopg2 in my requirements.txt file only the psycopg2-binary.
requirements.txt file
...
dj-database-url==0.5.0
Django==3.2.7
django-filter==21.1
django-formset-js-improved==0.5.0.2
django-heroku==0.3.1
psycopg2-binary
python-decouple==3.5
...
Downloading pytz-2022.2.1-py2.py3-none-any.whl (500 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 500.6/500.6 kB 2.6 MB/s eta 0:00:00
Collecting psycopg2
Downloading psycopg2-2.9.3.tar.gz (380 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 380.6/380.6 kB 2.7 MB/s eta 0:00:00
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [23 lines of output]
running egg_info
creating /tmp/pip-pip-egg-info-383i9hb2/psycopg2.egg-info
writing /tmp/pip-pip-egg-info-383i9hb2/psycopg2.egg-info/PKG-INFO
writing dependency_links to /tmp/pip-pip-egg-info-383i9hb2/psycopg2.egg-info/dependency_links.txt
writing top-level names to /tmp/pip-pip-egg-info-383i9hb2/psycopg2.egg-info/top_level.txt
writing manifest file '/tmp/pip-pip-egg-info-383i9hb2/psycopg2.egg-info/SOURCES.txt'
Error: pg_config executable not found.
pg_config is required to build psycopg2 from source. Please add the directory
containing pg_config to the $PATH or specify the full executable path with the
option:
python setup.py build_ext --pg-config /path/to/pg_config build ...
or with the pg_config option in 'setup.cfg'.
If you prefer to avoid building psycopg2 from source, please install the PyPI
'psycopg2-binary' package instead.
For further information please check the 'doc/src/install.rst' file (also at
<https://www.psycopg.org/docs/install.html>).
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
The command '/bin/sh -c pip install -r /app/requirements.txt --no-cache-dir' returned a non-zero code: 1
Dockerfile
FROM python:3.8.13-slim-buster
WORKDIR /app
COPY ./my_app ./
RUN pip install --upgrade pip --no-cache-dir
RUN pip install -r /app/requirements.txt --no-cache-dir
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
# CMD ["gunicorn", "main_app.wsgi:application", "--bind"]
You need to install the system dependencies (pg_*) if you want to use psycopg2, otherwise, you can use the all-in-one package that include them by remplacing psycopg2 by psycopg2-binary

How do I fix setuptools not available in build environment when trying to install libraries in python?

I'm trying to install a requirements.txt file in Docker, and I make it about 30 packages in when I get this error when trying to install importlib:
Can not execute setup.py since setuptools is not available in the build environment.
Here is the full error message:
Collecting importlib==1.0.4
#9 14.42 Downloading importlib-1.0.4.zip (7.1 kB)
#9 14.43 Preparing metadata (setup.py): started
#9 14.45 Preparing metadata (setup.py): finished with status 'error'
#9 14.45 error: subprocess-exited-with-error
#9 14.45
#9 14.45 × python setup.py egg_info did not run successfully.
#9 14.45 │ exit code: 1
#9 14.45 ╰─> [1 lines of output]
#9 14.45 ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.
#9 14.45 [end of output]
#9 14.45
#9 14.45 note: This error originates from a subprocess, and is likely not a problem with pip.
#9 14.45 error: metadata-generation-failed
#9 14.45
#9 14.45 × Encountered error while generating package metadata.
#9 14.45 ╰─> See above for output.
#9 14.45
#9 14.45 note: This is an issue with the package mentioned above, not pip.
#9 14.45 hint: See above for details.
In the Dockerfile, I have tried installing setuptools before installing the packages:
RUN python3 -m pip install setuptools
RUN python3 -m pip install -r requirements.txt
Some further details: my python version is 3.9, pip version is 22.1.2
When I do
easy_install --version
, I get
setuptools 41.0.1 from /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (Python 2.7)
However, when I try to install setuptools with pip, it says I have version 62.3.3 for python3.9
Requirement already satisfied: setuptools in /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages (62.3.3)
Any help is greatly appreciated, and let me know if I should provide any further details.
As mentioned in the comments, the error appears because you are trying to install importlib on Python 3.9, while it's meant for much older Python versions only.
Just remove importlib from your requirements.txt file.

Error in pip install transformers: Building wheel for tokenizers (pyproject.toml): finished with status 'error'

I'm building a docker image on cloud server via the following docker file:
# base image
FROM python:3
# add python file to working directory
ADD ./ /
# install and cache dependencies
RUN pip install --upgrade pip
RUN pip install RUST
RUN pip install transformers
RUN pip install torch
RUN pip install slack_sdk
RUN pip install slack_bolt
RUN pip install pandas
RUN pip install gensim
RUN pip install nltk
RUN pip install psycopg2
RUN pip install openpyxl
......
When installing the transformers package, the following error occurs:
STEP 5: RUN pip install transformers
Collecting transformers
Downloading transformers-4.15.0-py3-none-any.whl (3.4 MB)
Collecting filelock
......
Downloading click-8.0.3-py3-none-any.whl (97 kB)
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml): started
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python /usr/local/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmp_3y7hw5q
cwd: /tmp/pip-install-bsy5f4da/tokenizers_e09b9f903acd40f0af4a997fe1d8fdb4
Complete output (50 lines):
running bdist_wheel
......
copying py_src/tokenizers/trainers/__init__.pyi -> build/lib.linux-x86_64-3.10/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.linux-x86_64-3.10/tokenizers/tools
running build_ext
error: can't find Rust compiler
If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
To update pip, run:
pip install --upgrade pip
and then retry package installation.
If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
----------------------------------------
ERROR: Failed building wheel for tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
Building wheel for tokenizers (pyproject.toml): finished with status 'error'
Failed to build tokenizers
subprocess exited with status 1
subprocess exited with status 1
error building at STEP "RUN pip install transformers": exit status 1
time="2022-01-18T07:24:56Z" level=error msg="exit status 1"
Dockerfile build failed - exit status 1exit status 1
I'm not very sure about what's happening here. Can anyone help me? Thanks in advance.
The logs say
error: can't find Rust compiler
You need to install a rust compiler. See https://www.rust-lang.org/tools/install. You can modify the installation instructions for a docker image like this (from https://stackoverflow.com/a/58169817/5666087):
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
Just use a clean python=3.8 environment and try again.

Beeware 'briefcase create' asks for cairo >= 1.15.10

I'm following Beeware tutorial and unable to 'briefcase create'.
At some point it shows this:
Collecting pygobject>=3.14.0
Downloading PyGObject-3.38.0.tar.gz (712 kB)
|████████████████████████████████| 712 kB 6.9 MB/s
Installing build dependencies ... error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3.8 /usr/local/lib/python3.8/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-6esqaemw/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel pycairo
cwd: None
Complete output (36 lines):
WARNING: The directory '/home/brutus/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting setuptools
Downloading setuptools-50.3.2-py3-none-any.whl (785 kB)
Collecting wheel
Downloading wheel-0.35.1-py2.py3-none-any.whl (33 kB)
Collecting pycairo
Downloading pycairo-1.20.0.tar.gz (344 kB)
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing wheel metadata: started
Preparing wheel metadata: finished with status 'done'
Building wheels for collected packages: pycairo
Building wheel for pycairo (PEP 517): started
Building wheel for pycairo (PEP 517): finished with status 'error'
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3.8 /usr/local/lib/python3.8/dist-packages/pip/_vendor/pep517/_in_process.py build_wheel /tmp/tmp426eh9du
cwd: /tmp/pip-install-rmj9v5en/pycairo
Complete output (12 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.py -> build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.pyi -> build/lib.linux-x86_64-3.8/cairo
copying cairo/py.typed -> build/lib.linux-x86_64-3.8/cairo
running build_ext
Requested 'cairo >= 1.15.10' but version of cairo is 1.14.6
Command '['pkg-config', '--print-errors', '--exists', 'cairo >= 1.15.10']' returned non-zero exit status 1.
----------------------------------------
ERROR: Failed building wheel for pycairo
Failed to build pycairo
ERROR: Could not build wheels for pycairo which use PEP 517 and cannot be installed directly
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3.8 /usr/local/lib/python3.8/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-6esqaemw/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel pycairo Check the logs for full command output.
Unable to install dependencies. This may be because one of your dependencies is invalid, or because pip was unable to connect to the PyPI server.
I believe the main problem is
Requested 'cairo >= 1.15.10' but version of cairo is 1.14.6
But what I don't understand is were is cairo 1.14.6 if I have only cairo 1.20.0 installed.
I tried to update docker, reinstall cairo and pycairo, updated python packages but the error is still there.
The beeware build process is trying to install the pycairo package. This package is just a python interface to the cairo graphics library (libcairo2).
pycairo changelog shows, newest version 1.20.0 for pycairo, which requires cairo (libcairo2) version 1.15.10+.
If you are lucky, you can simply update your cairo package to a version which satisfies the requirement. Info is on the official site cairographics.org/download/.
I have the same problem as you. In my case I am again reminded to upgrade my OS to newer version of Ubuntu, since the cairo package libcairo2 is only available in version 14.6 in Ubuntu 16.04 LTS official PPA. In Ubuntu 20.04 LTS the available libcairo2 is 16.0. I suspect you have a simmilar OS, because your installed cairo version is the same as mine.
You can build the package from source to bypass the error:

Categories

Resources