pip: downloading dependencies to specific platform including non binaries - python

I'm trying to download the dependencies of paramiko from a linux host to a windows target which has no internet access .
After reading the example on pip's documentation I've used to following command in order to download the dependencies recursively to a 64 bit windows platform:
pip3 download --only-binary=:all: --platform win_amd64 --implementation cp paramiko
Was able to recursively download the dependencies until reaching pycparser. That is not surprising since I've used the --only-binary=:all: flag. Thing is - pip forces the usage of this flag when --platform flag is passed:
ERROR: --only-binary=:all: must be set and --no-binary must not be set (or must be set to :none:) when restricting platform and interpreter constraints using --python-version, --platform, --abi, or --implementation.
Terminal produced the following output:
Collecting paramiko
Downloading paramiko-2.3.0-py2.py3-none-any.whl (182kB)
100% |████████████████████████████████| 184kB 340kB/s
Saved ./paramiko-2.3.0-py2.py3-none-any.whl
Collecting pynacl>=1.0.1 (from paramiko)
Using cached PyNaCl-1.1.2-cp35-cp35m-win_amd64.whl
Saved ./PyNaCl-1.1.2-cp35-cp35m-win_amd64.whl
Collecting cryptography>=1.5 (from paramiko)
Using cached cryptography-2.0.3-cp35-cp35m-win_amd64.whl
Saved ./cryptography-2.0.3-cp35-cp35m-win_amd64.whl
Collecting pyasn1>=0.1.7 (from paramiko)
Using cached pyasn1-0.3.5-py2.py3-none-any.whl
Saved ./pyasn1-0.3.5-py2.py3-none-any.whl
Collecting bcrypt>=3.1.3 (from paramiko)
Using cached bcrypt-3.1.3-cp35-cp35m-win_amd64.whl
Saved ./bcrypt-3.1.3-cp35-cp35m-win_amd64.whl
Collecting cffi>=1.4.1 (from pynacl>=1.0.1->paramiko)
Using cached cffi-1.11.0-cp35-cp35m-win_amd64.whl
Saved ./cffi-1.11.0-cp35-cp35m-win_amd64.whl
Collecting six (from pynacl>=1.0.1->paramiko)
Using cached six-1.11.0-py2.py3-none-any.whl
Saved ./six-1.11.0-py2.py3-none-any.whl
Collecting asn1crypto>=0.21.0 (from cryptography>=1.5->paramiko)
Using cached asn1crypto-0.22.0-py2.py3-none-any.whl
Saved ./asn1crypto-0.22.0-py2.py3-none-any.whl
Collecting idna>=2.1 (from cryptography>=1.5->paramiko)
Using cached idna-2.6-py2.py3-none-any.whl
Saved ./idna-2.6-py2.py3-none-any.whl
Collecting pycparser (from cffi>=1.4.1->pynacl>=1.0.1->paramiko)
Could not find a version that satisfies the requirement pycparser (from cffi>=1.4.1->pynacl>=1.0.1->paramiko) (from versions: )
No matching distribution found for pycparser (from cffi>=1.4.1->pynacl>=1.0.1->paramiko)
Is there a way of overcoming this issue? Will I have to manually install non-binary packages (and their dependencies)?
Thanks,
Joey.

You have two options:
run the download operation on the same platform (be careful to be the same)
fix the internet access on your host
Don't try other fancy method or you will shut yourself in the foot: some dependencies will need to compile!

You can use the --prefer-binary option in pip. That'll make pip consider wheels as more important, even if they're an older version than an existing sdist (sdist is short for source distribution). An sdist would be selected if no wheels are found to be compatible.
This was released in pip 18.0 (so that's early 2018, pip's using CalVer now).

#sorin is right, your only real option is to use the exact same environment to download the dependencies as you'll be installing them on.
My solution is to use Docker to build a wheel that matches the target platform. In my case it is Debian 10, but it will work just the same for any operating system and version as long as there is a Docker image available.
Example Dockerfile to build a wheel with the dependencies in requirements.txt for Debian 10 with cPython 3.9:
FROM python:3.9-slim-buster
COPY requirements.txt requirements.txt
RUN set -eux; \
apt-get update && \
apt-get install -y build-essential && \
python3 -m venv .venv --without-pip
ENV VIRTUAL_ENV=.venv
ENV PATH="${VIRTUAL_ENV}/bin:${PATH}"
RUN set -eux; \
curl --silent https://bootstrap.pypa.io/get-pip.py | python && \
pip download --prefer-binary --upgrade setuptools wheel setuptools-rust -d deps && \
pip download --prefer-binary -r requirements.txt -d deps && \
mkdir -p main-wheel && \
pip wheel --wheel-dir=main-wheel -r requirements.txt
Build the image and extract the wheel:
docker build -t buildwheel -f Dockerfile
mkdir -p artifacts
CONTAINER=$(docker create buildwheel || exit 1)
docker cp "${CONTAINER}":main-wheel artifacts/. || exit 1
docker rm "${CONTAINER}"
docker image rm buildwheel
Congrats, you now have a wheel specifically for Debian 10 with cPython 3.9 inside the directory artifacts/main-wheel. Copy it to the target machine and do pip install --no-index --find-links=artifacts/main-wheel -r requirements.txt and everything should work.
PS: You might need to add build-time dependencies to the apt-get install inside the Dockerfile.

in python3 you can download the dependencies like mentioned below
while running this be inside the folder you want to save
pip download -r requirements.txt
once you downloaded the files move these to the machine you want to you can install
then run this command
pip install -r req.txt --no-index --find-links="/path/to/downloaded/files"

Related

Install packages on EMR via bootstrap actions not working in Jupyter notebook

I have an EMR cluster using EMR-6.3.1.
I am using the Python3 Kernel.
I have a very simple bootstrap script in S3:
#!/bin/bash
sudo python3 -m pip install Cython==0.29.4 boto==2.49.0 boto3==1.18.50 numpy==1.19.5 pandas==1.3.2 pyarrow==5.0.0
These are the bootstrap logs
+ sudo python3 -m pip install Cython==0.29.4 boto==2.49.0 boto3==1.18.50 numpy==1.19.5 pandas==1.3.2 pyarrow==5.0.0
WARNING: Running pip install with root privileges is generally not a good idea. Try `python3 -m pip install --user` instead.
WARNING: The scripts cygdb, cython and cythonize are installed in '/usr/local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The scripts f2py, f2py3 and f2py3.7 are installed in '/usr/local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script plasma_store is installed in '/usr/local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
and
Collecting Cython==0.29.4
Downloading Cython-0.29.4-cp37-cp37m-manylinux1_x86_64.whl (2.1 MB)
Requirement already satisfied: boto==2.49.0 in /usr/local/lib/python3.7/site-packages (2.49.0)
Collecting boto3==1.18.50
Downloading boto3-1.18.50-py3-none-any.whl (131 kB)
Collecting numpy==1.19.5
Downloading numpy-1.19.5-cp37-cp37m-manylinux2010_x86_64.whl (14.8 MB)
Collecting pandas==1.3.2
Downloading pandas-1.3.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
Collecting pyarrow==5.0.0
Downloading pyarrow-5.0.0-cp37-cp37m-manylinux2014_x86_64.whl (23.6 MB)
Collecting s3transfer<0.6.0,>=0.5.0
Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB)
Collecting botocore<1.22.0,>=1.21.50
Downloading botocore-1.21.65-py3-none-any.whl (8.0 MB)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.7/site-packages (from boto3==1.18.50) (0.10.0)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas==1.3.2) (2021.1)
Collecting python-dateutil>=2.7.3
Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting urllib3<1.27,>=1.25.4
Downloading urllib3-1.26.13-py2.py3-none-any.whl (140 kB)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas==1.3.2) (1.13.0)
Installing collected packages: Cython, python-dateutil, urllib3, botocore, s3transfer, boto3, numpy, pandas, pyarrow
Successfully installed Cython-0.29.4 boto3-1.18.50 botocore-1.21.65 numpy-1.19.5 pandas-1.3.2 pyarrow-5.0.0 python-dateutil-2.8.2 s3transfer-0.5.2 urllib3-1.26.13
From a notebook, importing pandas and seeing the wrong version - 1.2.3.
Further, I see pyarrow fails to import.
I've printed the import path of pandas, which python version is run, and sys.path.
import os
import pandas
import sys
print(sys.path)
print(pandas.__version__)
print(os.path.abspath(pandas.__file__))
print(os.popen('echo $PYTHONPATH').read())
print(os.popen('which python3').read())
# sys.path.append('/usr/local/lib64/python3.7/site-packages') # if I add this, pyarrow can import
import pyarrow
['/', '/emr/notebook-env/lib/python37.zip', '/emr/notebook-env/lib/python3.7', '/emr/notebook-env/lib/python3.7/lib-dynload', '', '/emr/notebook-env/lib/python3.7/site-packages', '/emr/notebook-env/lib/python3.7/site-packages/awseditorssparkmonitoringwidget-1.0-py3.7.egg', '/emr/notebook-env/lib/python3.7/site-packages/IPython/extensions', '/home/emr-notebook/.ipython']
1.2.3
/emr/notebook-env/lib/python3.7/site-packages/pandas/__init__.py
/usr/bin/python3
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-aea9862499ce> in <module>
9
10 # sys.path.append('/usr/local/lib64/python3.7/site-packages') # if I add this, pyarrow can import
---> 11 import pyarrow
ModuleNotFoundError: No module named 'pyarrow'
I found I can import pyarrow if I add /usr/local/lib64/python3.7/site-packages to sys.path. This seems like am improvement, but still the wrong version of pandas is imported.
I've tried:
SSH'ing into the master node and mucking with the configuration.
sudo python3 -m pip install --user ...
export PYTHONPATH=/usr/local/lib64/python3.7/site-packages && sudo python3 -m pip install ...
sudo pip3 install --upgrade setuptools && sudo python3 -m pip install ...
Using a pyspark kernel and running sc.install_pypi_package("pandas==1.3.2")
Any help is appreciated. Thank you.
The bootstrap configuration on EMR is not the last step before the cluster is WAITING and EMR Steps start running.
On my emr cluster I found that at the least these packages were logged as installed after the bootstrap configuration ran. I was having issues with numpy not upgrading.
Python packages installed post bootstrap
2022-12-10 00:10:28,250 INFO
main: Took 1 minute, 3 seconds and 451 milliseconds to install
packages:
[emr-scripts, emr-s3-select, aws-sagemaker-spark-sdk,
python27-numpy, python27-sagemaker_pyspark, python37-numpy,
python37-sagemaker_pyspark, emr-ddb, hadoop-yarn-nodemanager, docker,
hadoop-yarn, spark-yarn-shuffle, bigtop-utils, cloudwatch-sink,
hadoop, hadoop-lzo, emr-goodies, emrfs, hadoop-mapreduce, hadoop-hdfs,
R-core, aws-hm-client, emr-kinesis, hadoop-hdfs-datanode,
spark-datanucleus]
A work around to in your first cluster step to add the installition
...
Steps: [
{
'Name': 'Install Pandas',
"ActionOnFailure": "CONTINUE",
"HadoopJarStep": {
"Jar":
"s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar",
"Args": [
"s3://bucket/prefix/install_packages.sh"
]
}
},
]
Or if you want to use command-runner
{
"Name": "Install Pandas",
"ActionOnFailure": "CONTINUE",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args": [
"bash",
"-c",
" aws s3 cp s3://bucket/prefix/install.sh .; chmod +x install.sh; ./install.sh; rm install_pandas.sh "
]
}
}
In my example my install.sh file looks like
#!/bin/bash
set -x
sudo pip3 freeze # for debugging to view the previous package versions
sudo pip3 uninstall numpy -y -v
sudo yum install python3-devel -y
sudo pip3 install boto3==1.26.26 -v
sudo pip3 install numpy==1.21.6 -v
sudo pip3 install pandas==1.3.5 -v
sudo pip3 freeze # for debugging to view the post-install package versions

ERROR: Could not find a version that satisfies the requirement apturl==0.5.2 (from versions: none)-- [duplicate]

I am currently trying to install a requirements and it is telling me that it is not found when I try and comment them out it happens for others.
I just deployed a Ubuntu 18.04 server. Made the virtual env by the following command python3 -m venv --system-site-packages env but every single time I try and run pip install -r requirements.txt it fails with
Collecting apparmor==2.12 (from -r requirements.txt (line 1))
Could not find a version that satisfies the requirement apparmor==2.12 (from -r requirements.txt (line 1)) (from versions: )
No matching distribution found for apparmor==2.12 (from -r requirements.txt (line 1))
if I try and install say pip install apparmor it tells me
Collecting apparmor
Could not find a version that satisfies the requirement apparmor (from versions: )
No matching distribution found for apparmor
But then if I comment out apparmor it tells me this
Collecting apturl==0.5.2 (from -r requirements.txt (line 2))
Could not find a version that satisfies the requirement apturl==0.5.2 (from -r requirements.txt (line 2)) (from versions: )
No matching distribution found for apturl==0.5.2 (from -r requirements.txt (line 2))
and it goes on for others randomly. The requirements was made on my local which is also ubuntu 18 so unsure why this works on local but not on a new deploy.
I have also made sure that it's the newest version of pip
apparmor and apturl are Ubuntu packages, you can safely ignore them if your code doesn't use their code; just remove them from requirements.txt. If your code depends on them, ensure they are installed via apt:
apt install -y apparmor apturl && pip install -r requirements.txt
This is a common problem when you don't use virtual enviroment for work with python, so your requirements.txt lists all the packages pythons of your system or OS, when you must have only the packages from your project. In some moment you update your requirements.txt with pip freeze > requirements.txt, without a virtual environment and you updated the requirements.txt with all python packages in your OS and from your project, and maybe uploaded to a repository. So when you want to run in another computer and install all packages you got this kind of error...
Python is installed by default in ubuntu, you must consider this and in other system too.
First rule is work every time with virtual enviroment "virtual env documentation"
I know it's hard work, but you can backup that requirements.txt and clean it. Then try to run your program without any package (a clean install) and when errors occurs from missing packages you add it and update with pip freeze > requirements.txt

Cryptography Python Docker multistage build

I have a Python project that runs in a docker container and I am trying to convert to a multistage docker build process. My project depends on the cryptography package. My Dockerfile consists of:
# Base
FROM python:3.6 AS base
RUN pip install cryptography
# Production
FROM python:3.6-alpine
COPY --from=base /root/.cache /root/.cache
RUN pip install cryptography \
&& rm -rf /root/.cache
CMD python
Which I try to build with e.g:
docker build -t my-python-app .
This process works for a number of other Python requirements I have tested, such as pycrypto and psutil, but throws the following error for cryptography:
Step 5/6 : RUN pip install cryptography && rm -rf /root/.cache
---> Running in ebc15bd61d43
Collecting cryptography
Downloading cryptography-2.1.4.tar.gz (441kB)
Collecting idna>=2.1 (from cryptography)
Using cached idna-2.6-py2.py3-none-any.whl
Collecting asn1crypto>=0.21.0 (from cryptography)
Using cached asn1crypto-0.24.0-py2.py3-none-any.whl
Collecting six>=1.4.1 (from cryptography)
Using cached six-1.11.0-py2.py3-none-any.whl
Collecting cffi>=1.7 (from cryptography)
Downloading cffi-1.11.5.tar.gz (438kB)
Complete output from command python setup.py egg_info:
No working compiler found, or bogus compiler options passed to
the compiler from Python's standard "distutils" module. See
the error messages above. Likely, the problem is not related
to CFFI but generic to the setup.py of any Python package that
tries to compile C code. (Hints: on OS/X 10.8, for errors about
-mno-fused-madd see http://stackoverflow.com/questions/22313407/
Otherwise, see https://wiki.python.org/moin/CompLangPython or
the IRC channel #python on irc.freenode.net.)
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-uyh9_v63/cffi/
Obviously I was hoping not to have to install any compiler on my production image. Do I need to copy across another directory other than /root/.cache?
There is no manylinux wheel for Alpine, so you need to compile it yourself. Below is pasted from documentation on installation. Install and remove build dependencies in the same command to only save the package to the docker image layer.
If you are on Alpine or just want to compile it yourself then
cryptography requires a compiler, headers for Python (if you’re not
using pypy), and headers for the OpenSSL and libffi libraries
available on your system.
Alpine Replace python3-dev with python-dev if you’re using Python 2.
$ sudo apk add gcc musl-dev python3-dev libffi-dev openssl-dev
If you get an error with openssl-dev you may have to use libressl-dev.
Docs can be found here
I hope, my answer will be useful.
You should use --user option for cryptography installing via pip in base stage. Example: RUN pip install --user cryptography. This option means, that all files will be installed in the .local directory of
the current user’s home directory.
COPY --from=base /root/.local /root/.local, because cryptography installed in /root/.local.
Thats all. Full example docker multistage
# Base
FROM python:3.6 AS base
RUN pip install --user cryptography
# Production
FROM python:3.6-alpine
COPY --from=base /root/.local /root/.local
RUN pip install cryptography \
&& rm -rf /root/.cache
CMD python

pip install -d not creating wheel for cffi

I'm trying to create "offline package" for python code.
I'm running
pip install -d <dest dir> -r requirements.txt
The thing is that cffi==1.6.0 (inside requirements.txt) doesn't get built into a wheel.
Is there a way I can make it? (I trying to avoid the dependency in gcc in the target machine)
install -d just downloads the packages; it doesn't build them. To force everything to be built into wheels, use the pip wheel command instead:
pip wheel -w <dir> -r requirements.txt

Pip Wheel and coverage: command not found error

I want to use wheels on my Linux server as it seems much faster, but when I do:
pip install wheel
pip wheel -r requirements_dev.txt
Which contains the following packages
nose
django_coverage
coverage
I get coverage: command not found, it's like it is not being installed.
Is there a fallback if a wheel is not found to pip install or have I not understood/setup this correctly?
Can you try this?
virtualenv venv
source venv/bin/activate
pip -r install requirement.txt
Also getting this with using wheel:-
pip wheel -r check.txt
Collecting nose (from -r check.txt (line 1))
Using cached nose-1.3.7-py2-none-any.whl
Saved ./nose-1.3.7-py2-none-any.whl
Collecting django_coverage (from -r check.txt (line 2))
Saved ./django_coverage-1.2.4-cp27-none-any.whl
Collecting coverage (from -r check.txt (line 3))
Using cached coverage-4.2-cp27-cp27m-macosx_10_10_x86_64.whl
Saved ./coverage-4.2-cp27-cp27m-macosx_10_10_x86_64.whl
Skipping nose, due to already being wheel.
Skipping django-coverage, due to already being wheel.
Skipping coverage, due to already being wheel.
Installing from wheels is what pip already does by default. pip wheel is for creating wheels from your requirements file.

Categories

Resources