How to Bootstrap numpy installation in setup.py - python

I have a project which has a C extension which requires numpy. Ideally, I'd like whoever downloads my project to just be able to run python setup.py install or use one call to pip. The problem I have is that in my setup.py I need to import numpy to get the location of the headers, but I'd like numpy to be just a regular requirement in install_requires so that it will automatically be downloaded from the Python Package Index.
Here is a sample of what I'm trying to do:
from setuptools import setup, Extension
import numpy as np
ext_modules = [Extension('vme', ['vme.c'], extra_link_args=['-lvme'],
include_dirs=[np.get_include()])]
setup(name='vme',
version='0.1',
description='Module for communicating over VME with CAEN digitizers.',
ext_modules=ext_modules,
install_requires=['numpy','pyzmq', 'Sphinx'])
Obviously, I can't import numpy at the top before it's installed. I've seen a setup_requires argument passed to setup() but can't find any documentation on what it is for.
Is this possible?

The following works at least with numpy1.8 and python{2.6,2.7,3.3}:
from setuptools import setup
from setuptools.command.build_ext import build_ext as _build_ext
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
setup(
...
cmdclass={'build_ext':build_ext},
setup_requires=['numpy'],
...
)
For a small explanation, see why it fails without the "hack", see this answer.
Note, that using setup_requires has a subtle downside: numpy will not only be compiled before building extensions, but also when doing python setup.py --help, for example. To avoid this, you could check for command line options, like suggested in https://github.com/scipy/scipy/blob/master/setup.py#L205, but on the other hand I don't really think it's worth the effort.

I found a very easy solution in [this post][1]:
Or you can stick to https://github.com/pypa/pip/issues/5761. Here you install cython and numpy using setuptools.dist before actual setup:
from setuptools import dist
dist.Distribution().fetch_build_eggs(['Cython>=0.15.1', 'numpy>=1.10'])
Works well for me!

This is a fundamental problem with packages that need to use numpy (for distutils or get_include). I do not know of a way to "boot-strap" it using pip or easy-install.
However, it is easy to make a conda package for your module and provide the list of dependencies so that someone can just do a conda install pkg-name which will download and install everything needed.
Conda is available in Anaconda or in Miniconda (python + just conda).
See this website: http://docs.continuum.io/conda/index.html
or this slide-deck for more information: https://speakerdeck.com/teoliphant/packaging-and-deployment-with-conda

The key is to defer importing numpy until after it has been installed. A trick I learned from this pybind11 example is to import numpy in the __str__ method of a helper class (get_numpy_include below).
from setuptools import setup, Extension
class get_numpy_include(object):
"""Defer numpy.get_include() until after numpy is installed."""
def __str__(self):
import numpy
return numpy.get_include()
ext_modules = [Extension('vme', ['vme.c'], extra_link_args=['-lvme'],
include_dirs=[get_numpy_include()])]
setup(name='vme',
version='0.1',
description='Module for communicating over VME with CAEN digitizers.',
ext_modules=ext_modules,
install_requires=['numpy','pyzmq', 'Sphinx'])

To get pip to work, you can do similarly as Scipy: https://github.com/scipy/scipy/blob/master/setup.py#L205
Namely, the egg_info command needs to be passed to standard setuptools/distutils, but other commands can use numpy.distutils.

Perhaps a more practical solution is to just require numpy to be installed beforehand and import numpy inside a function scope. #coldfix solution works but compiling numpy takes forever. Much faster to pip install it first as a wheels package, especially now that we have wheels for most systems thanks to efforts like manylinux.
from __future__ import print_function
import sys
import textwrap
import pkg_resources
from setuptools import setup, Extension
def is_installed(requirement):
try:
pkg_resources.require(requirement)
except pkg_resources.ResolutionError:
return False
else:
return True
if not is_installed('numpy>=1.11.0'):
print(textwrap.dedent("""
Error: numpy needs to be installed first. You can install it via:
$ pip install numpy
"""), file=sys.stderr)
exit(1)
def ext_modules():
import numpy as np
some_extention = Extension(..., include_dirs=[np.get_include()])
return [some_extention]
setup(
ext_modules=ext_modules(),
)

This should now (since 2018-ish) be solved by adding numpy as a buildsystem dependency in pyproject.toml, so that pip install makes numpy available before it runs setup.py.
The pyproject.toml file should also specify that you're using Setuptools to build the project. It should look something like this:
[build-system]
requires = ["setuptools", "wheel", "numpy"]
build-backend = "setuptools.build_meta"
See Setuptools' Build System Support docs for more details.
This doesn't cover many other uses of setup.py other than install, but as those are mainly for you (and other developers of your project), so an error message saying to install numpy might work.

#coldfix's solution doesn't work for Cython-extensions, if Cython isn't pre-installed on the target-machine, as it fails with the error
error: unknown file type '.pyx' (from 'xxxxx/yyyyyy.pyx')
The reason for the failure is the premature import of setuptools.command.build_ext, because when imported, it tries to use Cython's build_ext-functionality:
try:
# Attempt to use Cython for building extensions, if available
from Cython.Distutils.build_ext import build_ext as _build_ext
# Additionally, assert that the compiler module will load
# also. Ref #1229.
__import__('Cython.Compiler.Main')
except ImportError:
_build_ext = _du_build_ext
And normally setuptools is successful, because the import happens after setup_requirements are fulfilled. However by importing it already in setup.py, only fall back solution can be used, which doesn't know any about Cython.
One possibility to bootstrap Cython alongside with numpy, would be to postpone the import of setuptools.command.build_ext with help of an indirection/proxy:
# factory function
def my_build_ext(pars):
# import delayed:
from setuptools.command.build_ext import build_ext as _build_ext#
# include_dirs adjusted:
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
#object returned:
return build_ext(pars)
...
setup(
...
cmdclass={'build_ext' : my_build_ext},
...
)
There are other possibilities, discussed for example in this SO-question.

You can simply add numpy into your pyproject.toml file. This works for me.
[build-system]
requires = [
"setuptools>=42",
"wheel",
"Cython",
"numpy"
]
build-backend = "setuptools.build_meta"

Related

How do I include the numpy includedir for a setuptools extension?

I have a setup.py that looks like this, minus the irrelevant bits.
import numpy
my_module = Extension(
'my_module',
......
include_dirs=[numpy.get_include()],
)
setup(
...
ext_modules=[my_module],
install_requires=['numpy'],
)
I cannot figure out how to rewrite this so that numpy is automatically installed if it is not already in the environment. In its current form, I get an error from the import numpy about numpy not being defined.
I need the extension to know where the numpy include files are. But there seems no way to delay the definition of the extension (or at least of its include_dirs) until after the installation of numpy from the install_requires has occurred.
Is there a workaround?

Importing package in another package's module?

I am building a package (and then going to upload it on pypi). I have 3 modules in this package. These packages have a few dependencies like numpy, cv2, and more.
So, I mentioned my dependencies in the setup.py file.
# -*- coding: utf-8 -*-
import setuptools
setuptools.setup(
name='StyleTransferTensorFlow',
url='https://github.com/LordHarsh/Neural_Style_Transfer',
author='Hash Banka',
author_email='harshbanka321#gmail.com',
packages=setuptools.find_packages(),
# Needed for dependencies
install_requires=['matplotlib','tensorflow','os-win','ffmpy','glob2', 'pytest-shutil', 'pytube', 'opencv-python', 'pillow', 'numpy', 'tensorflow_hub'],
version='0.0.8',
# The license can be anything you like
license='MIT',
description="Package to apply style transfer on different frames of a video",
# We will also need a readme eventually (there will be a warning)
# long_description=open('README.txt').read(),
python_requires='>=3.6',
)
I have also imported them in the init.py file present in same directory as the modules.
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
from pytube import YouTube
import os
import cv2
from PIL import Image
import shutil
import glob
import ffmpy
But still i am getting error when I execute code in the module
NameError: name 'cv2' is not defined
I am not sure why my imports are not running.
I have used cv2 inside a module to do the task.
So I am not sure what I am doing wrong. Please help me out.
I am not sure you fully understand what install_requires and packages are for.
install_requires specifies libraries that are required for your installation through setup.py to work. E.g. when installing it with pip or python setup.py install. You are specifying what should be on your computer before installing for the installation to work. No package will be installed: if a package listed here is missing, it will simply throw you an error during installation.
A very common package to include there is numpy, as you may import it in the setup.py, for example if you have some C or FORTRAN code which needs to compile upon installation.
packages argument shows which packages are needed for your library to work, so basically which packages to install along with the library. It will check if the package is not already installed on your machine, and if not install it along the library.
What I would do is empty entirely the install_requires argument. Don't even specify it if you are importing none of these packages in the setup.py. If it still doesn't work, I would replace setuptools.find_packages() with the list you are currently providing to install_requires.

AWS Lambda function in python not running and saying scikit-learn has not been built correctly

I have the following, very simple python code in a lambda function:
from sklearn.externals import joblib
import praw
import datetime
from operator import attrgetter
import sys
def handler_name(event, context):
return "I am a cat dog and i meow."
I have also done pip installs for scikit-learn, praw, datetime, numpy and scipy from within a python 2.7 virtualenv. I then compressed my .py file along with everything in my virtualenv's /lib/python2.7/site-packages folder into a zip and uploaded it to AWS lambda. Unfortunately when I run the code I get the following error:
Unable to import module 'mainLambda': /var/task/sklearn/__check_build/_check_build.so: invalid ELF header
___________________________________________________________________________
Contents of /var/task/sklearn/__check_build:
setup.py _check_build.so __init__.pyc
__init__.py setup.pyc
___________________________________________________________________________
It seems that scikit-learn has not been built correctly.
If you have installed scikit-learn from source, please do not forget
to build the package before using it: run `python setup.py install` or
`make` in the source directory.
If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform.
Obviously the issue is with sk-learn. I have no idea what though. It could be a versioning issue but I downloaded all the libraries from within a virtulenv and chose a python2.7 lambda function. Any idea? I am stumped!

Can I define optional packages in setuptools?

Currently one of my packages requires a JSON parser/encoder, and is designed to use simplejson if available falling back to the json module (in the standard library) if necessary (as benchmarks show simplejson is faster).
However, recently it's been hit or miss as to whether simplejson will install when using zc.buildout - something with the move to github, I believe. Which got me wondering; is it possible to define optional packages in my setup.py file which, if unavailable, won't stop the installation of my package?
optional packages at installation time.
I am assuming you are talking about your setup.py script.
You could change it to have:
# mypackage/setup.py
extras = {
'with_simplejson': ['simplejson>=3.5.3']
}
setup(
...
extras_require=extras,
...)
then you can do either of:
pip install mypackage,
pip install mypackage[with_simplejson]
with the latter installing simplejson>=3.5.3.
Instead of trying to install everything and fallback to a known good version,
you would want to install the subset of packages you know work.
optional packages at execution time.
Once you have two different sets of packages that could be installed, you need
to make sure you can use them if they are available. E.g. for your json import:
try:
# helpful comment saying this should be faster.
import simplejson as json
except ImportError:
import json
Another more complex example:
try:
# xml is dangerous
from defusedxml.cElementTree import parse
except ImportError:
try:
# cElementTree is not available in older python
from xml.cElementTree import parse
except ImportError:
from xml.ElementTree import parse
But you can also find this pattern in some packages:
try:
optional_package = None
import optional.package as optional_package
except ImportError:
pass
...
if optional_package:
# do addtional behavior
AFAIK there is no way to define an optional package and there would be no use to do so. What do you expect when you define an optional package? That it is installed when it is not yet available? (that would somehow make it mandatory)
No, IMHO the correct way to address this is in your imports where you want to use the package. E.g:
try:
from somespecialpackage import someapi as myapi
except ImportError:
from basepackage import theapi as myapi
This of course requires that the two APIs are compatible, but this is the case with simplejson and the standard library json package.

Cython ImportError: No module named parallel

I am trying to access the new parallel features of Cython 0.15 (using
Cython 0.15.1). However, if I try this minimal example (testp.py), taken from http://docs.cython.org/src/userguide/parallelism.html:
from cython.parallel import prange, parallel, threadid
cdef int i
cdef int sum = 0
for i in prange(n, nogil=True):
sum += i
print sum
with this setup.py:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
import numpy
ext = Extension("testp", ["testp.pyx"], include_dirs=[numpy.get_include()],
extra_compile_args=['-fopenmp'], extra_link_args ['-fopenmp'])
setup(ext_modules=[ext], cmdclass={'build_ext': build_ext})
when I import testp, Python tells me: ImportError: No module named
parallel. And in fact, if I browse the Cython package in the
site-packages, I cannot find any file or directory that is called
parallel. But I thought it should be included somewhere in the
release? Could someone please clarify for a confused user?
I'm using Cython 0.15+
cython.parallel exists in Shadow.py:
import sys
sys.modules['cython.parallel'] = CythonDotParallel()
And the Shadow.py can be located in your Python's dist-packages directory like /usr/local/lib/python2.6/dist-packages/ in Linux
You can check all of your python modules in python command-line using:
>>> help('modules')
And then try to install/reinstall cython using easy_install or pip.

Categories

Resources