Does building pure python modules w/ conda require setuptools?

Does building pure python modules w/ conda require setuptools? - python

This weekend I've been reading up on conda and the python packaging user guide because I have a simple pure python project that depends on numpy. It seemed to me that distributing/installing this project via conda was better than pip due to this dependency.
One thing on which I'm still not clear: conda will install a python package from a recipe in build.sh, but it seems like build.sh just ends up calling python setup.py install for most python packages.
So even if I want to distribute/install my python package with conda, I still end up depending on setuptools (or distutils) for the actual installation, correct? I was unable to find a conda utility analogous to setuptools; am I missing something?
FWIW, I posted this question on the conda issue tracker.
Thanks!

Typically you will still be using distutils (or setuptools if the library requires it) to install things, yes. It is not technically required. The build.sh can be anything. If you wanted to, you could just copy the code into site-packages. Using setup.py install is recommended, though, as libraries will already have setup.py working, it will install metadata that can be read by pip, and it will compile any extension modules and install any data files.

Related

python bdist_rpm not using install_requires?

I've created a new RPM using python bdist_rpm . Normally python setup.py install would install python dependencies like websocket-client or any other package. But the RPM just refuses to install anything.
Apparently the suggestion from various other posts seem to be in the line of just requiring them in setup.cfg as rpm packages. This doesn't make sense to me since most of the rpm packages seem to be on really old version and I can't possibly create rpms for all the python packages i require. I need a much recent version and it doesn't make sense that the yum installs don't actually install the packages.
What is the right (clean and easiest) way to do it ? I believe if a setup.py has something like
install_requires=[
"validictory",
"requests",
"netlogger>=4.3.0",
"netifaces",
"pyzmq",
"psutil",
"docopt"
],
Then it should try to either include them in the rpm or try to install it.
I am trying on a clean centos vm using vagrant which I keep destroying and then install the rpm.

Well the super hack way i used was to just add a post install script with all the requirements as easy_install installation (instead of pip because older versions may not have pip and even after installing pip, the approach failed on systems with python 2.6)
#Adding this in setup.py
options = {'bdist_rpm':{'post_install' : 'scripts/rpm_postinstall.sh'}},
Then the script is as follows:
easy_install -U <pkgnames>
Of course a post_uninstall can also be added if you want to clean up which I wouldn't because you have no clue what is using the packages installed apart from this app.
The logic of the rpm approach seems to be for this but its honestly over engineering and I'd rather package all the modules with the rpm to ensure it always works. ** Screaming out for a cleaner solution **

Automatically installing Python dependencies using CMake

I've had a quick look around, but because of terminology like dependencies and packages being used in different ways, it's quite tricky to pin down an answer.
I'm building a mixed-language source (Fortran, some C and Python) and the Fortran calls a Python script which depends on the networkx Python package in the PyPI. Normally, I just have networkx installed anyway, so it isn't a problem for me when rebuilding.
However, for distribution, I want the best way to:
Install pip or equivalent, if it is not installed.
Possibly install virtualenv and create a virtual environment, if appropriate.
Download and install networkx using the --user option with pip.
Is there a standard way? Or should I just use CMake dependencies with custom commands that install pip etc.?

it depends. for "manual" install, you definitely should detect if all required (to build) tools are installed, and issue an error if they don't. then use execute_process() to run pip and whatever you want.
from other side, if you are going to produce a real package for some particular Linux, you just pack your binaries and require (via corresponding syntax of particular package format like *.rpm or *.deb that your package depends on some other packages. so, you can be sure that they will be installed w/ (or even before) your package.

Python setup.py develop vs install

Two options in setup.py develop and install are confusing me. According to this site, using develop creates a special link to site-packages directory.
People have suggested that I use python setup.py install for a fresh installation and python setup.py develop after any changes have been made to the setup file.
Can anyone shed some light on the usage of these commands?

python setup.py install is used to install (typically third party) packages that you're not going to develop/modify/debug yourself.
For your own stuff, you want to first install your package and then be able to frequently edit the code without having to re-install the package every time — and that is exactly what python setup.py develop does: it installs the package (typically just a source folder) in a way that allows you to conveniently edit your code after it’s installed to the (virtual) environment, and have the changes take effect immediately.
Note: It is highly recommended to use pip install . (regular install) and pip install -e . (developer install) to install packages, as invoking setup.py directly will do the wrong things for many dependencies, such as pull prereleases and incompatible package versions, or make the package hard to uninstall with pip.
Update:
The develop counterpart for the latest python -m build approach is as follows (as per):

From the documentation. The develop will not install the package but it will create a .egg-link in the deployment directory back to the project source code directory.
So it's like installing but instead of copying to the site-packages it adds a symbolic link (the .egg-link acts as a multiplatform symbolic link).
That way you can edit the source code and see the changes directly without having to reinstall every time that you make a little change. This is useful when you are the developer of that project hence the name develop. If you are just installing someone else's package you should use install

Another thing that people may find useful when using the develop method is the --user option to install without sudo. Ex:
python setup.py develop --user
instead of
sudo python setup.py develop

Why use pip over easy_install?

A tweet reads:
Don't use easy_install, unless you
like stabbing yourself in the face.
Use pip.
Why use pip over easy_install? Doesn't the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?
(Let's assume that we're talking about easy_install from the Distribute package, that is maintained by the community)

From Ian Bicking's own introduction to pip:
pip was originally written to improve on easy_install in the following ways
All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
Care is taken to present useful output on the console.
The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
Error messages should be useful.
The code is relatively concise and cohesive, making it easier to use programmatically.
Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
Native support for other version control systems (Git, Mercurial and Bazaar)
Uninstallation of packages.
Simple to define fixed sets of requirements and reliably reproduce a set of packages.

Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here's the current state of things:
Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke's Extension Packages for Windows. pip can handle wheels; easy_install cannot.
Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don't even work properly with easy_install.
The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
easy_install itself is only quasi-maintained.
All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python's own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as "the preferred installer program".
Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.
The only good reason that I know of to use easy_install in 2015 is the special case of using Apple's pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don't include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it's easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it's only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn't, so you also want to sudo easy_install readline if you want to upgrade that.

Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.
The infographic below—from the Current State of Packaging section in the The Hitchhiker's Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.
Here's another infographic from distribute's documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.

Two reasons, there may be more:
pip provides an uninstall command
if an installation fails in the middle, pip will leave you in a clean state.

REQUIREMENTS files.
Seriously, I use this in conjunction with virtualenv every day.
QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS
Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.
From Heroku's documentation https://devcenter.heroku.com/articles/python
You create a virtual environment, and set your shell to use it. (bash/*nix instructions)
virtualenv env
source env/bin/activate
Now all python scripts run with this shell will use this environment's packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.
pip install flask
Now you can dump the info about which packages are installed with
pip freeze > requirements.txt
If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:
pip install -r requirements.txt
Any time you can automate tedium like this is awesome.

pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be used there. easy_install can install binary packages for Windows.

UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.
Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name

As an addition to fuzzyman's reply:
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be
used there. easy_install can install binary packages for Windows.
Here is a trick on Windows:
you can use easy_install <package> to install binary packages to avoid building a binary
you can use pip uninstall <package> even if you used easy_install.
This is just a work-around that works for me on windows.
Actually I always use pip if no binaries are involved.
See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install
I will ask on the mailing list what is planned for that.
Here is the latest update:
The new supported way to install binaries is going to be wheel!
It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1
https://pypi.python.org/pypi/wheel
http://wheel.readthedocs.org/en/latest/
I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.
cheers - Chris
A quick update:
The transition to wheel is almost over. Most packages are supporting wheel.
I promised to build wheels for PySide, and I did that last summer. Works great!
HINT:
A few developers failed so far to support the wheel format, simply because they forget to
replace distutils by setuptools.
Often, it is easy to convert such packages by replacing this single word in setup.py.

Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.
For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.
I'm using Python 2.7.8. I'm not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.
easy_install GitPython

Python packages installation in Windows

I recently began learning Python, and I am a bit confused about how packages are distributed and installed.
I understand that the official way of installing packages is distutils: you download the source tarball, unpack it, and run: python setup.py install, then the module will automagically install itself
I also know about setuptools which comes with easy_install helper script. It uses eggs for distribution, and from what I understand, is built on top of distutils and does the same thing as above, plus it takes care of any dependencies required, all fetched from PyPi
Then there is also pip, which I'm still not sure how it differ from the others.
Finally, as I am on a windows machine, a lot of packages also offers binary builds through a windows installer, especially the ones that requires compiling C/Fortran code, which otherwise would be a nightmare to manually compile on windows (assumes you have MSVC or MinGW/Cygwin dev environment with all necessary libraries setup.. nonetheless try to build numpy or scipy yourself and you will understand!)
So can someone help me make sense of all this, and explain the differences, pros/cons of each method. I'd like to know how each keeps track of packages (Windows Registry, config files, ..). In particular, how would you manage all your third-party libraries (be able to list installed packages, disable/uninstall, etc..)

I use pip, and not on Windows, so I can't provide comparison with the Windows-installer option, just some information about pip:
Pip is built on top of setuptools, and requires it to be installed.
Pip is a replacement (improvement) for setuptools' easy_install. It does everything easy_install does, plus a lot more (make sure all desired distributions can be downloaded before actually installing any of them to avoid broken installs, list installed distributions and versions, uninstall, search PyPI, install from a requirements file listing multiple distributions and versions...).
Pip currently does not support installing any form of precompiled or binary distributions, so any distributions with extensions requiring compilation can only be installed if you have the appropriate compiler available. Supporting installation from Windows binary installers is on the roadmap, but it's not clear when it will happen.
Until recently, pip's Windows support was flaky and untested. Thanks to a lot of work from Dave Abrahams, pip trunk now passes all its tests on Windows (and there's a continuous integration server helping us ensure it stays that way), but a release has not yet been made including that work. So more reliable Windows support should be coming with the next release.
All the standard Python package installation mechanisms store all metadata about installed distributions in a file or files next to the actual installed package(s). Distutils uses a distribution_name-X.X-pyX.X.egg-info file, pip uses a similarly-named directory with multiple metadata files in it. Easy_install puts all the installed Python code for a distribution inside its own zipfile or directory, and places an EGG-INFO directory inside that directory with metadata in it. If you import a Python package from the interactive prompt, check the value of package.__file__; you should find the metadata for that package's distribution nearby.
Info about installed distributions is only stored in any kind of global registry by OS-specific packaging tools such as Windows installers, Apt, or RPM. The standard Python packaging tools don't modify or pay attention to these listings.
Pip (or, in my opinion, any Python packaging tool) is best used with virtualenv, which allows you to create isolated per-project Python mini-environments into which you can install packages without affecting your overall system. Every new virtualenv automatically comes with pip installed in it.
A couple other projects you may want to be aware of as well (yes, there's more!):
distribute is a fork of setuptools which has some additional bugfixes and features.
distutils2 is intended to be the "next generation" of Python packaging. It is (hopefully) adopting the best features of distutils/setuptools/distribute/pip. It is being developed independently and is not ready for use yet, but eventually should replace distutils in the Python standard library and become the de facto Python packaging solution.
Hope all that helped clarify something! Good luck.

I use windows and python. It is somewhat frustrating, because pip doesn't always work to install things. Python is moving to pip, so I still use it. Pip is nice, because you can uninstall items and use
pip freeze > requirements.txt
pip install -r requirements.txt
Another reason I like pip is for virtual environments like venv with python 3.4. I have found venv a lot easier to use on windows than virtualenv.
If you cannot install a package you have to find the binary for it. http://www.lfd.uci.edu/~gohlke/pythonlibs/
I have found these binaries to be very useful.
Pip is trying to make something called a wheel for binary installations.
pip install wheel
wheel convert path\to\binary.exe
pip install converted_wheel.whl
You will also have to do this for any required libraries that do not install and are required for that package.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.