Python packages on windows: pip or native installers?

Python packages on windows: pip or native installers? - python

What are the relative merits of installing python packages on windows using pip as opposed to using packaged installers (exe/msi)?

For starters, some just wouldn't work for me (MySQLdb being the major culprit).
My new rule:
Try pip or easy_install
If that doesn't work, browse this library of python .msi/.exe installers for Windows.
If neither works for you, post a question on StackOverflow. You really have no benefit weighing the merits of one or the other; just get what you need in the manner that provides the least friction, and move on to more interesting things.

Native installers are used mainly when a package contains C extensions that need to be compiled. As you have to use the same compiler used to build Python itself and configure environment properly it's not something that many users see themselves doing. To avoid these problems they instead choose native installer. However, installing by running exe/msi installer directly doesn't allow to choose which Python instance to install a package in so you can't install in virtualenv. This seems like a major drawback of using native installers but only because many people are not aware of the fact that it is possible to use native installer to install in virtualenv - see Can I install Python windows packages into virtualenvs? (Unfortunately in this case you can't use pip as it doesn't handle installing binary packages).
To summarize:
If a package has no C extensions use pip
If a package has C extensions and
you can find native installer or binary egg use easy_install with native installer/binary egg
you can't find native installer nor binary egg use pip to compile C extensions and install package

Related

Combining Python precompiled installers with package managers

I am using Python in Windows. For performance reasons I need certain Python packages built against Intel MKL, most notably numpy. So far I have been installing all packages I use from precompiled installers from http://www.lfd.uci.edu/~gohlke/pythonlibs/. Obviously, manual package management is somewhat inefficient.
I know package managers and distributions exist (pip, Anaconda, Enthought). Is there a way to combine package management for most of the packages with manual install of specific package builds?
So far I have briefly tried pip. I see that after manually updating a package from an exe installer pip freeze still reports the previous version, while Python picks up the new version. So something seems to go at least a bit wrong.
Very related discussions are Anaconda vs. EPD Enthought vs. manual installation of Python and Python packages installation in Windows, but I did not find an answer to my particular question there.

Conda has the ability to convert Golhke installers into conda packages. You'll need to specify the dependencies manually, since the metadata isn't include in the installers. For example, to convert the cvxopt installer to a conda package use:
conda convert cvxopt-1.1.7.win-amd64-py2.7.exe -d 'numpy >=1.8'

Automatically installing Python dependencies using CMake

I've had a quick look around, but because of terminology like dependencies and packages being used in different ways, it's quite tricky to pin down an answer.
I'm building a mixed-language source (Fortran, some C and Python) and the Fortran calls a Python script which depends on the networkx Python package in the PyPI. Normally, I just have networkx installed anyway, so it isn't a problem for me when rebuilding.
However, for distribution, I want the best way to:
Install pip or equivalent, if it is not installed.
Possibly install virtualenv and create a virtual environment, if appropriate.
Download and install networkx using the --user option with pip.
Is there a standard way? Or should I just use CMake dependencies with custom commands that install pip etc.?

it depends. for "manual" install, you definitely should detect if all required (to build) tools are installed, and issue an error if they don't. then use execute_process() to run pip and whatever you want.
from other side, if you are going to produce a real package for some particular Linux, you just pack your binaries and require (via corresponding syntax of particular package format like *.rpm or *.deb that your package depends on some other packages. so, you can be sure that they will be installed w/ (or even before) your package.

Do I need to install Homebrew if I am planning to install Virtualenv?

Being fairly new to programming, I am having trouble understanding exactly what Homebrew does... or rather - why it is needed. I know it contains pip for package management, but so does Virtualenv and I'm planning on installing this in due course.
Does Homebrew install another version of python that is not the system version, upon which you would install Virtualenv and manage the different development environments from there?
I have a clean install of OSX Lion and I want to keep my projects separated, but am unsure why I need Homebrew.
I realise this is basic stuff, but if someone could explain it, I would be grateful.

Homebrew is just a package manager for Mac, like pip for Python. Of course you never need a package manager, you can just get all the programs, or libraries in case of pip and Pypi yourself. The point of package managers however is to ease this process and give you a simple interface to install the software, and also to remove it as that is usually not so simply when compiling things yourself etc.
That being said, Homebrew will only install things you tell it to install, so by just having Homebrew you don’t randomly get new versions of something. Homebrew is just a nice way to install general OSX stuff you need/want in general.

pip and virtualenv are python libraries and can be installed in any working python install including the one supplied by Apple as part of OSX and the python.org version.
Then it depends on what you need from python - if you just have to install python libraries or simple C linraries then you can just use easy_install and then pip, vittualenv other python tools.
If you are using more complex C libraries e.g. python interface for mysql then it helps to use a package manager like macports, homebrew or fink as the port writers will have sorted out the tricky dependencies. There are also other python installs from Enthought and Activestate that deal with some of the non simple cases e.g. scipy but are not general purpose package managers.
Macports and fink will install a separate version of python in /opt/local/bin or /sw/bin whilst I think homebrew will use Apple's python. *The difference is due to a difference of view of the package mangers design. Macports and fink were developed by people who experienced a lot of issues with different versions of software and so said that all our installs will be in a place only the package manager uses whilst Homebrew trys to use as much of the Apple supplied tools as possible so to add as little as needed.

What is the best way to install python 2 on OS X?

A colleague of mine wants to use my python 2 code on his OS X (10.6) machine. My code imports several built-in python packages, including Tkinter and shelve, and also uses third-party packages, including numpy, scipy, matplotlib, and ipython.
I've encountered a few problems with OS X's built-in python. (IDLE doesn't work, for example*). I suspect I should install a more recent version of python, and a different version of Tk.
My questions:
Will having two different versions of python/Tk on the same machine cause problems?
I would like to associate the terminal commands 'python', 'ipython', and 'easy_install' with the more recent version of python. How should I do this?
When I install third-party packages like numpy using a .dmg file, how do I control which version of python numpy installs into?
Is there a better way to do this?
If this process goes well, I'd consider adding OS X instructions to my code's documentation, so I'd like to boil down this process to the simplest, most general approach.
*EDIT: Also, this
EDIT: Thank you everyone for the useful answers. My colleague tried MacPorts, which seems to work well, but has a few speedbumps. First we had to install Xcode from the system install disk. This is not a fast or lightweight install (several GB). Luckily we still had the disk! Once Xcode was installed, MacPorts was easy to install. Python and the python subpackages we needed were also easy to install, but he told me this installation took several hours. Presumably this delay is due to compilation? He had an easy time setting the MacPorts python as default. However, I think we have to change the 'Python Launcher' application by hand, this seems to still default to the system python.
Even though he has a working system now, I'm tempted to ask him to try one of the other solutions. I'm not sure all of my code's potential users will tolerate a multi-hour, multi-gigabyte installation.

I use brew to install all my libraries/compilers/interpreters.
To install python try this:
brew install python
Then add Python's binaries directory to your $PATH in your ~/.profile:
export PATH=`brew --prefix python`/bin:$PATH
I'd recommend you to install pip, virtualenv and virtualenvwrapper to have better control over your environment too.

Have you tried ActivePython?
It includes a package manager (PyPM) that, by default, installs into your home directory (eg: ~/Library/Python/2.7). Main scripts get symlinked in /usr/local/bin; use the included pythonselect to set the active Python version.
You don't have to bother installing .dmg packages, as PyPM is a binary package manager ... therefore you can install non-pure Python packages like NumPy without having to compile things yourself.
ActivePython can use Apple's Tcl/Tk or, if installed, ActiveTcl.
A "simplest, most general approach" in your documentation could be:
Install ActivePython 2.7
Open Terminal and type pypm-2.7 install matplotlib ipython

Using MacPorts, you can install python 2.6, 2.7, 3.1 and 3.2 at the same time, with their own packages, without ever touching the built-in python.
numpy, scipy, matplotlib, and ipython are also available as ports for most of those python versions.
Moreover, if you install the python_select port, you'll be able:
to choose which one of those (plus the built-in python) is the "default" python;
to install python packages through easy_install/pip for the "selected" python, if they're not available as ports.
Add virtualenv to the mix, and you'll have a very, very flexible Python development environment.
As for your questions:
Q1: with MacPorts, no. while not a frequent user, I've installed and used matplotlib in 2.6 and 2.7, switching between the two using python_select.
Q2: easy_install, pip, ipython will be "linked" to the python they were installed by. (but see tip 1)
Q3: it's easier to install one of the py{26,27,xx}-numpy ports, or pip install numpy under your python_select'ed python.
Q4: well, MacPorts is the best thing I know after APT on Debian/Ubuntu... :-)
Now, two tips if you try MacPorts:
MacPorts cleanly installs ports separately from the OS X installation, in an /opt/local directory, and each python version is installed in a /opt/local/Library/Frameworks/Python.framework/Versions/{2.5,2.6,2.7,...} directory. Using python_select cleanly switch the "python" command using links. BUT... the Versions/{2.5,2.6,2.7,...}/bin directory, where python scripts are installed, is not added to the PATH. Just adding: export PATH=/opt/local/Library/Frameworks/Python.framework/Versions/Current/bin:$PATH to your ~/.profile will always give you direct access to the scripts installed for the selected python.
to avoid bad surprises, I've added a echo Selected python is \"$(python_select -s)\" line to my ~/.profile, so I always know which is my currently selected python when opening a session... :-)
Regards,
Georges

In almost all cases, the best python to use is the one from http://python.org/. It sets up the paths correctly and doesn't overwrite anything. DMG package installs usually work automatically, as does python setup.py install, and it's not too hard to get setuptools to work. If you want per-user installs, it is easy to set up .pydistutils.cfg and python automatically recognizes the path install_lib = ~/Library/Python/$py_version_short/site-packages

An addendum regarding the usage of brew:
Since some time, brew install python will install python3.
If you intend to install python2, you want to use
brew install python#2
It is perfectly fine to install both python and python3 using brew!

Here is an old post that answers your questions too.
In general it is not a problem at all to have more than one python installation on your machine. You just have to watch out which one you are calling on the command line.
>> which python
... helps to identify where your python binary is located. The original Mac OS X python is usually at "/usr/bin/python"
I personally use the MacPorts python installation. It also supports you with the installation of modules. (see link above)

I have 4 versions of python on my MacBook Pro. 2 from the original install of OS X 10.6 and a subsequent update, then self installed copies of python 2.7 and 3.2. You can update the python command to point at any of the versions. They all install in separate directories and cause no problems with each other.
I'm not sure what will happen when you install from a .dmg file. I believe it will simply use whatever version python points to.
This post on superuser.com answers your questions on changing default paths.

Python packages installation in Windows

I recently began learning Python, and I am a bit confused about how packages are distributed and installed.
I understand that the official way of installing packages is distutils: you download the source tarball, unpack it, and run: python setup.py install, then the module will automagically install itself
I also know about setuptools which comes with easy_install helper script. It uses eggs for distribution, and from what I understand, is built on top of distutils and does the same thing as above, plus it takes care of any dependencies required, all fetched from PyPi
Then there is also pip, which I'm still not sure how it differ from the others.
Finally, as I am on a windows machine, a lot of packages also offers binary builds through a windows installer, especially the ones that requires compiling C/Fortran code, which otherwise would be a nightmare to manually compile on windows (assumes you have MSVC or MinGW/Cygwin dev environment with all necessary libraries setup.. nonetheless try to build numpy or scipy yourself and you will understand!)
So can someone help me make sense of all this, and explain the differences, pros/cons of each method. I'd like to know how each keeps track of packages (Windows Registry, config files, ..). In particular, how would you manage all your third-party libraries (be able to list installed packages, disable/uninstall, etc..)

I use pip, and not on Windows, so I can't provide comparison with the Windows-installer option, just some information about pip:
Pip is built on top of setuptools, and requires it to be installed.
Pip is a replacement (improvement) for setuptools' easy_install. It does everything easy_install does, plus a lot more (make sure all desired distributions can be downloaded before actually installing any of them to avoid broken installs, list installed distributions and versions, uninstall, search PyPI, install from a requirements file listing multiple distributions and versions...).
Pip currently does not support installing any form of precompiled or binary distributions, so any distributions with extensions requiring compilation can only be installed if you have the appropriate compiler available. Supporting installation from Windows binary installers is on the roadmap, but it's not clear when it will happen.
Until recently, pip's Windows support was flaky and untested. Thanks to a lot of work from Dave Abrahams, pip trunk now passes all its tests on Windows (and there's a continuous integration server helping us ensure it stays that way), but a release has not yet been made including that work. So more reliable Windows support should be coming with the next release.
All the standard Python package installation mechanisms store all metadata about installed distributions in a file or files next to the actual installed package(s). Distutils uses a distribution_name-X.X-pyX.X.egg-info file, pip uses a similarly-named directory with multiple metadata files in it. Easy_install puts all the installed Python code for a distribution inside its own zipfile or directory, and places an EGG-INFO directory inside that directory with metadata in it. If you import a Python package from the interactive prompt, check the value of package.__file__; you should find the metadata for that package's distribution nearby.
Info about installed distributions is only stored in any kind of global registry by OS-specific packaging tools such as Windows installers, Apt, or RPM. The standard Python packaging tools don't modify or pay attention to these listings.
Pip (or, in my opinion, any Python packaging tool) is best used with virtualenv, which allows you to create isolated per-project Python mini-environments into which you can install packages without affecting your overall system. Every new virtualenv automatically comes with pip installed in it.
A couple other projects you may want to be aware of as well (yes, there's more!):
distribute is a fork of setuptools which has some additional bugfixes and features.
distutils2 is intended to be the "next generation" of Python packaging. It is (hopefully) adopting the best features of distutils/setuptools/distribute/pip. It is being developed independently and is not ready for use yet, but eventually should replace distutils in the Python standard library and become the de facto Python packaging solution.
Hope all that helped clarify something! Good luck.

I use windows and python. It is somewhat frustrating, because pip doesn't always work to install things. Python is moving to pip, so I still use it. Pip is nice, because you can uninstall items and use
pip freeze > requirements.txt
pip install -r requirements.txt
Another reason I like pip is for virtual environments like venv with python 3.4. I have found venv a lot easier to use on windows than virtualenv.
If you cannot install a package you have to find the binary for it. http://www.lfd.uci.edu/~gohlke/pythonlibs/
I have found these binaries to be very useful.
Pip is trying to make something called a wheel for binary installations.
pip install wheel
wheel convert path\to\binary.exe
pip install converted_wheel.whl
You will also have to do this for any required libraries that do not install and are required for that package.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.