For research related Python programs, I require Python 2.6 (or 2.7), numpy, scipy and matplotlib. Occasionally, I'm going to use other modules such as mayavi2 or numexpr.
The programs in questions will be exchanged between (Ubuntu) Linux and Windows and can be modified to work across platforms. The setup on the Windows side should resemble the Linux one as closely as possible. Integration with COM, .NET, or the Windows OS is not required.
I'm aware of the following options:
Python for Windows from python.org
ActivePython
Python(x,y), a bundle of Python with modules and GUIs
Enthought Python distribution, a bundle of Python with modules
Which of those will provide me most efficiently with a setup that just works? And how would they differ?
EDIT 2017-11-4: At this point in time Anaconda seems by far to be the best option. It is multi-platform, doesn't require admin/root permission, and it allows you to install multiple python versions in parallel.
Original post
The easiest way to install all the python libraries necessary for scientific computing is to install either Python(x,y) or Enthought Python Distribution (EPD). Both offer a fairly similar set of packages (including numexpr and mayavi2), so it's probably just a matter of personal preference. I prefer Python(x,y) because it is fully open source, whereas EPD is a commercial product with a free edition. You can compare the included packages for EPD and Python(x,y).
Both these options are much better than using the standard python (or ActiveState) then manually installing all the required scientific packages. Both should work well with code transported from Linux. It's worth mentioning that EPD also has a Linux version, so if you need all packages and versions to be absolutely identical between Windows and Linux setups, this might be the way to go.
Edit:The win32-superpack is a good option if you just want a few basic scientific packages, but if you want more complex things like mayavi, you'd need to install them yourself.
Edit 2013-05-03:
There are now a couple of other options which are also worth considering: winpython and anaconda
The question is old, but the answer nowadays to this question will always be Anaconda, so I thought I provide you a link to it.
Not only for scientfic purposes - it has all the libraries and tools for this, installed - but its also the best python distribution in general:
https://www.anaconda.com/download/
I have used the win32-superpack from the official SciPy distribution. It includes Python, NumPy, SciPy, matplotlib, etc. and everything works out of the box.
Maybe I should also comment on the packages on your list:
The standard Python distribution from Python.org does not include SciPy, as far as I know.
The Enthought distribution is installed on some of the computer clusters that I am using. It is linked against the Intel MKL and could be faster for linear algebra than the SciPy one. But it is a commercial package. Python programs developed with the SciPy distribution should, however, run without problems under Enthought, too.
I don't know anything about the other two distributions.
Related
up until recently I have only worked with one version of Python and used virtual environments every now and then. Now, I am working with some libraries that require older version of Python. So, I am very confused. Could anyone please clear up some of my confusion?
How do I install multiple Python versions?
I initially had Python version 3.8.x but upgraded to 3.10.x last month. There is currently only that one version on my PC now.
I wanted to install one of the Python 3.8.x version and went to https://www.python.org/downloads/. It lists a lot of versions and subversions like 3.6, 3.7, 3.8 etc. etc. with 3.8.1, 3.8.2 till 3.8.13. Which one should I pick?
I actually went ahead with 3.8.12 and downloaded the Tarball on the page: https://www.python.org/downloads/release/python-3812/
I extracted the tarball (23.6MB) and it created a folder with a setup.py file.
Is Python 3.8.12 now installed? Clicking on the setup.py file simply flashes the terminal for a second.
I have a few more questions. Hopefully, they won't get me downvoted. I am just confused and couldn't find proper answers for them.
Why does Python have such heavy dependency on the exact versions of libraries and packages etc?
For example, this question
How can I run Mozilla TTS/Coqui TTS training with CUDA on a Windows system?. This seems very beginner unfriendly. Slightly mismatched package
version can prevent any program from running.
Do virtual environments copy all the files from the main Python installation to create a virtual environment and then install specific packages inside it? Isn't that a lot of wasted resources in duplication because almost all projects require there own virtual environment.
Your questions depend a bit on "all the other software". For example, as #leiyang indicated, the answer will be different if you use conda vs just pip on vanilla CPython (the standard Windows Python).
I'm also going to assume you're actually on Windows, because on Linux I would recommend looking at pyenv. There is a pyenv-win, which may be worth looking into, but I don't use it myself because it doesn't play as nice if you also want (mini)conda environments.
1. (a) How do I install multiple Python versions?
Simply download the various installers and install them in sensible locations. E.g. "C:\Program Files\Python39" for Python 3.9, or some other location where you're allowed to install software.
Don't have Python add itself to the PATH though, since that'll only find the last version to do so and can really confuse things.
Also, you probably want to use virtual environments consistently, as this ties a specific project very clearly to a specific Python version, avoiding future confusion or problems.
1. (b) "3.8.1, 3.8.2 till 3.8.13" which should I pick?
Always pick the latest 3.x.y, so if there's a 3.8.13 for Windows, but no 3.8.14, pick that. Check if the version is actually available for your operating system, sometimes there are later versions for one OS, but not for another.
The reason is that between a verion like 3.6 and 3.7, there may be major changes that change how Python works. Generally, there will be backwards compatibility, but some changes may break how some of your packages work. However, when going up a minor version, there won't be any such breaking changes, just fixes and additions that don't get in the way of what was already there. A change from 2.x to 3.x only happens if the language itself goes through a major change, and rarely happens (and perhaps never will again, depending on who you ask).
An exception to the "no minor version change problems" is of course if you run some script that very specifically relies on something that was broken in 3.8.6, but no fixed in 3.8.7+ (as an example). However, that's very bad coding, to rely on what's broken and not fixing it later, so only go along with that if you have no other recourse. Otherwise, just the latest minor version of any version you're after.
Also: make sure you pick the correct architecture. If there's no specific requirement, just pick 64-bit, but if your script needs to interact with other installed software at the binary level, it may require you to install 32-bit Python (and 32-bit packages as well). If you have no such requirement, 64-bit allows more memory access and has some other benefits on modern computers.
2. Why does Python have such heavy dependency on the exact versions of libraries and packages etc?
It's not just Python, this is true for many languages. It's just more visible to the end user for Python, because you run it as an interpreted language. It's only compiled at the very last moment, on the computer it's running on.
This has the advantage that the code can run on a variety of computers and operating systems, but the downside that you need the right environment where you're running it. For people who code in languages like C++, they have to deal with this problem when they're coding, but target a much smaller number of environments (although there's still runtimes to contend with, and DirectX versions, etc.). Other languages just roll everything up into the program that's being distributed, while a Python script by itself can be tiny. It's a design choice.
There are a lot of tools to help you automate the process though and well-written packages will make the process quite painless. If you feel Python is very shakey when it comes to this, that's probable to blame on the packages or scripts you're using, not really the language. The only fault of the language is that it makes it very easy for developers to make such a mess for you and make your life hard with getting specific requirements.
Look for alternatives, but if you can't avoid using a specific script or package, once you figure out how to install or use it, document it or better yet, automate it so you don't have to think about it again.
3. Do virtual environments copy all the files from the main Python installation to create a virtual environment and then install specific packages inside it? Isn't that a lot of wasted resources in duplication because almost all projects require there own virtual environment.
Not all of them, but quite a few of them. However, you still need the original installation to be present on the system. Also, you can't pick up a virtual environment and put it somewhere else, not even on the same PC without some careful changes (often better to just recreate it).
You're right that this is a bit wasteful - but this is a difficult choice.
Either Python would be even more complicated, having to manage many different version of packages in a single environment (Java developers will be able to tell you war stories about this, with their dependency management - or wax lyrically about it, once they get it themselves).
Or you get what we have: a bit wasteful, but in the end diskspace is a lot cheaper than your time. And unlike your time, diskspace is almost infinitely expandable.
You can share virtual environments between very similar projects though, but especially if you get your code from someone else, it's best to not have to worry and just give up a few dozen MB for the project. On the upside: you can just delete a virtual environment directory and that pretty much gets rid of the whole things. Some applications like PyCharm may remember that it was once there, but other than that, that's the virtual environment gone.
Just install them. You can have any number of Python installations side by side. Unless you need to have 2 different minor versions, for example 3.10.1 and 3.10.2, there is no need to do anything special. (And if you do need that then you don't need any advice.) Just set up separate shortcuts for each one.
Remember you have to install any 3rd-party libraries you need in each version. To do this, navigate to the Scripts folder in the version you want to do the install in, and run pip from that folder.
Python's 3rd-party libraries are open-source and come from projects that have release schedules that don't necessarily coincide with Python's. So they will not always have a version available that coincides with the latest version of Python.
Often you can get around this by downloading unofficial binaries from Christoph Gohlke's site. Google Python Gohlke.
Install Python using the windows executable installers from python.org. If the version is 3.x.y, use the highest y that has a windows executable installer. Unless your machine is very old, use the 64-bit versions. Do not have them add python to your PATH environment variable, but in only one of the installs have it install the python launcher py. That will help you in using multiple versions. See e.g. here.
Python itself does not. But some modules/libraries do. Especially those that are not purely written in Python but contain extensions written in C(++). The reason for this is that compiling programs on ms-windows can be a real PITA. Unlike UNIX-like operating systems with Linux, ms-windows doesn't come with development tools as standard. Nor does it have decent package management. Since the official Python installers are built with microsoft tools, you need to use those with C(++) extensions as well. Before 2015, you even had to use exactly the same version of the compiler that Python was built with. That is still a good idea, but no longer strictly necessary. So it is a signigicant amount of work for developers to release binary packages for each supported Python version. It is much easier for them to say "requires Python 3.x".
I followed the advice of most pythonistas and set up a different version of Python with which to play than the one that comes built into Mac OS X. After scanning around, it seemed like the best way to handle things was to use homebrew, and then to follow up with pip. All was good up through numpy, and then things went bad. I can't get scipy to install nor matplotlib. After searching here at StackOverflow and trying a number of solutions, I finally stumbled across Chris Fonnesbeck's "Scipy Superpack", which promises to:
install recent 64-bit builds of Numpy (1.8) and Scipy (0.12), Matplotlib (1.3), iPython (0.14), Pandas (0.10), Statsmodels (0.5.0), Scikit-Learn 0.13, as well as PyMC (2.2) for OS X 10.8 (Mountain Lion) on Intel Macintosh.
That all sounds great to my noobie ears, but when I look at the install script, install_superpack.sh, it seems to be directing things to work with the system's version of python:
#!/bin/sh
PYTHON='/usr/bin/python'
GIT_FILENAME='git-1.7.7.3-intel-universal-snow-leopard'
GIT_VOLUME='/Volumes/Git 1.7.7.3 Snow Leopard Intel Universal/'
GFORTRAN='gcc-42-5666.3-darwin11.pkg'
SUDO='sudo'
Should I change the PYTHON variable above or leave it be and make adjustments to the PYTHON ENVIRONMENT (yes?! No!?) thingamabob I have read about elsewhere? What else, if anything, should I edit? Or should I just back away from this script since I clearly am out of my depth?
I should note that I would dearly love to get matplotlib running on my machine because I'd like to play with making histograms for some text analysis I am pursuing.
The superpack was compiled for the Apple python. It might work with your python from homebrew, but it's not recommended.
And by the way, when you say:
it seemed like the best way to handle things was to use homebrew, and then to follow up with pip
If this was true, then there wouldn't be a ton of questions here of people having trouble installing scipy with homebrew/pip. Homebrew and pip are great for minimalistic, pure python packages. But they stumble spectacularly with scipy or packages that require external non-python packages.
With Macports now having a buildbot for OSX 10.8, I personally see no reason why anyone would want to bother with homebrew/pip for a scientific python install. With a good internet connection it will take minutes to install a full setup, and you can have matplotlib with as many backends as you want.
I'm currently an undergrad Electrical Engineering student. I've been using MATLab for a while, but have grown weary of its syntax and subtleties. I've been trying to find an alternative, and after much searching have found Enthought. Since I'm a student, I can install the academic version of EPD; but, looking at the modules it contains, I'm wondering if I'd really need everything I'd get with that distribution. My question is, would EPDFree suffice for undergraduate study? Or am I better off with the academic version? In either case, should I install the libraries in the distribution separately (i.e. without installing EPD) if I can? Or, should I just go with EPD distribution? I primarily use the Ubuntu Linux distribution if that helps. Thanks to all in advance!
The ultimate answer mainly depends on your specific needs. EPD is really neat and it contains a lot of packages. I doubt that you'll use all of them. During your studies you will most probably use Numpy, Scipy, Matplotlib and ipython for matrix manipulations, solving linear systems, visualisation etc. Installing these packages separately on Ubuntu (with aptitude for example) is as effortless as installing EPD.
I would say that start by installing these packages separately from the Ubuntu repository, try them out, learn more about them, try to get comfortable in the environment. As your studies evolve you'll see what tools you lack, you need more and you can reconsider using EPD.
Enthought now provides the full EPD (not just the free EPD) to all academics. All you need is an EDU mail address. You can sign up at http://www.enthought.com/products/edudownload.php and you will receive a username/password that allows you to install and upgrade packages easily, just like a paid subscriber.
I am not a regular Linux user so this might be completely trivial question. I am running 6.2 PUIAS version i386_64 on one of my GPU based "super" computers due to the unavailability of NVidia drivers for NetBSD. The installed version of Python is 2.6.6. I need 2.7.2 Python and newer version of scipy, numpy, matlibplot and friends. I have PUIAS and EPEL repositories enabled. However they do not have newer versions of Python. What is the "recommended" way to install newer version of Python without braking the system which depends on it. I am not interested in Python 3.2 due to the lack of libraries for scientific computing.
When the install-Python-from-source routine tells you to use make install, type make altinstall instead. This will leave the normal python executable untouched and instead create python2.7 for you to use. Install the other packages from source using this new executable. Don't forget to change the shebang line in your scripts accordingly.
I am going to answer my own question. For people who are using Python for scientific computing on RedHat clones (PUIAS for example) the easiest way to get all they need is to use rpm package manager and Enthought Python Distribution (EPD for short). EPD installs everything in a sandbox so system tools which are based on an obsolete version of Python are not massed up. However, paths have to be adjusted for system or even easier on the user base so that the using shell invokes non-system tools. One should never compile Python from source unless you are interesting in Python itself or in porting it to your favorite operating system rather than in your own research!
I bought a low-end MacBook about a month ago and am finally getting around to configuring it for Python. I've done most of my Python work in Windows up until now, and am finding the choices for OS X a little daunting. It looks like there are at least five options to use for Python development:
"Stock" Apple Python
MacPython
Fink
MacPorts
roll-your-own-from-source
I'm still primarily developing for 2.5, so the stock Python is fine from a functionality standpoint. What I want to know is: why should I choose one over the other?
Update:
To clarify, I am looking for a discussion of the various options, not links to the documentation. I've marked this as a Community Wiki question, as I don't feel there is a "correct" answer. Thanks to everyone who has already commented for their insight.
One advantage I see in using the "stock" Python that's included with Mac OS X is that it makes deployment to other Macs a piece of cake. I don't know what your deployment scenario is, but for me this is important. My code has to run on any number of Macs at work, and I try to minimize the amount of work it takes to run my code on all of those systems.
I would highly recommend using MacPorts with Porticus for managing your Python installation. It takes a while to build everything, but the advantage is that whatever you build yourself will be built against the same libraries, so you won't have to futz around with statically linked shared objects, etc. if you want your Python stuff to work with Apache, PostgreSQL, etc.
If you choose to go this way, remember to install the python_select port and use it to make your system use the Python installed from MacPorts.
As an added bonus, MacPorts has packages for most main-stream Python eggs, so if you should be able to have MacPorts keep you up-to-date with the latest versions of all that stuff :)
Here's some helpful info to get you started. http://www.python.org/download/mac/
Depends what you are using python for. If you are using MacOS funitionality and things like PyObjC you are probably best of with MacPython or the python provided by Apple.
I use Python on my Mac mostly for development of server side applications which later will run on FreeBSD & Linux boxes. For that I have used fink python for a few years and ever since MacPorts python. With mac ports it is simple to add required c modules (like database driver etc). It's also easy to keep two python Versions (2.5 & 2.6 in my case) around.
I used "compile your own" python to test pre-3.0 python but generally I find managing dependencies to c modules painfull if done by hand.
Thanks to easy_install installing pure python modules is fast and easy for all the options mentioned above.
I was never very much an IDE person. For development I use command line subversion installed by MacPorts, Textmate and occasionaly Expandrive do directly access files on servers. I personally are very dependent on Bicyclerepairman for Textmade to handle my refactoring needs.
Others seem to be very happy with Eclipse & Pydev.
How about EPD from Enthought? Yes, it's large but it is a framework build and includes things like wxPython, vtk, numpy, scipy, and ipython built-in.
I recommend using Python Virtual environments, especially if you use a Timecapsule because Timecapsule will back everything up, except modules you added to Python!
Based on the number of bugs and omissions people have been encountering in Leopard python (just here on SO!), I couldn't recommend that version. e.g., see:
Why do I get wrong results for hmac in Python but not Perl?
Problems on select module on Python 2.5
I would choose MacPorts.
It does not eliminate your existing python supplied by Apple since it installs by default in /opt/local/bin (plays nice with it) and plus it is easy to download and install additional python modules (even binary modules that you need to compile!). I use Porticus GUI to maintain my MacPorts installed list of packages, including python.
In my windows environment I use Eclipse and PyDev, which works quite well together, even if it's a bit sparse. Apparently the exact same environment is available for the Mac as well, so I suggest downloading Eclipse and using the internal update software function to update PyDev with the URL http://pydev.sourceforge.net/updates/. To look further into PyDev, look here.
Apple's supplied python is quite old – my tiger install has 2.3.5. This may not be a problem for you, but you would be missing out on a lot. Also, there is a risk that Apple will update it. I'm not sure if moving from 2.3.5 to (say) 2.4 would cause code to break, but I guess it's possible. This happened to perl people recently: http://developers.slashdot.org/article.pl?sid=09/02/18/1435227
Macpython is a framework build (as is Apple's, I believe). To be honest, I'm not sure exactly what that means, but it's a prerequisite for some modules, in particular wxPython. If you get python from macports or fink, you will not be able to run wxPython (unless you run it through X11).
And guess what was forgotten by every answer here ... ActivePython.
No compilation required, even for third-party modules such as numpy, lxml, pyqt and thousands of others.
I recommend python (any python?) plus the ipython shell. My most recent experience with MacPython was MacPython 2.5, and I found IDLE frustrating to use as an editor. It's not very featureful, and its' very slow to scroll large quantities of output.