Best way to install python packages locally for development

Best way to install python packages locally for development - python

Being new to the python games I seem to have missed out on some knowledge on how you can develop on a program but also keep it in your live environment.
Programs like gpodder can be run directly from the source checkout which is really handy however others want to be "installed" to run.
A lot of programs are distributed with a setup.py with instructions to run "python ./setup.py install" as root which will put stuff somewhere in your file-system. There are even install commands like "develop" which seem to hold the promise of what I want. So I tried:
export PYTHONPATH=/home/alex/python
python ./setup.py develop --install-dir=/home/alex/python
Which downloaded a bunch of stuff locally and seems magically ensure the application I'm hacking on is still being run out of the src tree. So I guess my roundabout question is is this the correct way of developing python code? How do things like easy_install and pip factor into this?
So I tried the following:
python /usr/share/pyshared/virtualenv.py /home/alex/src/goobook
cd /home/alex/src/goobook/googbook.git
/home/alex/src/goobook/bin/python ./setup.py develop
And finally linked the program in question to my ~/bin
cd /home/alex/src/goobook
linkbin.pl bin/goobook
However invocation throws up a load of extra chatter which seems to imply it's wrong:
17:17 alex#socrates/i686 [goobook] >goobook --help
/home/alex/bin/goobook:5: UserWarning: Module pkg_resources was already imported from /home/alex/src/goobook/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg/pkg_resources.py, but /home/alex/src/goobook/lib/python2.5/site-packages/distribute-0.6.10-py2.5.egg is being added to sys.path
from pkg_resources import load_entry_point
/home/alex/bin/goobook:5: UserWarning: Module site was already imported from /home/alex/src/goobook/lib/python2.5/site.pyc, but /home/alex/src/goobook/lib/python2.5/site-packages/distribute-0.6.10-py2.5.egg is being added to sys.path
from pkg_resources import load_entry_point

Install:
http://pypi.python.org/pypi/virtualenv
to set up a localized virtual environment for your libraries, and:
http://pypi.python.org/pypi/setuptools
i.e. "easy_install" to install new things.

Virtualenv allows you to work in completely independent and isolated Python environments. It will let you easily create multiple environments which have different Python packages installed or different versions of a same package. Virtualenv also lets you easily switch between your different environments.
As of 2012, the de facto preferred tool for package management in Python is pip rather than setuptools. Pip is able to handle dependencies and to install/uninstall globally or inside a virtual environment. Pip even comes out-of-the-box with virtualenv.
Python 3
Also worth mentioning is the fact that virtual environments are becoming a part of Python itself in release 3.3, with the implementation of PEP 405.

The Python Packaging User Guide, which "aims to be the authoritative resource on how to package, publish and install Python distributions using current tools", recommends using pip to install in "development mode":
pip install -e <path>
Thus in the root directory of your package you can simply
pip install -e .
See installing from a local source tree.

The best way to develop Python apps with dependencies is to:
Download desired version of the python interpreter.
Install and use buildout (http://www.buildout.org/).
Buildout is something like Maven for Java (will fetch all needed packages automatically).
This way your Python interpreter will not be polluted by third party packages (this is important if you will be running developed application on other machines). Additionally you can integrate buildout with virtualenv package (this allows you to create virtual python interpreters for each project).

Related

Installing packages in python and setting up the working environment

I've been coding with R for quite a while but I want to start learning and using python more for its machine learning applications. However, I'm quite confused as to how to properly install packages and set up the whole working environment. Unlike R where I suppose most people just use RStudio and directly install packages with install.packages(), there seems to be a variety of ways this can be done in python, including pip install conda install and there is also the issue of doing it in the command prompt or one of the IDEs. I've downloaded python 3.8.5 and anaconda3 and some of my most burning questions right now are:
When to use which command for installing packages? (and also should I always do it in the command prompt aka cmd on windows instead of inside jupyter notebook)
How to navigate the cmd syntax/coding (for example the python documentation for installing packages has this piece of code: py -m pip install "SomeProject" but I am completely unfamiliar with this syntax and how to use it - so in the long run do I also have to learn what goes on in the command prompt or does most of the operations occur in the IDE and I mostly don't have to touch the cmd?)
How to set up a working directory of sorts (like setwd() in R) such that my .ipynb files can be saved to other directories or even better if I can just directly start my IDE from another file destination?
I've tried looking at some online resources but they mostly deal with coding basics and the python language instead of these technical aspects of the set up, so I would greatly appreciate some advice on how to navigate and set up the python working environment in general. Thanks a lot!

Python uses a different way of installing packages. Python has a thing named venv which stands for Virtual Environment. You install all of your packages in venv. Usually for each new project you make a new venv.
By using Anaconda on windows you install everything within the anaconda environment that you have specified.
python -m pip install "modulename" is a command that will install modulename to your default venv. You will be able to use this module when no other venv is specified. Here is the docs page. And here is a tutorial on how to use venv
By default python uses the same directory you have your code in. e.g. C:/Users/me/home/mypythonfile.py will run in C:/Users/me/home/ and will be able to access files in this directory. However you can use ../ to navigate directories or you can specify an absolute path to file you want to open e.g. with open("C:/system32/somesystemfile.sys") as file

Going over the technical differences of conda and pip:
So Conda is a packaging tool and installer that aims to do more than what pip does; handle library dependencies outside of the Python packages as well as the Python packages themselves. Both have many similar functionalities as well, you can install packages or create virtual environments with both.
It is generally advisable to generally have both conda and pip installed since there are some packages which might not be available with conda but with pip and vice versa.
The commands to install in both the ways is easy enough, but one thing to keep in mind is that
conda stores packages in the anaconda/pkgs directory
pip stores it in directory under /usr/local/bin/ for a Unix-based system, or \Program Files\ for Windows
You can use both pip or conda inside the jupyter notebook, it will work just fine, but it may be possible that you get multiple versions of the same package.
Most of the times, you will use cmd only to install a module used in your code, or to create environments, py -m pip install "SomeProject" here basically means that the module "SomeProject" will be downloaded in base env.

You could think of conda as python with a variety of additional functionalities, such as certain pre-installed packages and tools, such as spyder and jupyter. Hence, you must be precise when you say:
I've downloaded python 3.8.5 and anaconda3
Does it mean you installed python in your computer and then also anaconda?
In general, or at least in my opinion, using anaconda has advantages for development, but typically you'll just use a simple python installation in production (if that applies to you).
Anaconda has it's own package registry/repository . When you call conda install <package>, it will search for the package there and install it if available. You would better search it first, for instance matplotlib.
pip is a package manager for the Python Package Index. pip also ships with anaconda. Hence, in an anaconda environment you may install packages from either sources (either using pip install or conda install). For instance, pandas from PyPI and pandas from conda. There is no guarantee that packages exist in both sources. You must either search it first or simply try it.
In your first steps, I would suggest you to stick to only one dev env (either simple python or anaconda, recommend the second). Because that simplifies the question: "which python and which pip is executed in the cmd line?". That said, those commands should work as expected in any terminal, it be a simple cmd or an embedded one like in PyCharm or VS Code.
You could inspect that by running (on windows and linux at least):
which python, which pip.
Honestly, this is a question/answer that falls outside the scope of SO and for more info you would better check official websites, such as for anaconda or search for python vs anaconda blogs.

install flask for python2 when python 3 is default

I am trying to install flask and a few other modules for Python2.
When I try to install them using command pip install flask, it installs these for Python3.
This has created major issues because things like django are not compatible with Python3.
When I want to run a program using Python2, I cant use any of these modules.
Question
How do I use pip to install modules into a specified version of Python?

Try:
python2.7 -m pip install flask

Python “Virtual Environments” allow Python packages to be installed in an isolated location for a particular application, rather than being installed globally.
Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into /usr/lib/python2.7/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.
Or more generally, what if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.
Also, what if you can’t install packages into the global site-packages directory? For instance, on a shared host.
In all these cases, virtual environments can help you. They have their own installation directories and they don’t share libraries with other virtual environments.
Currently, there are two common tools for creating Python virtual environments:
venv is available by default in Python 3.3 and later, and installs pip and setuptools into created virtual environments in Python 3.4 and later.
virtualenv needs to be installed separately, but supports Python 2.6+ and Python 3.3+, and pip, setuptools and wheel are always installed into created virtual environments by default (regardless of Python version).
The basic usage is like so:
Using virtualenv:
virtualenv <DIR>
source <DIR>/bin/activate
Using venv:
python3 -m venv <DIR>
source <DIR>/bin/activate
For more information, see the virtualenv docs or the venv docs.
Managing multiple virtual environments directly can become tedious, so the dependency management tutorial introduces a higher level tool, Pipenv, that automatically manages a separate virtual environment for each project and application that you work on.
https://packaging.python.org/tutorials/installing-packages/#creating-virtual-environments

What is the difference between venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, pipenv, etc?

Python 3.3 includes in its standard library the new package venv. What does it do, and how does it differ from all the other packages that match the regex (py)?(v|virtual|pip)?env?

This is my personal recommendation for beginners: start by learning virtualenv and pip, tools which work with both Python 2 and 3 and in a variety of situations, and pick up other tools once you start needing them.
Now on to answer the question: what is the difference between these similarly named things: venv, virtualenv, etc?
PyPI packages not in the standard library:
virtualenv is a very popular tool that creates isolated Python environments for Python libraries. If you're not familiar with this tool, I highly recommend learning it, as it is a very useful tool.
It works by installing a bunch of files in a directory (eg: env/), and then modifying the PATH environment variable to prefix it with a custom bin directory (eg: env/bin/). An exact copy of the python or python3 binary is placed in this directory, but Python is programmed to look for libraries relative to its path first, in the environment directory. It's not part of Python's standard library, but is officially blessed by the PyPA (Python Packaging Authority). Once activated, you can install packages in the virtual environment using pip.
pyenv is used to isolate Python versions. For example, you may want to test your code against Python 2.7, 3.6, 3.7 and 3.8, so you'll need a way to switch between them. Once activated, it prefixes the PATH environment variable with ~/.pyenv/shims, where there are special files matching the Python commands (python, pip). These are not copies of the Python-shipped commands; they are special scripts that decide on the fly which version of Python to run based on the PYENV_VERSION environment variable, or the .python-version file, or the ~/.pyenv/version file. pyenv also makes the process of downloading and installing multiple Python versions easier, using the command pyenv install.
pyenv-virtualenv is a plugin for pyenv by the same author as pyenv, to allow you to use pyenv and virtualenv at the same time conveniently. However, if you're using Python 3.3 or later, pyenv-virtualenv will try to run python -m venv if it is available, instead of virtualenv. You can use virtualenv and pyenv together without pyenv-virtualenv, if you don't want the convenience features.
virtualenvwrapper is a set of extensions to virtualenv (see docs). It gives you commands like mkvirtualenv, lssitepackages, and especially workon for switching between different virtualenv directories. This tool is especially useful if you want multiple virtualenv directories.
pyenv-virtualenvwrapper is a plugin for pyenv by the same author as pyenv, to conveniently integrate virtualenvwrapper into pyenv.
pipenv aims to combine Pipfile, pip and virtualenv into one command on the command-line. The virtualenv directory typically gets placed in ~/.local/share/virtualenvs/XXX, with XXX being a hash of the path of the project directory. This is different from virtualenv, where the directory is typically in the current working directory. pipenv is meant to be used when developing Python applications (as opposed to libraries). There are alternatives to pipenv, such as poetry, which I won't list here since this question is only about the packages that are similarly named.
Standard library:
pyvenv (not to be confused with pyenv in the previous section) is a script shipped with Python 3.3 to 3.7. It was removed from Python 3.8 as it had problems (not to mention the confusing name). Running python3 -m venv has exactly the same effect as pyvenv.
venv is a package shipped with Python 3, which you can run using python3 -m venv (although for some reason some distros separate it out into a separate distro package, such as python3-venv on Ubuntu/Debian). It serves the same purpose as virtualenv, but only has a subset of its features (see a comparison here). virtualenv continues to be more popular than venv, especially since the former supports both Python 2 and 3.

I would just avoid the use of virtualenv after Python3.3+ and instead use the standard shipped library venv. To create a new virtual environment you would type:
$ python3 -m venv <MYVENV>
virtualenv tries to copy the Python binary into the virtual environment's bin directory. However it does not update library file links embedded into that binary, so if you build Python from source into a non-system directory with relative path names, the Python binary breaks. Since this is how you make a copy distributable Python, it is a big flaw. BTW to inspect embedded library file links on OS X, use otool. For example from within your virtual environment, type:
$ otool -L bin/python
python:
#executable_path/../Python (compatibility version 3.4.0, current version 3.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.0.0)
Consequently I would avoid virtualenvwrapper and pipenv. pyvenv is deprecated. pyenv seems to be used often where virtualenv is used but I would stay away from it also since I think venv also does what pyenv is built for.
venv creates virtual environments in the shell that are fresh and sandboxed, with user-installable libraries, and it's multi-python safe.
Fresh: because virtual environments only start with the standard libraries that ship with python, you have to install any other libraries all over again with pip install while the virtual environment is active.
Sandboxed: because none of these new library installs are visible outside the virtual environment, so you can delete the whole environment and start again without worrying about impacting your base python install.
User-installable libraries: because the virtual environment's target folder is created without sudo in some directory you already own, so you won't need sudo permissions to install libraries into it.
multi-python safe: because when virtual environments activate, the shell only sees the python version (3.4, 3.5 etc.) that was used to build that virtual environment.
pyenv is similar to venv in that it lets you manage multiple python environments. However with pyenv you can't conveniently rollback library installs to some start state and you will likely need admin privileges at some point to update libraries. So I think it is also best to use venv.
In the last couple of years I have found many problems in build systems (emacs packages, python standalone application builders, installers...) that ultimately come down to issues with virtualenv. I think python will be a better platform when we eliminate this additional option and only use venv.
EDIT: Tweet of the BDFL,
I use venv (in the stdlib) and a bunch of shell aliases to quickly switch.
— Guido van Rossum (#gvanrossum) October 22, 2020

UPDATE 20200825:
Added below "Conclusion" paragraph
I've went down the pipenv rabbit hole (it's a deep and dark hole indeed...) and since the last answer is over 2 years ago, felt it was useful to update the discussion with the latest developments on the Python virtual envelopes topic I've found.
DISCLAIMER:
This answer is NOT about continuing the raging debate about the merits of pipenv versus venv as envelope solutions- I make no endorsement of either. It's about PyPA endorsing conflicting standards and how future development of virtualenv promises to negate making an either/or choice between them at all. I focused on these two tools precisely because they are the anointed ones by PyPA.
venv
As the OP notes, venv is a tool for virtualizing environments. NOT a third party solution, but native tool. PyPA endorses venv for creating VIRTUAL ENVELOPES: "Changed in version 3.5: The use of venv is now recommended for creating virtual environments".
pipenv
pipenv- like venv - can be used to create virtual envelopes but additionally rolls-in package management and vulnerability checking functionality. Instead of using requirements.txt, pipenv delivers package management via Pipfile. As PyPA endorses pipenv for PACKAGE MANAGEMENT, that would seem to imply pipfile is to supplant requirements.txt.
HOWEVER: pipenv uses virtualenv as its tool for creating virtual envelopes, NOT venv which is endorsed by PyPA as the go-to tool for creating virtual envelopes.
Conflicting Standards:
So if settling on a virtual envelope solution wasn't difficult enough, we now have PyPA endorsing two different tools which use different virtual envelope solutions. The raging Github debate on venv vs virtualenv which highlights this conflict can be found here.
Conflict Resolution:
The Github debate referenced in above link has steered virtualenv development in the direction of accommodating venv in future releases:
prefer built-in venv: if the target python has venv we'll create the
environment using that (and then perform subsequent operations on that
to facilitate other guarantees we offer)
Conclusion:
So it looks like there will be some future convergence between the two rival virtual envelope solutions, but as of now pipenv- which uses virtualenv - varies materially from venv.
Given the problems pipenv solves and the fact that PyPA has given its blessing, it appears to have a bright future. And if virtualenv delivers on its proposed development objectives, choosing a virtual envelope solution should no longer be a case of either pipenv OR venv.
Update 20200825:
An oft repeated criticism of Pipenv I saw when producing this analysis was that it was not actively maintained. Indeed, what's the point of using a solution whose future could be seen questionable due to lack of continuous development? After a dry spell of about 18 months, Pipenv is once again being actively developed. Indeed, large and material updates have since been released.

Let's start with the problems these tools want to solve:
My system package manager don't have the Python versions I wanted or I want to install multiple Python versions side by side, Python 3.9.0 and Python 3.9.1, Python 3.5.3, etc
Then use pyenv.
I want to install and run multiple applications with different, conflicting dependencies.
Then use virtualenv or venv. These are almost completely interchangeable, the difference being that virtualenv supports older python versions and has a few more minor unique features, while venv is in the standard library.
I'm developing an /application/ and need to manage my dependencies, and manage the dependency resolution of the dependencies of my project.
Then use pipenv or poetry.
I'm developing a /library/ or a /package/ and want to specify the dependencies that my library users need to install
Then use setuptools.
I used virtualenv, but I don't like virtualenv folders being scattered around various project folders. I want a centralised management of the environments and some simple project management
Then use virtualenvwrapper. Variant: pyenv-virtualenvwrapper if you also use pyenv.
Not recommended
pyvenv. This is deprecated, use venv or virtualenv instead. Not to be confused with pipenv or pyenv.

Jan 2020 Update
#Flimm has explained all the differences very well. Generally, we want to know the difference between all tools because we want to decide what's best for us. So, the next question would be: which one to use? I suggest you choose one of the two official ways to manage virtual environments:
Python Packaging now recommends Pipenv
Python.org now recommends venv

pyenv - manages different python versions,
all others - create virtual environment (which has isolated python
version and installed "requirements"),
pipenv want combine all, in addition to previous it installs "requirements" (into the active virtual environment or create its own
if none is active)
So maybe you will be happy with pipenv only.
But I use: pyenv + pyenv-virtualenvwrapper, + pipenv (pipenv for installing requirements only).
In Debian:
apt install libffi-dev
install pyenv based on https://www.tecmint.com/pyenv-install-and-manage-multiple-python-versions-in-linux/, but..
.. but instead of pyenv-virtualenv install pyenv-virtualenvwrapper (which can be standalone library or pyenv plugin, here the 2nd option):
$ pyenv install 3.9.0
$ git clone https://github.com/pyenv/pyenv-virtualenvwrapper.git $(pyenv root)/plugins/pyenv-virtualenvwrapper
# inside ~/.bashrc add:
# export $VIRTUALENVWRAPPER_PYTHON="/usr/bin/python3"
$ source ~/.bashrc
$ pyenv virtualenvwrapper
Then create virtual environments for your projects (workingdir must exist):
pyenv local 3.9.0 # to prevent 'interpreter not found' in mkvirtualenv
python -m pip install --upgrade pip setuptools wheel
mkvirtualenv <venvname> -p python3.9 -a <workingdir>
and switch between projects:
workon <venvname>
python -m pip install --upgrade pip setuptools wheel pipenv
Inside a project I have the file requirements.txt, without fixing the versions inside (if some version limitation is not neccessary).
You have 2 possible tools to install them into the current virtual environment: pip-tools or pipenv. Lets say you will use pipenv:
pipenv install -r requirements.txt
this will create Pipfile and Pipfile.lock files, fixed versions are in the 2nd one. If you want reinstall somewhere exactly same versions then (Pipfile.lock must be present):
pipenv install
Remember that Pipfile.lock is related to some Python version and need to be recreated if you use a different one.
As you see I write requirements.txt. This has some problems: You must remove a removed package from Pipfile too. So writing Pipfile directly is probably better.
So you can see I use pipenv very poorly. Maybe if you will use it well, it can replace everything?
EDIT 2021.01: I have changed my stack to: pyenv + pyenv-virtualenvwrapper + poetry. Ie. I use no apt or pip installation of virtualenv or virtualenvwrapper, and instead I install pyenv's plugin pyenv-virtualenvwrapper. This is easier way.
Poetry is great for me:
poetry add <package> # install single package
poetry remove <package>
poetry install # if you remove poetry.lock poetry will re-calculate versions

As a Python newcomer this question frustrated me endlessly and confused me for months. Which virtual environment and package manager(s) should I invest in learning when I know that I will be using it for years to come?
The best article answering this vexing question is https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/ by Jake Vanderplas. Although a few years old, it provides practical answers and the history of Python package and virtual environment managers from the trenches as these state-of-the-art was developing.
It was particularly frustrating for me in the data science and "big data cloud computing" communities, because conda is widely used as a virtual environment manager and full function package manager for Python and JavaScript, SQL, Java, HTML5, and Jupyter Notebooks.
So why use pip at all, when conda does everything that pip and venv variants do?
The answer is, "because you MUST use pip if a conda package is simply not available." Many times a required package is only available in pip format and there is no easy solution but to use pip. You can learn to use conda build but if you are not the package maintainer, then you must convince the package owner to generate a conda package for each new release (or do it yourself.)
These pip-based packages differ along many important and practical dimensions:
stability
maturity
complexity
active support (versus dying or dead)
levels of adoption near the Python ecosystem "core" versus "on the
fringes" (i.e., integrated into Python.org distro)
easy to figure out and use (for beginners)
I will answer your question for two packages from dimension of package maturity and stability.
venv and virtualenv are the most mature, stability, and community support. From the online documentation you can see that virtualenv is in version 20.x as of today. virtualenv
virtualenv is a tool to create isolated Python environments. Since
Python 3.3, a subset of it has been integrated into the standard
library under the venv module. The venv module does not offer all
features of this library, to name just a few more prominent:
is slower (by not having the app-data seed method),
is not as extendable,
cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),
is not upgrade-able via pip,
does not have as rich programmatic API (describe virtual environments without creating them).
virtualenvwrapper is set of scripts to help people use virtualenv (it is a "wrapper" that not well-maintained, its last update was in 2019. virtualenvwrapper
My recommendation is to avoid ALL pip virtual environments whenever possible. Use conda instead. Conda provides a unified approach. It is maintained by teams of professional open source developers and has a reputable company providing funding and a commercially supported version. The teams that maintain pip, venv, virtualenv, pipenv, and many other pip variants have limited resources by comparison. The pip virtual environment plurality is frustrating for beginners. The pip-based virtual environment tools complexity, fragmentation, fringe and unsupported packages, and wildly inconsistent support drove me to use conda. For data science work, my recommendation is that to use a pip-based virtual environment manager as a last resort when conda packages do not exist.
The differences between the venv variants still scare me because my time is limited to learn new packages. pipenv, venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, poetry, and others have dozens of differences and complexities that take days to understand. I hate going down a path and find support for a package goes belly-up when a maintainer resigns (or gets too busy to maintain it). I just need to get my job done.
In the spirit of being helpful, here are a few links to help you dive in over your head, but not get lost in Dante's Inferno (re: pip).
A Guide to Python’s Virtual Environments
Choosing "core" Python packages to invest in for your career (long-term), versus getting a job done short term) is important. However, it is a business analysis question. Are you trying to simply get a task done, or a professional software engineer who builds scalable performant systems that require the least amount of maintenance effort over time? IMHO, conda will take you to the latter place more easily than dealing with pip-plurality problems. conda is still missing 1-step pip-package migration tools that make this a moot question. If we could simply convert pip packages into conda packages then pypi.org and conda-forge could be merged. Pip is necessary because conda packages are not (yet) universal. Many Python programmers are either too lazy to create conda packages, or they only program in Python and don't need conda's language-agnostic / multi-lingual support.
conda has been a god-send for me, because it supports cloud software engineering and data science's need for multilingual support of JavaScript, SQL, and Jupyter Notebook extensions, and conda plays well within Docker and other cloud-native environments. I encourage you to learn and master conda, which will enable you to side-step many complex questions that pip-based tools may never answer.
Keep it simple! I need one package that does 90% of what I need and guidance and workarounds for the 10% remaining edge cases.
Check out the articles linked herein to learn more about pip-based virtual environments.
I hope this is helpful to the original poster and gives pip and conda aficionados some things to think about.

I want to add docker into this list, as well as conda that several answer already mentioned.
conda is heavier than the virtual environments the title mentioned. It also give isolation on some system-python tools, such as ffmpeg or gpu drivers.
docker is even better, it gives you a whole new OS to play with. With a good Dockerfile and a docker build, docker run script, you have good documentation of how your environment is built, and it is easy to populate, migrate to other environment (staging, production, cloud). It helps you in the long run.
Another thing: PyCharm provides several options to select your virtual environment. It helps the new-comers not to worry about this thing. Recommend to use it before you know what the virtual environment is.

How to manage libraries in deployment

I run Vagrant on Mac OS X. I am coding inside a virtual machine with CentOS 6, and I have the same versions of Python and Ruby in my development and production environment. I have these restrictions:
I cannot manually install. Everything must come through RPM.
I cannot use pip install and gem install to install the libraries I want as the system is managed through Puppet, and everything I add will be removed.
yum has old packages. I usually cannot find the latest versions of the libraries.
I would like to put my libraries locally in a lib directory near my scripts, and create an RPM that includes those frozen versions of dependencies. I cannot find an easy way to bundle my libraries for my scripts and push everything into my production server. I would like to know the easiest way to gather my dependencies in Python and Ruby.
I tried:
virtualenv (with --relocatable option)
PYTHONPATH
sys.path.append("lib path")
I don't know which is the right way to go. Also for ruby, is there any way to solve my problems with bundler? I see that bundler is for rails. Does it work for custom small scripts?
I like the approach in Node.JS and NPM; all packages are stored locally in node_modules. I have nodejs rpm installed, and I deploy a folder with my application on the production server. I would like to do it this way in Ruby and Python.

I don't know Node, but what you describe for NPM seems to be exactly what a virtualenv is. Once the virtualenv is activated, pip installs only within that virtualenv - so puppet won't interfere. You can write out your current list of packages to a requirements.txt file with pip freeze, and recreate the whole thing again with pip install -r requirements.txt. Ideally you would then deploy with puppet, and the deploy step would involve creating or updating the virtualenv, activating it, then running that pip command.

Maybe take a look at Docker?
With Docker you could create a image of your specific environment and deploy that.
https://www.docker.com/whatisdocker/

Python ImportError while module is installed [Ubuntu]

I'd like to make a switch from Windows to Linux (Ubuntu) writing my python programs but I just can't get things to work. Here's the problem: I can see that there are quite the number of modules pre-installed (like numpy, pandas, matplotlib, etc.) in Ubuntu. They sit nicely in the /host/Python27/Lib/site-packages directory. But when I write a test python script and try to execute it, it gives me an ImportError whenever I try to import a module (for instance import numpy as np gives me ImportError: No module named numpy). When I type which python in the commandline I get the /usr/bin/python path. I think I might need to change things related to the python path, but I don't know how to do that.

You can use the following command in your terminal to see what folders are in your PYTHONPATH.
python -c "import sys, pprint; pprint.pprint(sys.path)"
I'm guessing /host/Python27/Lib/site-packages wont be in there (it doesn't sound like a normal python path. How did you install these packages?).
If you want to add folders to your PYTHONPATH then use the following:
export PYTHONPATH=$PYTHONPATH:/host/Python27/Lib/site-packages
Personally here are some recommendations for developing with Python:
Use virtualenv. It is a very powerful tool that creates sandboxed python environments so you can install modules and keep them separate from the main interpreter.
Use pip - When you've created a virtualenv, and activated it you can use pip install to install packages for you. e.g. pip install numpy will install numpy into your virtual environment and will be accessible from only this virtualenv. This means you can also install different versions for testing etc. Very powerful. I would recommend using pip to install your python packages over using ubuntu apt-get install as you are more likely to get the newer versions of modules (apt-get relies on someone packaging the latest versions of your python libraries and may not be available for as many libraries as pip).
When writing python scripts that you will make executable (chmod +x my_python_script.py) make sure you put #!/usr/bin/env python at the top as this will pick up the python interpreter in your virtual environment. If you don't (and put #!/usr/bin/python) then running ./my_python_script.py will always use the system python interpreter.

/host/Python27/Lib/site-packages is not a default python directory on linux installations as far as I am aware.
The normal python installation (and python packages) should be found under /usr/lib or /usr/lib64 depending on your processor architecture.
If you want to check where python is searching in addition to these directories you can use a terminal with the following command:
echo $PYTHONPATH
If the /host/Python27/Lib/site-packages path is not listed, attempt to use the following command and try it again:
export PYTHONPATH=$PYTHONPATH:host/Python27/Lib/site-packages
If this should work and you do not want to write this in a terminal every time you want to use these packages, simply put it into a file called .bashrc in your home folder (normally /home/<username>).

When installing other python libraries, specify the pip version you want to install it to, if it's python2 you use, then enter this syntax:
pip2 install <package>
For python3
pip3 install <package>

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.