Does miniconda installation affect standard python installation? - python

I had first installed python using the standard python distribution available on their official website and I would be using pip to install all necessary packages.
However, now I wish to use miniconda, since it is a better choice for data science.
But, it installs python along with It and I don't want to disturb my earlier setup of pip+Python.
Will installing miniconda affect my python installation.
Is there a way of installing it without disturbing the python installation?
I am on a Windows operating system.

You can safely install Anaconada (or Miniconda) on top of other Python installations. It goes into a completely different folder on your local disk. But leave the default installation options on default, especially don't add Python to your path.
The important thing is that you activate your environment before you use it via
conda activate
and then start Python from there (or let your IDE do that for you).
(base)> python
Without activatation conda doesn't work and calling python from the command prompt will start your 'standard installation' again.
The advantage of Anaconda is that it guarantees maximumum consistency for the 'scientific stack' and in case you are still missing some 3rd party packages you can always install them aditionally via `pip install' into an activated conda environment.

Related

Installing packages in python and setting up the working environment

I've been coding with R for quite a while but I want to start learning and using python more for its machine learning applications. However, I'm quite confused as to how to properly install packages and set up the whole working environment. Unlike R where I suppose most people just use RStudio and directly install packages with install.packages(), there seems to be a variety of ways this can be done in python, including pip install conda install and there is also the issue of doing it in the command prompt or one of the IDEs. I've downloaded python 3.8.5 and anaconda3 and some of my most burning questions right now are:
When to use which command for installing packages? (and also should I always do it in the command prompt aka cmd on windows instead of inside jupyter notebook)
How to navigate the cmd syntax/coding (for example the python documentation for installing packages has this piece of code: py -m pip install "SomeProject" but I am completely unfamiliar with this syntax and how to use it - so in the long run do I also have to learn what goes on in the command prompt or does most of the operations occur in the IDE and I mostly don't have to touch the cmd?)
How to set up a working directory of sorts (like setwd() in R) such that my .ipynb files can be saved to other directories or even better if I can just directly start my IDE from another file destination?
I've tried looking at some online resources but they mostly deal with coding basics and the python language instead of these technical aspects of the set up, so I would greatly appreciate some advice on how to navigate and set up the python working environment in general. Thanks a lot!
Python uses a different way of installing packages. Python has a thing named venv which stands for Virtual Environment. You install all of your packages in venv. Usually for each new project you make a new venv.
By using Anaconda on windows you install everything within the anaconda environment that you have specified.
python -m pip install "modulename" is a command that will install modulename to your default venv. You will be able to use this module when no other venv is specified. Here is the docs page. And here is a tutorial on how to use venv
By default python uses the same directory you have your code in. e.g. C:/Users/me/home/mypythonfile.py will run in C:/Users/me/home/ and will be able to access files in this directory. However you can use ../ to navigate directories or you can specify an absolute path to file you want to open e.g. with open("C:/system32/somesystemfile.sys") as file
Going over the technical differences of conda and pip:
So Conda is a packaging tool and installer that aims to do more than what pip does; handle library dependencies outside of the Python packages as well as the Python packages themselves. Both have many similar functionalities as well, you can install packages or create virtual environments with both.
It is generally advisable to generally have both conda and pip installed since there are some packages which might not be available with conda but with pip and vice versa.
The commands to install in both the ways is easy enough, but one thing to keep in mind is that
conda stores packages in the anaconda/pkgs directory
pip stores it in directory under /usr/local/bin/ for a Unix-based system, or \Program Files\ for Windows
You can use both pip or conda inside the jupyter notebook, it will work just fine, but it may be possible that you get multiple versions of the same package.
Most of the times, you will use cmd only to install a module used in your code, or to create environments, py -m pip install "SomeProject" here basically means that the module "SomeProject" will be downloaded in base env.
You could think of conda as python with a variety of additional functionalities, such as certain pre-installed packages and tools, such as spyder and jupyter. Hence, you must be precise when you say:
I've downloaded python 3.8.5 and anaconda3
Does it mean you installed python in your computer and then also anaconda?
In general, or at least in my opinion, using anaconda has advantages for development, but typically you'll just use a simple python installation in production (if that applies to you).
Anaconda has it's own package registry/repository . When you call conda install <package>, it will search for the package there and install it if available. You would better search it first, for instance matplotlib.
pip is a package manager for the Python Package Index. pip also ships with anaconda. Hence, in an anaconda environment you may install packages from either sources (either using pip install or conda install). For instance, pandas from PyPI and pandas from conda. There is no guarantee that packages exist in both sources. You must either search it first or simply try it.
In your first steps, I would suggest you to stick to only one dev env (either simple python or anaconda, recommend the second). Because that simplifies the question: "which python and which pip is executed in the cmd line?". That said, those commands should work as expected in any terminal, it be a simple cmd or an embedded one like in PyCharm or VS Code.
You could inspect that by running (on windows and linux at least):
which python, which pip.
Honestly, this is a question/answer that falls outside the scope of SO and for more info you would better check official websites, such as for anaconda or search for python vs anaconda blogs.

Should Anaconda be use to manage system python? Or, is it just to create an isolated environment?

I already have Python 2.7 installed in Windows. I have normally used pip to install packages. However, Pandas recommends using Anaconda and it appears that it has many benefits so I wanted to try it.
I installed miniconda and it just reinstalled Python under its own directory. Does Anaconda always duplicate the python libraries or can it be used to manage the system's python.
I use python to develop and also wanted to use Pandas to analyse data. However, I would like to avoid have two copies of Python. I want to have one python environment that is constant with all the packages that I intend to have. Otherwise, I feel that I will have to install the same packages multiple times.
I know that Anaconda is to separate different environments. Does this mean that I am trying to do something it is not its purpose or have I installed it incorrectly?
Anaconda has a root environment that includes a bit more than 100 of the most popular Python packages.
Yes, you can use the root Python as your system's Python executable.
The anaconda installation comes with Conda, which is a robust environment manager. If you want to keep your root environment stable, you can use Conda to create new environments for each project, and Conda handles the dependencies of each environment as well.
You can create a new environment named "analysis" that has Python, IPython, and Pandas using:
conda create --name analysis python ipython pandas
After installing all of the packages, you can use the environment by running (from the CMD prompt):
conda activate analysis

What is the relationship between a python virtual environment and specific system libraries?

We have an application which does some of its work in Python in a python virtual environment setup using virtualenv.
We've hit a problem where the version of a system library does not match the version installed in the virtual environment. That is we have NetCDF4 installed into the virtual environment and and previously had libnetcdf.so.7 installed through yum. The python package appears to be dependent on having libnetcdf.so.7 available.
Due to a system update libnetcdf.so.7 no longer exists and has been replaced by libnetcdf.so.11.
So the question is this: Does setting up the virtual environment detect the system library version or is there some other mechanism? Also do we need to re-build the environment to fix this or is there another option?
When you use virtualenv to create a virtual environment you have the option of whether or not to include the standard site packages as part of the environment. Since this is now default behaviour (though it can be asserted by using --no-site-packages in the command line) it's possible that you are using an older version of virtualenv that doesn't insist on this.
In that case you should be able to re-create the environment fairly easily. First of all capture the currently-installed packages in the existing environment with the commmand
pip freeze > /tmp/requirements.txt
Then delete the virtual environment, and re-create it with the following commands:
virtualenv --no-site-packages envname
source envname/bin/activate
pip install -r /tmp/requirements.txt
However none of this addresses the tricky issue of not having the required support libraries installed. You might try creating a symbolic link to the new library from the old library's position - it may be thatNetCDF4 can work with multiple versions of libnetCDF and is simply badly configured to use a specific version. If not then solving thsi issue might turn out to be long and painful.

Do anaconda packages interfere with system python

I have a system with certain python version and packages installed suing the distribution repositories. For some project (calculation) I need newer version the the packages. I am thinking of installing anaconda and use conda virtual environments. Will this broke programs that must use the system packages?
(note: I tried virtual enviroment, but I couldn't install a newver version of matplotlib, because of problems with pygtk)
No this will not break your system's python. As long as you don't tick the option "register miniconda as the default system python" (or whatever that option is called depending on your OS).
One of the key benefits of conda is that you can create isolated python environments, fully independent of each other.

How to install Python libraries under specific environments

I have two Anaconda installations on my computer. The first one is based on Python 2.7 and the other is based on Python 3.4. The default Python version is the 3.4 though. What is more, I can start Python 3.4 either by typing /home/eualin/.bin/anaconda3/bin/python or just python. I can do the same but for Python 2.7 by typing /home/eualin/.bin/anaconda2/bin/python. My problem is that I don't know how to install new libraries under certain environments (either under Python 2.7 or Python 3.4). For example, when I do pip install seaborn the library gets installed under Python 3.4 by default when in fact I want to install it under Python 2.7. Any ideas?
EDIT
This is what I am doing so far: the ~/.bashrc file contains the following two blocks, of which only one is enabled at any given time.
# added by Anaconda 2.1.0 installer
export PATH="/home/eualin/.bin/anaconda2/bin:$PATH"
# added by Anaconda3 2.1.0 installer
#export PATH="/home/eualin/.bin/anaconda3/bin:$PATH"
Depending of which version I want to work, I open the fie, comment the opposite block and do source ~/.bashrc Then, I install the libraries I want to use one by one. But, is this the recommended way?
You don't need multiple anaconda distributions for different python versions. I would suggest keeping only one.
conda basically lets you create environments for your different needs.
conda create -n myenv python=3.3 creates a new environment named myenv, which works with a python3.3 interpreter.
source activate myenv switches to the newly created environment. This basically sets the PATH such that pip, conda, python and other binaries point to the correct environment and interpreter.
conda install pip is the first thing you may want to do. Afterwards you can use pip and conda to install the packages you need.
After activating your environment pip install <mypackage> will point to the right version of pip so no need to worry too much.
You may want to create environments for different python versions or different sets of packages. Of course you can easily switch between those environments using source activate <environment name>.
For more examples and details you may want to have a look at the docs.
Virtualenv seems like the obvious answer here, but I do want to suggest an alternative that we've been using to great effect lately: Fig - this is particularly effective since we use Docker in production as well, but I imagine that using Fig as a replacement for virtualenv would be quite effective regardless of your production environment.
Using virtualenv is your best option as #Dettorer has mentioned.
I found this method of installing and using virtualenv the most useful.
Check it out:
Proper way to install virtualenv

Categories

Resources