Why using Anaconda environments to install tensorflow on Windows?

Why using Anaconda environments to install tensorflow on Windows? - python

In tensorflow installation guide it is said, that I should use "environment" to install tensorflow: https://www.tensorflow.org/install/install_windows#installing_with_anaconda
Why? Can't I just install with pip?
If installed with environment, should I "activate" it each time I use tensorflow?
If I use tensorflow from within other thing like keras and/or PyCharm, then how can I activate environment?

The question is about Windows. I assume you installed python using anaconda. Then you have a default environment, called root. You can create as many environments as you want, think of each as a separate installation of python. Using conda or pip installs stuff at your current installation. Conda stuff is kind of pre-compiled to work with your machine/anaconda environment, while pip stuff is usually compiled on the spot. I assume compiling tensorflow might not be completely trivial...
'Activate' changes from one environment to the other, so unless you have multiple environments you shouldn't need it. You run all these on command prompt.
Bottom line is, unless you have multiple environments (I highly recommend it so you can try different things) I cannot see you using activate. Install tensorflow and keras on the same one and only root environment you have. You should be able to access both (it is also possible just installing keras would install tensorflow, if its a dependancy)
If you see no prompt, it is the default, root environment. You can see all your environments with: conda info --envs But unless you create some environment (using e.g. conda create --name py Python=2) you probably only have root. One of the nice things with environments is you can have one with Python=2 (latest python 2), one with Python=3, another with Python=2.7 etc
On your follow-up, If you have multiple environments, you can switch between them on Pycharm by changing the interpreter. On the image you see me selecting e.g. py2_olv

Professional answer:
Quote from https://machinelearningspace.com/installing-tensorflow-2-0-in-anaconda-environment/:
What is Anaconda and why I recommend it?
...
[dropped intro to Anaconda]
...
For a Python developer or a data science researcher, using Anaconda
has a lot of advantages, such as independently installing/updating
packages without ruining the system. So, we no need to worry about the
system library or anything like that. This can save time and energy
for other things.
Anaconda can be used across different platforms, Windows, macOS, and
Linux. If we want to use a different Python version or package
libraries, just create a different environment and play around without
any risk of crashing the system library.
####
Unprofessional research:
Now in addition my own research. I am not a professional, I have little knowledge of the seemingly chaotic world of different install methods. This refers to some first research at https://superuser.com/questions/1572640/do-i-need-to-install-cuda-separately-after-installing-the-nvidia-display-driver/1572762#1572762. Mind that I am guessing a lot here. Please comment if I am wrong.
We see that at the moment, Pytorch supports version 10.2, Tensorflow supports 10.1, and it is not just the version that differs: mind that "CUDA Toolkit" (standalone) and cudatoolkit (conda binary install) are different! One is a a standalone / executable install, the other is a binary install. And tensorflow needs tensorflow-gpu to reach the standalone cuda install.
Therefore you should consider a separate environment for both Tensorflow and Pytorch, since any update of the conda cudatoolkit to version 11.0 could harm the dependency condition of Pytorch (Though this is not completely right. Pytorch uses a cuda that is installed inside Pytorch. It is still the approach to understand the recommended different envs). For tensorflow, you have to install version CUDA Toolkit 10.1 although 11.0 is already available, so that your whole card must run on a lower version than possible only to support Tensorflow - even if some games would like to have version 11.0.
Unprofessional answer:
If all of the dependencies are so important and so easily wrong when updated separately, like you could do with pip, any install that you do by yourself using pip might crash your sensitive tensorflow install. Therefore it is recommended to keep to a full service approach which Anaconda offers, where all dependencies are kept right, even if you enter conda install --all. That is why you better search for an Anaconda guide, for example https://machinelearningspace.com/installing-tensorflow-2-0-in-anaconda-environment/.

If you would have read through the entire document, it would have stated that the Anaconda installation is community supported, not officially supported. They want you to install TensorFlow using native pip through Python 3.5.x. That being said, from personal experience, I will tell you that if you are looking to run basic level TensorFlow Python scripts, such as training and testing an MNIST model, a Windows installation will be fine, or using a model that has already been trained for some purpose will also be fine. However, if you want to train advanced models such as Inception, which are the state-of-the-art image classifiers with less than 5% error for normal images, Windows is not suitable. You should try using Linux installation for any training purposes. I would recommend using VirtualBox, having used it in the past.
As for activating the environment, as long as, in any script / in the bash, you include the line "import tensorflow as tf", you should be fine, at least for native pip installation.
Good luck!

Related

Creating Anaconda environment satisfying prerequisites below

I have installed conda 4.5.12 and managed to install an environment with a .yml file flawlessly.
Now I need to set an up environment to supply myself a simulation of this project in here.
In it's prerequisites list there are these components;
Python 2: HDF5, OpenCV 2 interfaces for python.
C++: HDF5, OpenCV 2, Boost
Lua JIT and Torch 7.
Torch 7 packages: class, GPU support cunn and cutorch, Matlab support mattorch, JSON support lunajson, Torch image library image
Please note that mattorch is an outdated packages which is no longer maintained.
So my question is mainly whether I can generate a yaml file to cover up this list & create an virtual environment to start up development while researching about.
You can find the Git Hub branch in [HERE]

If all these packages are available on conda or pipy, then you can indeed just make a yml and install that.
From personal experience, it's often a good idea to gradually add packages and often test installing the conda environment. In this way, you can better identify if a dependency conflict arrises and you need to set some manual versions.

Can I let people use a different Tensorflow-gpu version above what they had installed with different CUDA dependencies?

I was trying to pack and release a project which uses tensorflow-gpu. Since my intention is to make the installation as easy as possible, I do not want to let the user compile tensorflow-gpu from scratch so I decided to use pipenv to install whatsoever version pip provides.
I realized that although everything works in my original local version, I can not import tensorflow in the virtualenv version.
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Although this seems to be easily fixable by changing local symlinks, that may break my local tensorflow and is against the concept of virtualenv and I will not have any idea on how people installed CUDA on their instances, so it doesn't seems to be promising for portability.
What can I do to ensure that tensorflow-gpu works when someone from internet get my project only with the guide of "install CUDA X.X"? Should I fall back to tensorflow to ensure compatibility, and let my user install tensorflow-gpu manually?

Having a working tensorflow-gpu on a machine does involve a series of steps including installation of cuda and cudnn, the latter requiring an NVidia approval. There are a lot of machines that would not even meet the required config for tensorflow-gpu, e.g. any machine that doesn't have a modern nvidia gpu. You may want to define the tensorflow-gpu requirement and leave it to the user to meet it, with appropriate pointers for guidance. If the project can work acceptably on tensorflow-cpu, that would be a much easier fallback option.

Is it ok to use the anaconda distribution for web development?

I started learning Python working on projects involving data and following tutorials advising to install the anaconda bundle, to take advantage of the other libraries coming with it.
So did I and I got confortable with it, I liked the way it manages environments.
During the last months, I have been teaching myself web development with django, flask, continuing to use the anaconda python. But most of the time I install the dependencies I need with pip install though (inside the conda environment).
I've never seen any tutorials or podcasts mentioning the conda environment as an option for developing web apps so I start to get worried. Is it for a good reason?
Everywhere its the combination of pip and virtualenv that prevail. And virtualenv isn't compatible with anaconda that has its own env management system.
My newbie question is: Will I run into problems later (dependencies management in production or deployment maybe?) using the anaconda distribution to develop my web apps?

Yes. Albeit, with a few caveats. First, I don't recommend using the big Anaconda distribution. I recommend installing Miniconda(3) (link).
To set up the second caveat, it's important to figure out what part of Conda you are talking about using. Conda is two things, that is, it is has both the functionality of virtualenv (an environment manager) and pip (a package manager).
So you certainly can use Conda in place of virtualenv (an environment manager) and still use pip within that Conda environment as your package manager. Actually this is my preference. Jake VanderPlas had a good comparison of virtualenv vs Conda as an environment manager. Conda has a more limited offering of packages thus I try to keep everything as one package manager (pip) within that environment. One problem I've found with virtualenv is you can't choose any particular flavor of Python, e.g. 2.7, 3.3, 3.6, etc like you can seemlessly install that version of Python within your environment with Conda.
Here's a list of command comparisons of Conda, virtualenv, and pip if that helps clear things up a bit on how you can utilize Conda and/or virtualenv and/or pip.

Confusion between Python and Anaconda

Recently I have started programming in Python (Python 3.5) on my Linux OS. But I am confused about Anaconda. What is it actually? Is it a version of Python or something else? If I do not install Anaconda will there be any limitations?

Anaconda is a free and open-source Python distribution and collection of hundreds of packages related to data science, scientific programming, development and more. Python is included in the Anaconda distribution. It is not an IDE (like PyCharm that mentioned in the comments) though it can be configured with most IDEs. I will note that the distribution includes an IDE called Spyder. It also comes with a platform-agnostic package manager called conda.
You can read more here: https://docs.continuum.io/anaconda/

Anaconda is a popular Python data science platform.
Anaconda is a commercial open source distribution of:
Python and R programming languages for large-scale data processing, predictive analytics, and scientific computing, that aims to simplify package management and deployment.
Also, you can very well install Anaconda for any operating system i.e linux or windows. They have navigator which will be of great use to launch modules available.
Anaconda while installing asks Python version :
Find more about anaconda at :
Official Website
Anaconda Docs

Anaconda distribution has been on my computer for last 2 years, on & off, so I feel that I have some experience using it.
Anaconda tries to be a Swiss army knife, and the fact remains, everything that is available with anaconda, can be manually installed using PIP.
If you're a beginner, and don't intend to do some comprehensive stuff in data science/ML field, I don't see any reason that you will need to install Anaconda. If you still want to have conda on your machine, go for it, but if you have python pre-installed, remove it first, and then use conda. (Otherwise you'll have to be specific and observant of where is it that the new python packages being installed on your computer.)
Conda dist. usually occupies 2-4 GB of space very easily.(There is a light installer known as miniconda, but it too goes on to consume memory considerably)
When you use conda command to install a python package, it usually pulls additional (maybe unnecessary for a beginner) packages along with it, thus consuming more & more space on your device. So, if your machine is slow and you have less space, Anaconda is a big NO-NO for you.
Anaconda (IMHO) is a finely tuned hype in the internet space of beginner python users.
And even if you have sufficient memory and a capable device, I don't find why should you spend that for things that you may never use. Unless you have a significant benefit when doing so, which could be more pronounced for those in a professional environment.
There are ways to bulk install everything you need using PIP, And PIP only installs what we demand/command from the terminal, nothing additional stuff, unless we ask for it.
Also, keep in mind, if you want to do data science, ML, Deep learning things, go for 64-bit version of python, so that every module you need can be installed without countering errors.

Anaconda is nothing but a python and R distribution. If you are working on Machine learning or data science field, tou will find anaconda very useful. So installing anaconda will also install python, conda(which is a package manager in anaconda), a lot of third party python packages, an IDE(like spyder), jupyter notebook(which is very helpful to write codes and visualise results and run codes cell by cell) . However, if tou are just a beginner, installing only python would be enough. Python will have certain standard libraries that will be installed along with it. And when u need new packages, you can use pip to install them.
P.s. if you have low memory space and u are just beginning, anaconda is a no no as it will have many packages installed by default, which u might not use. But installing python requires less memory and when u need a third party library, u can use pip to install libraries.

Does Conda replace the need for virtualenv?

I recently discovered Conda after I was having trouble installing SciPy, specifically on a Heroku app that I am developing.
With Conda you create environments, very similar to what virtualenv does. My questions are:
If I use Conda will it replace the need for virtualenv? If not, how do I use the two together? Do I install virtualenv in Conda, or Conda in virtualenv?
Do I still need to use pip? If so, will I still be able to install packages with pip in an isolated environment?

Conda replaces virtualenv. In my opinion it is better. It is not limited to Python but can be used for other languages too. In my experience it provides a much smoother experience, especially for scientific packages. The first time I got MayaVi properly installed on Mac was with conda.
You can still use pip. In fact, conda installs pip in each new environment. It knows about pip-installed packages.
For example:
conda list
lists all installed packages in your current environment.
Conda-installed packages show up like this:
sphinx_rtd_theme 0.1.7 py35_0 defaults
and the ones installed via pip have the <pip> marker:
wxpython-common 3.0.0.0 <pip>

Short answer is, you only need conda.
Conda effectively combines the functionality of pip and virtualenv in a single package, so you do not need virtualenv if you are using conda.
You would be surprised how many packages conda supports. If it is not enough, you can use pip under conda.
Here is a link to the conda page comparing conda, pip and virtualenv:
https://docs.conda.io/projects/conda/en/latest/commands.html#conda-vs-pip-vs-virtualenv-commands.

I use both and (as of Jan, 2020) they have some superficial differences that lend themselves to different usages for me. By default Conda prefers to manage a list of environments for you in a central location, whereas virtualenv makes a folder in the current directory. The former (centralized) makes sense if you are e.g. doing machine learning and just have a couple of broad environments that you use across many projects and want to jump into them from anywhere. The latter (per project folder) makes sense if you are doing little one-off projects that have completely different sets of lib requirements that really belong more to the project itself.
The empty environment that Conda creates is about 122MB whereas the virtualenv's is about 12MB, so that's another reason you may prefer not to scatter Conda environments around everywhere.
Finally, another superficial indication that Conda prefers its centralized envs is that (again, by default) if you do create a Conda env in your own project folder and activate it the name prefix that appears in your shell is the (way too long) absolute path to the folder. You can fix that by giving it a name, but virtualenv does the right thing by default.
I expect this info to become stale rapidly as the two package managers vie for dominance, but these are the trade-offs as of today :)
EDIT: I reviewed the situation again in 04/2021 and it is unchanged. It's still awkward to make a local directory install with conda.

Virtual Environments and pip
I will add that creating and removing conda environments is simple with Anaconda.
> conda create --name <envname> python=<version> <optional dependencies>
> conda remove --name <envname> --all
In an activated environment, install packages via conda or pip:
(envname)> conda install <package>
(envname)> pip install <package>
These environments are strongly tied to conda's pip-like package management, so it is simple to create environments and install both Python and non-Python packages.
Jupyter
In addition, installing ipykernel in an environment adds a new listing in the Kernels dropdown menu of Jupyter notebooks, extending reproducible environments to notebooks. As of Anaconda 4.1, nbextensions were added, adding extensions to notebooks more easily.
Reliability
In my experience, conda is faster and more reliable at installing large libraries such as numpy and pandas. Moreover, if you wish to transfer your preserved state of an environment, you can do so by sharing or cloning an env.
Comparisons
A non-exhaustive, quick look at features from each tool:
Feature
virtualenv
conda
Global
n
y
Local
y
n
PyPI
y
y
Channels
n
y
Lock File
n
n
Multi-Python
n
y
Description
virtualenv creates project-specific, local environments usually in a .venv/ folder per project. In contrast, conda's environments are global and saved in one place.
PyPI works with both tools through pip, but conda can add additional channels, which can sometimes install faster.
Sadly neither has an official lock file, so reproducing environments has not been solid with either tool. However, both have a mechanism to create a file of pinned packages.
Python is needed to install and run virtualenv, but conda already ships with Python. virtualenv creates environments using the same Python version it was installed with. conda allows you to create environments with nearly any Python version.
See Also
virtualenvwrapper: global virtualenv
pyenv: manage python versions
mamba: "faster" conda
In my experience, conda fits well in a data science application and serves as a good general env tool. However in software development, dropping in local, ephemeral, lightweight environments with virtualenv might be convenient.

Installing Conda will enable you to create and remove python environments as you wish, therefore providing you with same functionality as virtualenv would.
In case of both distributions you would be able to create an isolated filesystem tree, where you can install and remove python packages (probably, with pip) as you wish. Which might come in handy if you want to have different versions of same library for different use cases or you just want to try some distribution and remove it afterwards conserving your disk space.
Differences:
License agreement. While virtualenv comes under most liberal MIT license, Conda uses 3 clause BSD license.
Conda provides you with their own package control system. This package control system often provides precompiled versions (for most popular systems) of popular non-python software, which can easy ones way getting some machine learning packages working. Namely you don't have to compile optimized C/C++ code for you system. While it is a great relief for most of us, it might affect performance of such libraries.
Unlike virtualenv, Conda duplicating some system libraries at least on Linux system. This libraries can get out of sync leading to inconsistent behavior of your programs.
Verdict:
Conda is great and should be your default choice while starting your way with machine learning. It will save you some time messing with gcc and numerous packages. Yet, Conda does not replace virtualenv. It introduces some additional complexity which might not always be desired. It comes under different license. You might want to avoid using conda on a distributed environments or on HPC hardware.

Another new option and my current preferred method of getting an environment up and running is Pipenv
It is currently the officially recommended Python packaging tool from Python.org

Conda has a better API no doubt. But, I would like to touch upon the negatives of using conda since conda has had its share of glory in the rest of the answers:
Solving environment Issue - One big thorn in the rear end of conda environments. As a remedy, you get advised to not use conda-forge channel. But, since it is the most prevalent channel and some packages (not just trivial ones, even really important ones like pyspark) are exclusively available on conda-forge you get cornered pretty fast.
Packing the environment is an issue
There are other known issues as well. virtualenv is an uphill journey but, rarely a wall on the road. conda on the other hand, IMO, has these occasional hard walls where you just have to take a deep breath and use virtualenv

1.No, if you're using conda, you don't need to use any other tool for managing virtual environments (such as venv, virtualenv, pipenv etc).
Maybe there's some edge case which conda doesn't cover but virtualenv (being more heavyweight) does, but I haven't encountered any so far.
2.Yes, not only can you still use pip, but you will probably have to. The conda package repository contains less than pip's does, so conda install will sometimes not be able to find the package you're looking for, more so if it's not a data-science package.
And, if I remember correctly, conda's repository isn't updated as fast/often as pip's, so if you want to use the latest version of a package, pip might once again be your only option.
Note: if the pip command isn't available within a conda virtual environment, you will have to install it first, by hitting:
conda install pip

Yes, conda is a lot easier to install than virtualenv, and pretty much replaces the latter.

I work in corporate, behind several firewall with machine on which I have no admin acces
In my limited experience with python (2 years) i have come across few libraries (JayDeBeApi,sasl) which when installing via pip threw C++ dependency errors
error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools
these installed fine with conda, hence since those days i started working with conda env.
however it isnt easy to stop conda from installing dependency inside c.programfiles where i dont have write access.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.