TL/DR: Is there really no way to just tell jupyter console to run in some conda environment, without first unnecessarily installing (and hence depending on) Jupyter in that environment?
I really did try to make this not appear entirely like a rant... I hope you see there is an actual question here.
It seems like getting Jupyter to work with conda environments requires either
Installing a new Jupyter in every conda environment you want to use, or
Installing ipykernel in every conda environment you want to use (which depends on the jupyter package...), and creating a new kernel from within the environment.
I find this a bit astonishing, since I do not think of Jupyter as a requirement of the project, but rather as just another editor/IDE-like thing, making use of environments. Conda's purpose is to manage reproducible dependencies; Jupyter's should be to interpret code within the environment I tell it. Since I'd like to store the environment.yml in git and share it with others, I see no purpose in also requiring them to install Jupyter; they might not even use it.
Yet, it seems not to work that way at all. It feels like when I would like to use Emacs to make use of an environment, I'd have to install an "emacskernel" package in every environment. That's not how it works.
What I would like is to have one globally installed Jupyter, which can just be pointed to different environments -- similar to how the Julia REPL with julia --project=... works (yeah I know that conda is not a built-into-the-language package manager, but you should get the analogy...). (This would kinda work if conda environments would "inherit", i.e., fall back to the "global environment" for unfound dependencies, and you could just use the global Jupyter from within each one; but as I understand, they don't?)
Is this possible at all? What am I missing? Are there any better alternatives providing global Jupyter + local environments? (I must admit I have never used virtualenv or the like...)
(This older question seems to cover the same topic for pipenv, but there's not real answer there... neither a definite NO, nor an explaination why.)
Related
I am working on a research project in which I need to use some scientific packages each of which comes with their specific requirement files including their needed libraries. I am coding python in jupyter notebook using Anaconda in Windows 10.
Based on what I've read on the web, each project needs to have its own environment, so I created an environment (say project_env) using conda. During my project, in some parts, I need to use some external scientific packages (let's call 'bst' and 'MDN'), cloned from Github, each of which has their specific dependencies.
my current practice is just installing all these dependencies in the same environment (project_env), and code the whole project in one notebook. However, as going forward, things getting more complicated and facing some conflicts between installed packages even using conda installation. So, I came up with this idea to keep things apart as much as possible, i.e. creating two other environments for the external packages (bst_env and MDN_env) and then using them whenever I need them in the project. Under this scenrio, I cannot include all my project code in one jupyter notebook because as far as I know there is no way to switch between environments from inside a notebook. However, in this way it is quite difficult and messy to run different notebooks for different parts of the project.
My question is: Is there a method to run more than one environment from a notebook? if no, what would be the best practice to handle these environments in a project? should I export my variables from my source code (run in project_env) to other environments (bst_env or MDN_env) every time and activate and run their according environments and notebooks every time or there is a better practice to do that?
I found this great package (nb_conda_kernels), which is exactly what I wanted. It enables you to switch between environments (kernels) inside a jupyter notebook, just by selecting from a list of available environments.
As mentioned here (https://github.com/Anaconda-Platform/nb_conda_kernels), just type: 'conda install nb_conda_kernels' in conda terminal to install this package in the environment (kernel) from which you want to run other environments (kernels). In my case (the above question) it is 'project_env'. Also, make sure to have 'ipykernel' installed in the external environments you want to use in your notebook (in my case: 'bst_env' and 'MDN_env').
Now, during working in a notebook under environment "A", you can use dependencies installed in environments "B" or "C" just by selecting these environments from the list of kernels in jupyter notebook.
I use miniconda to manage my python environments on Windows 10. Additionally, I use software called ESRI ArcGIS Pro that comes bundled with it's own versions of conda and python that are somewhat modified to work with their software. I must use ESRI's conda to manage environments that interact with this application.
I have this same set up on both my laptop and desktop, and until recently had no issues. However, recently ESRI's conda stopped working on my laptop. Any conda commands (e.g. conda list, conda info --envs, conda create -n myenv, even just conda by itself) produce no output whatsoever. At first I suspected that PATH was set incorrectly, but I've check that this is not the case (even calling ERIS's conda.exe with a full path still does not work). I then suspected that the conda.exe file itself was corrupted, but this also is not the case (copied it to my desktop and it works fine there).
I suspect it may have something to do with my separate miniconda installation. It doesn't seem to be an issue of environmental variables being set incorrectly (again checked against the working system), but I'm wondering if there is any possibility that there are Registry entries (perhaps set by my Miniconda install) that could be causing this issue?
Any thoughts on why this might be the case? Or advice on how to proceed with diagnosing the issue?
EDIT:
Per merv's request, my conda environmental variables:
CONDA_DEFAULT_ENV=C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3
CONDA_PREFIX=C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3
CONDA_PS1_BACKUP=$P$G
Clearly these paths are different than normal due to the custom distribution.
To answer your other questions, no other conda commands generate any output whatsoever. As for activate I don't have any other environments to try activating (the arcgispro-py3 env you see above is the name of the 'base' env that ships with the software), but deactivate seems to work. Another slight difference to mention is that conda activate ... is not a command in this special conda, you have to just use activate by itself which AFAICT calls a shell script.
Is it correct to expect a Conda environment to provide complete isolation and containment for pip/pipenv usage?
Let's say I create and activate a Conda environment and name it "pip-pip", then proceed with my project, which uses pipenv, while completely ignoring the fact that this is happening with a Conda environment activated.
Will all traces of that pipenv project be contained in "pip-pip", or is there a possibility of a spillover?
Will the fact that pip/pipenv is used from within "pip-pip" negatively affect the experience in any way?
This arrangement should work fine, as long as your shell and environment variables are configured correctly.
If you try to activate the Pipenv without the "Pip-pip" Conda environment active, you might have breakage or other unpredictable behavior, as Pipenv was installed with one Python and is being run with another. The extent of the breakage depends on the implementation details of Pipenv.
As a general rule, it should be possible to nest such "environment" programs arbitrarily, as long as they are well-designed, and as long as you activate the chain of environments in the order that they were originally installed. Whether this negatively affects your experience depends on your tolerance for annoyance.
However, Pipenv by default creates virtual environments in a global location. I'm not sure what that location is, but it's possible that you could end up with Pipenv environments installed alongside each other that depend on different Python versions. This, I think, might constitute "spillover" in the sense of your question.
I currently have a rather complex Python configuration that has evolved over the years, and I'd like to clean it up and "modernize" it.
The existing configuration has a the default macOS Python, and Homebrew's Python 3 and Python 2 all existing side-by-side, along with their associated Pips. I also have some python command line tools that these Pythons or their associated installed packages have created, and which I use more or less frequently.
What I'd like to do is:
Leave macOS Python untouched
Eliminate all Homebrew Python's
Remove non-macOS Python 2 entirely
Switch to Conda Python as my Python 3
Have access to mkvirtualenv (as an alternative to creating environments) with virtualenvwrapper
Have access to Jupyter
I'm not sure how to do this without creating problems, and want to confirm that the obvious thing is the safe thing:
use Homebrew to uninstall its Pythons,
install Conada, and then
use (Conda's) pip to install mkvirtualenv, virtualenvwrapper, and Jupyter (and any other tools I subsequently need)
Is that the correct procedure? Is so are there particular forms of the commands I should use or options I should chose for them?
The biggest and/or first issue is how to not break existing functionality that relies on Python. There are two broad camps here:
1) tools and other scripts that hard-code the Python executable's location, and
2) tools and other scripts that rely on the/a system PATH variable.
#1 is the easier one. If you aren't going to remove any Python versions, then these are no work at all...these will keep working. If you do want to uninstall some Python versions, then you have to work to switch any tools relying on those versions you want to remove to another version that also works for that tool. The path in question is commonly in a shebang ('#! xxx') line at the top of each main Python binary, but there are other ways that the path to the Python binary can be formed. In short, why uninstall anything? Disk space is cheap. Maybe instead just make sure that these unwanted versions are not referenced by any PATH variables.
#2 is the hard one. It isn't necessarily the case that all of the tools in this category are using the version of Python you get when you just type "python" at a command prompt for your primary account. There can be other modes of operation that initialize the execution environment (the PATH variable) in different ways, and so may be running different Python versions despite depending on the value of PATH.
Part of #2 is worrying about not just "python" references, but "python2", "python3", and possibly other variants as well.
Only once you've got a plan for dealing with the above so you don't break things can you worry about possibly getting rid of Python versions and installing new ones. Hopefully, Brew does a good job of uninstalling the versions it's installed, so if you can remove dependencies on one or more of them, they can potentially be easily removed. If you've got self-installed Python versions, those should be easy to uninstall as well by just removing references to them in PATH variables (or not...shouldn't be a big problem if you miss some) and then deleting the install directory.
Then there's adding the new version(s) of Python. This can only affect #2 above. You have to think about that one and know what affect you're going to have if the new install(s) manipulate any PATH variables. If it only manipulates your own user's PATH, or it leaves it to you to do so, this is a much easier to understand task, but any change to the environment is a chance to break existing functionality.
Finally, there's the mechanisms for choosing different Python versions for new development, including the use of virtual envs. This is probably the easiest part, as you can do research, try things, and test that you can do whatever you want to do. This part of the problem is the best bounded.
I don't know anything about Jupyter, other than knowing vaguely what it is, so I don't know how that complicates all this.
UPDATE: A final note. As you may already know, Python does a good job of isolating itself in terms of each version keeping its unique identity. If you use the right 'pip' and 'easy_install' that are sitting right next to the 'python' binary you're going to run with, you should be cleanly affecting just that one environment. I can't know that it's this easy for all Python versions, but I've never seen this convention broken by a version of Python that I've used. The complications here, of course, involve which versions of these tools you're getting in various situations when they are found via a PATH variable.
First, install anaconda or miniconda. The installation is non-destructive and does not conflict with your other Python installations. Check that it works before you consider removing homebrew installed Pythons.
The conda command is used both as a package manager and as an environment manager. You cannot avoid creating conda environments: the default installation is already part of an environment named base. I'm not sure why you would want to, either.
You can use pip to install any package you choose into a conda environment, but since you can use conda install for any package available on any conda channel (e.g. 'defaults', 'conda-forge'), using pip often is redundant.
You could use non-conda virtual environments, but again: why? conda create -n foo python=x.x jupyter #etc and then conda activate foo is all you need to get one up and running.
I am planning to implement some functionality in Python to be used as part of a larger non-Python project, so as to take advantage of some Python libraries. I have done some scripting in Python before, but nothing this substantial.
From the advice I've gotten, it seems like we will definitely want to use a virtual environment to manage dependencies. I am exploring venv and conda and haven't committed to either yet, though it seems like conda would have the advantage of providing pre-built versions of Cython dependencies.
With both conda and venv, though, the documentation I've been finding seems to be oriented toward working interactively inside the environment. For our purposes, I want to be able to run the programs we've written in Python programmatically, without going through the system shell.
Is there an established, recommended way to do this?
I've been trying to look at what the Bash scripts to activate a virtual environment actually do, and it looks like they basically just set up some environment variables. Both add their virtual environment's bin directory to the beginning of the PATH, venv sets VIRTUAL_ENV, and conda sets a bunch of CONDA_ environment variables. Interestingly, it doesn't look like either sets, say, PYTHONPATH.
For programmatic use, is it sufficient to set these environment variables and then run the equivalent of python3 -m mymodule, or is there more setup that needs to be done? I would particularly like to know if this is documented anywhere, for conda, venv, or both: relying on having figured out what environment variables need to be set to what values seems a bit fragile.