installing R kernel to jupyter notebook in a different conda environment

installing R kernel to jupyter notebook in a different conda environment - python

I have a conda environment containing all packages for jupyter notebook (say it's called jupyter_env. In a different conda environment I have R installed including r-irkernel (say the env is called R_env).
For python kernels I can easily make a python kernel in a specific environment (called e.g. pyth27) available to my jupyter installation in a different environment:
(pyth27) > python -m ipykernel install --prefix=/path/to/jupyter/env --name "python27"
Is there anything similar possible for the R kernel? So far I can only run the R kernel using a jupyter installation within the same environment(R_env).
One solution might be the nb-conda_kernels package. However there I'm not clear if it always adds all available kernels from all environments or whether I can specify which environments should be searched.
My question is similar to this one https://github.com/jupyter/jupyter/issues/397. Only that I don't want to use the base environment to start jupyter but a dedicated environment.

As described on https://github.com/IRkernel/IRkernel, the r-ikernel package provides a mechanism similar to python -m ipykernel install, to be run in R:
R> IRkernel::installspec()
To run this from Bash, you can do
(R_env)> Rscript -e "IRkernel::installspec()"
Now the tricky part, due to Jupyter and R being in different environments: According to https://github.com/IRkernel/IRkernel/issues/499, IRkernel::installspec() requires the jupyter-kernelspec command. I've tested two methods to provide it (to be done before issuing the above commands):
jupyter-kernelspec is part of Jupyter and hence in the file tree of jupyter_env, so add its path to PATH (I found it's better to add to the end so as to not disrupt other path lookups during the Rscript call)
(R_env)> export PATH="$PATH:</path/to/conda>/envs/jupyter_env/bin"
jupyter-kernelspec is included in the jupyter_client conda package, so you can do
(R_env)> conda install jupyter_client
Caveat: this installs a number of dependencies, including Python.
I opted for the first method to keep R_env free of Python packages.

Related

How to import modules into Jupyter Notebook kernel

I am having problems installing modules and then importing them into specific Jupyter Notebook kernels. I want to install them directly into the kernel as opposed to throughout anaconda to separate dependencies in projects. Here is how the problem goes:
I firstly want a package, for example, nltk
I navigate to and activate the conda environment (called python3) and run 'conda install nltk'
I then load that environment into Jupyter using ipykernel with the command 'python -m ipykernel install --user --name python3'
When trying to import the package into the notebook it tells me that it cannot be found
I have been struggling with this for a while. Where am I going wrong? I greatly appreciate all the help.
NOTE: I have somehow managed to install and import many packages into notebooks using the aforementioned process. I'd really like a method to do this in a foolproof manner.

Not entirely clear where things go wrong, but perhaps clarifying some of the terminology could help:
"navigate to...the conda environment" - navigating has zero effect on anything. Most end-users should never enter or directly write to any environment directories.
"...and activate the conda environment" - activation is unnecessary - a more robust installation command is always to use a -n,--name argument:
conda install -n python3 nltk
This is more robust because it is not context-sensitive, i.e., it doesn't matter what (if any) environment is currently activated.
"load that environment into Jupyter using ipykernel" - that command registers the environment as a kernel at a user-level. That only ever needs to be run once per kernel - not after each new package installation. Loading the kernel happens when you are creating (or changing the settings of) a notebook. That is, you choose the kernel in the Jupyter GUI.
Even better, keep jupyter in a dedicated environment with an installation of nb_conda_kernels and Jupyter (launched from that dedicated environment) will auto-discover all Conda environments that have valid kernels installed (e.g., ipykernel, r-irkernel).

Why I can import packages in my jupyter notebooks only when using `pipenv run papermill`?

In a project where I have to run some Jupyter notebooks, I created a virtual environment using pipenv and installed some packages (note that I used the --site-packages flag).
Although now I am now able to run the notebooks with pipenv run papermill ..., I cannot run them from Jupyter using pipenv run or pipenv shell because of some ModuleNotFoundError exceptions.
In particular, the modules that are note found in the second case are the ones installed in the virtual environment only and not inherited from global-sites.
Indeed, if I check the sys.path I can see the difference in the two cases: in the second there is no ~/.local/share/virtualenvs/... entry.
Why am I having this issue and how can it be solved? (If possible, I would prefer not to pollute my ~/.local/share/jupyter/kernels with other kernels from virtualenvs).

As was suggested here, you also need to make sure that the kernel is also under the venv:
python -c "import IPython"
python -m ipykernel install --user --name=my-virtualenv-name
and then switch the kernel named "my-virtualenv-name" in the jupyter user interface

Explanation about Miniconda environments

I'm new using Jupyter on Miniconda and I was having a problem while importing packages (ImportError: DLL load failed ), looking for answers the solution was to initialize a base environment in my bash.
I used to initialize jupyter typing jupyter notebook in bash, but using the solution given, I have to activate conda activate bash and then type jupyter notebook. What is the difference between starting Jupyter the way I used to and this new way?

conda activate command activates a virtual environment. It is an isolated environment so all packages you installed in the virtual environment cannot be used outside it. When you start bash, you are in the base environment and it seems that you installed your Jupiter in bash environment so you cannot use bash's Jupiter in base environment and vice versa. It may be a little annoying at the beginning, but it can let you use different environments for different purposes. For example, since pip only allows one version of a specific package to be installed, different environments can let you test a new version of a package without breaking the functionality of the original program.

Jupyter notebook, how to execute system shell commands in the right conda environnment?

I'm currently experiencing some troubles with jupyter notebook and system shell commands. I use nb_conda_kernels to be able to access all of my conda environment from a jupyter notebook launched in base environment, and this works perfectly in most of my use cases. For simplicity sake, let's assume I have 2 environments, the base one, and one named work_env. I launch jupyter notebook in the base environment, and select the work_env kernel upon opening the notebook I'm working on.
Today I came across this line:
! pip install kaggle --upgrade
upon execution of the cell (with the work_env kernel correctly activated), pip installed the kaggle package in my base environment. The intended result was to install this package in my work_env. Any ideas on how to make shell commands execute in the "right" environment from jupyter notebook?

Try specifying the current python interpreter.
import sys
!$sys.executable -m pip install kaggle --upgrade
sys.executable returns the path to the python interpreter you are currently running. $ passes that variable to your terminal (! runs the command on the terminal).
Aliases expand Python variables just like system calls using ! or !! do: all expressions prefixed with ‘$’ get expanded. For details of the semantic rules, see PEP-215
from https://ipython.org/ipython-doc/3/interactive/magics.html
-m is used to run a library module (pip in this case) as a script (check python -h). Running pip as a script guarantees that you are using the pip linked to the current python interpreter rather than the one specified by your system variables.
So, in this way you are sure that pip is installing dependencies on the very same python interpreter you are working on (which is installed in your current environment), this does the trick.

Running Jupyter with multiple Python and IPython paths

I'd like to work with Jupyter notebooks, but have had difficulty doing basic imports (such as import matplotlib). I think this was because I have several user-managed python installations. For instance:
> which -a python
/usr/bin/python
/usr/local/bin/python
> which -a ipython
/Library/Frameworks/Python.framework/Versions/3.5/bin/ipython
/usr/local/bin/ipython
> which -a jupyter
/Library/Frameworks/Python.framework/Versions/3.5/bin/jupyter
/usr/local/bin/jupyter
I used to have anaconda, but removed if from the ~/anaconda directory. Now, when I start a Jupyter Notebook, I get a Kernel Error:
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/pytho‌n3.5/subprocess.py",
line 947, in init restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/pytho‌n3.5/subprocess.py",
line 1551, in _execute_child raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2]
No such file or directory: '/Users/npr1/anaconda/envs/py27/bin/python'
What should I do?!

This is fairly straightforward to fix, but it involves understanding three different concepts:
How Unix/Linux/OSX use $PATH to find executables (%PATH% in Windows)
How Python installs and finds packages
How Jupyter knows what Python to use
For the sake of completeness, I'll try to do a quick ELI5 on each of these, so you'll know how to solve this issue in the best way for you.
1. Unix/Linux/OSX $PATH
When you type any command at the prompt (say, python), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH, which the user can specify. To see your PATH, you can type echo $PATH.
The result is a list of directories on your computer, which will be searched in order for the desired executable. From your output above, I assume that it contains this:
$ echo $PATH
/usr/bin/:/Library/Frameworks/Python.framework/Versions/3.5/bin/:/usr/local/bin/
In windows echo %path%
Probably with some other paths interspersed as well. What this means is that when you type python, the system will go to /usr/bin/python. When you type ipython, in this example, the system will go to /Library/Frameworks/Python.framework/Versions/3.5/bin/ipython, because there is no ipython in /usr/bin/.
It's always important to know what executable you're using, particularly when you have so many installations of the same program on your system. Changing the path is not too complicated; see e.g. How to permanently set $PATH on Linux?.
Windows - How to set environment variables in Windows 10
2. How Python finds packages
When you run python and do something like import matplotlib, Python has to play a similar game to find the package you have in mind. Similar to $PATH in unix, Python has sys.path that specifies these:
$ python
>>> import sys
>>> sys.path
['',
'/Users/jakevdp/anaconda/lib/python3.5',
'/Users/jakevdp/anaconda/lib/python3.5/site-packages',
...]
Some important things: by default, the first entry in sys.path is the current directory. Also, unless you modify this (which you shouldn't do unless you know exactly what you're doing) you'll usually find something called site-packages in the path: this is the default place that Python puts packages when you install them using python setup.py install, or pip, or conda, or a similar means.
The important thing to note is that each python installation has its own site-packages, where packages are installed for that specific Python version. In other words, if you install something for, e.g. /usr/bin/python, then ~/anaconda/bin/python can't use that package, because it was installed on a different Python! This is why in our twitter exchange I recommended you focus on one Python installation, and fix your$PATH so that you're only using the one you want to use.
There's another component to this: some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip, ipython, jupyter, pep8, etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to work only with that Python installation.
That means that, as your system is set-up, when you run python, you get /usr/bin/python, but when you run ipython, you get /Library/Frameworks/Python.framework/Versions/3.5/bin/ipython which is associated with the Python version at /Library/Frameworks/Python.framework/Versions/3.5/bin/python! Further, this means that the packages you can import when running python are entirely separate from the packages you can import when running ipython or a Jupyter notebook: you're using two completely independent Python installations.
So how to fix this? Well, first make sure your $PATH variable is doing what you want it to. You likely have a startup script called something like ~/.bash_profile or ~/.bashrc that sets this $PATH variable. On Windows, you can modify the user specific environment variables. You can manually modify that if you want your system to search things in a different order. When you first install anaconda/miniconda, there will be an option to do this automatically (add Python to the PATH): say yes to that, and then python will always point to ~/anaconda/python, which is probably what you want.
3. How Jupyter knows what Python to use
We're not totally out of the water yet. You mentioned that in the Jupyter notebook, you're getting a kernel error: this indicates that Jupyter is looking for a non-existent Python version.
Jupyter is set-up to be able to use a wide range of "kernels", or execution engines for the code. These can be Python 2, Python 3, R, Julia, Ruby... there are dozens of possible kernels to use. But in order for this to happen, Jupyter needs to know where to look for the associated executable: that is, it needs to know which path the python sits in.
These paths are specified in jupyter's kernelspec, and it's possible for the user to adjust them to their desires. For example, here's the list of kernels that I have on my system:
$ jupyter kernelspec list
Available kernels:
python2.7 /Users/jakevdp/.ipython/kernels/python2.7
python3.3 /Users/jakevdp/.ipython/kernels/python3.3
python3.4 /Users/jakevdp/.ipython/kernels/python3.4
python3.5 /Users/jakevdp/.ipython/kernels/python3.5
python2 /Users/jakevdp/Library/Jupyter/kernels/python2
python3 /Users/jakevdp/Library/Jupyter/kernels/python3
Each of these is a directory containing some metadata that specifies the kernel name, the path to the executable, and other relevant info.
You can adjust kernels manually, editing the metadata inside the directories listed above.
The command to install a kernel can change depending on the kernel. IPython relies on the ipykernel package which contains a command to install a python kernel: for example
$ python -m ipykernel install
It will create a kernelspec associated with the Python executable you use to run this command. You can then choose this kernel in the Jupyter notebook to run your code with that Python.
You can see other options that ipykernel provides using the help command:
$ python -m ipykernel install --help
usage: ipython-kernel-install [-h] [--user] [--name NAME]
[--display-name DISPLAY_NAME] [--prefix PREFIX]
[--sys-prefix]
Install the IPython kernel spec.
optional arguments:
-h, --help show this help message and exit
--user Install for the current user instead of system-wide
--name NAME Specify a name for the kernelspec. This is needed to
have multiple IPython kernels at the same time.
--display-name DISPLAY_NAME
Specify the display name for the kernelspec. This is
helpful when you have multiple IPython kernels.
--prefix PREFIX Specify an install prefix for the kernelspec. This is
needed to install into a non-default location, such as
a conda/virtual-env.
--sys-prefix Install to Python's sys.prefix. Shorthand for
--prefix='/Users/bussonniermatthias/anaconda'. For use
in conda/virtual-envs.
Note: the recent version of anaconda ships with an extension for the notebook that should automatically detect your various conda environments if the ipykernel package is installed in it.
Wrap-up: Fixing your Issue
So with that background, your issue is quite easy to fix:
Set your PATH so that the desired Python version is first. For example, you could run export PATH="/path/to/python/bin:$PATH" to specify (one time) which Python you'd like to use. To do this permanently, add that line to your .bash_profile/.bashrc (note that anaconda can do this automatically for you when you install it). I'd recommend using the Python that comes with anaconda or miniconda: this will allow you to conda install all the tools you need.
Make sure the packages you want to use are installed for that python. If you're using conda, you can type, e.g. conda install jupyter matplotlib scikit-learn to install those packages for anaconda/bin/python.
Make sure that your Jupyter kernels point to the Python versions you want to use. When you conda install jupyter it should set this up for anaconda/bin/python automatically. Otherwise you can use the jupyter kernelspec command or python -m ipykernel install command to adjust existing kernels or install new ones.
For installing modules into other Python Jupyter kernels not managed by Anaconda, you need to copy the path to the Python executable for the kernel and run /path/to/python -m pip install <package>

#jakevdp explained it very well.
When I updated my ubuntu I also had the same problem and I solved it by changing the kernel configuration file(kernel.json).
To list the kernel files location.
Use
jupyter kernelspec list
It will return
Available kernels:
python3 /home/user1/.local/share/jupyter/kernels/python3
python2 /usr/local/share/jupyter/kernels/python2
I was using python3 so I changed the file at
/home/user1/.local/share/jupyter/kernels/python3
by following step
nano /home/user1/.local/share/jupyter/kernels/python3/kernel.json
There inside
argv
I changed the first parameter(i.e. python3 directory path) form
"/usr/bin/python3.5"
to
"/usr/bin/python3"
and saved it with
ctr+x
and restarted jupyter-notebook.

also found not to put your virtual environment inside the git repo as it becomes non-readable to read the python packages. seems different permissions use while reading and writing (writing - installing a package - use pip), how unable to read. Hence, for me the python libraries were getting read from system installation and not virtual environment.

#jakevdp's Answer above & his blog https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/ gives fairly good idea about what's going wrong, however just updating path from shell was not working for me, there are 2 ways that worked for me
Either update path on notebook using magic commands, run below on cell
originalPath = %env PATH
%env PATH = [local anaconda path]/kernels/[custom_kernel]/bin/:$originalPath
Or you can even update the kernel.json & set the path in env
{
"argv": [
"[custom kernel path]/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"env": {
"PATH": "[custom kernel path]/bin/:[rest of the paths]"
},
"display_name": "custom_kerbel",
"language": "python"
}

If you just want to install a package into the current environment to be able to import it, you can use the %pip and %conda magicks.
As you mention anaconda, you probably should use conda to install:
# Install a conda package in the current Jupyter kernel
%conda install <dependency_name>
Alternatively, if you need to use pip:
# Install a pip package in the current Jupyter kernel
%pip install <python_package_name>

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.