Installing RDKit in Google Colab

Installing RDKit in Google Colab - python

I cannot figure out how to fix the following issue. Up until today I was using the following code snippet for installing RDKit in Google Colab:
!wget -c https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
!chmod +x Miniconda3-latest-Linux-x86_64.sh
!time bash ./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local
!time conda install -q -y -c conda-forge rdkit
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')
However, today I started to get the following error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-d24c24e2d1f9> in <module>()
----> 1 from rdkit import Chem
2 import networkx as nx
ModuleNotFoundError: No module named 'rdkit'
I've tried using the full Anaconda distribution instead of Miniconda, as well as changing the python version to 3.6 and 3.8 but nothing seems to work.

I created a Python package to simplify the setup. You can find it here.
It will install Miniconda (or any other flavour) and patch a couple things that make Colab tricky.
Use it like this (first cell in your notebook):
!pip install -q condacolab
import condacolab
condacolab.install()
The kernel will restart and then you will be able to run conda or mamba with the !shell syntax:
!mamba install -c conda-forge rdkit
Check the repository for more details!

I think you need to specify python 3.7 when you install Miniconda (the current rdkit build supports python 3.7), the latest Miniconda version is py3.8:
!wget -c https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh
!chmod +x Miniconda3-py37_4.8.3-Linux-x86_64.sh
!time bash ./Miniconda3-py37_4.8.3-Linux-x86_64.sh -b -f -p /usr/local
!time conda install -q -y -c conda-forge rdkit
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')
https://colab.research.google.com/drive/1MAZyv3O4-TrI8c1MD4JVmwExDquaprRT?usp=sharing

If you want to avoid installing Conda, you can just extract the anaconda package
# version 2018 is quite easy
# download & extract
url = 'https://anaconda.org/rdkit/rdkit/2018.09.1.0/download/linux-64/rdkit-2018.09.1.0-py36h71b666b_1.tar.bz2'
!curl -L $url | tar xj lib
# move to python packages directory
!mv lib/python3.6/site-packages/rdkit /usr/local/lib/python3.6/dist-packages/
x86 = '/usr/lib/x86_64-linux-gnu'
!mv lib/*.so.* $x86/
# rdkit need libboost_python3.so.1.65.1
!ln -s $x86/libboost_python3-py36.so.1.65.1 $x86/libboost_python3.so.1.65.1
For the latest version, it's a bit more complicate due to libboost 1.67. So, I put it in my kora library.
!pip install kora -q
import kora.install.rdkit
You'll get version 2020.09.1

First, you can install condacolab in Colab like below.
!pip install -q condacolab
import condacolab
condacolab.install()
Then you can install rdkit by using conda syntax.
!conda install -c rdkit rdkit
If you follow these steps it will work completely properly

Related

Installing Open3d-Ml with Pytorch (on MacOs)

I created a virtualenv with python 3.10 and installed open3d and PyTorch according to the instructions on open3d-ml webpage: Open3d-ML but when I tested it with import open3d.ml.torch I get the error:
Exception: Open3D was not built with PyTorch support!
Steps to reproduce
python3.10 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install open3d
pip install torch torchvision torchaudio
Error
% python -c "import open3d.ml.torch as ml3d"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xx/.venv/lib/python3.10/site-packages/open3d/ml/torch/__init__.py", line 34, in <module>
raise Exception('Open3D was not built with PyTorch support!')
Exception: Open3D was not built with PyTorch support!
Environment:
% python3 --version
Python 3.10.9
% pip freeze
open3d==0.16.1
torch==1.13.1
torchaudio==0.13.1
torchvision==0.14.1
OS
macOS 12.6
Kernel Version: Darwin 21.6.0
I also checked below similar issues but they don't have answers:
https://github.com/isl-org/Open3D/discussions/5849
https://github.com/isl-org/Open3D-ML/issues/557
Open3D-ML and pytorch
According to this issue 5849 the problem can't be related only to MacOs because, in a docker with Ubuntu20.04, there is a similar error.
Does anyone know how we can tackle this?

You will need to enable PyTorch / Tensorflow when building with cmake - have a look at https://medium.com/#kidargueta/installing-open3d-ml-for-3d-computer-vision-with-pytorch-d640a6862e19 and http://www.open3d.org/docs/release/compilation.html
cmake -DBUILD_CUDA_MODULE=ON -DGLIBCXX_USE_CXX11_ABI=OFF -DBUILD_PYTORCH_OPS=ON -DBUILD_CUDA_MODULE=ON -DBUNDLE_OPEN3D_ML=ON -DOPEN3D_ML_ROOT=Open3D-ML -DBUILD_JUPYTER_EXTENSION:BOOL=ON -DBUILD_WEBRTC=ON -DOPEN3D_ML_ROOT=https://github.com/isl-org/Open3D-ML.git -DPython3_ROOT=/path/to/your/conda-env/bin/python ..

Finally, I decided to build Open3D from the source for Mac M1. I followed almost the official open3d page and thanks to this medium in one of the replies.
Build Open3d-ml with Pytorch on Mac M1
For the OS environment see the main question.
conda create --name open3d-ml-build python==3.10
conda activate open3d-ml-build
# install pytorch from pytorch.org
conda install pytorch torchvision torchaudio -c pytorch
# now clone open3d in a desired location
git clone --branch v0.16.1 git#github.com:isl-org/Open3D.git ./foo/open3d-0.16-build
cd open3d-0.16-build
mkdir build && cd build
git clone git#github.com:isl-org/Open3D-ML.git
Now make sure you are in the activeted conda env.
Build
(takes very long and a lot of memory)
Note on Mac M1 you don't have Cuda but Metal Performance Shaders (MPS) so I made CUDA Flags OFF in the cmake configuration.
which python3
>> /Users/XX/miniconda3/envs/open3d-ml-build/bin/python3
# in the build direcotry
cmake -DBUILD_CUDA_MODULE=OFF -DGLIBCXX_USE_CXX11_ABI=OFF \
-DBUILD_PYTORCH_OPS=ON -DBUILD_CUDA_MODULE=OFF \
-DBUNDLE_OPEN3D_ML=ON -DOPEN3D_ML_ROOT=Open3D-ML \
-DBUILD_JUPYTER_EXTENSION:BOOL=OFF \
-DPython3_ROOT=/Users/XX/miniconda3/envs/open3d-ml-build/bin/python3 ..
make -j$(sysctl -n hw.physicalcpu) [verbose=1]
If it fails, try it again or run it with verbose and look for fatal error.
Install
# Install pip package in the current python environment
make install-pip-package
# if error: Module Not found yapf
pip install yapf
# Create Python package in build/lib
make python-package
# Create pip wheel in build/lib
# This creates a .whl file that you can install manually.
make pip-package
sanity check
Again in the activated conda environment
# if not installed
pip install tensorboard
python3 -c "import open3d; import open3d.ml.torch"
pip freeze | grep torch
torch==1.13.1
torchaudio==0.13.1
torchvision==0.14.1
If you don't get any errors you should be good to go.

Install conda package to base python env

I'm using databricks for Spark and one of the packages I want to install is cyipopt for Python. cyipopt documentation recommends installation of the package from conda-forge using the command
conda install -c conda-forge cyipopt
The problem is, databricks has recently disabled conda due to the some terms and conditions update of Anaconda, and only supports pip where cyipopt isn't available. So, I'm looking for alternate methods of installing this package with no luck so far. One of the things I tried is to install conda in databricks using:
%sh mkdir -p ./miniconda3
%sh wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ./miniconda3/miniconda.sh
%sh bash ./miniconda3/miniconda.sh -b -u -p ./miniconda3
%sh ./miniconda3/bin/conda install -c conda-forge cyipopt
The above commands installed conda successfully, but the last command to install cyipopt kept running for multiple hours when I tried it many many times, so I just assumed it failed.
Secondly, I manually downloaded all the tar.bz2 files of cyipopt from conda-forge and its dependencies and was able to manually install all of them successfully including cyipopt, but looks like they are not accessible by the default Python of databricks (likely because conda installed them in a different env).
Can I get conda to install packages in the same env as the default Python instead of in its own new environment? Or is there any other easier way to install cyipopt package in the default environment?

I got a solution from this article which talks about a similar process for Colab, https://towardsdatascience.com/conda-google-colab-75f7c867a522
Note that %sh in the below code is the magic command of databricks that is used to run shell commands.
## install conda
%sh mkdir -p ./miniconda3
# ## download miniconda compatible with below python
# ## from https://docs.conda.io/en/latest/miniconda.html
import sys
print(sys.version)
## change Miniconda download link relevant to your python version, the below link is for Python 3.8 only
%sh wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh -O ./miniconda3/miniconda.sh
%sh bash ./miniconda3/miniconda.sh -b -u -p /usr/local
%sh which conda
%sh rm -rf ./miniconda3/miniconda.sh
%sh conda init bash
%sh conda init zsh
%sh conda --version
## update conda
## change python version below to your python version
%sh conda install --channel defaults conda python=3.8 --yes
%sh conda update --channel defaults --all --yes
%sh conda --version
## add conda packages path to sys.path so that Python can find those packages
import sys
sys.path
## change python version to your respective python version
_ = (sys.path.append("/usr/local/lib/python3.8/site-packages"))
## install your package
%sh conda install -c conda-forge cyipopt --yes
import cyipopt

How to change the pytorch version in Google colab

I need to change the pytorch version in google colab,so i install anaconda
%%bash
MINICONDA_INSTALLER_SCRIPT=Miniconda3-4.5.4-Linux-x86_64.sh
MINICONDA_PREFIX=/usr/local
wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT
./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX
import sys
_ = (sys.path
.append("/usr/local/lib/python3.6/site-packages"))
and then
!conda install pytorch==1.0.0 torchvision==0.2.1 cuda100 -c pytorch --yes
but when i
import torch
torch.__version__
it's 1.9+cuda120
what's more, when i trying to
pip uninstall torch
colab told me that do you want to uninstall pytorch-1.0.0
how does it happen?

First, you have to run
!pip uninstall torch -y
The -y is for skipping the prompt request. This will uninstall torch, it will take more or less 5 minutes.
Then, you have to
!pip install torch==1.0.0
And finally
import torch
torch.__version__
# '1.0.0'

Python Name Error ('XLMProphetNetTokenizer' is not defined)

I'm working on a cheminformatics project that requires quite a few packages to complete. I'm working in Google Colab.
TLDR
Google Colab throws error NameError: name 'XLMProphetNetTokenizer' is not defined even though I explicitly load the relevant package with !pip install transformers.
Long Explanation
To complete my project, I require packages such as RDKit, Moleculekit from Acellera, and the NLP tool allennlp. Google Colab seemed to have issues loading the cheminformatics package RDKit so I had to install using conda. I used !pip install for the rest of the packages. Code as follows:
# Installing RDKit
!wget -c https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh
!chmod +x Miniconda3-py37_4.8.3-Linux-x86_64.sh
!time bash ./Miniconda3-py37_4.8.3-Linux-x86_64.sh -b -f -p /usr/local
!time conda install -q -y -c conda-forge rdkit
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')
!apt-get install python-rdkit librdkit1 rdkit-data -qq
# Installing ProphetNet Packages (Needed for ElmoEmbedder)
!pip install transformers
# Installing Moleculekit
!pip install moleculekit
# Installing AllenNLP
!pip install allennlp
However, when I call from allennlp.commands.elmo import ElmoEmbedder, Google Colab throws the following error: NameError: name 'XLMProphetNetTokenizer' is not defined
Looking on the following link, https://huggingface.co/transformers/master/model_doc/xlmprophetnet.html, it seems that I should be able to load the relavent XLMProphetNetTokenizer by using !pip install transformers. I'm not sure why it still throws the error. Regardless, would appreciate any help.

How to install mayavi on google Colab?

I tried installing mayavi on Colab using pip:
!pip install mayavi
This threw the following error:
Running setup.py bdist_wheel for mayavi ... error
The rest of the error output is available at the Colab document.
Solution: Work in Progress
Following the response from #Bob-Smith, I found that his solution needed a slight change for installing the dependencies:
!apt-get install vtk6
!apt-get install libvtk6-dev python-vtk6
Problems Faced and Workaround Found (PFWF)
PFWF-001 !apt-get install python-vtk throws the following error:
Package 'python-vtk' has no installation candidate
I found a command-reference for this:
!apt-get install libvtk5-dev python-vtk
However, this command also did not work. The package name had changed from libvtk5-dev to libvtk6-dev and the python binding for VTK has changed from python-vtk to python-vtk6. Clearly this kind of change will continue to happen in future and you may need to check the package name and the python binding for VTK before running the following statement:
!apt-get install libvtk6-dev python-vtk6
Note: If you are here looking to solve VTK installation problems for python and this does not solve that you may want to look here: installing-vtk-for-python
Installing mayavi still throws error:
Although the two steps above install the dependencies, the last line: !pip install mayavi spits out the following error:
Could not connect to any X display.
The latest progress on Mayavi installation can be found here.
https://colab.research.google.com/drive/1K_VIP9izNLKalD_IgBSiTowyNkU7aWcW

You'll first need to install deps. Run
!apt-get install vtk6
!apt-get install python-vtk
!pip install mayavi
If you've attempted to import myavi before installing the deps, you may need to restart you runtime before executing the !pip install myavi command using the Runtime -> Restart runtime menu.

I was trying to do the same thing and I was getting error like this. So I tried install vtk package with conda. You need conda to install vtk of course so:
!wget -c https://repo.anaconda.com/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh
!chmod +x Miniconda3-4.5.4-Linux-x86_64.sh
!bash ./Miniconda3-4.5.4-Linux-x86_64.sh -b -f -p /usr/local
!conda install -q -y --prefix /usr/local python=3.6 ujson
import sys
sys.path.append('/usr/local/lib/python3.6/site-packages')
import ujson
print(ujson.dumps({1:2}))
!conda --version
Then set up vtk package with conda as:
!apt-get install vtk6
!conda install -c anaconda vtk
I was trying to install mayavi for mne package so:
!conda activate mne
!conda install gxx_linux-64=7.3
!pip install https://api.github.com/repos/enthought/mayavi/zipball/226189a6ad3dc3c01d031ef21d0d0cde554ac851
Be careful because you need to mne package to activate so before installing mayavi (I was trying to install mne as I said before):
!pip install mne

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Installing RDKit in Google Colab - python

First, you can install condacolab in Colab like below. !pip install -q condacolab import condacolab condacolab.install() Then you can install rdkit by using conda syntax. !conda install -c rdkit rdkit If you follow these steps it will work completely properly

Related

Installing Open3d-Ml with Pytorch (on MacOs)

Install conda package to base python env

How to change the pytorch version in Google colab

Python Name Error ('XLMProphetNetTokenizer' is not defined)

How to install mayavi on google Colab?

Categories

Resources