Install conda package manually Databricks ML runtime - python

I have a Databricks ML Runtime cluster. I am trying to install fbprophet library using cluster init_script. I am following the example in the Databricks documentation.
#!/bin/bash
set -x
. /databricks/conda/etc/profile.d/conda.sh
conda info --envs
conda activate /databricks/python
conda install -y --offline /dbfs/conda_packages/linux-64/fbprophet.tar.bz2
But the init_script logs show that it cannot activate conda env, and cannot locate package from given path.
: invalid option
set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]
bash: line 2: /databricks/conda/etc/profile.d/conda.sh
: No such file or directory
usage: conda [-h] [-V] command ...
conda: error: unrecognized arguments: --envs
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell
See 'conda init --help' for more information and options.
IMPORTANT: You may need to close and restart your shell after running 'conda init'.
PackagesNotFoundError: The following packages are not available from current channels:
- /dbfs/conda_packages/linux-64/fbprophet.tar.bz2
Current channels:
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/linux-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
Can you please guide me what is wrong with the script, and how can I install the conda package from a path on DBFS.
Or is there any other way I can install conda packages. Because if I try to install using UI, or library API, it fails to install fbprophet package.
Regards

Related

How to remove anacnda environments that's not in default env directories?

I created an conda environment in the path of my choice rather than anaconda's default directory with:
~$ conda create --prefix=/data/sfy_envs/test python=3.8
After successed, the environments are visible in conda:
~$ conda info --envs
# conda environments:
#
base * /data/miniconda3
maskrcnn_sfy /data/miniconda3/envs/maskrcnn_sfy
torch16-sfy /data/miniconda3/envs/torch16-sfy
/data/sfy_envs/test
/data/sfy_envs/tf2-sfy
The last two environments are created with --prefix parameters, and have no name. I can activate them by directly refer to their path:
~$ conda activate /data/sfy_envs/test
But I cannot remove them. For example to remove test, I tried:
~$ conda remove /data/sfy_envs/test
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are missing from the target environment:
- /data/sfy_envs/test
and
~$ conda remove -p /data/sfy_envs/test
CondaValueError: no package names supplied,
try "conda remove -h" for more details
These won't work and I have on idea why.
Or could I just delete the environment directory manually, and remove their paths from the file .conda/environments.txt? I'm not sure if it's a safe treatment.
Use
conda env remove --prefix /data/sfy_envs/test
or
conda remove --prefix /data/sfy_envs/test --all

pycafe installation on conda fails in MacBook Air Catalina

I use MacBook Air, with catalina OS. The conda version I use is 4.9.2.
I am trying to install pycafe with conda using the following syntax in my homebrew window:
conda install -c paulscherrerinstitute pycafe
And I get the following error:
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
PackagesNotFoundError: The following packages are not available from current channels:
- pycafe
Current channels:
- https://conda.anaconda.org/paulscherrerinstitute/osx-64
- https://conda.anaconda.org/paulscherrerinstitute/noarch
- https://repo.anaconda.com/pkgs/main/osx-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/osx-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://conda.anaconda.org/conda-forge/osx-64
- https://conda.anaconda.org/conda-forge/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
Can you please help on how to install pycafe?
Package Unavailable for OS X
As mentioned in the comments, this particular package is only available through a user channel and they had only built it for the linux-64 platform and Python 3.5. If you're keen on avoiding having to compile this yourself (and I should note that the version build is still the latest), then I'd suggest going through Docker to build and host the environment.
Alternative: Docker Container
I put together a basic Dockerfile and YAML as a starter, but I'd expect there may be some additional configuration (e.g., setting environmental variables) that is needed.
pycafe.yaml
name: pycafe
channels:
- paulscherrerinstitute
- conda-forge
- defaults
dependencies:
- python=3.5
- pycafe
- cafe
- numpy
Dockerfile
ROM continuumio/miniconda3
SHELL ["/bin/bash", "--login", "-c"]
# Install PyCafe
COPY pycafe.yaml .
RUN conda env create -f pycafe.yaml -n pycafe \
&& conda config --set auto_activate_base false \
&& sed -i "s/conda activate base/conda activate pycafe/" ~/.bashrc \
&& conda clean -a -y
Placing both of these in a directory, one can build the Docker image with
docker build -t "pycafe:1.3.0" .
Then one could launch a Python session with
docker run -it pycafe:1.3.0 bash
python

How to cache pip packages within Azure Pipelines

Although this source provides a lot of information on caching within Azure pipelines, it is not clear how to cache Python pip packages for a Python project.
How to proceed if one is willing to cache Pip packages on an Azure pipelines build?
According to this, it may be so that pip cache will be enabled by default in the future. As far as I know it is not yet the case.
I used the pre-commit documentation as inspiration:
https://pre-commit.com/#azure-pipelines-example
https://github.com/asottile/azure-pipeline-templates/blob/master/job--pre-commit.yml
and configured the following Python pipeline with Anaconda:
pool:
vmImage: 'ubuntu-latest'
variables:
CONDA_ENV: foobar-env
CONDA_HOME: /usr/share/miniconda/envs/$(CONDA_ENV)/
steps:
- script: echo "##vso[task.prependpath]$CONDA/bin"
displayName: Add conda to PATH
- task: Cache#2
displayName: Use cached Anaconda environment
inputs:
key: conda | environment.yml
path: $(CONDA_HOME)
cacheHitVar: CONDA_CACHE_RESTORED
- script: conda env create --file environment.yml
displayName: Create Anaconda environment (if not restored from cache)
condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
- script: |
source activate $(CONDA_ENV)
pytest
displayName: Run unit tests
To cache a standard pip install use this:
variables:
# variables are automatically exported as environment variables
# so this will override pip's default cache dir
- name: pip_cache_dir
value: $(Pipeline.Workspace)/.pip
steps:
- task: Cache#2
inputs:
key: 'pip | "$(Agent.OS)" | requirements.txt'
restoreKeys: |
pip | "$(Agent.OS)"
path: $(pip_cache_dir)
displayName: Cache pip
- script: |
pip install -r requirements.txt
displayName: "pip install"
I wasn't very happy with the standard pip cache implementation that is mentioned in the official documentation. You basically always install your dependencies normally, which means that pip will perform loads of checks that take up time. Pip will find the cached builds (*.whl, *.tar.gz) eventually, but it all takes up time. You can opt to use venv or conda instead, but for me it lead to buggy situations with unexpected behaviour. What I ended up doing instead was using pip download and pip install separately:
variables:
pipDownloadDir: $(Pipeline.Workspace)/.pip
steps:
- task: Cache#2
displayName: Load cache
inputs:
key: 'pip | "$(Agent.OS)" | requirements.txt'
path: $(pipDownloadDir)
cacheHitVar: cacheRestored
- script: pip download -r requirements.txt --dest=$(pipDownloadDir)
displayName: "Download requirements"
condition: eq(variables.cacheRestored, 'false')
- script: pip install -r requirements.txt --no-index --find-links=$(pipDownloadDir)
displayName: "Install requirements"

Install python modules on Elastic Beanstalk with conda

Problem
I am trying to install python packages that have C dependencies on AWS Elastic Beanstalk (namely : fbprophet and xgboost)
Elastic Beanstalk python installs packages from requirements.txt by default with pip or pipenv on Amazon Linux 2
However, fbprophet and xgboost have dependencies in C that need to be compiled before installing them with pip. conda comes with these libraries precompiled so they are a lot easier to install with conda.
What I have tried
Here is my attempt at installing them with conda using a .config file in .ebextensions folder :
commands:
00_download_conda:
command: 'wget http://repo.continuum.io/archive/Anaconda3-2020.02-Linux-x86_64.sh'
test: test ! -d /anaconda
01_install_conda:
command: 'bash Anaconda3-2020.02-Linux-x86_64.sh -b -f -p /anaconda'
test: test ! -d /anaconda
02_reload_bash:
command: 'source ~/.bashrc'
03_create_home:
command: 'mkdir -p /home/wsgi'
04_conda_env:
command: 'conda env create -f environment.yml'
05_activate_env:
command: 'conda activate demo_forecast'
However this does not work and throws this error :
[2020-04-21T18:18:22.285Z] INFO [3699] - [Application update app-8acc-200421_201612#4/AppDeployStage0/EbExtensionPreBuild/Infra-EmbeddedPreBuild/prebuild_0_test_empty_dash/Command 03_conda_env] : Activity execution failed, because: /bin/sh: conda: command not found
(ElasticBeanstalk::ExternalInvocationError)
[2020-04-21T18:18:22.285Z] INFO [3699] - [Application update app-8acc-200421_201612#4/AppDeployStage0/EbExtensionPreBuild/Infra-EmbeddedPreBuild/prebuild_0_test_empty_dash/Command 03_conda_env] : Activity failed.
So it seems that sourcing the .bashrc does not create the conda alias
I am aware of this question and its answer , however it is a little bit old and does not provide enough details for my case, because it does not go through with installing packages using conda.
Another way would be to try and install and compile the C dependencies before pip installing the requirement, but I had no success there for now.
Thank you for the help !

Multiple python versions with anaconda and travis

I have a python project that makes use of libraries that needs to be built. Given that I use anaconda. I want to create a plan for travis that would let me test against multiple python versions and I am not able to do it. Here is what I have:
I want to test it against multiple python versions (e.g. 2.7, 3.5, 3.6)
I have requirements.yml file which looks like following:
channels:
- kne
dependencies:
- numpy
- pytest
- numpy
- scipy
- matplotlib
- seaborn
- pybox2d
- pip:
- gym
- codecov
- pytest
- pytest-cov
My .travis.yml contains:
language: python
# sudo false implies containerized builds
sudo: false
python:
- 3.5
- 3.4
before_install:
# Here we download miniconda and install the dependencies
- export MINICONDA=$HOME/miniconda
- export PATH="$MINICONDA/bin:$PATH"
- hash -r
- wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh
- bash miniconda.sh -b -f -p $MINICONDA
- conda config --set always_yes yes
- conda update conda
- conda info -a
- echo "Python version var"
- echo $TRAVIS_PYTHON_VERSION
- conda env create -n testenv -f environment.yml python=$TRAVIS_PYTHON_VERSION
- source activate testenv
install:
- python setup.py install
script:
- python --version
- python -m pytest --cov=.
- codecov
If I put python version into environment.yml it works fine but I can't use multiple python versions. For me, it seems to if -f is provided, it ignores any additional packages listed for conda env create.
Also, adding - conda install -n testenv python=$TRAVIS_PYTHON_VERSION after env creation does not work.
UnsatisfiableError: The following specifications were found to be in conflict:
- functools32 -> python 2.7.*
- python 3.5*
Use "conda info <package>" to see the dependencies for each package.
What am I supposed to do, in order make it work?
// If you would like to see more details, it is available here: https://travis-ci.org/mbednarski/Chiron/jobs/220644726
You can use sed to modify the python dependency in environment.yml file before creating the conda environment.
Include python in the environment.yml:
channels:
- kne
dependencies:
- python=3.6
- numpy
- pytest
- numpy
- scipy
- matplotlib
- seaborn
- pybox2d
- pip:
- gym
- codecov
- pytest
- pytest-cov
And then modify your .travis.yml:
before_install:
# Here we download miniconda and install the dependencies
- export MINICONDA=$HOME/miniconda
- export PATH="$MINICONDA/bin:$PATH"
- hash -r
- wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh
- bash miniconda.sh -b -f -p $MINICONDA
- conda config --set always_yes yes
- conda update conda
- conda info -a
- echo "Python version var"
- echo $TRAVIS_PYTHON_VERSION
# Edit the environment.yml file for the target Python version
- sed -i -E 's/(python=)(.*)/\1'$TRAVIS_PYTHON_VERSION'/' ./environment.yml
- conda env create -n testenv -f environment.yml
- source activate testenv
The sed regex will replace the text python=3.6 with the equivalent for the target python version.
BTW: I see in your repository you've worked around this by using multiple environment.yml files. This seems reasonable, even necessary for certain dependencies, but perhaps tedious to maintain for many Python versions.

Categories

Resources