I would like to be able to install R packages from GitHub in a R conda environment created by Snakemake, as well as python libraries via pip in a python environment. I'll use these environments in a whole set of rules thereafter.
My initial thought was to create a rule running a script to install the specified packages.
For instance, my initial run was: snakemake -j1 --use-conda -R create_r_environment.
My Snakefile:
rule create_r_environment:
conda:
"envs/r.yaml"
script:
"scripts/r-dependencies.R"
rule create_python_environment:
conda:
"envs/python.yaml"
script:
"scripts/python-dependencies.py"
My envs/r.yaml file:
channels:
- conda-forge
dependencies:
- r=4.0
My r-dependencies.R file:
remotes::install_github("ramiromagno/gwasrapidd", upgrade = "never")
My envs/pyton.yaml file:
channels:
- conda-forge
dependencies:
- python=3.8.2
My python-dependencies.py file:
!pip install gseapy
The log output:
Building DAG of jobs...
Creating conda environment envs/r.yaml...
Downloading and installing remote packages.
Environment for envs/r.yaml created (location: .snakemake/conda/388,repos = "http://cran.us.r-project.org")f7df8)
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 create_r_environment
1
[Fri Oct 30 22:38:56 2020]
rule create_r_environment:
jobid: 0
Activating conda environment: /home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/.snakemake/conda/388f7df8
[Fri Oct 30 22:38:57 2020]
Error in rule create_r_environment:
jobid: 0
conda-env: /home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/.snakemake/conda/388f7df8
RuleException:
CalledProcessError in line 5 of /home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/Snakefile:
Command 'source /home/cmcouto-silva/miniconda3/bin/activate '/home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/.snakemake/conda/388f7df8'; set -euo pipefail; Rscript --vanilla /home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/.snakemake/scripts/tmpa6jdxovx.r-dependencies.R' returned non-zero exit status 1.
File "/home/cmcouto-silva/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2168, in run_wrapper
File "/home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/Snakefile", line 5, in __rule_create_r_environment
File "/home/cmcouto-silva/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 529, in _callback
File "/home/cmcouto-silva/miniconda3/envs/snakemake/lib/python3.8/concurrent/futures/thread.py", line 57, in run
File "/home/cmcouto-silva/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
File "/home/cmcouto-silva/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2199, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/cmcouto-silva/cmcouto.silva#usp.br/lab_files/phd_data/SO/.snakemake/log/2020-10-30T223743.852983.snakemake.log
My folder structure:
.
├── envs
│ ├── python.yaml
│ └── r.yaml
├── scripts
│ ├── python-dependencies.py
│ └── r-dependencies.R
└── Snakefile
It successfully creates the environment but fails when running the script, and I don't know why. I've changed the envs/r.yaml file content to install.packages("data.table") to see if there was an issue with the github package, but it's not. It fails anyway. The same occurs when I run the rule create_python_environment (output not showed here).
Any help?
Edit after the accepted answer
As #dariober pointed out, I forgot to install the remotes package before calling it in the script. I did it in the .yaml file, and it worked well. Also, I installed the pip libraries using shell instead of a python file.
I would like to highlight some points though, just in case anyone's facing the same or similar problem:
First, I could successfully install further packages I needed to, but some of them require specific libraries (e.g. libcurl), which is installed in my system, but it's not recognized inside the Snakemake conda environment, forcing me to either install it in the Snakemake conda environment (which is good for reproducibility, although I don't know how to do that yet) or specify the path library. Maybe a better option would be using a container just like #merv commented out.
Second, I figured out that Snakemake already provides a way to install pip libraries using the .yaml file. From the documentation, it looks like this:
name: stats2
channels:
- javascript
dependencies:
- python=3.6 # or 2.7
- bokeh=0.9.2
- numpy=1.9.*
- nodejs=0.10.*
- flask
- pip:
- Flask-Testing
I think there are quite a few wrong things:
remotes::install_github("ramiromagno/gwasrapidd", upgrade = "never"): In your r.yaml you should include the remotes package.
!pip install gseapy is not valid python code. If anything, it is code to be executed by shell but I'm not sure that leading ! is correct. Also, gseapy is available from bioconda I don;t see why you should install it with pip.
Before OP edited the question
My envs/r.yaml file:
remotes::install_github("ramiromagno/gwasrapidd", upgrade = "never")
It's odd that you get the conda environment correctly created since that r.yaml is not a valid environment file.
This is what I tried to recreate your issue:
r.yaml
cat r.yaml
remotes::install_github("ramiromagno/gwasrapidd", upgrade = "never")
Snakefile:
cat Snakefile
rule create_r_environment:
conda:
"r.yaml"
script:
"r-dependencies.R"
Execute:
snakemake -j1 --use-conda -R create_r_environment
Building DAG of jobs...
Creating conda environment r.yaml...
Downloading and installing remote packages.
CreateCondaEnvironmentException:
Could not create conda environment from /home/dario/Downloads/r.yaml:
# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<
Traceback (most recent call last):
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda/exceptions.py", line 1079, in __call__
return func(*args, **kwargs)
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/cli/main.py", line 80, in do_call
exit_code = getattr(module, func_name)(args, parser)
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/cli/main_create.py", line 80, in execute
directory=os.getcwd())
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/specs/__init__.py", line 40, in detect
if spec.can_handle():
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/specs/yaml_file.py", line 18, in can_handle
self._environment = env.from_file(self.filename)
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/env.py", line 151, in from_file
return from_yaml(yamlstr, filename=filename)
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/env.py", line 137, in from_yaml
data = validate_keys(data, kwargs)
File "/home/dario/miniconda3/lib/python3.7/site-packages/conda_env/env.py", line 35, in validate_keys
new_data = data.copy() if data else {}
AttributeError: 'str' object has no attribute 'copy'
`$ /home/dario/miniconda3/bin/conda-env create --file /home/dario/Downloads/.snakemake/conda/095b0ca2.yaml --prefix /home/dario/Downloads/.snakemake/conda/095b0ca2`
environment variables:
CIO_TEST=<not set>
CMAKE_PREFIX_PATH=/home/dario/miniconda3/envs/tritume:/home/dario/miniconda3/envs/tritum
e/x86_64-conda-linux-gnu/sysroot/usr
CONDA_AUTO_UPDATE_CONDA=false
CONDA_BUILD_SYSROOT=/home/dario/miniconda3/envs/tritume/x86_64-conda-linux-gnu/sysroot
CONDA_DEFAULT_ENV=tritume
CONDA_EXE=/home/dario/miniconda3/bin/conda
CONDA_PREFIX=/home/dario/miniconda3/envs/tritume
CONDA_PROMPT_MODIFIER=(tritume)
CONDA_PYTHON_EXE=/home/dario/miniconda3/bin/python
CONDA_ROOT=/home/dario/miniconda3
CONDA_SHLVL=1
DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
PATH=/home/dario/miniconda3/envs/tritume/bin:/home/dario/miniconda3/condabi
n:/opt/gradle/gradle-5.2/bin:/home/dario/.local/share/umake/bin:/home/
dario/.local/bin:/home/dario/bin:/opt/gradle/gradle-5.2/bin:/usr/local
/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/loc
al/games:/snap/bin:/usr/lib/jvm/java-10-oracle/bin:/usr/lib/jvm/java-1
0-oracle/db/bin
REQUESTS_CA_BUNDLE=<not set>
SSL_CERT_FILE=<not set>
WINDOWPATH=2
active environment : tritume
active env location : /home/dario/miniconda3/envs/tritume
shell level : 1
user config file : /home/dario/.condarc
populated config files : /home/dario/.condarc
conda version : 4.8.3
conda-build version : not installed
python version : 3.7.6.final.0
virtual packages : __glibc=2.27
base environment : /home/dario/miniconda3 (writable)
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://conda.anaconda.org/bioconda/linux-64
https://conda.anaconda.org/bioconda/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /home/dario/miniconda3/pkgs
/home/dario/.conda/pkgs
envs directories : /home/dario/miniconda3/envs
/home/dario/.conda/envs
platform : linux-64
user-agent : conda/4.8.3 requests/2.22.0 CPython/3.7.6 Linux/4.15.0-91-generic ubuntu/18.04.4 glibc/2.27
UID:GID : 1001:1001
netrc file : None
offline mode : False
An unexpected error has occurred. Conda has prepared the above report.
If submitted, this report will be used by core maintainers to improve
future releases of conda.
Would you like conda to send this report to the core maintainers?
[y/N]:
Timeout reached. No report sent.
File "/home/dario/miniconda3/envs/tritume/lib/python3.6/site-packages/snakemake/deployment/conda.py", line 320, in create
Anyway, your error says:
... r-dependencies.R' returned non-zero exit status 1
What do you have in r-dependencies.R?
Related
I am struggling to install packages on Win10 computer where I have no admin privileges and I have restricted access to the Internet throughout our corporate infrastructure. I have Python 3.6 with couple of other packages installed and I'd like clone the environment to do some experiments. As I must install packages only from the repository of the corporate, I have added the following line to the condarc file placed in my %userprofile% directory:
ssl_verify: false
channels:
- https://uname:password#corporateURL/repository/type
Now if I try to clone the base environment with conda create --clone base --name exp_env, it gets all the packages except one, downloading and extracting packages shows 0% at that packege, and tells me that
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://repo.anaconda.com/pkgs/main/win-64/pckname-version-pyversion.tar.bz2>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
However, I am able to download that package with my browser from the Internet and from the repository fo the corporate too.
Why does conda look for that package outside the channel I gave to it in the .condarc file?
Why cannot conda download that package even I can download it with my browser?
How to solve the problem by
telling conda to use the repository of the corporate
Downloading the package manually and feed conda with that file?
Additional info
I masked out some of the irrelevant info, but these are the things conda tells:
(base) %userprofile%>conda info
active environment : base
active env location : %ProgramFiles%\Anaconda3
shell level : 1
user config file : %userprofile%\.condarc
populated config files : %userprofile%\.condarc
conda version : X
conda-build version : T
python version : Z.final.0
base environment : %ProgramFiles%\Anaconda3 (read only)
channel URLs : https://user:password#corporateURL/repository/type/win-64
https://user:password#corporateURL/repository/type/noarch
https://user:password#www.corporateURL/repository/type2/win-64
https://user:password#www.corporateURL/repository/type2/noarch
https://user:password#corporateURL/repository/type3/win-64
https://user:password#corporateURL/repository/type3/noarch
package cache : %ProgramFiles%\Anaconda3\pkgs
%userprofile%\AppData\Local\conda\conda\pkgs
envs directories : %userprofile%\AppData\Local\conda\conda\envs
%ProgramFiles%\Anaconda3\envs
%userprofile%\.conda\envs
platform : win-64
user-agent : conda/X requests/Y CPython/Z Windows/10 Windows/S
administrator : False
netrc file : None
offline mode : False
I have a python script that I am trying to to run daily using Windows Task Scheduler. Based on some SO threads, I wrote a batch script and created the scheduler task. I am using Anaconda as virtual environment and version is 4.5.11. When I run the python script from "Anaconda Prompt" or "Windows CMD" it works just fine.
However, when I use the batch script to execute my python script from "Windows CMD" it fails to find my intended conda environment.
Here is my batch script,
set original_dir=%CD%
set conda_root_dir=C:\Anaconda3\Scripts
call %conda_root_dir%\activate.bat
cd C:\Anaconda3\Scripts
call activate yttv_crawler
python "C:\__ Work Station\Py_Projects\YT_TV_Crawler\index.py"
call deactivate
cd %original_dir%
exit /B 1
After running the batch script manually from CMD for test purposes, I am getting this-
C:\Users\aafaysal\Desktop>yt_tv_crawler.bat
C:\Users\aafaysal\Desktop>set original_dir=C:\Users\aafaysal\Desktop
C:\Users\aafaysal\Desktop>set venv_root_dir=C:\Anaconda3\envs\yttv_crawler
C:\Users\aafaysal\Desktop>set conda_root_dir=C:\Anaconda3\Scripts
C:\Users\aafaysal\Desktop>call C:\Anaconda3\Scripts\activate.bat
(base) C:\Users\aafaysal\Desktop>cd C:\Anaconda3\Scripts
(base) C:\Anaconda3\Scripts>call activate yttv_crawler
Could not find conda environment: yttv_crawler
You can list all discoverable environments with `conda info --envs`.
(base) C:\Anaconda3\Scripts>python "C:\__ Work
Station\Py_Projects\YT_TV_Crawler\index.py"
Traceback (most recent call last):
File "C:\__ Work Station\Py_Projects\YT_TV_Crawler\index.py", line 1, in
<module>
from data_access_layer import save_videos
File "C:\__ Work Station\Py_Projects\YT_TV_Crawler\data_access_layer.py",
line 1,
in <module> import pymysql
ModuleNotFoundError: No module named 'pymysql'
(base) C:\Anaconda3\Scripts>call deactivate
C:\Anaconda3\Scripts>cd C:\Users\aafaysal\Desktop
C:\Users\aafaysal\Desktop>exit /B 1
But my conda environment does exists.
(base) C:\Users\aafaysal>conda --version
conda 4.5.11
(base) C:\Users\aafaysal>conda info --envs
# conda environments:
#
base * C:\Anaconda3
iqtools C:\Anaconda3\envs\iqtools
yttracker C:\Anaconda3\envs\yttracker
yttv_crawler C:\Anaconda3\envs\yttv_crawler
Here is the list of System Path Environment variables.
Please help me out.
NB: I know there are a lot of similar questions out there and I have nearly checked/tried everything and still have not been able to fix this.
I am trying to setup the environment in my Unix terminal. When I run my yaml file with make, I am getting the error
make: *** [Makefile:105: environment-dev] Error 247
But when I remove conda-forge from yaml file, all my packages are getting installed but in the end I am getting the error
Adding activation of '/home/xxx/yyy/.env' to conda 'env-abc' environment...
/bin/sh: 1: .: Can't open /home/xxx/yyy/.env
/bin/sh: 1: cannot create : Directory nonexistent
Please find the below yaml file
channels:
- defaults
- conda-forge
dependencies:
- python==3.7.1
- pip:
- -r src/requirements.txt
- jupyterlab==0.35.*
- flake8==3.7.*
- -e . # The project package.
Command I used:
make filename
Please advice
My Bad. I just create .env file and it worked well.
Python newbie here. I have encountered a permission problem with anaconda. Everything runs ok, but I do not seem to be able to update conda, create new environments or install new packages.
When I try to update (conda update conda) it I get:
Fetching package metadata ..... An unexpected error has occurred.
Please consider posting the following information to the
conda GitHub issue tracker at:
https://github.com/conda/conda/issues
Current conda version:
platform : osx-64
conda version : 4.3.29
conda is private : False
conda-env version : 4.3.29
conda-build version : not installed
python version : 2.7.11.final.0
requests version : 2.14.2
root environment : /anaconda (writable)
default environment : /anaconda
envs directories : /anaconda/Users/Tina/.conda/envs
package cache : /anaconda/Users/Tina/.conda/pkgs
channel URLs : https://conda.anaconda.org/anaconda-fusion/osx-64
https://conda.anaconda.org/anaconda-fusion/noarch
https://repo.continuum.io/pkgs/main/osx-64
https://repo.continuum.io/pkgs/main/noarch
https://repo.continuum.io/pkgs/free/osx-64
https://repo.continuum.io/pkgs/free/noarch
https://repo.continuum.io/pkgs/r/osx-64
https://repo.continuum.io/pkgs/r/noarch
https://repo.continuum.io/pkgs/pro/osx-64
https://repo.continuum.io/pkgs/pro/noarch
config file : /Users/Tina/.condarc
netrc file : None
offline mode : False
user-agent : conda/4.3.29 requests/2.14.2 CPython/2.7.11 Darwin/15.5.0 OSX/10.11.5
UID:GID : 501:20
$ /anaconda/bin/conda update conda
Traceback (most recent call last):
File "/anaconda/lib/python2.7/site-packages/conda/exceptions.py", line 640, in conda_exception_handler
return_value = func(*args, **kwargs)
File "/anaconda/lib/python2.7/site-packages/conda/cli/main.py", line 140, in _main
exit_code = args.func(args, p)
File "/anaconda/lib/python2.7/site-packages/conda/cli/main_update.py", line 65, in execute
install(args, parser, 'update')
File "/anaconda/lib/python2.7/site-packages/conda/cli/install.py", line 231, in install
unknown=index_args['unknown'], prefix=prefix)
File "/anaconda/lib/python2.7/site-packages/conda/core/index.py", line 101, in get_index
index = fetch_index(channel_priority_map, use_cache=use_cache)
File "/anaconda/lib/python2.7/site-packages/conda/core/index.py", line 120, in fetch_index
repodatas = collect_all_repodata(use_cache, tasks)
File "/anaconda/lib/python2.7/site-packages/conda/core/repodata.py", line 75, in collect_all_repodata
repodatas = _collect_repodatas_serial(use_cache, tasks)
File "/anaconda/lib/python2.7/site-packages/conda/core/repodata.py", line 485, in _collect_repodatas_serial
for url, schan, pri in tasks]
File "/anaconda/lib/python2.7/site-packages/conda/core/repodata.py", line 115, in func
res = f(*args, **kwargs)
File "/anaconda/lib/python2.7/site-packages/conda/core/repodata.py", line 467, in fetch_repodata
touch(cache_path)
File "/anaconda/lib/python2.7/site-packages/conda/gateways/disk/update.py", line 64, in touch
utime(path, None)
OSError: [Errno 13] Permission denied: '/anaconda/pkgs/cache/9cd9d6b5.json'```
I get the same error when trying to install seaborn or creating an environment. I am reluctant to use sudo because I do not want to break things.
I do not understand what is going on here, so any help would be highly appreciated.
Thanks so much;
T
The user that you are using to run conda update conda does not have write permission on /anaconda/pkgs/cache/.
If you don't want to manage anaconda as the superuser, I would recommend that you create a new user group (i.e. anaconda_admin) and run:
sudo groupadd anaconda_admin
sudo chown -R :anaconda_admin /anaconda
Then you will need to ensure that permissions are something like:
sudo chmod -R 775 /anaconda
And finally that your user is in the anaconda_admin group:
sudo adduser <<<your_user>>> anaconda_admin
For humble Windows users that cannot use sudo: You have to open the conda console as Administrator by right clicking on the console icon and then select run as administrator. Then conda update conda should work fine.
You ought to use sudo in order to write certain files into system. It is perfectly fine and will not break you OS, unless you work with sophisticated and rudimentary packages and installers (conda and python libraries are absolutely fine).
sudo conda update conda should do the thing not only with updating conda, but also with other dependencies and packages you wish to install.
In short, the installer tries to write a file into a certain directory (or modify a file in a directory) that it has not got an access to. With sudo you make them do that as you run it with appended priviliges.
On RedHat Enterprise 7, trying to install node.js inside of a nodeenv (0.13.6) in a Python virtual environment (Python 2.7). When I do nodeenv -p, I get OSError: Command make --jobs=2 failed with error code 2...googling, the only reference to this is here. Not super useful for me, because I am already trying to install the newest version of node (4.2.1). Full trace of this is below:
$ nodeenv -p
* Install node (4.2.1..Traceback (most recent call last):
File "/usr/local/pythonenvs/producer/bin/nodeenv", line 11, in <module>
sys.exit(main())
File "/usr/local/pythonenvs/producer/lib/python2.7/site-packages/nodeenv.py", line 891, in main
create_environment(env_dir, opt)
File "/usr/local/pythonenvs/producer/lib/python2.7/site-packages/nodeenv.py", line 732, in create_environment
install_node(env_dir, src_dir, opt)
File "/usr/local/pythonenvs/producer/lib/python2.7/site-packages/nodeenv.py", line 608, in install_node
build_node_from_src(env_dir, src_dir, node_src_dir, opt)
File "/usr/local/pythonenvs/producer/lib/python2.7/site-packages/nodeenv.py", line 577, in build_node_from_src
callit([make_cmd] + make_opts, opt.verbose, True, node_src_dir, env)
File "/usr/local/pythonenvs/producer/lib/python2.7/site-packages/nodeenv.py", line 461, in callit
% (cmd_desc, proc.returncode))
OSError: Command make --jobs=2 failed with error code 2
I then tried to install from prebuilt, using the instructions in this GitHub issue.
nodeenv -p --prebuilt
That seemed to work...
* Install node (4.2.1... done.
* Appending data to /usr/local/pythonenvs/producer/bin/activate
Except nothing actually installed -- tab completing shows no node or npm install (I have deactivated and re-activated the virtual environment):
$ no
nodeenv nohup nologin notify-send
$ np
$ nproc
My other installs worked with the same instructions, so I'm at a loss for debugging this. Any hints or suggestions? If this is a permissions issue, where do I need to set / change them? The user already owns the virtual environment directory...
Okay, so I don't have a solution to the root cause (I suspect some sort of issue / conflict with make on my server), but I managed to get it installed via --prebuilt. I had to manually delete the node.js source from /usr/local/pythonenvs/producer/src/node-v4.2.1/, because the --prebuilt option was trying to copy those as if they were binaries. After deleting the directory, I downloaded / extracted from nodejs.org into the virtual environment's src directory. Then, the nodeenv -p --prebuilt command works fine.