how to use rpy2 within a packrat environment? - python

I try to use an R package that I have installed using the R package 'packrat' that allow to create a virtual environment similar to virtuanlenv in python. But I do not succeed.
Within a console using R I can run successfully the following code:
cd /path/to/packrat/environment
R # this launch a R console in the packrat environment
library(mycustompackage)
result = mycustompackage::myfunc()
q()
I would like to do the same using rpy2, but I'm unable to activate the packrat environment. Here follow what I've tested unsuccessfully.
from rpy2.robjects import r
from rpy2.robjects.packages import importr
packrat_dir = r.setwd('/path/to/packrat/environment')
importr('mycustompackage')
result = r.mycustompackage.myfunc()
But it fails at 'importr' because it cannot find the package 'mycustompackage'. Either unsuccessfull :
importr('mycustompackage', lib_loc='/path/to/packrat/environment')
Neither:
os.environ['R_HOME'] = '/path/to/packrat/environment'
importr('mycustompackage', lib_loc ='/path/to/packrat/environment')
Any suggestion on how to use rpy2 with packrat environments?

I am not familiar with the R package packrat, but I am noticing that the bash + R and Python/rpy2 code have a subtle difference that might matter a lot: in the bash + R case, when R is starting it is already in your packrat project directory whereas in the Python / rpy2 case R is starting from a different directory and is moved to the packrat project directory using setwd().
I am reading that packrat is using a file .Rprofile (https://rstudio.github.io/packrat/limitations.html), evaluated by R at startup time if in the current directory. I suspect that the issue is down to how packrat is used rather than an issue with rpy2.

Very good remark (hidden file = forgotten file). I found out how to make it running:
from rpy2.robjects import r
from rpy2.robjects.packages import importr
# Init the packrat environment
r.setwd('/path/to/packrat/environment')
r.source('.Rprofile')
# use the packages it contains
importr('mycustompackage')
result = r.myfunc()
lgautier, you made my day, thanks a lot.

Related

scipy.stats is lost when I import my module

The problem appears only when I run the code via the Linux command line, i.e. the Windows Subsystem for Linux. It does not occur when run via a conda environment on Windows. In both cases, scipy is properly installed.
I have created a function to perform linear regressions of values in rows from across two dataframes df_1 and df_2. Their column names are the same as the keys as the dictionary data_dict.
from scipy.stats import linregress
import numpy as np
def foo(df_1, df_2, data_dict):
for index, row in df_2.iterrows():
x = []
for d in data_dict:
x.append(row[d])
x = np.array(x)
for index, row in df_1.iterrows():
y = []
for d in data_dict:
y.append(row[d])
y = np.array(y)
s, i, r, p, se = linregress(x, y)
This works fine as long as I run it from within the script it is written in, however as soon as I import it into a different script, 'bar' and try to run it I get the error AttributeError: module 'scipy' has no attribute 'stats', and the traceback refers to the line in which linregress is actually used, not the import line.
I have tried importing in other ways, i.e.
from scipy import stats
As well as importing directly before the linregress operation, i.e.
from scipy.stats import linregress
s, i, r, p, se = linregress(x, y)
And finally I've tried seeing if any of the other modules imported to 'bar' are interfering with scipy.stats, and this is not the case.
Any idea why python is 'forgetting' scipy.stats?
I also tried checking that scipy.stats was imported by writing a list of all modules imported in 'bar' before calling foo;
with open('modules_on_import.txt', 'a') as f:
for s in sys.modules:
f.write(f"{s}\n")
f.close()
and scipy.stats can be found in modules_on_import.txt
Some more details:
I'm not running in virtual environment, echo $VIRTUAL_ENV returns nothing.
Everything is run via the command line, i.e. directly in Bash. In this case I simply type python3 bar.py.
All modules installed using pip, via command line - i.e. pip install scipy
Unsure if it matters, but I'm editing in vim.
A (simplified) example of bar.py.
from psd_processing import process_psd # function to make df_2 and data_dict
from uptake_processing import process_uptake # function to make df_1
from foo_test import foo
project = '0020'
loading_df = process_uptake(project, 'co2', 298) # this works
param_df, data_dict = process_psd(project, 'n2', 'V') # this works
correlation_df = foo(loading_df, param_df, data_dict) # this breaks on linregress in foo.py
It's not the installation method of scipy. I uninstalled and reinstalled with pip3 to be sure.
However, when I run the code via the Spyder IDE, it works!
Some pertinent information;
I was originally running the code via Ubuntu 20.04.3 LTS on Windows 10 x86_64. My Python installation is in /usr on Ubuntu.
When running in Spyder, the code is run directly on Windows. The python installation is in C:\Users\<user>\Anaconda3.
How do I get this code to run properly via the command line?
As noted in the question, this code works on a conda venv in windows but not in python3 directly installed on the Ubuntu WSL. As my preference is to use the linux command line I did the following workaround;
Install Anaconda on the Ubuntu WSL.
Create and activate a virtual environment.
Install required packages in virtual environment via conda install <pkg>.
Run everything in the new virtual environment.

Attempting to run RPY2 in Python and receiving error 0X7e

I'm attempting to run RPY2 to utilize the TTR package in R, and running python 3.8.3 and R 4.0.2. However, when attempting to run the code
os.environ['R_HOME'] = "C:\\Program Files\\R\\R-4.0.2\\bin\\x64"
from rpy2.robjects.packages import importr'
this results in :
OSError: cannot load library 'C:\Program Files\R\R-4.0.2\bin\x64\bin\x64\R.dll': error 0x7e
I proactively ran python -m rpy2.situation, yielding
C:\Users\XXXXX>python -m rpy2.situation
rpy2 version:
3.3.4
Python version:
3.8.3rc1 (tags/v3.8.3rc1:802eb67, Apr 29 2020, 21:39:14) [MSC v.1924 64 bit (AMD64)]
Looking for R's HOME:
Environment variable R_HOME: None
InstallPath in the registry: C:\Program Files\R\R-4.0.2
Environment variable R_USER: None
Environment variable R_LIBS_USER: None
R version:
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.
In the PATH:
Loading R library from rpy2: OK
Additional directories to load R packages from:
None
C extension compilation:
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning: Unable to get R compilation flags.
Any help on why RPY2 is causing this 0x7e error is greatly appreciated. I have also uninstalled and reinstalled both R, and RPY2 as I found that on a solution on some other posts.
I had the same issue trying to import the rpy2 library. I got it sorted when i added a path for R in my environment variable.
***InstallPath in the registry: C:\Program Files\R\R-4.0.2
Try creating a path on system environment variables with the above and see if it works
I had the same error and for me the problem was that SciPy was imported before rpy2. Moving the SciPy import below rpy2 solved it.
I had the exact same problem. The reason was that python was running in an Anaconda environment. The environment has its own version of R installed. (Maybe search your computer for "Rcmd.exe" to see all the R copies on your machine.)
The solution was to modify os.environ['R_HOME'] to the appropriate copy of R:
For me it worked by adding this to the top of my python script:
import os
os.environ["R_HOME"] = "C:\\Users\\<Name>\\anaconda3\\envs\\<enironment_name>\\Lib\\R\\"
But the exact path might be different for you depending on from where you are running rpy2.
And also note that just like Aidan mentioned you should not add \\bin\\x64 to your R_HOME path.
The line Loading R library from rpy2: OK when running rpy2.situation suggests that the R dll is loading properly. There is likely something different between the environment in which you are running you Python script and the terminal where you are running C:\Users\XXXXX>python -m rpy2.situation.
Try running rpy2.situation from a Python script (for example take the content of the if __name__ == '__main__': block - https://github.com/rpy2/rpy2/blob/master/rpy2/situation.py#L358)
note in your output:
OSError: cannot load library 'C:\Program Files\R\R-4.0.2\bin\x64\bin\x64\R.dll': error 0x7e
your R_Home just needs to be 'C:\Program Files\R\R-4.0.2'. In fact remove the changing of environment variables and it should just work.
You need to do both 2 things together:
set R_HOME
set environ "path" include "R bin"
e.g.
#set R_HOME dynamically
import os
os.environ['R_HOME'] = r'YOUR R HOME PATH' #e.g. r'C:\Users\STEMLab\Miniconda2\envs\myenv\Lib\R' for my case
#set R bin
os.environ['path'] += r';YOUR R BIN;' #e.g. r';C:\Users\STEMLab\Miniconda2\envs\myenv\Lib\R\bin;' for my case again
cheers :)
I was in the exact same situation and changing the environment variable settings didn't solve the problem. I had installed 32-bit python by mistake; installing 64-bit python worked fine for me!
You can check it by
import platform; platform.architecture()

Integrating R code via rpy2 into a Python package

I'm attempting to build a Python package, and use rpy2 and a handful of R scripts to integrate R seamlessly into that package.
This is code that I've prototyped previously in a Jupyter notebook. What this usually looks like is:
import rpy2
# load in R script containing some useful functions
rpy2.robjects.r("source('feature.R')")
# generate a python binding for 'useful_func' described in the R script
useful_func = rpy2.robjects.globalenv['useful_func']
result = useful_func(data)
This has worked well in Jupyter, as long as all my R scripts are in the same directory as the notebook I'm working with.
The package I'm trying to build looks something like:
package/
-__init__.py
-package.py
-lib/
-__init__.py
-feature1.py
-feature1.R
I can import feature1 easily, but when it tries to source feature1.R, R can't find the file. I can fix this by providing an absolute path to feature1.R but obviously this won't work when I attempt to distribute the package. How can I generate an absolute path to a resource file within a package in a way that is zip-safe?
...and I figured it out. Answering in case other folks have a similar form of this issue.
In feature1.py:
import importlib.resources as pkg_resources
import rpy2
with pkg_resources.path('lib', 'feature1.R') as filepath:
rpy2.robjects.r("source('" + str(filepath) + "')")
useful_func = rpy2.robjects.globalenv['useful_func']
You have resolved yourself the issue with the path in your package. The following is only a mention of convenience code in rpy2 to let you automagically map your R source file to a Python module (just like rpy2's importr() does, but without the need to have the R code in an R package):
https://rpy2.github.io/doc/v3.1.x/html/robjects_rpackages.html#importing-arbitrary-r-code-as-a-package

Retrieving the requirements of a Python single script

I would like to know how to extract the requirements from a single python script. I tried the following way, at the beginning of my file, immediately after the imports:
try:
from pip._internal.operations import freeze
except ImportError: # pip < 10.0
from pip.operations import freeze
x = freeze.freeze()
for p in x:
print(p)
The piece of code above, however, gives me back all the Python frameworks installed locally. I would like to extract only the necessary requirements for the script, in order to be able to deploying the final application.
I hope I was clear.
pipreqs is simple to use
install:
pip install pipreqs
in linux in the same folder of your script
use:
pipreqs .
then the requirements.txt file is created
pip home page:
https://pypi.org/project/pipreqs/
You can do this easily with 'modulefinder' python module.
I think you want to print all the modules required by a script.
So, you can refer to
http://blog.rtwilson.com/how-to-find-out-what-modules-a-python-script-requires/
or for your ease the code is here:
from modulefinder import ModuleFinder
f = ModuleFinder()
# Run the main script
f.run_script('run.py')
# Get names of all the imported modules
names = list(f.modules.keys())
# Get a sorted list of the root modules imported
basemods = sorted(set([name.split('.')[0] for name in names]))
# Print it nicely
print ("\n".join(basemods))

Using PythonService.exe to host python service while using virtualenv

I've got a Windows 7 environment where I need to develop a Python Windows Service using Python 3.4. I'm using pywin32's win32service module to setup the service and most of the hooks seem to be working ok.
The problem is when I attempt to run the service from source code (using python service.py install followed by python service.py start). This uses PythonService.exe to host service.py - but I'm using a venv virtual environment and the script can't find it's modules (error message discovered with python service.py debug).
Pywin32 is installed in the virtualenv and in looking at the source code of PythonService.exe, it dynamically links in Python34.dll, imports my service.py and invokes it.
How can I get PythonService.exe to use my virtualenv when running my service.py?
Thanks very much for posting this question and a solution. I took a slightly different approach which might also be useful. It is pretty difficult to find working tips for Python services, let alone doing it with a virtualenv. Anyway...
Steps
This is using Windows 7 x64, Python 3.5.1 x64, pywin32-220 (or pypiwin32-219).
Open an Administrator command prompt.
Create a virtualenv. C:\Python35\python -m venv myvenv
Activate the virtualenv. call myvenv\scripts\activate.bat
Install pywin32, either:
From Pypi: pip install pypiwin32,
From http://www.lfd.uci.edu/~gohlke/pythonlibs/: pip install path\to\pywin32.whl
Run the post-install script python myvenv\Scripts\pywin32_postinstall.py -install.
This script registers the DLL's in the system, and copies them to C:\Windows\System32. The DLL's are named pythoncom35.dll and pywintypes35.dll. So virtual environments on the same machine on the same major Python point release will share these... it's a minor tradeoff :)
Copy myvenv\Lib\site-packages\win32\pythonservice.exe to myvenv\Scripts\pythonservice.exe
On the service class (whatever subclasses win32serviceutil.ServiceFramework), set the class property _exe_path_ to point to this relocated exe. This will become the service binPath. For example: _exe_path_ = os.path.join(*[os.environ['VIRTUAL_ENV'], 'Scripts', 'pythonservice.exe']).
Discussion
I think why this works is that Python looks upwards to figure out where the Libs folders are and based on that sets package import paths, similar to the accepted answer. When pythonservice.exe is in the original location, that doesn't seem to work smoothly.
It also resolves DLL linking problems (discoverable with depends.exe from http://www.dependencywalker.com/). Without the DLL business sorted out, it won't be possible to import from the *.pyd files from venv\Lib\site-packages\win32 as modules in your scripts. For example it's needed allow import servicemanager; as servicemanager.pyd is not in the package as a .py file, and has some cool Windows Event Log capabilities.
One of the problems I had with the accepted answer is that I couldn't figure out how to get it to accurately pick up on package.egg-link paths that are created when using setup.py develop. These .egg-link files include the path to the package when it's not located in the virtualenv under myvenv\Lib\site-packages.
If it all went smoothly, it should be possible to install, start and test the example win32 service (from an Admin prompt in the activated virtualenv):
python venv\Lib\site-packages\win32\Demos\service\pipeTestService.py install
python venv\Lib\site-packages\win32\Demos\service\pipeTestService.py start
python venv\Lib\site-packages\win32\Demos\service\pipeTestServiceClient.py
The Service Environment
Another important note in all this is that the service will execute the python code in a completely separate environment to the one you might run python myservice.py debug. So for example os.environ['VIRTUAL_ENV'] will be empty when running the service. This can be handled by either:
Setting environment variables from inside the script, e.g.
Find current path starting from the sys.executable, as described in the accepted answer.
Use that path to locate a config file.
Read the config file and put them in the environment with os.environ.
Add registry keys to the service with the environment variables.
See Accessing Environment Variables from Windows Services for doing this manually with regedit.exe
See REG ADD a REG_MULTI_SZ Multi-Line Registry Value for doing this from the command line.
I read all the answers, but no solution can fix my problem.
After carefully researched David K. Hess's code, I made some change, and it finally works.
But my reputation doesn't enough, so I just post the code here.
# 1. Custom your Project's name and Virtual Environment folder's name
# 2. Import this before all third part models
# 3. If you still failed, check the link below:
# https://stackoverflow.com/questions/34696815/using-pythonservice-exe-to-host-python-service-while-using-virtualenv
# 2019-05-29 by oraant, modified from David K. Hess's answer.
import os, sys, site
project_name = "PythonService" # Change this for your own project !!!!!!!!!!!!!!
venv_folder_name = "venv" # Change this for your own venv path !!!!!!!!!!!!!!
if sys.executable.lower().endswith("pythonservice.exe"):
# Get root path for the project
service_directory = os.path.abspath(os.path.dirname(__file__))
project_directory = service_directory[:service_directory.find(project_name)+len(project_name)]
# Get venv path for the project
def file_path(x): return os.path.join(project_directory, x)
venv_base = file_path(venv_folder_name)
venv_scripts = os.path.join(venv_base, "Scripts")
venv_packages = os.path.join(venv_base, 'Lib', 'site-packages')
# Change current working directory from PythonService.exe location to something better.
os.chdir(project_directory)
sys.path.append(".")
prev_sys_path = list(sys.path)
# Manually activate a virtual environment inside an already initialized interpreter.
os.environ['PATH'] = venv_scripts + os.pathsep + os.environ['PATH']
site.addsitedir(venv_packages)
sys.real_prefix = sys.prefix
sys.prefix = venv_base
# Move some sys path in front of others
new_sys_path = []
for item in list(sys.path):
if item not in prev_sys_path:
new_sys_path.append(item)
sys.path.remove(item)
sys.path[:0] = new_sys_path
How to use it? It's simple, just paste it into a new python file, and import it before any third part model like this:
import service_in_venv # import at top
import win32serviceutil
import win32service
import win32event
import servicemanager
import time
import sys, os
........
And now you should fix your problem.
It appears this used to work correctly with the virtualenv module before virtual environments were added to Python 3.3. There's anecdotal evidence (see this answer: https://stackoverflow.com/a/12424980/1055722) that Python's site.py used to look upward from the executable file until it found a directory that would satisfy imports. It would then use that for sys.prefix and this was sufficient for PythonService.exe to find the virtualenv it was inside of and use it.
If that was the behavior, it appears that site.py no longer does that with the introduction of the venv module. Instead, it looks one level up for a pyvenv.cfg file and configures for a virtual environment in that case only. This of course doesn't work for PythonService.exe which is buried down in the pywin32 module under site-packages.
To work around it, I adapted the activate_this.py code that comes with the original virtualenv module (see this answer: https://stackoverflow.com/a/33637378/1055722). It is used to bootstrap an interpreter embedded in an executable (which is the case with PythonService.exe) into using a virtualenv. Unfortunately, venv does not include this.
Here's what worked for me. Note, this assumes the virtual environment is named my-venv and is located one level above the source code location.
import os
import sys
if sys.executable.endswith("PythonService.exe"):
# Change current working directory from PythonService.exe location to something better.
service_directory = os.path.dirname(__file__)
source_directory = os.path.abspath(os.path.join(service_directory, ".."))
os.chdir(source_directory)
sys.path.append(".")
# Adapted from virtualenv's activate_this.py
# Manually activate a virtual environment inside an already initialized interpreter.
old_os_path = os.environ['PATH']
venv_base = os.path.abspath(os.path.join(source_directory, "..", "my-venv"))
os.environ['PATH'] = os.path.join(venv_base, "Scripts") + os.pathsep + old_os_path
site_packages = os.path.join(venv_base, 'Lib', 'site-packages')
prev_sys_path = list(sys.path)
import site
site.addsitedir(site_packages)
sys.real_prefix = sys.prefix
sys.prefix = venv_base
new_sys_path = []
for item in list(sys.path):
if item not in prev_sys_path:
new_sys_path.append(item)
sys.path.remove(item)
sys.path[:0] = new_sys_path
One other factor in my troubles - there is a new pypi wheel for pywin32 that is provided by the Twisted folks that makes it easier to install with pip. The PythonService.exe in that package was acting oddly (couldn't find a pywin32 dll when invoked) compared to the one you get when installing the official win32 exe package into the virtual env using easy_install.
For anyone reading in 2018, I didn't have any luck with either solution above (Win10, Python 3.6) - so this is what I did to get it working. The working directory is in site-packages/win32 on launch, so you need to change the working directory and fix the sys.path before you try and import any project code. This assumed venv sits in your project dir, otherwise you may just need to hard code some paths:
import sys
import os
if sys.executable.lower().endswith("pythonservice.exe"):
for i in range(4): # goes up 4 directories to project folder
os.chdir("..")
# insert site-packages 2nd in path (behind project folder)
sys.path.insert(1, os.path.join("venv",'Lib','site-packages'))
[REST OF IMPORTS]
class TestService(win32serviceutil.ServiceFramework):
[...]
Not use "pythonservice.exe", register python.exe to services directly:
import win32serviceutil
import win32service
import servicemanager
import sys
import os
import os.path
import multiprocessing
#
def main():
import time
time.sleep(600)
class ProcessService(win32serviceutil.ServiceFramework):
_svc_name_ = "SleepService"
_svc_display_name_ = "Sleep Service"
_svc_description_ = "Sleeps for 600"
_exe_name_ = sys.executable # python.exe from venv
_exe_args_ = '-u -E "' + os.path.abspath(__file__) + '"'
proc = None
def SvcStop(self):
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
if self.proc:
self.proc.terminate()
def SvcRun(self):
self.proc = multiprocessing.Process(target=main)
self.proc.start()
self.ReportServiceStatus(win32service.SERVICE_RUNNING)
self.SvcDoRun()
self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING)
def SvcDoRun(self):
self.proc.join()
def start():
if len(sys.argv)==1:
import win32traceutil
servicemanager.Initialize()
servicemanager.PrepareToHostSingle(ProcessService)
servicemanager.StartServiceCtrlDispatcher()
elif '--fg' in sys.argv:
main()
else:
win32serviceutil.HandleCommandLine(ProcessService)
if __name__ == '__main__':
try:
start()
except (SystemExit, KeyboardInterrupt):
raise
except:
import traceback
traceback.print_exc()
Its make python 3.5+ virtualenv support to work by pointing right iterpreter with service install.

Categories

Resources