While using the rpy2 library of Python to work with R. I get the following error message while trying to import a function of the bnlearn package:
# Using R inside python
import rpy2
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
# Install packages
packnames = ('visNetwork', 'bnlearn')
utils.install_packages(StrVector(packnames))
# Load packages
visNetwork = importr('visNetwork')
bnlearn = importr('bnlearn')
tabu = bnlearn.tabu
fit = bn.learn.bn.fit
With the error:
AttributeError: module 'bnlearn' has no attribute 'bn'
While checking the bnlearn documentation one finds out that bn is a class structure. So one should check out all the attributes of the object in question, that is, running:
bnlearn.__dict__['_rpy2r']
After that you should get a similar output like the next one, where you find how you would import each attribute of bnlearn:
...
...
'bn_boot': 'bn.boot',
'bn_cv': 'bn.cv',
'bn_cv_algorithm': 'bn.cv.algorithm',
'bn_cv_structure': 'bn.cv.structure',
'bn_fit': 'bn.fit',
'bn_fit_backend': 'bn.fit.backend',
'bn_fit_backend_continuous': 'bn.fit.backend.continuous',
'bn_fit_backend_discrete': 'bn.fit.backend.discrete',
'bn_fit_backend_mixedcg': 'bn.fit.backend.mixedcg',
'bn_fit_barchart': 'bn.fit.barchart',
'bn_fit_dotplot': 'bn.fit.dotplot',
...
...
Then, running the following will solve the issue:
bn_fit = bnlearn.bn_fit
Now, you could, for example, run a bayesian Network:
structure = tabu(datos, score = "loglik-g")
bn_mod = bn_fit(structure, data = datos, method = "mle")
In general, this approach solves the issue of importing any function from an R package into Python through the rpy2 package.
Related
I'm trying to run an apriori algorithm in python using rpy2.
i've hit a wall because I want to give the algorithm some parameters but than the code doesn't work.
if I leave the parameter blank it runs. Is there a way to make the apriori algorithm work with paramters?
I've got some R experience and in R my code would look something like this.
output <- apriori(input, parameter = list(support=.01, confidence=.01, minlen=2))
python code:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
# import R packages
base = importr('base')
arules = importr('arules')
arulesViz = importr('arulesViz')
Matrix = importr('Matrix')
utils = importr('utils')
grid = importr('grid')
data = robjects.r('read.transactions("input_data.csv", sep = ",",rm.duplicates=FALSE)')
summary_r = arules.itemFrequency(data, type="absolute")
apr = arules.apriori(data,parameter=list(support=0.001, confidence=0.001, minlen=2))
print(apr)
I've found the answer to the question above on a different forum.
you need to add the following code
from rpy2.robjects.vectors import ListVector
apr = arules.apriori(data,parameter= robjects.ListVector({"support":0.01, "confidence":0.01, "minlen":2}))
I need to import into python the library rugarch of R for volatility forecast.
This is just an example, which could be done entirely in python since it is univariate, however I have to apply later on a multivariate method for which I have not a python solution.
So I have done the following:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
the error happens when:
rugarch = importr('rugarch')
RRuntimeError: Error in loadNamespace(name) : there is no package called 'rugarch'
I also tried to make it pointing the right folder as:
import rpy2.rinterface
utils = importr("utils")
base = importr('base')
print(base._libPaths())
got: C:/Users/simeone/Anaconda3/envs/Luigi/Lib/R/library
rugarch = importr('rugarch', lib_loc = C:/Users/simeone/Anaconda3/envs/Luigi/Lib/R/library")
still the same error: RRuntimeError: Error in loadNamespace(name) : there is no package called 'rugarch'.
In addition I tried forcing the installation of rugarch as follows:
utils.install_packages('rugarch')
but I get this error: RRuntimeError: Error in contrib.url(repos, "source") :
trying to use CRAN without setting a mirror.
Can anybody help? I am stuck
I decided to post an answer on this which works and can be of help for other people.
The last command was working jbut the CRAN mirror was missing.
SO the final code is:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
utils = importr("utils")
utils.chooseCRANmirror(ind=1) # this was missing
utils.install_packages('rugarch')
rugarch = importr('rugarch')
I'm trying to test a tool I'm building which uses some jMetalPy functionality. I had/have a previous version working but I am now trying to refactor out some external dependencies (such as the aforementioned jMetalPy).
Project Code & Structure
Here is a minimalist structure of my project.
MyToolDirectory
¦--/MyTool
¦----/__init__.py
¦----/_jmetal
¦------/__init__.py
¦------/core
¦--------/quality_indicator.py
¦----/core
¦------/__init__.py
¦------/run_manager.py
¦----/tests
¦------/__init__.py
¦------/test_run_manager.py
The _jmetal directory is to remove external dependency on the jMetalPy package - and I have copied only the necessary packages/modules that I need.
Minimal contents of run_manager.py
# MyTool\core\run_manager.py
import jmetal
# from jmetal.core.quality_indicators import HyperVolume # old working version
class RunManager:
def __init__(self):
pass
#staticmethod
def calculate_hypervolume(front, ref_point):
if front is None or len(front) < 1:
return 0.
hv = jmetal.core.quality_indicator.HyperVolume(ref_point)
# hv = HyperVolume(ref_point)
hypervolume = hv.compute(front)
return hypervolume
Minimal contents of test_run_manager.py
# MyTool\tests\test_run_manager.py
import unittest
from unittest.mock import MagicMock, Mock, patch
from MyTool import core
class RunManagerTest(unittest.TestCase):
def setUp(self):
self.rm = core.RunManager()
def test_calculate_hypervolume(self):
ref_points = [0.0, 57.5]
front = [None, None]
# with patch('MyTool.core.run_manager.HyperVolume') as mock_HV: # old working version
with patch('MyTool.core.run_manager.jmetal.core.quality_indicator.HyperVolume') as mock_HV:
mock_HV.return_value = MagicMock()
res = self.rm.calculate_hypervolume(front, ref_points)
mock_HV.assert_called_with(ref_points)
mock_HV().compute.assert_called_with(front)
Main Question
When I run a test with the code as-is, I get this error message:
E ModuleNotFoundError: No module named 'MyTool.core.run_manager.jmetal'; 'MyTool.core.run_manager' is not a package
But when I change it to:
with patch('MyTool.core.run_manager.jmetal.core') as mock_core:
mock_HV = mock_core.quality_indicator.HyperVolume
mock_HV.return_value = MagicMock()
res = self.rm.calculate_hypervolume(front, ref_points)
mock_HV.assert_called_with(ref_points)
mock_HV().compute.assert_called_with(front)
... now the test passes. What gives?!
Why can't (or rather, how can) I surgically patch the exact class I want (i.e., HyperVolume) without patching out an entire sub-package as well? Is there a way around this? There may be code in jmetal.core that needs to run normally.
Is the reason this isn't working only because there is no from . import quality_indicator statement in jMetalPy's jmetal\core\__init__.py ?
Because even with patch('MyTool.core.run_manager.jmetal.core.quality_indicator) throws:
E AttributeError: <module 'jmetal.core' from 'path\\to\\venv\\lib\\site-packages\\jmetal\\core\\__init__.py'> does not have the attribute 'quality_indicator'
Or is there something I'm doing wrong?
In the case that it is just about adding those import statements, I could do that in my _jmetal sub-package, but I was hoping to let the user default to their own jMetalPy installation if they already had one by adding this to MyTool\__init__.py:
try:
import jmetal
except ModuleNotFoundError:
from . import _jmetal as jmetal
and then replacing all instances of import jmetal with from MyTool import jmetal. However, I'd run into the same problem all over again.
I feel that there is some core concept I am not grasping. Thanks for the help.
I'm trying to extract features using tsfresh package and extract_features() function.
tsfresh Version: 0.4.0.post0.dev1+ng19fa136
However, I get the following error:
AttributeError: type object 'MinimalFeatureExtractionSettings' has no
attribute 'n_processes'
Code:
import numpy as np
import pandas as pd
column_names = ['time_series1', 'time_series2','time_series3']
ts = np.random.rand(6,3)
df_to_extract = pd.DataFrame(data=ts, columns = column_names)
df_to_extract['id'] = 1
df_to_extract['time'] = np.arange(1,7)
#print(df_to_extract)
import tsfresh
from tsfresh import extract_features
from tsfresh import select_features
from tsfresh.utilities.dataframe_functions import impute
from tsfresh import extract_relevant_features
from tsfresh.feature_extraction import extract_features, MinimalFeatureExtractionSettings
from tsfresh.feature_extraction.settings import *
from tsfresh.feature_extraction.settings import FeatureExtractionSettings
import tsfresh.feature_extraction.settings
from tsfresh import utilities
from tsfresh import feature_extraction
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= MinimalFeatureExtractionSettings)
Package source code: https://github.com/blue-yonder/tsfresh/blob/master/tsfresh/feature_extraction/extraction.py
I'm using Python 3.5 (Anaconda) on Win10.
I suppose it could be some kind of import error.
How to solve that issue?
Problem solved
To make it work add:
settings= MinimalFeatureExtractionSettings()
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= settings)
There is no MinimalFeatureExtractionSettings object anymore. It is called MinimalFCParameters now. Thus, you would have to write the following code:
from tsfresh.feature_extraction import extract_features, MinimalFCParameters
...
minimalFCParametersForTsFresh = MinimalFCParameters()
extracted_features = extract_features(df_to_extract,column_id="id",default_fc_parameters = minimalFCParametersForTsFresh)
Using rpy2, I want to check if a given package is installed. If it is, I import it. If not, I install it first.
How do I check if it's installed?
from rpy2 import *
if not *my package is installed*:
rpy2.interactive as r
r.importr("utils")
package_name = "my_package"
r.packages.utils.install_packages(package_name)
myPackage = importr("my_package")
Here is a function that'd do it on the Python side
(note the contriburl, that should be set to a CRAN mirror, and that the case where installing the library is failing is not handled).
from rpy2.rinterface import RRuntimeError
from rpy2.robjects.packages import importr
utils = importr('utils')
def importr_tryhard(packname, contriburl):
try:
rpack = importr(packname)
except RRuntimeError:
utils.install_packages(packname, contriburl = contriburl)
rpack = importr(packname)
return rpack
You can use the following function I got from #SaschaEpskamp's answer to another SO post:
pkgTest <- function(x)
{
if (!require(x,character.only = TRUE))
{
install.packages(x,dep=TRUE)
if(!require(x,character.only = TRUE)) stop("Package not found")
}
}
And use this instead to load your packages:
r.source("file_with_pkgTest.r")
r.pkgTest("utils")
In general, I would recommend not try to write much R code inside Python. Just create a few high-level R functions which do what you need, and use those as a minimal interface between R and Python.
import sys,subprocess
your_package = 'nltk'
package_names = subprocess.Popen([pip freeze],
stdout=subprocess.PIPE).communicate()[0]
pakage = package_names.split('\n')
for package in packages:
if package ==your_package:
print 'true'