Reading rds file into python - python

I am trying to read an rds file in python using the following two sets of code that I found on stackoverflow:
import pyreadr
from collections import OrderedDict
result = pyreadr.read_r('Datasets.rds')
df = result["History"]
Which gives me an ordereddict with size 0 and:
import rpy2.robjects as robjects
import tzlocal
from rpy2.robjects import pandas2ri
pandas2ri.activate()
readRDS = robjects.r['readRDS']
df = readRDS('Datasets.rds')
df = pandas2ri.ri2py(df)
which does not show anything to me while runs with no error.
Could you please let me know what might be wrong with these codes?

Related

How can i make arules.apriori work in python using rpy2

I'm trying to run an apriori algorithm in python using rpy2.
i've hit a wall because I want to give the algorithm some parameters but than the code doesn't work.
if I leave the parameter blank it runs. Is there a way to make the apriori algorithm work with paramters?
I've got some R experience and in R my code would look something like this.
output <- apriori(input, parameter = list(support=.01, confidence=.01, minlen=2))
python code:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
# import R packages
base = importr('base')
arules = importr('arules')
arulesViz = importr('arulesViz')
Matrix = importr('Matrix')
utils = importr('utils')
grid = importr('grid')
data = robjects.r('read.transactions("input_data.csv", sep = ",",rm.duplicates=FALSE)')
summary_r = arules.itemFrequency(data, type="absolute")
apr = arules.apriori(data,parameter=list(support=0.001, confidence=0.001, minlen=2))
print(apr)
I've found the answer to the question above on a different forum.
you need to add the following code
from rpy2.robjects.vectors import ListVector
apr = arules.apriori(data,parameter= robjects.ListVector({"support":0.01, "confidence":0.01, "minlen":2}))

Importing rugarch R library into python

I need to import into python the library rugarch of R for volatility forecast.
This is just an example, which could be done entirely in python since it is univariate, however I have to apply later on a multivariate method for which I have not a python solution.
So I have done the following:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
the error happens when:
rugarch = importr('rugarch')
RRuntimeError: Error in loadNamespace(name) : there is no package called 'rugarch'
I also tried to make it pointing the right folder as:
import rpy2.rinterface
utils = importr("utils")
base = importr('base')
print(base._libPaths())
got: C:/Users/simeone/Anaconda3/envs/Luigi/Lib/R/library
rugarch = importr('rugarch', lib_loc = C:/Users/simeone/Anaconda3/envs/Luigi/Lib/R/library")
still the same error: RRuntimeError: Error in loadNamespace(name) : there is no package called 'rugarch'.
In addition I tried forcing the installation of rugarch as follows:
utils.install_packages('rugarch')
but I get this error: RRuntimeError: Error in contrib.url(repos, "source") :
trying to use CRAN without setting a mirror.
Can anybody help? I am stuck
I decided to post an answer on this which works and can be of help for other people.
The last command was working jbut the CRAN mirror was missing.
SO the final code is:
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
utils = importr("utils")
utils.chooseCRANmirror(ind=1) # this was missing
utils.install_packages('rugarch')
rugarch = importr('rugarch')

Rpy2 Conversion of R dataframe to pandas

How to properly convert an R dataframe object to a pandas Dataframe through rpy2?
I've tried everything from the documentation and stackoverflow to no avail.
My version of rpy2 is 2.9.4 running in Conda.
I've tried:
from rpy2.robjects import pandas2ri
pandas2ri.activate()
import rpy2.robjects as ro
pd_dt = ro.conversion.rpy2py(r_from_pd_df)
print(pd_dt)
where r_from_pd_df is an R dataframe
which brings up the error:
AttributeError: module 'rpy2.robjects.conversion' has no attribute 'rpy2py'
Then I used the documentation example:
import pandas as pd
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2.robjects.conversion import localconverter
r_df = ro.DataFrame({'int_values': ro.IntVector([1,2,3]),
'str_values': ro.StrVector(['abc', 'def', 'ghi'])})
with localconverter(ro.default_converter + pandas2ri.converter):
pd_from_r_df = ro.conversion.rpy2py(r_df)
that raised the same error.

Importing any function from an R package into python

While using the rpy2 library of Python to work with R. I get the following error message while trying to import a function of the bnlearn package:
# Using R inside python
import rpy2
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
# Install packages
packnames = ('visNetwork', 'bnlearn')
utils.install_packages(StrVector(packnames))
# Load packages
visNetwork = importr('visNetwork')
bnlearn = importr('bnlearn')
tabu = bnlearn.tabu
fit = bn.learn.bn.fit
With the error:
AttributeError: module 'bnlearn' has no attribute 'bn'
While checking the bnlearn documentation one finds out that bn is a class structure. So one should check out all the attributes of the object in question, that is, running:
bnlearn.__dict__['_rpy2r']
After that you should get a similar output like the next one, where you find how you would import each attribute of bnlearn:
...
...
'bn_boot': 'bn.boot',
'bn_cv': 'bn.cv',
'bn_cv_algorithm': 'bn.cv.algorithm',
'bn_cv_structure': 'bn.cv.structure',
'bn_fit': 'bn.fit',
'bn_fit_backend': 'bn.fit.backend',
'bn_fit_backend_continuous': 'bn.fit.backend.continuous',
'bn_fit_backend_discrete': 'bn.fit.backend.discrete',
'bn_fit_backend_mixedcg': 'bn.fit.backend.mixedcg',
'bn_fit_barchart': 'bn.fit.barchart',
'bn_fit_dotplot': 'bn.fit.dotplot',
...
...
Then, running the following will solve the issue:
bn_fit = bnlearn.bn_fit
Now, you could, for example, run a bayesian Network:
structure = tabu(datos, score = "loglik-g")
bn_mod = bn_fit(structure, data = datos, method = "mle")
In general, this approach solves the issue of importing any function from an R package into Python through the rpy2 package.

AttributeError: type object 'MinimalFeatureExtractionSettings' has no attribute 'n_processes'

I'm trying to extract features using tsfresh package and extract_features() function.
tsfresh Version: 0.4.0.post0.dev1+ng19fa136
However, I get the following error:
AttributeError: type object 'MinimalFeatureExtractionSettings' has no
attribute 'n_processes'
Code:
import numpy as np
import pandas as pd
column_names = ['time_series1', 'time_series2','time_series3']
ts = np.random.rand(6,3)
df_to_extract = pd.DataFrame(data=ts, columns = column_names)
df_to_extract['id'] = 1
df_to_extract['time'] = np.arange(1,7)
#print(df_to_extract)
import tsfresh
from tsfresh import extract_features
from tsfresh import select_features
from tsfresh.utilities.dataframe_functions import impute
from tsfresh import extract_relevant_features
from tsfresh.feature_extraction import extract_features, MinimalFeatureExtractionSettings
from tsfresh.feature_extraction.settings import *
from tsfresh.feature_extraction.settings import FeatureExtractionSettings
import tsfresh.feature_extraction.settings
from tsfresh import utilities
from tsfresh import feature_extraction
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= MinimalFeatureExtractionSettings)
Package source code: https://github.com/blue-yonder/tsfresh/blob/master/tsfresh/feature_extraction/extraction.py
I'm using Python 3.5 (Anaconda) on Win10.
I suppose it could be some kind of import error.
How to solve that issue?
Problem solved
To make it work add:
settings= MinimalFeatureExtractionSettings()
extracted_features = extract_features(df_to_extract,
column_id="id",
column_sort="time",
parallelization= 'per_kind',
feature_extraction_settings= settings)
There is no MinimalFeatureExtractionSettings object anymore. It is called MinimalFCParameters now. Thus, you would have to write the following code:
from tsfresh.feature_extraction import extract_features, MinimalFCParameters
...
minimalFCParametersForTsFresh = MinimalFCParameters()
extracted_features = extract_features(df_to_extract,column_id="id",default_fc_parameters = minimalFCParametersForTsFresh)

Categories

Resources