rpy2 : DTW package not working with step pattern provided - python

I am currently trying to use rpy2 to access "R" DTW (Dynamic time warping) package to calculate distance between multivariate time series. Maybe since the time series are really different that I am getting the error "No warping path exists that is allowed by costraints"
I think the default step pattern in the library is symmetric but i want to test it out with asymmetric step pattern , but the toy code is not working when i try to give step pattern as "asymmetric" . Following is the code
import numpy as np
from sklearn import cluster
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
from rpy2.robjects.packages import importr
import rpy2.robjects as robj
"""Example of DTW calculation, it's 2 variables , 5 timestamps and 16 timestamps each"""
R = rpy2.robjects.r
DTW = importr('dtw')
# Generate our data
template = np.array([[1,2,3,4,5],[1,2,3,4,5]]).transpose()
rt,ct = template.shape
query = np.array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]]).transpose()
rq,cq = query.shape
templateR=R.matrix(template,nrow=rt,ncol=ct)
queryR=R.matrix(query,nrow=rq,ncol=cq)
alignment = R.dtw(templateR,queryR,keep=True, step_pattern=R.asymmetric)
alignment = R.dtw(templateR, queryR, keep=True)
dist = alignment.rx('distance')[0][0]
print dist
I have also looked into following example but it's not working:
rpy2 dtw missing argument window.size
Thanks !

The default step pattern in the library is symmetric2. To test with others type:
alignment = R.dtw(templateR,queryR,keep=True, step_pattern=DTW.asymmetric)
not R.asymmetric
Hope it helps

Related

How to import a function from an R package as if it was native Python function and use all its outputs?

There is a function called dea(x, y, *args) in library(Benchmarking) which returns useful objects. I've described 3 key ones below:
crs = dea(mydata_matrix_x, my_data_matrix_y, RTS="IN", ORIENTATION= "in") # both matrixes have N rows
efficiency(crs) # a 'numeric' type object which looks like a 1xN vector
peers(crs) # A matrix: Nx2 (looks to me like a pandas dataframe when run in .ipynb file with R kernel)
lambda(crs) # A matrix: Nx2 of type dbl (also looks like a dataframe)
Now I would like to programatically vary my_data_matrix_x. This matrix represents my inputs. At first it will be a Nx10 matrix. However I intend to drop each column sequentially and run dea() on the Nx9 matrix, then graph the efficiency(crs) scores that come out. The issue is I have no idea how to achieve this in R (amongst other things) and would rather circumvent the issue by writing all my code in Python and importing this dea() function somehow from an R script
I believe the best solution available to me will be to read and write from files:
from Benchmarking_script.r import dea
def test_inputs(data, input):
INPUTS = ['input 1', 'input2', 'input3', 'input4,' 'input5']
OUTPUTS = ['output1', 'output2']
data_inputs = data.drop(f"{input}", axis=1)
data_outputs = data[OUTPUTS]
data_inputs.to_csv("my_inputs.csv")
data_outputs.to_csv("my_outputs.csv")
run Benchmarking.dea(data_inputs, data_outputs, RTS="crs", ORIENTATION="in")
clearly this last line won't work: I am interested to hear flexible (and simple!) ways to run this dea() function idiomatically as if it was a native Python function
Related SO questions
The closest answer on SO I've found has been Importing any function from an R package into python
When adapting the code I've written
import pandas as pd
data = pd.read_csv("path/to_data.csv")
import rpy2
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
packnames = ('Benchmarking')
utils.install_packages(StrVector(packnames))
Benchmarking = importr('Benchmarking')
crs = Benchmarking.dea(data['Age'], data['CO2'], RTS='crs', ORIENTATION='in')
--------------------------------------------------------------
NotImplementedError: Conversion 'py2rpy' not defined for objects of type '<class 'pandas.core.series.Series'>'
So importing the function natively as a Python file hasn't worked
The second approach is the way to go. You need to use a converter context so python and r variables would be converted automatically. Specifically, try pandas2ri submodule shipped with rpy2. Something like this:
from rpy2.robjects import pandas2ri
with pandas2ri:
crs = Benchmarking.dea(data['Age'], data['CO2'], RTS='crs', ORIENTATION='in')
If this doesn't work, update your post with the error.

RPY2 to use as.xts from XTS library

I am using RPY2 in python to call as.xts object from xts library. 'as' is a reserved word in python. Hence, i am unsure how to proceed use as.xts in my python code.
My aim is to use as.xts on an existing dataframe with time series column.
from rpy2.robjects import pandas2ri
pandas2ri.activate()
r_dataframe = pandas2ri.py2ri(pandas_df)
from rpy2.robjects.packages import importr
xts= importr('xts', lib_loc="local path to R library" , robject_translations = {".subset.xts": "_subset_xts2", "to.period": "to_period2"})
r_ts = as.xts(r_dataframe) # i am unsure of this step usage.
i expect a time series object in output of the last line of code. I am going to use forecast package on top of time series object.

Error calling a R function from python using rpy2 with survival library

When calling a function in the survival package in R from within python with the rpy2 interface I get the following error:
RRuntimeError: Error in formula[[2]] : subscript out of bounds
Any pointer to solve the issue please?
Thanks
Code:
import pandas as pd
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
import rpy2.robjects as ro
R = ro.r
from rpy2.robjects import pandas2ri
pandas2ri.activate()
## install the survival package
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1) # select the first mirror in the list
utils.install_packages(StrVector('survival'))
#Load the library and example data set
survival=importr('survival')
infert = R('infert')
## Linear model works fine
reslm=R.lm('case~spontaneous+induced',data=infert)
#Run the example clogit function, which fails
rescl=R.clogit('case~spontaneous+induced+strata(stratum)',data=infert)
After trying around, I found out, there is a difference, whether you offer the R instance of rpy2 the full R-code string to execute, or not.
Thus, you can make your function run, by giving as much as possible as R code:
#Run the example clogit function, which fails
rescl=R.clogit('case~spontaneous+induced+strata(stratum)',data=infert)
#But give the R code to be executed as one complete string - this works:
rescl=R('clogit(case ~ spontaneous + induced + strata(stratum), data = infert)')
If you capture the return value to a variable within R, you can inspect the data and get out the critical information of the model
by the usual functions in R.
E.g.
R('rescl.in.R <- clogit(case ~ spontaneous + induced + strata(stratum), data = infert)')
R('str(rescl.in.R)')
# or:
R('coef(rescl.in.R)')
## array([1.98587552, 1.40901163])
R('names(rescl.in.R)')
## array(['coefficients', 'var', 'loglik', 'score', 'iter',
## 'linear.predictors', 'residuals', 'means', 'method', 'n', 'nevent',
## 'terms', 'assign', 'wald.test', 'y', 'formula', 'xlevels', 'call',
## 'userCall'], dtype='<U17')
It helps a lot - at least in this first phase of using rpy2 (for me, too), to have your r instance open and trying the code in parallel which you do, since the output in R is far more readable and you know and see what you are doing and what you could address.
In Python, the output is stripped off of important informations (like the name etc) - and in addition, it is not pretty-printed.
This fails when including the strata() function within the formula because it's not evaluated in the right environment. In R, formulas are special language constructs and so they need to be treated separately by rpy2.
So, for your example, this would look like:
rescl = R.clogit(ro.Formula('case ~ spontaneous + induced + strata(stratum)'),
data = infert)
See the documentation for rpy2.robjects.Formula for more details. That documentation also discusses the pros & cons of this approach vs that provided by #Gwang-jin-kim

How to use Stargazer to print fit in rpy2

I'm learning how to use rpy2, and I would like to create formatted regression output using the stargazer package. My best guess of how to do this is the following code:
import pandas as pd
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
stargazer = importr('stargazer')
from rpy2.robjects import pandas2ri
pandas2ri.activate()
r = robjects.r
df = pd.DataFrame({'x': [1,2,3,4,5],
'y': [2,1,3,5,4]})
fit = r.lm('y~x', data=df)
print fit
print r.stargazer(fit)
However, when I run it, the I get the following output:
Coefficients:
(Intercept) x
0.6 0.8
[1] "\n"
[2] "% Error: Unrecognized object type.\n"
So the fit is being generated, and prints fine. But stargazer doesn't seem to recognize the fit object as something it can parse.
Any suggestions? Am I calling stargazer incorrectly in this context?
I should mention that I am running this in Python 2.7.5 on a windows 10 machine, with R 3.3.2, and rpy2 version 2.7.8 from the unofficial windows binary. So it could just be a problem with the windows build, but it seems odd that everything except stargazer would work.
I am not familiar with the R package stargazer but from a quick look at the documentation this seems to be the correct usage.
Before anything, you may want to check whether the issue is with execution or with printing. At which one of the two lines is this failing ?
p = r.stargazer(fit)
print(p)
If the failure is with the execution, you may want to move more code to R and see if you reach a point where you get it to work. If not, this is likely an issue with the R code and/or stargazer. If you get it to work the issue is on the rpy2/conversion side.
rcode = """
df <- data.frame(x = c(1,2,3,4,5),
y = c(2,1,3,5,4))
fit <- lm('y~x', data=df)
p <- stargazer(fit)
"""
# parse and evaluate the R code
r(rcode)
# intermediate objects can be retrieved from the `globalenv` to
# investigate where they differ from the ones obtained earlier.
# For example:
print(robjects.globalenv["p"])
Now that we showed that it is likely an issue on the stargazer side, we can make the use of arbitrary data frames a matter of binding it to a symbol in R's globalenv:
robjects.globalenv["df"] = df
rcode = """
fit <- lm('y~x', data=df)
p <- stargazer(fit)
"""
# parse and evaluate the R code
r(rcode)
print(robjects.globalenv["p"])

rpy2 for Holt Winters and UCM Model

I am new to modelling and I am trying to design 2 models through python which are :
Holt Winters
Unobserved Component Model
And I saw that these models are available in R. Can i use rpy2 to call these functions from R into python ?
Thank you
The rpy2 package allows you to pull in any r functions, bind to them in python, and interface with them in much the same way you would in R. Using a Pandas Dataframe with a timeseries index and a column called "consumptionPower". Of course your pandas Dataframe will be different.
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
pandas2ri.activate()
ts = robjects.r('ts')
c = robjects.r('c')
forecast=importr('forecast')
HoltWinters = robjects.r('HoltWinters')
training_pd = hourly['consumptionPower'][:"2017-06-01 00:00:00"]
trainingRTS=ts(training_pd.values, start=c(2015,6), frequency=8760)
hwPower = HoltWinters(trainingRTS, seasonal="additive")

Categories

Resources