I am facing an issue when trying to execute the r code below with python rpy2.
from rpy2.robjects import r
import rpy2.robjects as ro
from rpy2.robjects.conversion import localconverter
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr
stats = importr("stats")
with localconverter(ro.default_converter + pandas2ri.converter):
Rdataframe2 = ro.conversion.py2rpy(dtw)
rdism = r["as.dist"](Rdataframe2)
ttclust = r.hclust(rdism)
ttclusterange = r.cutree(ttclust, k='1:3')
I can't find a way to pass the argument k="1:3" in the cutree function.
I keep receiving an error message stating
""elements of 'k' must be between 1 and %d", :
missing value where TRUE/FALSE needed
it seems that I can't find the right syntax to execute the last line.
Can someone please help me to solve this issue
The 1:3 expression is meant to generate a vector of c(1, 2, 3) in R. However, you are not evaluating it in R but passing it as a string/character '1:3' using rpy2. Try passing an equivalent list [1, 2, 3] instead, or using list(range(1, 3 + 1)). This is:
r.cutree(ttclust, k=list(range(1, 3 + 1)))
Related
I have an rpy2 script:
from rpy2.robjects.packages import importr
binom = importr('binom')
from rpy2 import robjects
robjects.r('''library(binom)
p = seq(0,1,.01)
coverage = binom.coverage(p, 10, method="bayes", type = "central")$coverage
''')
I'd like to use it to compare the results from a list of methods please:
methods = [("bayes", type = "central"),("asymptotic")]
for method in methods:
robjects.globalenv["method"] = robjects.r(method)
robjects.r('''library(binom)
p = seq(0,1,0.01)
coverage = binom.coverage(p, 10, method=method)$coverage
''')
The first line gives me:
invalid syntax
And I'd like to include the 'type' for the Bayes method please but when I drop that to get the syntax on my list I still get the error:
object 'bayes' not found
robjects.r() receives a string so for this particular task you can just replace the word method with the right string. Using both quotes (single and double) will do the trick because .replace() will ditch the external quote and replace the text, keeping the single quote.
from rpy2.robjects.packages import importr
binom = importr('binom')
from rpy2 import robjects
methods = ["'bayes', type='central'","'asymptotic'"]
for method in methods:
r_string = """library(binom)
p = seq(0,1,0.01)
coverage = binom.coverage(p, 10, method=TECHNIQUE)$coverage
""".replace('TECHNIQUE',method)
robjects.r(r_string)
I have 2 lists, and I would like to check the wilcoxon rank sum test.
I saw that there is scipy.stats.ranksums library, but it only show the 2 sided test.
How can I do 1 sided test in python?
I checked seems like there is only one-sided for wilcoxon signed rank test.
Maybe not the most ideal, but you can use the wilcox.test from R by calling rpy2:
import numpy as np
from rpy2.robjects import FloatVector
from rpy2.robjects.packages import importr
stats = importr('stats')
x = np.random.poisson(1,size=20)
y = np.random.poisson(3,size=20)
test = stats.wilcox_test(FloatVector(x),FloatVector(y),alternative='less')
d = { key : test.rx2(key)[0] for key in ['statistic','p.value','alternative'] }
d
I wrote a function in Python 2.7:
# Python #
def function_py(par):
#something happens
return(value)
and I want to use this function as an argument for another function in R. More precisely, I want to perform to compute the Sobol' indices using the following function:
# R #
library('sensitivity')
sobol(function_py_translated, X1,X2)
where function_py_translated would b the R equivalent of function_py.
I'm trying to use the rpy2 module, and for a simple function, I could make a working case:
import rpy2.rinterface as ri
import rpy2.robjects.numpy2ri
sensitivity = importr('sensitivity')
radd = ri.baseenv.get('+')
def costfun(X):
a = X[0]
b = X[1]
return(radd(a,b))
costfunr=ri.rternalize(costfun)
X1 = robjects.r('data.frame(matrix(rnorm(2*1000), nrow = 1000))')
X2 = robjects.r('data.frame(matrix(rnorm(2*1000), nrow = 1000))')
sobinde = sensitivity.sobol(costfunr,X1,X2)
print(sobinde.__getitem__(11))
The main problem is that I had to redefine the "+". Is there a way to work around this ? Being able to pass an arbitrary function without prior transformation ? The function I want to analyze is much more complicated.
Thank you very much for your time
It's easy to use apriori algorithm from package arules as:
import rpy2.interactive as r
arules = r.packages.importr("arules")
from rpy2.robjects.vectors import ListVector
od = OrderedDict()
od["supp"] = 0.0005
od["conf"] = 0.7
od["target"] = 'rules'
result = ListVector(od)
my_rules = arules.apriori(dataset, parameter=result)
However, apriori subset uses a different format in subset param:
rules.sub <- subset(rules, subset = rhs %in% "marital-status=Never-married" & lift > 2)
It's possible to use this subset function with rpy2?
If subset is (re)defined in the R package arules, the object arules obtained from importr will contain it. In your python code this will look like arules.subset.
The parameter subset is a slightly different story because it is an R expression. There can be several ways to tackle this. One of them is to wrap it in an ad-hoc R function.
from rpy2.robjects import r
def mysubset(rules, subset_str):
return r("function(rules) { arules::subset(rules, subset = %s) }" % \
subset_str)
rules_sub = mysubset(rules,
"rhs %in% "marital-status=Never-married" & lift > 2)
I need to make computations in a python program, and I would prefer to make some of them in R. Is it possible to embed R code in python ?
You should take a look at rpy (link to documentation here).
This allows you to do:
from rpy import *
And then you can use the object called r to do computations just like you would do in R.
Here is an example extracted from the doc:
>>> from rpy import *
>>>
>>> degrees = 4
>>> grid = r.seq(0, 10, length=100)
>>> values = [r.dchisq(x, degrees) for x in grid]
>>> r.par(ann=0)
>>> r.plot(grid, values, type=’lines’)
RPy is your friend for this type of thing.
The scipy, numpy and matplotlib packages all do simular things to R and are very complete, but if you want to mix the languages RPy is the way to go!
from rpy2.robjects import *
def main():
degrees = 4
grid = r.seq(0, 10, length=100)
values = [r.dchisq(x, degrees) for x in grid]
r.par(ann=0)
r.plot(grid, values, type='l')
if __name__ == '__main__':
main()
When I need to do R calculations, I usually write R scripts, and run them from Python using the subprocess module. The reason I chose to do this was because the version of R I had installed (2.16 I think) wasn't compatible with RPy at the time (which wanted 2.14).
So if you already have your R installation "just the way you want it", this may be a better option.
Using rpy2.objects. (Tried and ran some sample R programs)
from rpy2.robjects import r
print(r('''
# Create a vector.
apple <- c('red','green',"yellow")
print(apple)
# Get the class of the vector.
print(class(apple))
##########################
# Create the data for the chart.
v <- c(7,12,28,3,41)
# Give the chart file a name.
png(file = "line_chart.jpg")
# Plot the bar chart.
plot(v,type = "o")
# Save the file.
dev.off()
##########################
# Give the chart file a name.
png(file = "scatterplot_matrices.png")
# Plot the matrices between 4 variables giving 12 plots.
# One variable with 3 others and total 4 variables.
pairs(~wt+mpg+disp+cyl,data = mtcars,
main = "Scatterplot Matrix")
# Save the file.
dev.off()
install.packages("plotly") # Please select a CRAN mirror for use in this session
library(plotly) # to load "plotly"
'''))