rpy2: how to convert str vector into numeric vector

rpy2: how to convert str vector into numeric vector - python

Anyone know how to convert str vector in rpy2 to a numeric vector?
r('x_num = as.numeric(x)')
works, but x_num is not in the python environment. I can't call it from python.
I tried:
x_num = base.as_numeric(x)
r('class(x_num)')
which shows:
'<StrVector - Python:0x7fe602a54d88 / R:0xa06bb28>
[str]'
The reason why I want to do this is because, when I pass a numpy array to robjects.FloatVector, the class of the object is str vector, which causes problems for my further analysis.
e.g.
x = pd.read_csv('x.csv', index_col=0).values.flatten()
x_ro = robjects.FloatVector(x)
r('class(x_ro)')
'<StrVector - Python:0x7fe605062098 / R:0xa16c158>
[str]'
Thank you very much!
edit:
I've already added x_ro to the env. I forgot to copy it here
robjects.globalenv["x_ro"] = x_ro

Regarding to your 1st problem, if the x_num variable is in the format you want in the R environment, you can get its view in python using the numpy.asarray() method (as stated in the documentation), so changes you could made to this array in python will also act on the underlying R vector :
my_view = numpy.asarray(r("x_num"))
It can also be done automatically if you enter these line of code :
from rpy2.robjects import numpy2ri
numpy2ri.activate()
So calling r("x_num") should return a numpy array if possible.
Also in your last snippet of code, are you sure that this is the "same" x_ro object, as you are not setting it in the R environnement ?
I guess you should do something like :
x_ro = robjects.FloatVector(x)
robjects.globalenv["x_ro"] = x_ro
then try again r('class(x_ro)') and see if you have the correct output.

It easier to identify an issue with a fully working example. Without it I am tempted to say that it is working as expected.
In [1]: import rpy2.robjects as ro
In [2]: ro.vectors.FloatVector((1,2,3,4,5))
Out[2]:
<FloatVector - Python:0x7f3541c68788 / R:0x3541468>
[1.000000, 2.000000, 3.000000, 4.000000, 5.000000]
In [3]: ro.vectors.FloatVector(('1','2','3','4','5'))
Out[3]:
<FloatVector - Python:0x7f353bff7d88 / R:0x3541398>
[1.000000, 2.000000, 3.000000, 4.000000, 5.000000]
In [4]: ro.vectors.FloatVector(('1','2','3','a','5'))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-263bdc61f184> in <module>()
----> 1 ro.vectors.FloatVector(('1','2','3','a','5'))
/usr/local/lib/python3.5/dist-packages/rpy2/robjects/vectors.py in __init__(self, obj)
454
455 def __init__(self, obj):
--> 456 obj = FloatSexpVector(obj)
457 super(FloatVector, self).__init__(obj)
458
ValueError: Error while trying to convert element 3 to a double.
In [5]: ro.vectors.FloatVector(ro.vectors.StrVector(('1','2','3','a','5')))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-26578834d7ec> in <module>()
----> 1 ro.vectors.FloatVector(ro.vectors.StrVector(('1','2','3','a','5')))
/usr/local/lib/python3.5/dist-packages/rpy2/robjects/vectors.py in __init__(self, obj)
454
455 def __init__(self, obj):
--> 456 obj = FloatSexpVector(obj)
457 super(FloatVector, self).__init__(obj)
458
ValueError: Invalid SEXP type '16' (should be 14).
Having established that we are able to build R vectors of float from Python, we can look at whether binding it to a symbol in R and accessing that object from R makes any difference. It does not:
In [1]: import rpy2.robjects as ro
In [2]: v = ro.vectors.FloatVector((1,2,3,4,5))
In [3]: ro.globalenv['v'] = v
In [4]: ro.r("print(v)")
[1]
1
2
3
4
5
Out[4]:
<FloatVector - Python:0x7fb4791e5f08 / R:0x2f7eed0>
[1.000000, 2.000000, 3.000000, 4.000000, 5.000000]
In [5]: ro.r("class(v)")
Out[5]:
<StrVector - Python:0x7fb4791e5548 / R:0x2d02658>
['numeric']

Related

TypeError: uniform() got an unexpected keyword argument 'low'/'size'

visualizing a neural network result and this is what shows up:
def apply_net(y_in):
global w, b
z=dot(w, y_in)+b
return(1/(1+exp(-z)))
N0=2
N1=1
w=random.uniform(low=-10,high=+10,size=(N1,N0)) # random weights: N1xN0
b=random.uniform(low=-1,high=+1,size=N1) #biases: N1 vector
TypeError Traceback (most recent call last)
in ()
2 N1=1
3
----> 4 w=random.uniform(low=-10,high=+10,size=(N1,N0)) # random weights: N1xN0
5 b=random.uniform(low=-1,high=+1,size=N1) #biases: N1 vector
TypeError: uniform() got an unexpected keyword argument 'low'
___ If I remove low and high and keep it (-10, 10, size=(N1,N0)), it says:
TypeError: uniform() got an unexpected keyword argument 'size'
If I remove size then it says:
TypeError: uniform() takes 3 positional arguments but 4 were given
?

You must declare the alias library or the library name directly
import numpy as np
def apply_net(y_in):
global w, b
z=np.dot(w, y_in)+b
return(1/(1+np.exp(-z)))
N0=2
N1=1
w=np.random.uniform(low=-10,high=+10,size=(N1,N0)) # random weights: N1xN0
b=np.random.uniform(low=-1,high=+1,size=N1) #biases: N1 vector

We must use random.uniform while importing the related class with an alias (using 'as') or else just use import numpy while importing
An example for using alias is :
from numpy import random as np_random
Then utilize np_random.uniform()
(figured it out and had to use this to solve the problem - note I ran it on colab)

How to deal with the different type of np.array(list) in py2 and py3 when using pyhdf?

I want to save something as variable in hdf by pyhdf.
This is my code:
import numpy as np
from pyhdf.SD import *
var = 'PRESSURE_INDEPENDENT_SOURCE'
vartype = 4
hdf4 = SD('./a.hdf', 2 | 4)
dset = hdf4.create(var, vartype, (1,13))
a = 'AFGL_1976'
b = np.array([list(a.ljust(13))])
dset[:] = b
It works in py2 and b.type is |S1.
But, b.dtype is <U1 in py3 and I got this error when running the last row of my code:
TypeError: Cannot cast array data from dtype('<U1') to dtype('S1') according to the rule 'safe'
If I add b = b.astype('S1') in py3, there's the same error. But, b.dtype is |S1.

try:
b = np.array(list(a.ljust(13)),dtype='S1')

Unable to create a tensor using torch.Tensor

i was trying to create a tensor as below.
import torch
t = torch.tensor(2,3)
i got the following error.
TypeError Traceback (most recent call
last) in ()
----> 1 a=torch.tensor(2,3)
TypeError: tensor() takes 1 positional argument but 2 were given
so, i tried the following
import torch
t = torch.Tensor(2,3)
# No error while creating the tensor
# When i print i get an error
print(t)
i get the following error
RuntimeError Traceback (most recent call
last) in ()
----> 1 print(a)
D:\softwares\anaconda\lib\site-packages\torch\tensor.py in
repr(self)
55 # characters to replace unicode characters with.
56 if sys.version_info > (3,):
---> 57 return torch._tensor_str._str(self)
58 else:
59 if hasattr(sys.stdout, 'encoding'):
D:\softwares\anaconda\lib\site-packages\torch_tensor_str.py in
_str(self)
216 suffix = ', dtype=' + str(self.dtype) + suffix
217
--> 218 fmt, scale, sz = _number_format(self)
219 if scale != 1:
220 prefix = prefix + SCALE_FORMAT.format(scale) + ' ' * indent
D:\softwares\anaconda\lib\site-packages\torch_tensor_str.py in
_number_format(tensor, min_sz)
94 # TODO: use fmod?
95 for value in tensor:
---> 96 if value != math.ceil(value.item()):
97 int_mode = False
98 break
RuntimeError: Overflow when unpacking long
But, according to This SO Post, he was able to create a tensor. Am i missing something here. Also, why was i able to create a tensor with Tensor(capital T) and not with tensor(small t)

torch.tensor() expects a sequence or array_like to create a tensor whereas torch.Tensor() class can create a tensor with just shape information.
Here's the signature of torch.tensor():
Docstring:
tensor(data, dtype=None, device=None, requires_grad=False) -> Tensor
Constructs a tensor with :attr:data.
Args:
data (array_like): Initial data for the tensor. Can be a list, tuple,
NumPy ndarray, scalar, and other types.
dtype (:class:torch.dtype, optional): the desired data type of returned tensor.
Regarding the RuntimeError: I cannot reproduce the error in Linux distros. Printing the tensor works perfectly fine from ipython terminal.
Taking a closer look at the error, this seems to be a problem only in Windows OS. As mentioned in the comments, have a look at the issues/6339: Error when printing tensors containing large values

numpy TypeError: ufunc 'invert' not supported for the input types, and the inputs

For the code below:
def makePrediction(mytheta, myx):
# -----------------------------------------------------------------
pr = sigmoid(np.dot(myx, mytheta))
pr[pr < 0.5] =0
pr[pr >= 0.5] = 1
return pr
# -----------------------------------------------------------------
# Compute the percentage of samples I got correct:
pos_correct = float(np.sum(makePrediction(theta,pos)))
neg_correct = float(np.sum(np.invert(makePrediction(theta,neg))))
tot = len(pos)+len(neg)
prcnt_correct = float(pos_correct+neg_correct)/tot
print("Fraction of training samples correctly predicted: %f." % prcnt_correct)
I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-f0c91286cd02> in <module>()
13 # Compute the percentage of samples I got correct:
14 pos_correct = float(np.sum(makePrediction(theta,pos)))
---> 15 neg_correct = float(np.sum(np.invert(makePrediction(theta,neg))))
16 tot = len(pos)+len(neg)
17 prcnt_correct = float(pos_correct+neg_correct)/tot
TypeError: ufunc 'invert' not supported for the input types, and the inputs
Why is it happening and how can I fix it?

np.invert requires ints or bools, use the method np.linalg.inv instead.

From the documentation:
Parameters:
x : array_like.
Only integer and boolean types are handled."
Your original array is floating point type (the return value of sigmoid()); setting values in it to 0 and 1 won't change the type. You need to use astype(np.int):
neg_correct = float(np.sum(np.invert(makePrediction(theta,neg).astype(np.int))))
should do it (untested).
Doing that, the float() cast you have also makes more sense. Though I would just remove the cast, and rely on Python doing the right thing.
In case you are still using Python 2 (but please use Python 3), just add
from __future__ import division
to let Python do the right thing (it won't hurt if you do it in Python 3; it just doesn't do anything). With that (or in Python 3 anyway), you can remove numerous other float() casts you have elsewhere in your code, improving readability.

Python TypeError on Load Object using Dill

Trying to render a large and (possibly very) unpicklable object to a file for later use.
No complaints on the dill.dump(file) side:
In [1]: import echonest.remix.audio as audio
In [2]: import dill
In [3]: audiofile = audio.LocalAudioFile("/Users/path/Track01.mp3")
en-ffmpeg -i "/Users/path/audio/Track01.mp3" -y -ac 2 -ar 44100 "/var/folders/X2/X2KGhecyG0aQhzRDohJqtU+++TI/-Tmp-/tmpWbonbH.wav"
Computed MD5 of file is b3820c166a014b7fb8abe15f42bbf26e
Probing for existing analysis
In [4]: with open('audio_object_dill.pkl', 'wb') as f:
...: dill.dump(audiofile, f)
...:
In [5]:
But trying to load the .pkl file:
In [1]: import dill
In [2]: with open('audio_object_dill.pkl', 'rb') as f:
...: audio_object = dill.load(f)
...:
Returns following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-203b696a7d73> in <module>()
1 with open('audio_object_dill.pkl', 'rb') as f:
----> 2 audio_object = dill.load(f)
3
/Users/mikekilmer/Envs/GLITCH/lib/python2.7/site-packages/dill-0.2.2.dev-py2.7.egg/dill/dill.pyc in load(file)
185 pik = Unpickler(file)
186 pik._main_module = _main_module
--> 187 obj = pik.load()
188 if type(obj).__module__ == _main_module.__name__: # point obj class to main
189 try: obj.__class__ == getattr(pik._main_module, type(obj).__name__)
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.pyc in load(self)
856 while 1:
857 key = read(1)
--> 858 dispatch[key](self)
859 except _Stop, stopinst:
860 return stopinst.value
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.pyc in load_newobj(self)
1081 args = self.stack.pop()
1082 cls = self.stack[-1]
-> 1083 obj = cls.__new__(cls, *args)
1084 self.stack[-1] = obj
1085 dispatch[NEWOBJ] = load_newobj
TypeError: __new__() takes at least 2 arguments (1 given)
The AudioObject is much more complex (and large) than the class object the above calls are made on (from SO answer), and I'm unclear as to whether I need to send a second argument via dill, and if so, what that argument would be or how to tell if any approach to pickling is viable for this specific object.
Examining the object itself a bit:
In [4]: for k, v in vars(audiofile).items():
...: print k, v
...:
returns:
is_local False
defer False
numChannels 2
verbose True
endindex 13627008
analysis <echonest.remix.audio.AudioAnalysis object at 0x103c61bd0>
filename /Users/mikekilmer/Envs/GLITCH/glitcher/audio/Track01.mp3
convertedfile /var/folders/X2/X2KGhecyG0aQhzRDohJqtU+++TI/-Tmp-/tmp9ADD_Z.wav
sampleRate 44100
data [[0 0]
[0 0]
[0 0]
...,
[0 0]
[0 0]
[0 0]]
And audiofile.analysis seems to contain an attribute called audiofile.analysis.source which contains (or apparently points back to) audiofile.analysis.source.analysis

In this case, the answer lay within the module itself.
The LocalAudioFile class provides (and each of it's instances can therefor utilize) it's own save method, called via LocalAudioFile.save or more likely the_audio_object_instance.save.
In the case of an .mp3 file, the LocalAudioFile instance consists of a pointer to a temporary .wav file which is the decompressed version of the .mp3, along with a whole bunch of analysis data which is returned from the initial audiofile, after it's been interfaced with the (internet-based) Echonest API.
LocalAudioFile.save calls shutil.copyfile(path_to_wave, wav_path) to save the .wav file with same name and path as original file linked to audio object and returns an error if the file already exists. It calls pickle.dump(self, f) to save the analysis data to a file also in the directory the initial audio object file was called from.
The LocalAudioFile object can be reintroduced simply via pickle.load().
Here's an iPython session in which I used the dill, which is a very useful wrapper or interface that offers most of the standard pickle methods plus a bunch more:
audiofile = audio.LocalAudioFile("/Users/mikekilmer/Envs/GLITCH/glitcher/audio/Track01.mp3")
In [1]: import echonest.remix.audio as audio
In [2]: import dill
# create the audio_file object
In [3]: audiofile = audio.LocalAudioFile("/Users/mikekilmer/Envs/GLITCH/glitcher/audio/Track01.mp3")
en-ffmpeg -i "/Users/path/audio/Track01.mp3" -y -ac 2 -ar 44100 "/var/folders/X2/X2KGhecyG0aQhzRDohJqtU+++TI/-Tmp-/tmp_3Ei0_.wav"
Computed MD5 of file is b3820c166a014b7fb8abe15f42bbf26e
Probing for existing analysis
#call the LocalAudioFile save method
In [4]: audiofile.save()
Saving analysis to local file /Users/path/audio/Track01.mp3.analysis.en
#confirm the object is valid by calling it's duration method
In [5]: audiofile.duration
Out[5]: 308.96
#delete the object - there's probably a "correct" way to do this
in [6]: audiofile = 0
#confirm it's no longer an audio_object
In [7]: audiofile.duration
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-04baaeda53a4> in <module>()
----> 1 audiofile2.duration
AttributeError: 'int' object has no attribute 'duration'
#open the pickled version (using dill)
In [8]: with open('/Users/path/audio/Track01.mp3.analysis.en') as f:
....: audiofile = dill.load(f)
....:
#confirm it's a valid LocalAudioFile object
In [8]: audiofile.duration
Out[8]: 308.96
Echonest is a very robust API and the remix package provides a ton of functionality. There's a small list of relevant links assembled here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

rpy2: how to convert str vector into numeric vector - python

Related

TypeError: uniform() got an unexpected keyword argument 'low'/'size'

How to deal with the different type of np.array(list) in py2 and py3 when using pyhdf?

Unable to create a tensor using torch.Tensor

numpy TypeError: ufunc 'invert' not supported for the input types, and the inputs

Python TypeError on Load Object using Dill

Categories

Resources