I occasionally meet this python construction: number + array
And I wonder what is the return value, is it number or array. What it does?
Example, where I met it is this:
def __init__(self, n):
self.wins = np.zeros( n )
self.trials = np.zeros(n )
def sample( self, n=1 ):
for k in range(n):
choice = np.argmax( rbeta( 1 + self.wins, 1 + self.trials - self.wins) )
choices[ k ] = choice
return
Note: I know almost nothing about Python
Your question is not about the syntax per se (addition is nothing special syntax-wise), but about addition methods of numpy arrays. For the numpy array objects, addition of scalars is implemented so that the result is an array where all elements are added the scalar.
In [1]: import numpy as np
In [2]: a = np.arange(0, 5)
In [3]: a
Out[3]: array([0, 1, 2, 3, 4])
In [4]: 1+a
Out[4]: array([1, 2, 3, 4, 5])
Suggested reading:
Python Data Model, specifically the part about object.__add__ and object.__radd__;
Tentative NumPy tutorial.
this isn't number + array
it is scalar + nparray.
it adds the scalar to each element of the np array
Related
I'm implementing this equation and using it for the set of frequencies nos:
The non vectorized code works:
import numpy as np
h = np.array([1,2,3])
nos = np.array([4, 5, 6, 7])
func = lambda h, no: np.sum([hk * np.exp(-1j * no * k) for k, hk in enumerate(h)])
# Not vectorized
resps = np.zeros(nos.shape[0], dtype='complex')
for i, no in enumerate(nos):
resps[i] = func(h, no)
print(resps)
> Out: array([-0.74378734-1.45446975j,
> -0.94989022+3.54991188j,
> 5.45190245+2.16854975j,
> 2.91801616-4.28579526j])
I'd like to vectorize the call in order to pass nos at once instead of explicitly iterating:
H = np.vectorize(func, excluded={'h'}, signature='(k),(n)->(n)')
resps = H(h, nos)
When calling H:
Error: ValueError: 0-dimensional argument does not have enough dimensions for all core dimensions ('n',)
I'm using the signature parameter but I'm not sure I use it in the correct way. Without this parameter there is an error in func:
TypeError: 'numpy.int32' object is not iterable
I don't understand where the problem is.
A list comprehension version of your loop:
In [15]: np.array([func(h,n) for n in nos])
Out[15]:
array([-0.74378734-1.45446975j, -0.94989022+3.54991188j,
5.45190245+2.16854975j, 2.91801616-4.28579526j])
vectorize - excluding the first argument (by position, not name), and scalar iteration on second.
In [16]: f=np.vectorize(func, excluded=[0])
In [17]: f(h,nos)
Out[17]:
array([-0.74378734-1.45446975j, -0.94989022+3.54991188j,
5.45190245+2.16854975j, 2.91801616-4.28579526j])
No need to use signature.
With true numpy vectorization (not the pseudo np.vectorize):
In [23]: np.sum(h * np.exp(-1j * nos[:,None] * np.arange(len(h))), axis=1)
Out[23]:
array([-0.74378734-1.45446975j, -0.94989022+3.54991188j,
5.45190245+2.16854975j, 2.91801616-4.28579526j])
I have a NumPy array with the following properties:
shape: (9986080, 2)
dtype: np.float32
I have a method that loops over the range of the array, performs an operation and then inputs result to new array:
def foo(arr):
new_arr = np.empty(arr.size, dtype=np.uint64)
for i in range(arr.size):
x, y = arr[i]
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
new_arr[i] = int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
I agree with Iguananaut's comment that this data structure seems a bit odd. My biggest problem with it is that it is really tricky to try and vectorize the putting together of integers in a string and then re-converting that to an integer. Still, this will certainly help speed up the function:
def foo(arr):
x_values = arr[:,0]
y_values = arr[:,1]
ones = np.ones(arr.shape[0], dtype=np.uint64)
e = np.char.array(np.where(x_values < 0, ones, ones * 2))
n = np.char.array(np.where(y_values < 0, ones * 3, ones * 4))
x_values = np.char.array(np.absolute(x_values))
y_values = np.char.array(np.absolute(y_values))
x_values = np.char.replace(x_values, '.', '')
y_values = np.char.replace(y_values, '.', '')
new_arr = np.char.add(np.char.add(x_values, e), np.char.add(y_values, n))
return new_arr.astype(np.uint64)
Here, the x and y values of the input array are first split up. Then we use a vectorized computation to determine where e and n should be 1 or 2, 3 or 4. The last line uses a standard list comprehension to do the string merging bit, which is still undesirably slow for super large arrays but faster than a regular for loop. Also vectorizing the previous computations should speed the function up hugely.
Edit:
I was mistaken before. Numpy does have a nice way of handling string concatenation using the np.char.add() method. This requires converting x_values and y_values to Numpy character arrays using np.char.array(). Also for some reason, the np.char.add() method only takes two arrays as inputs, so it is necessary to first concatenate x_values and e and y_values and n and then concatenate these results. Still, this vectorizes the computations and should be pretty fast. The code is still a bit clunky because of the rather odd operation you are after, but I think this will help you speed up the function greatly.
You may use np.apply_along_axis. When you feed this function with another function that takes row (or column) as an argument, it does what you want to do.
For you case, You may rewrite the function as below:
def foo(row):
x, y = row
e, n = ''
if x < 0:
e = '1'
else:
w = '2'
if y > 0:
n = '3'
else:
s = '4'
return int(f'{abs(x)}{e}{abs(y){n}'.replace('.', ''))
# Where you want to you use it.
new_arr = np.apply_along_axis(foo, 1, n)
I have a problem I'm working on where I have to produce a function which mirrors a mathematical one given:
Probability = e^beta / 1 + e^beta. So far I produced code that works when I feed it an integer, but I need to use the function to calculate the probabilities of an array.
My code so far:
import math
e = math.e
def likelihood(beta):
for i in range(beta):
return (e**(beta)/(1+ e**(beta)))
beta_candidate = np.random.uniform(-5, 5, 50)
likelihood_candidate = likelihood(beta_candidate)
Whenever I run the code I'm met with an error stating: only integer scalar arrays can be converted to a scalar index.
In [3]: import math
In [4]: e = math.e
In [5]: def likelihood(beta):
...: return [e**i/(1+e**i) for i in beta]
...:
In [7]: likelihood_candidate = likelihood(beta_candidate)
Since you have your beta_candidate as numpy array, you can just do vectorized numpy operations:
l = np.exp(beta_candidate)/(1+np.exp(beta_candidate))
I'm trying to insert a small multidimensional array into a larger one inside a numba jitclass. The small array is set specific positions of the larger array defined by an index list.
The following MWE shows the problem without numba - everything works as expected
import numpy as np
class NumbaClass(object):
def __init__(self, n, m):
self.A = np.zeros((n, m))
# solution 1 using pure python
def nonNumbaFunction1(self, idx, values):
self.A[idx[:, None], idx] = values
# solution 2 using pure python
def nonNumbaFunction2(self, idx, values):
self.A[np.ix_(idx, idx)] = values
if __name__ == "__main__":
n = 6
m = 8
obj = NumbaClass(n, m)
print(f'A =\n{obj.A}')
idx = np.array([0, 2, 5])
values = np.arange(len(idx)**2).reshape(len(idx), len(idx))
print(f'values =\n{values}')
obj.nonNumbaFunction1(idx, values)
print(f'A =\n{obj.A}')
obj.nonNumbaFunction2(idx, values)
print(f'A =\n{obj.A}')
Both functions nonNumbaFunction1 and nonNumbaFunction2 do not work inside a numba class. So my current solution looks like this which is not really nice in my opinion
import numpy as np
from numba import jitclass
from numba import int64, float64
from collections import OrderedDict
specs = OrderedDict()
specs['A'] = float64[:, :]
#jitclass(specs)
class NumbaClass(object):
def __init__(self, n, m):
self.A = np.zeros((n, m))
# solution for numba jitclass
def numbaFunction(self, idx, values):
for i in range(len(values)):
idxi = idx[i]
for j in range(len(values)):
idxj = idx[j]
self.A[idxi, idxj] = values[i, j]
if __name__ == "__main__":
n = 6
m = 8
obj = NumbaClass(n, m)
print(f'A =\n{obj.A}')
idx = np.array([0, 2, 5])
values = np.arange(len(idx)**2).reshape(len(idx), len(idx))
print(f'values =\n{values}')
obj.numbaFunction(idx, values)
print(f'A =\n{obj.A}')
So my questions are:
Does anyone know a solution to this indexing in numba or is there another vectorized solution?
Is there a faster solution for nonNumbaFunction1?
It might be useful to know that inserted array is small (4x4 to 10x10), but this indexing appears in nested loops so it has to be quiet fast as well! Later I need a similar indexing for three dimensional objects too.
Because of limitations on numba's indexing support, I don't think you can do any better than writing out the for loops yourself. To make it generic across dimensions, you could use the generated_jit decorator to specialize. Something like this:
def set_2d(target, values, idx):
for i in range(values.shape[0]):
for j in range(values.shape[1]):
target[idx[i], idx[j]] = values[i, j]
def set_3d(target, values, idx):
for i in range(values.shape[0]):
for j in range(values.shape[1]):
for k in range(values.shape[2]):
target[idx[i], idx[j], idx[k]] = values[i, j, l]
#numba.generated_jit
def set_nd(target, values, idx):
if target.ndim == 2:
return set_2d
elif target.ndim == 3:
return set_3d
Then, this could be used in your jitclass
specs = OrderedDict()
specs['A'] = float64[:, :]
#jitclass(specs)
class NumbaClass(object):
def __init__(self, n, m):
self.A = np.zeros((n, m))
def numbaFunction(self, idx, values):
set_nd(self.A, values, idx)
I programmed class which looks something like this:
import numpy as np
class blank():
def __init__(self,a,b,c):
self.a=a
self.b=b
self.c=c
n=5
c=a/b*8
if (a>b):
y=c+a*b
else:
y=c-a*b
p = np.empty([1,1])
k = np.empty([1,1])
l = np.empty([1,1])
p[0]=b
k[0]=b*(c-1)
l[0]=p+k
for i in range(1, n, 1):
p=np.append(p,l[i-1])
k=np.append(k,(p[i]*(c+1)))
l=np.append(l,p[i]+k[i])
komp = np.zeros(shape=(n, 1))
for i in range(0, n):
pl_avg = (p[i] + l[i]) / 2
h=pl_avg*3
komp[i]=pl_avg*h/4
self.tot=komp+l
And when I call it like this:
from ex1 import blank
import numpy as np
res=blank(1,2,3)
print(res.tot)
everything works well.
BUT I want to call it like this:
res = blank(np.array([1,2,3]), np.array([3,4,5]), 3)
Is there an easy way to call it for each i element of this two arrays without editing class code?
You won't be able to instantiate a class with NumPy arrays as inputs without changing the class code. #PabloAlvarez and #NagaKiran already provided alternative: iterate with zip over arrays and instantiate class for each pair of elements. While this is pretty simple solution, it defeats the purpose of using NumPy with its efficient vectorized operations.
Here is how I suggest you to rewrite the code:
from typing import Union
import numpy as np
def total(a: Union[float, np.ndarray],
b: Union[float, np.ndarray],
n: int = 5) -> np.array:
"""Calculates what your self.tot was"""
bc = 8 * a
c = bc / b
vectorized_geometric_progression = np.vectorize(geometric_progression,
otypes=[np.ndarray])
l = np.stack(vectorized_geometric_progression(bc, c, n))
l = np.atleast_2d(l)
p = np.insert(l[:, :-1], 0, b, axis=1)
l = np.squeeze(l)
p = np.squeeze(p)
pl_avg = (p + l) / 2
komp = np.array([0.75 * pl_avg ** 2]).T
return komp + l
def geometric_progression(bc, c, n):
"""Calculates array l"""
return bc * np.logspace(start=0,
stop=n - 1,
num=n,
base=c + 2)
And you can call it both for sole numbers and NumPy arrays like that:
>>> print(total(1, 2))
[[2.6750000e+01 6.6750000e+01 3.0675000e+02 1.7467500e+03 1.0386750e+04]
[5.9600000e+02 6.3600000e+02 8.7600000e+02 2.3160000e+03 1.0956000e+04]
[2.1176000e+04 2.1216000e+04 2.1456000e+04 2.2896000e+04 3.1536000e+04]
[7.6205600e+05 7.6209600e+05 7.6233600e+05 7.6377600e+05 7.7241600e+05]
[2.7433736e+07 2.7433776e+07 2.7434016e+07 2.7435456e+07 2.7444096e+07]]
>>> print(total(3, 4))
[[1.71000000e+02 3.39000000e+02 1.68300000e+03 1.24350000e+04 9.84510000e+04]
[8.77200000e+03 8.94000000e+03 1.02840000e+04 2.10360000e+04 1.07052000e+05]
[5.59896000e+05 5.60064000e+05 5.61408000e+05 5.72160000e+05 6.58176000e+05]
[3.58318320e+07 3.58320000e+07 3.58333440e+07 3.58440960e+07 3.59301120e+07]
[2.29323574e+09 2.29323590e+09 2.29323725e+09 2.29324800e+09 2.29333402e+09]]
>>> print(total(np.array([1, 3]), np.array([2, 4])))
[[[2.67500000e+01 6.67500000e+01 3.06750000e+02 1.74675000e+03 1.03867500e+04]
[1.71000000e+02 3.39000000e+02 1.68300000e+03 1.24350000e+04 9.84510000e+04]]
[[5.96000000e+02 6.36000000e+02 8.76000000e+02 2.31600000e+03 1.09560000e+04]
[8.77200000e+03 8.94000000e+03 1.02840000e+04 2.10360000e+04 1.07052000e+05]]
[[2.11760000e+04 2.12160000e+04 2.14560000e+04 2.28960000e+04 3.15360000e+04]
[5.59896000e+05 5.60064000e+05 5.61408000e+05 5.72160000e+05 6.58176000e+05]]
[[7.62056000e+05 7.62096000e+05 7.62336000e+05 7.63776000e+05 7.72416000e+05]
[3.58318320e+07 3.58320000e+07 3.58333440e+07 3.58440960e+07 3.59301120e+07]]
[[2.74337360e+07 2.74337760e+07 2.74340160e+07 2.74354560e+07 2.74440960e+07]
[2.29323574e+09 2.29323590e+09 2.29323725e+09 2.29324800e+09 2.29333402e+09]]]
You can see that results are in compliance.
Explanation:
First of all I'd like to note that your calculation of p, k, and l doesn't have to be in the loop. Moreover, calculating k is unnecessary. If you see carefully, how elements of p and l are calculated, they are just geometric progressions (except the 1st element of p):
p = [b, b*c, b*c*(c+2), b*c*(c+2)**2, b*c*(c+2)**3, b*c*(c+2)**4, ...]
l = [b*c, b*c*(c+2), b*c*(c+2)**2, b*c*(c+2)**3, b*c*(c+2)**4, b*c*(c+2)**5, ...]
So, instead of that loop, you can use np.logspace. Unfortunately, np.logspace doesn't support base parameter as an array, so we have no other choice but to use np.vectorize which is just a loop under the hood...
Calculating of komp though is easily vectorized. You can see it in my example. No need for loops there.
Also, as I already noted in a comment, your class doesn't have to be a class, so I took a liberty of changing it to a function.
Next, note that input parameter c is overwritten, so I got rid of it. Variable y is never used. (Also, you could calculate it just as y = c + a * b * np.sign(a - b))
And finally, I'd like to remark that creating NumPy arrays with np.append is very inefficient (as it was pointed out by #kabanus), so you should always try to create them at once - no loops, no appending.
P.S.: I used np.atleast_2d and np.squeeze in my code and it could be unclear why I did it. They are necessary to avoid if-else clauses where we would check dimensions of array l. You can print intermediate results to see what is really going on there. Nothing difficult.
if it is just calling class with two different list elements, loop can satisfies well
res = [blank(i,j,3) for i,j in zip(np.array([1,2,3]),np.array([3,4,5]))]
You can see list of values for res variable
The only way I can think of iterating lists of arrays is by using a function on the main program for iteration and then do the operations you need to do inside the loop.
This solution works for each element of both arrays (note to use zip function for making the iteration in both lists if they have a small size as listed in this answer here):
for n,x in zip(np.array([1,2,3]),np.array([3,4,5])):
res=blank(n,x,3)
print(res.tot)
Hope it is what you need!