I'm trying to plot some complex functions using numpy. Example of some working code:
import numpy as np
from PIL import Image
size = 1000
w = np.linspace(-10, 10, size)
x, y = np.meshgrid(w, w)
r = x + 1j*y
def f(q):
return np.angle(q)
z = f(r)
normalized = ((255/(np.amax(z) - np.amin(z)))*(z+abs(np.amin(z)))).astype(int)
data = [i for j in normalized for i in j]
img = Image.new('L', (size, size))
img.putdata(data[::-1]) #pixels are done bottom to top
img.show()
However, suppose I want the function f to have a simple comparison in it, like this:
def f(q):
if np.abs(q) < 4:
return 1
else:
return 0
I get the error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
For the np.abs(q) < 4 check.
I did some digging and realized it's because Python is doing the operation on the entire r array, and it can't compare an array to an integer. So, I tried looking for ways to do element-wise comparisons.
This page looked promising: it says I can do element-wise comparisons by using np.less(a, b), so I tried
def f(q):
if np.less(np.abs(q), 4):
return 1
else:
return 0
and got the same ValueError. It seems as though both arguments for np.less() need to be arrays of the same size.
What I want is to compare each element of my array to a single, non-array quantity. I suppose I could make a dummy array of the same size filled with identical 4's, but there has to be a more elegant way of doing this.
The key is to return an array value instead of trying to coerce an array into a single bool, which is what if (some_array): keeps trying to do. There being no unambiguous way to decide what single boolean np.array([True, False]) should convert to, it doesn't even try.
So don't even branch:
def f(q):
return abs(q) < 4
gives an array like
>>> f(np.array([1,3,5]))
array([ True, True, False], dtype=bool)
which as numbers will behave like
>>> f(np.array([1,3,5])).astype(int)
array([1, 1, 0])
and give
Related
this is how I got the two arrays (array 1 and array2) for my function:
x = np.arange(-5, 5,0.01)
prob=stats.norm.pdf(x,0,1)
prob_array=numpy.array(prob).reshape(1000,1) #array1
x_tran=m.transpose()
x_tran_array=array(x_tran)
mu_array=array(mu) # mu is stock return
mu_array1=numpy.array(mu_array).reshape(54966,1)
sigma_array=array(sigma) #sigma is the historical volatility
sigma_array1=numpy.array(sigma_array).reshape(54966,1)
mu1_mat=mat(ones((1,1000))) #for matrix calculation
original_x=mu_array1*mu1_mat+sigma_array1*x_tran_array #array2
I defined a function:
def TK_value(z,p):
if z >= 0:
utility=z**0.88
prob=(p**0.61)/(p**0.61+(1-p)**0.61)**(1/0.61)
else:
utility= -2.25*(-z)**0.88
prob=(p**0.69)/(p**0.69+(1-p)**0.69)**(1/0.69)
return utility*prob
tks=TK_value(original_x,prob_array)
I have two arrays with original_x with shape((54966, 1000)
and prob_array with shape (1000,1). I want to use original_x as z and prob_array as p in this function.
But the error is :
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Welcome to SO! The problem seems to be this line: if z >= 0:
If you use the '>'/'<' operator on an array it will return the following:
>>> import numpy as np
>>> a = np.array([1,2,3])
>>> a > 2
array([False, False, True])
This array can't be converted to bool by default, you have to be more specific, for example by using any() to test if atleast one element falls under the given condition.
Numpy arrays can do it like this: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.any.html.
This question already has answers here:
convert numpy array to 0 or 1
(7 answers)
Closed 2 years ago.
I have this function:
if elem < 0:
elem = 0
else:
elem = 1
I want to apply this function to every element in a NumPy array, which would be done with a for loop when performing this function for only the same dimensions. But in this case, I need it to work regardless of the array dimensions and shape. Would there be any way this can be achieved in Python with NumPy?
Or would there be any general way to apply any def to every element in a NumPy n-dimensional array?
Isn't it
arr = (arr >= 0).astype(int)
np.where
np.where(arr < 0, 0, 1)
You can use a boolean mask to define an array of decisions. Let's work through a concrete example. You have an array of positive and negative numbers and you want to take the square root only at non-negative locations:
arr = np.random.normal(size=100)
You compute a mask like
mask = arr >= 0
The most straightforward way to apply the mask is to create an output array, and fill in the required elements:
result = np.empty(arr.shape)
result[mask] = np.sqrt(arr[mask])
result[~mask] = arr[~mask]
This is not super efficient because you have to compute the inverse of the mask and apply it multiple times. For this specific example, your can take advantage of the fact that np.sqrt is a ufunc and use its where keyword:
result = arr.copy()
np.sqrt(arr, where=mask, out=result)
One popular way to apply the mask would be to use np.where but I specifically constructed this example to show the caveats. The simplistic approach would be to compute
result = np.where(mask, np.sqrt(arr), arr)
where chooses the value from either np.sqrt(arr) or arr depending on whether mask is truthy or not. This is a very good method in many cases, but you have to have the values pre-computed for both branches, which is exactly what to want to avoid with a square root.
TL;DR
Your specific example is looking for a representation of the mask itself. If you don't care about the type:
result = arr >= 0
If you do care about the type:
result = (arr >= 0).astype(int)
OR
result = -np.clip(arr, -1, 0)
These solutions create a different array from the input. If you want to replace values in the same buffer,
mask = arr >= 0
arr[mask] = 1
arr[~mask] = 0
You can do something like this:
import numpy as np
a=np.array([-2,-1,0,1,2])
a[a>=0]=1
a[a<0]=0
>>> a
array([0, 0, 1, 1, 1])
An alternative to the above solutions could be combining list comprenhension with ternary operators.
my_array = np.array([-1.2, 3.0, -10.11, 5.2])
sol = np.asarray([0 if val < 0 else 1 for val in my_array])
take a look to these sources
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
https://book.pythontips.com/en/latest/ternary_operators.html
Use numpy.vectorize():
import numpy as np
def unit(elem):
if elem < 0:
elem = 0
else:
elem = 1
a = np.array([[1, 2, -0.5], [0.5, 2, 3]])
vfunc = np.vectorize(unit)
vfunc(a)
# array([[1, 1, 0], [1, 1, 1]])
I am trying to vectorize a function that takes a numpy array as input. I have a 2-d numpy array (shape is 1000,100) on which the function is to be applied on each of the 1000 rows. I tried to vectorize the function using np.vectorize. Here is the code:
def fun(i):
print(i)
location = geocoder.google([i[1], i[0]], method="reverse")
#print type(location)
location = str(location)
location = location.split("Reverse")
if len(location) > 1:
location1 = location[1]
return [i[0], i[1], location1]
#using np.vectorize
vec_fun = np.vectorize(fun)
Which raises the error
<ipython-input-19-1ee9482c6161> in fun(i)
1 def fun(i):
2 print(i)
----> 3 location = geocoder.google([i[1], i[0]], method="reverse")
4 #print type(location)
5 location = lstr(location)
IndexError: invalid index to scalar variable.
I have printed the argument that is passed in to the fun which prints a single value (the first element of the vector) rather than the vector(1 row) that is the reason of the index error but I'm not getting any idea how to resolve this.
By this time I think yo have solved your problem. However, I just found a way that solve this and may help other people with the same question. You can pass a signature="str" parameter to np.vectorize in order to specify the input and output shape. For example, the signature "(n) -> ()" expects an input shape with length (n) (rows) and outputs a scalar (). Therefore, it will broadcast up to rows:
def my_sum(row):
return np.sum(row)
row_sum = np.vectorize(my_sum, signature="(n) -> ()")
my_mat = np.array([
[1, 1, 1],
[2, 2, 2],
])
row_sum(my_mat)
OUT: array([3, 6])
vectorize runs your function on each element of an array, so it's not the right choice. Use a regular loop instead:
for row in some_array:
i0, i1, loc = fun(row)
It's up to you as to what you want to do with the output. Keep in mind that your function does not assign location1 if len(location) <= 1, and will raise an error in that case. It also returns a string rather than a numerical value in the third output.
Once you fix those issues, if you want to make an array of the output:
output = np.empty((some_array.shape[0], 3))
for i, row in enumerate(some_array):
output[i, :] = fun(row)
For example, let's consider this toy code
import numpy as np
import numpy.random as rnd
a = rnd.randint(0,10,(10,10))
k = (1,2)
b = a[:,k]
for col in np.arange(np.size(b,1)):
b[:,col] = b[:,col]+col*100
This code will work when the size of k is bigger than 1. However, with the size equal to 1, the extracted sub-matrix from a is transformed into a row vector, and applying the function in the for loop throws an error.
Of course, I could fix this by checking the dimension of b and reshaping:
if np.dim(b) == 1:
b = np.reshape(b, (np.size(b), 1))
in order to obtain a column vector, but this is expensive.
So, the question is: what is the best way to handle this situation?
This seems like something that would arise quite often and I wonder what is the best strategy to deal with it.
If you index with a list or tuple, the 2d shape is preserved:
In [638]: a=np.random.randint(0,10,(10,10))
In [639]: a[:,(1,2)].shape
Out[639]: (10, 2)
In [640]: a[:,(1,)].shape
Out[640]: (10, 1)
And I think b iteration can be simplified to:
a[:,k] += np.arange(len(k))*100
This sort of calculation will also be easier is k is always a list or tuple, and never a scalar (a scalar does not have a len).
np.column_stack ensures its inputs are 2d (and expands at the end if not) with:
if arr.ndim < 2:
arr = array(arr, copy=False, subok=True, ndmin=2).T
np.atleast_2d does
elif len(ary.shape) == 1:
result = ary[newaxis,:]
which of course could changed in this case to
if b.ndim==1:
b = b[:,None]
Any ways, I think it is better to ensure the k is a tuple rather than adjust b shape after. But keep both options in your toolbox.
I defined a function
def softthresh(u, LAMBDA):
if np.fabs(u) <= LAMBDA:
return 0
else:
return ((np.fabs(u) - LAMBDA) * u / np.fabs(u))
u is a numpy array, and np.fabs will check the relations for each array element (np.fabs(u_i)). It gives me the following error:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Follow up Question:
Strange behaviour in simple function.
def softthresh(u,LAMBDA):
for i in u:
if np.fabs(i)<=LAMBDA:
return 0
else:
return ((np.fabs(i)-LAMBDA)*u/np.fabs(i))
ll = 5.0
xx = np.arange(-10,11)
yy = softthresh(xx,ll)
What I get is not what I expect. for u (=xx ) array-elements that are smaller than 5 i should get zero. But i don't. Why?
You are calling return from inside the inner loop. Therefore, your function returns just after it evaluates the first member of u.
Since you are using NumPy, you should take advantage of NumPy's ability to operate on the whole array at once, and also of NumPy's smart indexing.
def softthreshold(u, LAMBDA):
notzero = np.fabs(u) > LAMBDA # find the indeces of elements that need to be scaled
rr = np.zeros_like(u) # an array the same size/type as u, already initialized to 0
rr[notzero] = (np.fabs(u[notzero])-LAMBDA)*u[notzero]/np.fabs(u[notzero]) # scale each of the members that aren't zero
return rr
Your problems depends on the numpy array. If you are working with a list it works.
Otherwise if you need the numpy array you can use code like
def softthresh(u,LAMBDA):
for i in u:
if np.fabs(i)<=LAMBDA:
return 0
else:
return ((np.fabs(u)-LAMBDA)*u/np.fabs(u))
You get the array through the dependency of <= logic and the numpy.array definition.
If u is an array, you need to loop through all its elements in your function.
Alternatively, you can have u be an element of your array and call it with a loop like this :
tbl = np.array([1, 2, 3, 4, 5])
for elt in tbl:
print(softthresh(elt, 3))
Results would be :
0
0
0
1.0
2.0