I have 4 arrays, A,B,C,D. A and B have shape (n,n) and C/D have shape (n,n,m). I am trying to set it up so that when an element of A is greater than B, that array of length m belongs to C. In essence
C_new = np.where(A > B, C,D) , D_new = np.where(A < B , D, C). However this gives me a value error (operands could not be broadcast together with shapes)
I am curious if I can use where here instead of just looping through each element?
Edit: example:
A = np.ones((2,2))
B = 2*np.eye(2)
C = np.ones((2,2,3))
D = np.zeros((2,2,3))
# Cnew = np.where(A > B, C,D)-> ValueError: operands could not be broadcast together with shapes (2,2) (2,2,3) (2,2,3)
The Cnew would be zeros in the (0,0) and (1,1) index.
You need to add a new axis at the end of the condition in order for it to broadcast correctly:
C_new = np.where((A > B)[..., np.newaxis], C, D)
D_new = np.where((A < B)[..., np.newaxis], D, C)
Related
I'm currently struggling with a probably rather simple question but I can't get my head around it.
Assuming I have the follow two 2d arrays with different shapes, I can combine them into a new array using:
a = np.zeros((2, 3))
b = np.zeros((4, 5))
c = np.array([a, b])
print(c.shape)
# Output
# (2,)
for elements in c:
print(elements.shape)
# Output:
# (2, 3)
# (4, 5)
So far so good!
But how would I do this if I have a large list where I'd have to iterate over? Here is a simple example with just 4 different 2d arrays:
This works as expected:
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = np.array([a, b, c, d])
print(e.shape)
# Output
# (4,)
for elements in e:
print(elements.shape)
# Output
# (2, 3)
# (4, 5)
# (6, 7)
# (8, 9)
This doesn't work as expected and my question would be how to do this in an iterative way:
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = None
for elements in [a, b, c, d]:
e = np.array([e, elements])
print(e.shape)
# Output
# (2,) <--- This should be (4,) as in the upper example, but I don't know how to achieve that :-/
for elements in e:
print(elements.shape)
# (2,)
# (8, 9)
I understand that in each iteration I'm just combining two arrays why it always stays at shape of (2,), but I wonder how this can be done in an elegant way.
So basically I want to have a third dimension which states the count or amount of arrays that are stored. E.g. if I iterate of 1000 different 2d arrays I'd expect to have a shape of (1000,)
Hope my question is understandable - if not let me know!
Thanks a lot!
If I understood your issue correctly, you can achieve what you want in a list comprehension. This will yield the exact same solution as your code above that you described as working.
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = np.array([element for element in [a, b, c, d]])
print(e.shape)
for elements in e:
print(elements.shape)
One can use numpy.where for selecting values from two arrays depending on a condition:
import numpy
a = numpy.random.rand(5)
b = numpy.random.rand(5)
c = numpy.where(a > 0.5, a, b) # okay
If the array has more dimensions, however, this does not work anymore:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, 0] > 0.5, a, b) # !
Traceback (most recent call last):
File "p.py", line 10, in <module>
c = numpy.where(a[:, 0] > 0.5, a, b) # okay
File "<__array_function__ internals>", line 6, in where
ValueError: operands could not be broadcast together with shapes (5,) (5,2) (5,2)
I would have expected a numpy array of shape (5,2).
What's the issue here? How to work around it?
Remember that broadcasting in numpy only works from the right, so while (5,) shaped arrays can broadcast with (2,5) shaped arrays they can't broadcast with (5,2) shaped arrays. to broadcast with a (5,2) shaped array you need to maintain the second dimension so that the shape is (5,1) (anything can broadcast with 1)
Thus, you need to maintain the second dimension when indexing it (otherwise it removes the indexed dimension when only one value exists). You can do this by putting the index in a one-element list:
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, [0]] > 0.5, a, b) # works
You can use c = numpy.where(a > 0.5, a, b)
however if you want to use only the first column of a then you need to consider the shape of the output.
let's first see what is the shape of this operation
(a[:, 0] > 0.5).shape # outputs (5,)
it's one dimensional
while the shape of a and b is (5, 2)
it's two dimensional and hence you can't broadcast this
the solution is to reshape the mask operation to be of shape (5, 1)
your code should look like this
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where((a[:, 0] > 0.5).reshape(-1, 1), a, b) # !
You can try:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a > 0.5, a, b)
instead of: c = np.where(a>0.5,a,b)
you can use: c = np.array([a,b])[a>0.5]
which works for multidimensional arrays if a and b have the same shape.
A is a 1d array with shape 100, B is a 2d array with shape (50000, 100). I want to calculate hamming distance between A and B, and get an array X with shape 50000.
I can do it with a loop:
for i in range(50000):
X[i] = np.count_nonzero(A != B[j,:])
I'd like to know can I skip the loop or do something to make it faster?
You can directly compare A and B with A != B, which will broadcast due to the different number of dimensions A and B have, and then you can use np.count_nonzero per row with axis=1:
np.count_nonzero(A != B, axis=1)
A = np.array([1,2])
B = np.array([[1,2],[3,2],[1,3],[2,4]])
np.count_nonzero(A != B, axis=1)
# array([0, 1, 1, 2])
I'm trying to do an integration with numpy:
A = n.trapz(B,C)
but I have some issues with B and C shapes
B is a filled array inizialized with numpy zeros function
B=np.zeros((N,1))
C is a column extracted from a matrix, always inizialized with numpy:
C = D[:,0]
D = np.zeros((N,2))
the problem is that:
n.shape(B) # (N,1)
n.shape(C) # (N,)
how can I manage this?
Try
B = np.zeros(N)
np.trapz(B, C)
Also, you np.trapz accepts multi-dimensional arrays, so arrays of shape (N, 1) are ok; you just need to specify an axis to handle it properly.
B = np.zeros((N, 1))
C = D[:, 0]
np.trapz(B, C.reshape(N, 1), axis=1)
I have a matrix
A = [[ 1. 1.]
[ 1. 1.]]
and two arrays (a and b), every array contains 20 float numbers How can I multiply the using formula:
( x' = A * ( x )
y' ) y
Is this correct? m = A * [a, b]
Matrix multiplication with NumPy arrays can be done with np.dot.
If X has shape (i,j) and Y has shape (j,k) then np.dot(X,Y) will be the matrix product and have shape (i,k). The last axis of X and the second-to-last axis of Y is multiplied and summed over.
Now, if a and b have shape (20,), then np.vstack([a,b]) has shape (2, 20):
In [66]: np.vstack([a,b]).shape
Out[66]: (2, 20)
You can think of np.vstack([a, b]) as a 2x20 matrix with the values of a on the first row, and the values of b on the second row.
Since A has shape (2,2), we can perform the matrix multiplication
m = np.dot(A, np.vstack([a,b]))
to arrive at an array of shape (2, 20).
The first row of m contains the x' values, the second row contains the y' values.
NumPy also has a matrix subclass of ndarray (a special kind of NumPy array) which has convenient syntax for doing matrix multiplication with 2D arrays. If we define A to be a matrix (rather than a plain ndarray which is what np.array(...) creates), then matrix multiplication can be done with the * operator.
I show both ways (with A being a plain ndarray and A2 being a matrix) below:
import numpy as np
A = np.array([[1.,1.],[1.,1.]])
A2 = np.matrix([[1.,1.],[1.,1.]])
a = np.random.random(20)
b = np.random.random(20)
c = np.vstack([a,b])
m = np.dot(A, c)
m2 = A2 * c
assert np.allclose(m, m2)