I am using python3.5 and i have question: Why np.dot() is behaving like this?
>> a = np.array([[1,2,3,4]])
>> b = np.array([123])
>> np.dot(a,b)
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: shapes (1,4) and (1,) not aligned: 4 (dim 1) != 1 (dim 0)
>>np.dot(b,a)
array([123, 246, 369, 492])
From help(np.dot), we learn that, np.dot(x,y) is a sum product over the last axis of x and the second-to-last of y
In the case of np.dot(a, b), the last axis of a is 4 and the length of the only axis of b is 1. They don't match: fail.
In the case of np.dot(b, a), the last axis of b is 1 and the 2nd to last of a is 1. They match: success.
Workarounds
Depending on what your intention is for np.dot(a,b), you may want:
>>> np.dot(a, np.resize(b,a.shape[-1]))
array([1230])
From the documentation for numpy.dot(x, y):
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors... For N dimensions it is a sum product over the last axis of x and the second-to-last of y:
So, where you have:
a = np.array([[1,2,3,4]]) # shape is (1, 4), 2-D array (matrix)
b = np.array([123]) # shape is (1,), 1-D array (vector)
np.dot(b, a) works ((1,) * (1, 4), the relevant dimensions agree)
np.dot(a, b) doesn't ((1, 4) * (1,), the relevant dimensions don't agree, the operation is undefined. Note that the 'second-to-last' axis of (1,) corresponds to its one and only axis)
This is the same behaviour as if you have two 2-D arrays, i.e. matrices:
a = np.array([[1,2,3,4]]) # shape is (1, 4)
b = np.array([[123]]) # shape is (1, 1)
np.dot(b, a) works ((1, 1) * (1, 4), inner matrix dimensions agree)
np.dot(a, b) doesn't ((1, 4) * (1, 1), inner matrix dimensions don't agree)
If however you have two 1-D arrays, i.e. vectors, neither operation works:
a = np.array([1,2,3,4]) # shape is (4,)
b = np.array([123]) # shape is (1,)
np.dot(b, a) doesn't work ((1,) * (4,), but can only define the inner product for vectors of the same length)
np.dot(a, b) doesn't work ((4,) * (1), same)
Related
I am just starting to learn Python/NumPy. I want to write a function which will apply an operation with 2 inputs and 1 output and a given weight matrix i.e two NumPy arrays of shape (2,1) and should return a NumPy array of shape (1,1) using tanh. Here is what I came up with:
import numpy as np
def test_neural(inputs,weights):
result=np.matmul(inputs,weights)
print(result)
z = np.tanh(result)
return (z)
x = np.array([[1],[1]])
y = np.array([[1],[1]])
z=test_neural(x,y)
print("final result:",z)
But I am getting the following matmul error:
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)
Can someone please tell me what I am missing?
The problem is the dimensions of the matrix multiplication.
You can multiply matrixes with shared dimensions like this (read more here):
(M , N) * (N , K) => Result dimensions is (M, K)
You try multiply:
(2 , 1) * (2, 1)
But the dimensions are illegal.
So you have to transpose inputs before multiply (just apply .T on the matrix), so you get valid dimensions for multiplication:
(1, 2) * (2, 1) => Result dimension is (1, 1)
Code:
import numpy as np
def test_neural(inputs,weights):
result=np.matmul(inputs.T, weights)
print(result)
z = np.tanh(result)
return (z)
x = np.array([[1],[1]])
y = np.array([[1],[1]])
z=test_neural(x,y)
# final result: [[0.96402758]]
print("final result:",z)
One can use numpy.where for selecting values from two arrays depending on a condition:
import numpy
a = numpy.random.rand(5)
b = numpy.random.rand(5)
c = numpy.where(a > 0.5, a, b) # okay
If the array has more dimensions, however, this does not work anymore:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, 0] > 0.5, a, b) # !
Traceback (most recent call last):
File "p.py", line 10, in <module>
c = numpy.where(a[:, 0] > 0.5, a, b) # okay
File "<__array_function__ internals>", line 6, in where
ValueError: operands could not be broadcast together with shapes (5,) (5,2) (5,2)
I would have expected a numpy array of shape (5,2).
What's the issue here? How to work around it?
Remember that broadcasting in numpy only works from the right, so while (5,) shaped arrays can broadcast with (2,5) shaped arrays they can't broadcast with (5,2) shaped arrays. to broadcast with a (5,2) shaped array you need to maintain the second dimension so that the shape is (5,1) (anything can broadcast with 1)
Thus, you need to maintain the second dimension when indexing it (otherwise it removes the indexed dimension when only one value exists). You can do this by putting the index in a one-element list:
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a[:, [0]] > 0.5, a, b) # works
You can use c = numpy.where(a > 0.5, a, b)
however if you want to use only the first column of a then you need to consider the shape of the output.
let's first see what is the shape of this operation
(a[:, 0] > 0.5).shape # outputs (5,)
it's one dimensional
while the shape of a and b is (5, 2)
it's two dimensional and hence you can't broadcast this
the solution is to reshape the mask operation to be of shape (5, 1)
your code should look like this
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where((a[:, 0] > 0.5).reshape(-1, 1), a, b) # !
You can try:
import numpy
a = numpy.random.rand(5, 2)
b = numpy.random.rand(5, 2)
c = numpy.where(a > 0.5, a, b)
instead of: c = np.where(a>0.5,a,b)
you can use: c = np.array([a,b])[a>0.5]
which works for multidimensional arrays if a and b have the same shape.
I need to calculate dot product of two matrices. Probably tensordot would do the job, however I am struggling to figure out exact solution.
The simple option
res = np.dot(x, fullkernel[:, :-1].transpose())
works fine, where x is of shape (9999,), fullkernel of shape (980,10000), and res is of shape (1, 980).
Now I need to do similar thing with 2 dimensions. Thus my x now has shape (9999, 2), fullkernel (2, 980, 10000).
Literally I want my result "res" to be of 2 dimensions, where each one is dot.product of one column of x and one dimension of fullkernel.
You can do that like this:
res = np.einsum('ki,ijk->ij', x, fullkernel[:, :, :-1])
print(res.shape)
# (2, 980)
If you want to have the additional singleton dimension in the middle just do:
res = np.expand_dims(res, 1)
An equivalent solution with # / np.matmul would be:
res = np.expand_dims(x.T, 1) # np.moveaxis(fullkernel[:, :, :-1], 2, 1)
print(res.shape)
# (2, 1, 980)
We all know that dot product between vectors must return a scalar:
import numpy as np
a = np.array([1,2,3])
b = np.array([3,4,5])
print(a.shape) # (3,)
print(b.shape) # (3,)
a.dot(b) # 26
b.dot(a) # 26
perfect. BUT WHY if we use a "real" (take a look at Difference between numpy.array shape (R, 1) and (R,)) row vector or column vector the numpy dot product returns error on dimension ?
arow = np.array([[1,2,3]])
brow = np.array([[3,4,5]])
print(arow.shape) # (1,3)
print(brow.shape) # (1,3)
arow.dot(brow) # ERROR
brow.dot(arow) # ERROR
acol = np.array([[1,2,3]]).reshape(3,1)
bcol = np.array([[3,4,5]]).reshape(3,1)
print(acol.shape) # (3,1)
print(bcol.shape) # (3,1)
acol.dot(bcol) # ERROR
bcol.dot(acol) # ERROR
Because by explicitly adding a second dimension, you are no longer working with vectors but with two dimensional matrices. When taking the dot product of matrices, the inner dimensions of the product must match.
You therefore need to transpose one of your matrices. Which one you transpose will determine the meaning and shape of the result.
A 1x3 times a 3x1 matrix will result in a 1x1 matrix (i.e., a scalar). This is the inner product. A 3x1 times a 1x3 matrix will result in a 3x3 outer product.
You can also use the # operator, which is actually matrix multiplication.
In this case, as well as in dot product, you need to be aware to the matrices sizes (ndarray should always be dim compatible ), but it's more readable:
>>> a = np.array([1,2,3])
>>> a.shape
(3,)
>>> b= np.array([[1,2,3]])
>>> b.shape
(1, 3)
>>> a#b
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: shapes (3,) and (1,3) not aligned: 3 (dim 0) != 1 (dim 0)
>>> a#b.T
array([14])
You can also do like this
import numpy as npy
Vector1 = npy.array([0,2,3])
Vector2 = npy.array([3,5,1])
print("Dot Product of", Vector1, "and", Vector2,)
def DotProduct(a,b):
NetValue = 0
for i in range(len(a)):
NetValue += a[i]*b[i]
return NetValue
ans = DotProduct(Vector1,Vector2)
print("The answer is =",ans)
I have some problems understanding how python/numpy is casting array shapes when comparing to an empty list - which as far as I understand - is an implicit (element wise) comparison with False.
In the following example the shape decreases by one in the last dimension, if it is not greater than 1.
z = N.zeros((2,2,1))
z == []
>> array([], shape=(2, 2, 0), dtype=bool)
z2 = N.zeros((2,2,2))
z2 ==[]
>> False
If, however, I compare with False directly, I get the expected output.
z = N.zeros((2,2,1))
(z == False).shape
>> (2, 2, 2)
z2 = N.zeros((2,2,2))
(z2 == False).shape
>> (2, 2, 1)
This is ordinary broadcasting at work. When you do
z = N.zeros((2,2,1))
z == []
[] is interpreted as an array of shape (0,), and then the shapes are broadcast against each other:
(2, 2, 1)
vs (0,)
Since (0,) is shorter than (2, 2, 1), it gets expanded, as if the array were copied repeatedly:
(2, 2, 1)
vs (2, 2, 0)
and since there's a 1 in the first shape and the other shape doesn't have a 1 there, the first shape gets "expanded" as if it were copied zero times:
(2, 2, 0)
vs (2, 2, 0)
The comparison thus results in an array of booleans with shape (2, 2, 0).
When z has shape (2, 2, 2):
z2 = N.zeros((2,2,2))
z2 ==[]
broadcasting fails, since a length-2 axis and a length-0 axis can't be broadcast against each other. NumPy reports that it doesn't know how to perform the comparison:
>>> numpy.zeros([2, 2, 2]).__eq__([])
NotImplemented
The list doesn't know how either, so Python falls back on the default comparison by identity, and gets a result of False.
When you compare against False:
z = N.zeros((2,2,1))
(z == False).shape
False gets interpreted as an array of shape () - an empty shape! That gets broadcast out to shape (2, 2, 1), as if copied out to an array full of Falses, so the result has the same shape as z.