numpy addition between different dimensional arrays - python

I am running the following code:
import numpy as np
a = np.array([1, 2])
b = np.array([[1, 2]])
a = a + b
print(a)
[[2 , 4 ]]
As you can see. dimension of a is 1, and b is 2.
Mathematically, it is not possible to add between different dimensional arrays
how can it work under the numpy? and what does that mean [ [ 2, 4 ] ]?
a.shape is (2, )
b.shape is (1 ,2)
(a+b).shape is (1 ,2)
However the following code yields an error:
import numpy as np
a = np.array([1, 2])
b = np.array([[1, 2]])
a += b
Why doesn't it work? What makes different result?

As already implied in the comments, it always helps to check the documentation; quoting:
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
they are equal,
or one of them is 1
In your case both of the arrays share a dimension that has 1 element (1 column in A and 1 row in B).
Because of that it adds them in a way that would not make sense mathematically.
If you changed this and had arrays that share a dimension which is 1, then you would get an error.

Related

Applying matrix functions like scipy.linalg.eigh to higher dimensional arrays

I am new to numpy but have been using python for quite a while as an engineer.
I am writing a program that currently stores stress tensors as 3x3 numpy arrays within another NxM array which represents values through time and through the thickness of a wall, so overall it is an NxMx3x3 numpy array. I want to efficiently calculate the eigenvals and vectors of each 3x3 array within this larger array. So far I have tried to using "fromiter" but this doesn't seem to work because the functions returns 2 arrays. I have also tried apply_along_axis which also doesn't work because it says the inner 3x3 is not a square matrix? I can do it with list comprehension, but this doesn't seem ideal to resort to using lists.
Example just calculating eigenvals using list comprehension
import numpy as np
from scipy import linalg
a=np.random.random((2,2,3,3))
f=linalg.eigvalsh
ans=np.asarray([f(x) for x in a.reshape((4,3,3))])
ans.shape=(2,2,3)
I thought something like this would work but I have played around with it and can't get it working:
np.apply_along_axis(f,0,a)
BTW the 2x2 bit could be up to 5000x100 and this code is repeated ~50x50x200 times hence the need for efficiency. Any help would be greatly appreciated?
You can use numpy.linalg.eigh. It accepts an array like your example a.
Here's an example. First, create an array of 3x3 symmetric arrays:
In [96]: a = np.random.random((2, 2, 3, 3))
In [97]: a = a + np.transpose(a, axes=(0, 1, 3, 2))
In [98]: a[0, 0]
Out[98]:
array([[0.61145048, 0.85209618, 0.03909677],
[0.85209618, 1.79309413, 1.61209077],
[0.03909677, 1.61209077, 1.55432465]])
Compute the eigenvalues and eigenvectors of all the 3x3 arrays:
In [99]: evals, evecs = np.linalg.eigh(a)
In [100]: evals.shape
Out[100]: (2, 2, 3)
In [101]: evecs.shape
Out[101]: (2, 2, 3, 3)
Take a look at the result for a[0, 0]:
In [102]: evals[0, 0]
Out[102]: array([-0.31729364, 0.83148477, 3.44467813])
In [103]: evecs[0, 0]
Out[103]:
array([[-0.55911658, 0.79634401, 0.23070516],
[ 0.63392772, 0.23128064, 0.73800062],
[-0.53434473, -0.55887877, 0.63413738]])
Verify that it is the same as computing the eigenvalues and eigenvectors for a[0, 0] separately:
In [104]: np.linalg.eigh(a[0, 0])
Out[104]:
(array([-0.31729364, 0.83148477, 3.44467813]),
array([[-0.55911658, 0.79634401, 0.23070516],
[ 0.63392772, 0.23128064, 0.73800062],
[-0.53434473, -0.55887877, 0.63413738]]))

How to use Numpy Matrix operation to calculate multiple samples at once?

How do I use Numpy matrix operations to calculate over multiple vector samples at once?
Please see below the code I came up with, 'd' is the outcome I'm trying to get. But this is only one sample. How do I calculate the output without doing something like repeat the code for every sample OR looping through every sample?
a = np.array([[1, 2, 3]])
b = np.array([[1, 2, 3]])
c = np.array([[1, 2, 3]])
d = ((a.T * b).flatten() * c.T)
a1 = np.array([[2, 3, 4]])
b1 = np.array([[2, 3, 4]])
c1 = np.array([[2, 3, 4]])
d1 = ((a1.T * b1).flatten() * c1.T)
a2 = np.array([[3, 4, 5]])
b2 = np.array([[3, 4, 5]])
c2 = np.array([[3, 4, 5]])
d2 = ((a2.T * b2).flatten() * c2.T)
The way broadcasting works is to repeat your data along an axis of size one as many times as necessary to make your element-wise operation work. That is what is happening to axis 1 of a.T and axis 0 of b. Similar for the product of the result. My recommendation would be to concatenate all your inputs along another dimension, to allow broadcasting to happen along the existing two.
Before showing how to do that, let me just mention that you would be much better off using ravel instead of flatten in your example. flatten makes a copy of the data, while ravel only makes a view. Since a.T * b is a temporary matrix anyway, there is really no reason to make the copy.
The easiest way to combine some arrays along a new dimension is np.stack. I would recommend combining along the first dimension for a couple of reasons. It's the default for stack and your result can be indexed more easily: d[0] will be d, d[1] will be d1, etc. If you ever add matrix multiplication into your pipeline, np.dot will work out of the box since it operates on the last two dimensions.
a = np.stack((a0, a1, a2, ..., aN))
b = np.stack((b0, b1, b2, ..., bN))
c = np.stack((c0, c1, c2, ..., cN))
Now a, b and c are all 3D arrays the first dimension is the measurement index. The second and third correspond to the two dimensions of the original arrays.
With this structure, what you called transpose before is just swapping the last two dimensions (since one of them is 1), and raveling/flattening is just multiplying out the last two dimensions, e.g. with reshape:
d = (a.reshape(N, -1, 1) * b).reshape(N, 1, -1) * c.reshape(N, -1, 1)
If you set one of the dimensions to have size -1 in the reshape, it will absorb the remaining size. In this case, all your arrays have 3 elements, so the -1 will be equivalent to 3.
You have to be a little careful when you convert the ravel operation to 3D. In 2D, x.ravel() * c.T implicitly transforms x into a 1xN array before broadcasting. In 3D, x.reshape(3, -1) creates a 2D 3x27 array, which you multiply by c.reshape(3, -1, 1), which is 3x3x1. Broadcasting rules state that you are effectively multiplying a 1x3x27 array by a 3x3x1, but you really want to multiply a 3x1x27 array by the 3x3x1, so you need to specify all three axes for the 3D "ravel" explicitly.
Here is an IDEOne link with your sample data for you to play with: https://ideone.com/p8vTlx

Selecting a column of a numpy array

I am somewhat confused about selecting a column of an NumPy array, because the result is different from Matlab and even from NumPy matrix. Please see the following cases.
In Matlab, we use the following command to select a column vector out of a matrix.
x = [0, 1; 2 3]
out = x(:, 1)
Then out becomes [0; 2], which is a column vector.
To do the same thing with a NumPy Matrix
x = np.matrix([[0, 1], [2, 3]])
out = x[:, 0]
Then the output is np.matrix([[0], [2]]) which is expected, and it is a column vector.
However, in case of NumPy array
x = np.array([[0, 1], [2, 3]])
out = x[:, 0]
The output is np.array([0, 2]) which is 1 dimensional, so it is not a column vector. My expectation is it should have been np.array([[0], [2]]).
I have two questions.
1. Why is the output from the NumPy array case different form the NumPy matrix case? This is causing a lot of confusion to me, but I think there might be some reason for this.
2. To get a column vector from a 2-Dim NumPy Array, then should I do additional things like expand_dims
x = np.array([[0, 1], [2, 3]])
out = np.expand_dims(x[:, 0], axis = 1)
In MATLAB everything has atleast 2 dimensions. In older MATLABs, 2d was it, now they can have more. np.matrix is modeled on that old MATLAB.
What does MATLAB do when you index a 3d matrix?
np.array is more general. It can have 0, 1, 2 or more dimensions.
x[:, 0]
x[0, :]
both select one column or row, and return an array with one less dimension.
x[:, [0]]
x[[0], :]
would return 2d arrays, with a singleton dimension.
In Octave (MATLAB clone) indexing produces inconsistent results, depending on which side of matrix I select:
octave:7> x=ones(2,3,4);
octave:8> size(x)
ans =
2 3 4
octave:9> size(x(1,:,:))
ans =
1 3 4
octave:10> size(x(:,:,1))
ans =
2 3
MATLAB/Octave adds dimensions at the end, and apparently readily squeezes them down on that side as well.
numpy orders the dimensions in the other direction, and can add dimensions at the start as needed. But it is consistent in squeezing out singleton dimensions when indexing.
The fact that numpy can have any number of dimensions, while MATLAB has a minimum of 2 is a crucial difference that often trips up MATLAB users. But one isn't any more logical than the other. MATLAB's practice is more a more matter of history than general principals.

Python numpy: Dimension [0] in vectors (n-dim) vs. arrays (nxn-dim)

I'm currently wondering how the numpy array behaves. I feel like the dimensions are not consistent from vectors (Nx1 dimensional) to 'real arrays' (NxN dimensional).
I dont get, why this isn't working:
a = array(([1,2],[3,4],[5,6]))
concatenate((a[:,0],a[:,1:]), axis = 1)
# ValueError: all the input arrays must have same number of dimensions
It seems like the : (at 1:]) makes the difference, but (:0 is not working)
Thanks in advance!
Detailled Version: So I would expect that shape(b)[0] references the vertical direction in (Nx1 arrays), like in an 2D (NxN) array. But it seems like dimension [0] is the horizontal direction in arrays (Nx1 arrays)?
from numpy import *
a = array(([1,2],[3,4],[5,6]))
b = a[:,0]
print shape(a) # (3L, 2L), [0] is vertical
print a # [1,2],[3,4],[5,6]
print shape(b) # (3L, ), [0] is horizontal
print b # [1 3 5]
c = b * ones((shape(b)[0],1))
print shape(c) # (3L, 3L), I'd expect (3L, 1L)
print c # [[ 1. 3. 5.], [ 1. 3. 5.], [ 1. 3. 5.]]
What did I get wrong? Is there a nicer way than
d = b * ones((1, shape(b)[0]))
d = transpose(d)
print shape(d) # (3L, 1L)
print d # [[ 1.], [ 3.], [ 5.]]
to get the (Nx1) vector that I expect or want?
There are two overall issues here. First, b is not an (N, 1) shaped array, it is an (N,) shaped array. In numpy, 1D and 2D arrays are different things. 1D arrays simply have no direction. Vertical vs. horizontal, rows vs. columns, these are 2D concepts.
The second has to do with something called "broadcasting". In numpy arrays, you are able to broadcast lower-dimensional arrays to higher-dimensional ones, and the lower-dimensional part is applied elementwise to the higher-dimensional one.
The broadcasting rules are pretty simple:
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
they are equal, or
one of them is 1
In your case, it starts with the last dimension of ones((shape(b)[0],1)), which is 1. This meets the second criteria. So it multiplies the array b elementwise for each element of ones((shape(b)[0],1)), resulting in a 3D array.
So it is roughly equivalent to:
c = np.array([x*b for x in ones(shape(b))])
Edit:
To answer your original question, what you want to do is to keep both the first and second arrays as 2D arrays.
numpy has a very simple rule for this: indexing reduces the number of dimensions, slicing doesn't. So all you need is to have a length-1 slice. So in your example, just change a[:,0] to a[:,:1]. This means 'get every column up to the second one'. Of course that only includes the first column, but it is still considered a slice operation rather than getting an element, so it still preservers the number of dimensions:
>>> print(a[:, 0])
[1 3 5]
>>> print(a[:, 0].shape)
(3,)
>>> print(a[:, :1])
[[1]
[3]
[5]]
>>> print(a[:, :1].shape)
(3, 1)
>>> print(concatenate((a[:,:1],a[:,1:]), axis = 1))
[[1 2]
[3 4]
[5 6]]

python numpy ValueError: operands could not be broadcast together with shapes

In numpy, I have two "arrays", X is (m,n) and y is a vector (n,1)
using
X*y
I am getting the error
ValueError: operands could not be broadcast together with shapes (97,2) (2,1)
When (97,2)x(2,1) is clearly a legal matrix operation and should give me a (97,1) vector
EDIT:
I have corrected this using X.dot(y) but the original question still remains.
dot is matrix multiplication, but * does something else.
We have two arrays:
X, shape (97,2)
y, shape (2,1)
With Numpy arrays, the operation
X * y
is done element-wise, but one or both of the values can be expanded in one or more dimensions to make them compatible. This operation is called broadcasting. Dimensions, where size is 1 or which are missing, can be used in broadcasting.
In the example above the dimensions are incompatible, because:
97 2
2 1
Here there are conflicting numbers in the first dimension (97 and 2). That is what the ValueError above is complaining about. The second dimension would be ok, as number 1 does not conflict with anything.
For more information on broadcasting rules: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
(Please note that if X and y are of type numpy.matrix, then asterisk can be used as matrix multiplication. My recommendation is to keep away from numpy.matrix, it tends to complicate more than simplifying things.)
Your arrays should be fine with numpy.dot; if you get an error on numpy.dot, you must have some other bug. If the shapes are wrong for numpy.dot, you get a different exception:
ValueError: matrices are not aligned
If you still get this error, please post a minimal example of the problem. An example multiplication with arrays shaped like yours succeeds:
In [1]: import numpy
In [2]: numpy.dot(numpy.ones([97, 2]), numpy.ones([2, 1])).shape
Out[2]: (97, 1)
Per numpy docs:
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when:
they are equal, or
one of them is 1
In other words, if you are trying to multiply two matrices (in the linear algebra sense) then you want X.dot(y) but if you are trying to broadcast scalars from matrix y onto X then you need to perform X * y.T.
Example:
>>> import numpy as np
>>>
>>> X = np.arange(8).reshape(4, 2)
>>> y = np.arange(2).reshape(1, 2) # create a 1x2 matrix
>>> X * y
array([[0,1],
[0,3],
[0,5],
[0,7]])
You are looking for np.matmul(X, y). In Python 3.5+ you can use X # y.
It's possible that the error didn't occur in the dot product, but after.
For example try this
a = np.random.randn(12,1)
b = np.random.randn(1,5)
c = np.random.randn(5,12)
d = np.dot(a,b) * c
np.dot(a,b) will be fine; however np.dot(a, b) * c is clearly wrong (12x1 X 1x5 = 12x5 which cannot element-wise multiply 5x12) but numpy will give you
ValueError: operands could not be broadcast together with shapes (12,1) (1,5)
The error is misleading; however there is an issue on that line.
Use np.mat(x) * np.mat(y), that'll work.
We might confuse ourselves that a * b is a dot product.
But in fact, it is broadcast.
Dot Product :
a.dot(b)
Broadcast:
The term broadcasting refers to how numpy treats arrays with different
dimensions during arithmetic operations which lead to certain
constraints, the smaller array is broadcast across the larger array so
that they have compatible shapes.
(m,n) +-/* (1,n) → (m,n) : the operation will be applied to m rows
Convert the arrays to matrices, and then perform the multiplication.
X = np.matrix(X)
y = np.matrix(y)
X*y
we should consider two points about broadcasting.
first: what is possible.
second: how much of the possible things is done by numpy.
I know it might look a bit confusing, but I will make it clear by some example.
lets start from the zero level.
suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B).
numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B.
then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".
But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.
This is because X and y are not the same types. for example X is a numpy matrix and y is a numpy array!
Error: operands could not be broadcast together with shapes (2,3) (2,3,3)
This kind of error occur when the two array does not have the same shape.
to correct this you need reshape one array to match the other.
see example below
a1 = array([1, 2, 3]), shape = (2,3)
a3 =array([[[1., 2., 3.],
[2., 3., 2.],
[2., 4., 5.]],
[[1., 0., 3.],
[2., 3., 7.],
[2., 4., 6.]]])
with shape = (2,3,3)
IF i try to run np.multiply(a2,a3) it will return the error below
Error: operands could not be broadcast together with shapes (2,3) (2,3,3)
to solve this check out the broadcating rules
which state hat Two dimensions are compatible when:
#1.they are equal, or
#2.one of them is 1`
Therefore lets reshape a2.
reshaped = a2.reshape(2,3,1)
Now try to run np.multiply(reshaped,a3)
the multiplication will run SUCCESSFUL!!
ValueError: operands could not be broadcast together with shapes (x ,y) (a ,b)
where x ,y are variables
Basically this error occurred when value of y (no. of columns) doesn't equal to the number of elements in another multidimensional array.
Now let's go through by ex=>
coding apart
import numpy as np
arr1= np.arange(12).reshape(3,
output of arr1
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
arr2= np.arange(4).reshape(1,4)
or (both are same 1 rows and 4 columns)
arr2= np.arange(4)
ouput of arr2=>
array([0, 1, 2, 3])
no of elements in arr2 is equal no of no. of the columns in arr1 it will be excute.
for x,y in np.nditer([a,b]):
print(x,y)
output =>
0 0
1 1
2 2
3 3
4 0
5 1
6 2
7 3
8 0
9 1
10 2
11 3

Categories

Resources