In python, how to write program that create two 4 * 4 matrices A and B whose elements are random numbers. Then create a matrix C that looks like
C = ⎡A B⎤
⎣B A⎦
Find the diagonal of the matrix C. The diagonal elements are to be presented in a 4 * 2 matrix.
import numpy as np
matrix_A = np.random.randint(10, size=(4, 4))
matrix_B = np.random.randint(10, size=(4, 4))
matrix_C = np.array([[matrix_A, matrix_B], [matrix_B, matrix_A]])
d= matrix_C.diagonal()
D=d.reshape(2,4)
print(f'This is matrix C:\n{matrix_C}')
print(f'These are the diagonals of Matrix C:\n{D}')
The construction
matrix_C = np.array([[matrix_A, matrix_B], [matrix_B, matrix_A]])
does not concatenate matrices, but creates 4th order tensor (put matrices inside matrix). You can check that by
print(matrix_C.shape) # (2, 2, 4, 4)
To lay out blocks call np.block, then all other parts of your code should work fine:
matrix_C = np.block([[matrix_A, matrix_B], [matrix_B, matrix_A]])
print(matrix_C.shape) # (8, 8)
d= matrix_C.diagonal()
D=d.reshape(2,4) # np.array([matrix_A.diagonal(), matrix_A.diagonal()])
Related
I have been using NumPy for a while but there still are instances in which broadcasting is impenetrable to me. Please see the code below:
import numpy as np
np.random.seed(123)
# Number of measurements for the x variable
nx = 7
# Number of measurements for the y variable
ny = 11
# Number of items for which we run the simulation
nc = 23
# Fake some data
x = np.random.uniform(0, 1, size=(nx, ))
y = np.random.uniform(0, 1, size=(ny, ))
# histogram_2d represents the 2d frequency of the x, y measurements
histogram_2d = np.random.randint(0, 20, size=(nx, ny))
# c is the actual simulation results, size=(nx*ny, nc)
c = np.random.uniform(0, 9, size=(nx*ny, nc))
# Try broadcasting
c_3d = c.reshape((nc, nx, ny))
numpy_sum = (c_3d * histogram_2d).sum()
# Attempt possible replacement with a simple loop
partial_sum = 0.0
for i in range(nc):
c_2d = np.reshape(c[:, i], (nx, ny))
partial_sum += (c_2d * histogram_2d).sum()
print('Numpy broadcasting: ', numpy_sum)
print('Actual loop : ', partial_sum)
In my naivete, I was expecting the two approaches to give the same results (up to some multiple of machine precision). But on my system I get this:
Numpy broadcasting: 74331.4423599
Actual loop : 73599.8596346
As my ignorance is showing: given that histogram_2d is a 2D array and c_3d is a 3D array, I was simply thinking that NumPy would magically expand histogram_2d with nc copies of itself in the first axis and do the multiplication. But it appears I am not quite correct.
I would like to know how to replace the condensed, broadcasted multiplication + sum with a proper for loop - I am looking at some code in Fortran to do the same and this Fortran code:
hist_3d = spread(histogram_2d, 1, nc)
c_3d = reshape(c, [nc, nx, ny])
partial_sum = sum(c_3d*hist_3d)
Does not do what the NumPy broadcasting does... Which means I am doing something fundamentally wrong somewhere and/or my understanding of broadcasting is still very limited.
In [3]: c.shape
Out[3]: (77, 23)
This isn't a good reshape; it works, but will mess up the layout
In [5]: c_3d = c.reshape((nc, nx, ny)); c_3d.shape
Out[5]: (23, 7, 11)
This is good - splitting the 77 into 7 and 11:
In [6]: c_3d = c.reshape((nx, ny,nc)); c_3d.shape
Out[6]: (7, 11, 23)
To multiply with:
In [7]: histogram_2d.shape
Out[7]: (7, 11)
Use:
In [8]: (histogram_2d[:,:,None]*c_3d).shape
Out[8]: (7, 11, 23)
In [9]: (histogram_2d[:,:,None]*c_3d).sum()
Out[9]: 73599.85963455029
With this broadcasting
(7,11,1) and (7,11,23) => (7,11,23)
The 2 key rules are:
add leading dimensions as need to match total ndim
change all size 1 dimensions to match
(I used change, because the 1 may actually be changed to 0. That's not a common case, but illustrates the generality of broadcasting. )
New trailing dimensions have to be explicit. This avoids some ambiguities, as when trying to add a (2,) and (3,). One of those can be expanded to (1,2) or (1,3), but which? If one is (3,1), then expanding the other to (1,2) is unambiguous.
Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
np.random.seed(777)
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:
np.dstack([A,np.broadcast_to(b[:,None],A.shape[:2])]
I want to use tensordot to compute the dot product of a specific dim of two tensors. Like:
A is a tensor, whose shape is (3, 4, 5)
B is a tensor, whose shape is (3, 5)
I want to do a dot use A's third dim and B's second dim, and get a output whose dims is (3, 4)
Like below:
for i in range(3):
C[i] = dot(A[i], B[i])
How to do it by tensordot?
Well, do you want this in numpy or in Theano?
In the case, where, as you state, you would like to contract axis 3 of A against axis 2 of B, both are straightforward:
import numpy as np
a = np.arange(3 * 4 * 5).reshape(3, 4, 5).astype('float32')
b = np.arange(3 * 5).reshape(3, 5).astype('float32')
result = a.dot(b.T)
in Theano this writes as
import theano.tensor as T
A = T.ftensor3()
B = T.fmatrix()
out = A.dot(B.T)
out.eval({A: a, B: b})
however, the output then is of shape (3, 4, 3). Since you seem to want an output of shape (3, 4), the numpy alternative uses einsum, like so
einsum_out = np.einsum('ijk, ik -> ij', a, b)
However, einsum does not exist in Theano. So the specific case here can be emulated as follows
out = (a * b[:, np.newaxis]).sum(2)
which can also be written in Theano
out = (A * B.dimshuffle(0, 'x', 1)).sum(2)
out.eval({A: a, B: b})
In this specific case, einsum is probably easier to understand than tensordot. For example:
c = np.einsum('ijk,ik->ij', a, b)
I'm going to over-simplify the explanation a bit to make things more immediately understandable. We have two input arrays (separated by the comma) and this yields our output array (to the right of the ->).
a has shape 3, 4, 5 and we'll refer to it as ijk
b has shape 3, 5 (ik)
We want the output c to have shape 3, 4 (ij)
Seems a bit magical, right? Let's break that down a bit.
The letters we "lose" as we cross the -> are axes that will be summed over. That's what dot is doing, as well.
We want output with shape 3, 4, so we're eliminating k
Therefore, the output c should be ij
This means we'll refer to b as ik.
As a full example:
import numpy as np
a = np.random.random((3, 4, 5))
b = np.random.random((3, 5))
# Looping through things
c1 = []
for i in range(3):
c1.append(a[i].dot(b[i]))
c1 = np.array(c1)
# Using einsum instead
c2 = np.einsum('ijk,ik->ij', a, b)
assert np.allclose(c1, c2)
You can do this with tensordot as well. I'll add an example of that as soon as I have a bit more time. (Of course, if anyone else would like to add a tensordot example as another answer in the meantime, feel free!)
I'm trying to do an integration with numpy:
A = n.trapz(B,C)
but I have some issues with B and C shapes
B is a filled array inizialized with numpy zeros function
B=np.zeros((N,1))
C is a column extracted from a matrix, always inizialized with numpy:
C = D[:,0]
D = np.zeros((N,2))
the problem is that:
n.shape(B) # (N,1)
n.shape(C) # (N,)
how can I manage this?
Try
B = np.zeros(N)
np.trapz(B, C)
Also, you np.trapz accepts multi-dimensional arrays, so arrays of shape (N, 1) are ok; you just need to specify an axis to handle it properly.
B = np.zeros((N, 1))
C = D[:, 0]
np.trapz(B, C.reshape(N, 1), axis=1)
I have a matrix
A = [[ 1. 1.]
[ 1. 1.]]
and two arrays (a and b), every array contains 20 float numbers How can I multiply the using formula:
( x' = A * ( x )
y' ) y
Is this correct? m = A * [a, b]
Matrix multiplication with NumPy arrays can be done with np.dot.
If X has shape (i,j) and Y has shape (j,k) then np.dot(X,Y) will be the matrix product and have shape (i,k). The last axis of X and the second-to-last axis of Y is multiplied and summed over.
Now, if a and b have shape (20,), then np.vstack([a,b]) has shape (2, 20):
In [66]: np.vstack([a,b]).shape
Out[66]: (2, 20)
You can think of np.vstack([a, b]) as a 2x20 matrix with the values of a on the first row, and the values of b on the second row.
Since A has shape (2,2), we can perform the matrix multiplication
m = np.dot(A, np.vstack([a,b]))
to arrive at an array of shape (2, 20).
The first row of m contains the x' values, the second row contains the y' values.
NumPy also has a matrix subclass of ndarray (a special kind of NumPy array) which has convenient syntax for doing matrix multiplication with 2D arrays. If we define A to be a matrix (rather than a plain ndarray which is what np.array(...) creates), then matrix multiplication can be done with the * operator.
I show both ways (with A being a plain ndarray and A2 being a matrix) below:
import numpy as np
A = np.array([[1.,1.],[1.,1.]])
A2 = np.matrix([[1.,1.],[1.,1.]])
a = np.random.random(20)
b = np.random.random(20)
c = np.vstack([a,b])
m = np.dot(A, c)
m2 = A2 * c
assert np.allclose(m, m2)