Related
I am currently converting some Matlab code to Python and I am wondering if there is a similar function to Matlab's shiftdim(A, n)
B = shiftdim(A,n) shifts the dimensions of an array A by n positions. shiftdim shifts the dimensions to the left when n is a positive integer and to the right when n is a negative integer. For example, if A is a 2-by-3-by-4 array, then shiftdim(A,2) returns a 4-by-2-by-3 array.
If you use numpy you can use np.moveaxis.
From the docs:
>>> x = np.zeros((3, 4, 5))
>>> np.moveaxis(x, 0, -1).shape
(4, 5, 3)
>>> np.moveaxis(x, -1, 0).shape
(5, 3, 4)
numpy.moveaxis(a, source, destination)[source]
Parameters
a: np.ndarray
The array whose axes should be reordered.
source: int or sequence of int
Original positions of the axes to move. These must be unique.
destination: int or sequence of int
Destination positions for each of the original axes.
These must also be unique.
shiftdim's function is a bit more complex than shifting axes around.
For input shiftdim(A, n), if n is positive, shift the axes to the left by n (i.e., rotate), but if n is negative, shift the axes to the right and append trailing dimensions of size 1.
For input shiftdim(A), remove any trailing dimensions of size 1.
from collections import deque
import numpy as np
def shiftdim(array, n=None):
if n is not None:
if n >= 0:
axes = tuple(range(len(array.shape)))
new_axes = deque(axes)
new_axes.rotate(n)
return np.moveaxis(array, axes, tuple(new_axes))
return np.expand_dims(array, axis=tuple(range(-n)))
else:
idx = 0
for dim in array.shape:
if dim == 1:
idx += 1
else:
break
axes = tuple(range(idx))
# Note that this returns a tuple of 2 results
return np.squeeze(array, axis=axes), len(axes)
Same examples as the Matlab docs
a = np.random.uniform(size=(4, 2, 3, 5))
print(shiftdim(a, 2).shape) # prints (3, 5, 4, 2)
print(shiftdim(a, -2).shape) # prints (1, 1, 4, 2, 3, 5)
a = np.random.uniform(size=(1, 1, 3, 2, 4))
b, nshifts = shiftdim(a)
print(nshifts) # prints 2
print(b.shape) # prints (3, 2, 4)
I just want to get the dot product of some sets of multidimensional data.
For simplicity, I am posting the pieces small, and demonstrating my efforts
thus far.
To just get 'a' dot 'q', and the 4 numbers that I want is easy enough.
import numpy as np
a = np.arange(1,4) # shape = (3,)
q = np.array([[x, x, x] for x in range(4)])+1 # shape = (4, 3)
c = np.dot(a, q.T) # array([ 6, 12, 18, 24]) shape = (4,)
If I want to add another set to 'a', I can expand the dimensions. Again, pretty easy. The dot product simply reflects the additional dimension.
a = np.arange(1,4).reshape(1,3) # shape = (1,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24]]) shape = (1,4)
and the other set...
a = np.vstack((a,a+1)) # shape = (2,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24], [ 9, 18, 27, 36]]) shape = (2,4)
To add another dimension to q, the transpose needs to be a little more complicated.
q = np.expand_dims(q, axis=0) # shape = (1, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 1, 4)
now stack 'q' matrix
q = np.vstack((q, q+1)) # shape = (2, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 2, 4)
Though, what I am going for is the diagonal of c. While I have not tried it yet, I am imagining that when 'a' and 'q' start to reach >(2000, 3) and >(2000, 4, 3) c will be (2000, 2000, 4) and I only need 1/2000th of that. Does anyone know how to make this more efficient than doing the calculation and then taking the diagonal?
Again, what I want is...
c = np.dot(a, np.transpose(q, (0, 2, 1)))
c = c[np.arange(2), np.arange(2)]
or
c[0] = np.dot(a[0:1], np.transpose(q[0:1], (0, 2, 1)))
c[1] = np.dot(a[1:2], np.transpose(q[1:2], (0, 2, 1)))
but without having to make the enormous matrix first and then trim it later.
I have read a couple other, kinda, similar questions. Though, I hope that this question is perceived to be more complicated than a dot product of the same vector and its diagonal, Also, if the answer is np.einsum(), could you explain the process a more than the numpy docs?
I reposted the question, with the einsum() entries at each c. In fact, Alexander Korovin linked to an excellent einsum summary.
I just want to get the dot product of some sets of multidimensional data.
For simplicity, I am posting the pieces small, and demonstrating my efforts
thus far.
To just get 'a' dot 'q', and the 4 numbers that I want is easy enough.
import numpy as np
a = np.arange(1,4) # shape = (3,)
q = np.array([[x, x, x] for x in range(4)])+1 # shape = (4, 3)
c = np.dot(a, q.T) # array([ 6, 12, 18, 24]) shape = (4,)
c = np.einsum('i,ji->j', a, q)
If I want to add another set to 'a', I can expand the dimensions. Again, pretty easy. The dot product simply reflects the additional dimension.
a = np.arange(1,4).reshape(1,3) # shape = (1,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24]]) shape = (1,4)
c = np.einsum('ij,ij->i', a, q)
and the other set...
a = np.vstack((a,a+1)) # shape = (2,3)
c = np.dot(a, q.T) # array([[ 6, 12, 18, 24], [ 9, 18, 27, 36]]) shape = (2,4)
c = np.einsum('ij,gj->ig', a, q)
To add another dimension to q, the transpose needs to be a little more complicated.
q = np.expand_dims(q, axis=0) # shape = (1, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 1, 4)
c = np.einsum('ij,fgj->fig', a, q)
now stack 'q' matrix
q = np.vstack((q, q+1)) # shape = (2, 4, 3)
c = np.dot(a, np.transpose(q, (0, 2, 1))) # shape = (2, 2, 4)
c = np.einsum('ij,fgj->fig', a, q)
Though, what I am going for is the diagonal of c. While I have not tried it yet, I am imagining that when 'a' and 'q' start to reach >(2000, 3) and >(2000, 4, 3) c will be (2000, 2000, 4) and I only need 1/2000th of that. Does anyone know how to make this more efficient than doing the calculation and then taking the diagonal?
Again, what I want is...
c = np.dot(a, np.transpose(q, (0, 2, 1)))
c = c[np.arange(2), np.arange(2)]
or
c[0] = np.dot(a[0:1], np.transpose(q[0:1], (0, 2, 1)))
c[1] = np.dot(a[1:2], np.transpose(q[1:2], (0, 2, 1)))
but without having to make the enormous matrix first and then trim it later.
So do this...
c = np.einsum('ik,ijk->ij', a, q)
This is the entire code related to my question. You should be able to run this code and see the plots created - by just pasting and running it into your IDE.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0))\
, mode='constant', constant_values = (0,0))
print ("x.shape =\n", x.shape)
print ("x_pad.shape =\n", x_pad.shape)
print ("x[1,1] =\n", x[1,1])
print ("x_pad[1,1] =\n", x_pad[1,1])
fig, axarr = plt.subplots(1, 2)
axarr[0].set_title('x')
axarr[0].imshow(x[0,:,:,0])
axarr[1].set_title('x_pad')
axarr[1].imshow(x_pad[0,:,:,0])
Specifically, my question is related to these two lines of code:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
I want to pad the 2nd and 3rd dimension in x. So, I want to pad x[1] which has a value of 3 and x[2] which also has the value of 3. Based on the problem that I am solving, x[0] and x[3], which contain '4' and '2' respectively, represent something else. x[0] represents the number of number of such 3*3 matrices and x[3] the channels.
My question is about around how python is representing this information and about how we are interpreting it. Are these the same?
The statement x = np.random.randn (4, 3, 3, 2) created a matrix 4 rows by 3 columns and each element in this 4*3 matrix is a 3 row by 2 column matrix. That is how Python is representing the x_pad. Is this understanding correct?
If so, then in the np.pad statement, we are padding the number of columns in the outer matrix (which is 3 in the 4*3). We are also padding the number of rows, which is 3, in the “3*2” - that is, the number of rows in the inner matrix).
The 3, 3 in (4, 3, 3, 2) was supposed to be part of just one matrix and not the columns of the outer matrix and the rows of the inner matrix? I am having trouble visualizing this? Can someone please clarify. Thank you!
These lines:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
are equivalent to:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.zeros((4, 3+2+2, 3+2+2, 2))
x_pad[:, 2:-2, 2:-2, :] = x
You could interpret a 4-D array as being a 2-D array of 2-D arrays if that fits whatever this data represents for you, but numpy internally stores arrays as a 1D array of data; with x[i,j,k,l] pointing to data[l+n3*(k + n2*(j + n1*i))] where n1, n2, n3 are the lengths of the corresponding axes.
Visualizing 4-D (and higher) arrays is very difficult for humans. You just have to keep track of the indices for the four axes when you deal with such arrays.
This is my first nontrivial use of numpy, and I'm having some trouble in one spot.
So, I have colors, a (xsize + 2, ysize + 2, 3) ndarray, and newlife, a (xsize + 2, ysize + 2) ndarray of booleans. I want to add a random value between -5 and 5 to all three values in colors at all positions where newlife is true. In other words newlife maps 2D vectors to whether or not I want to add a random value to the color in colors at that position.
I've tried a million variations on this:
colors[np.nonzero(newlife)] += (np.random.random_sample((xsize + 2,ysize + 2, 3)) * 10 - 5)
but I keep getting stuff like
ValueError: operands could not be broadcast together with shapes (589,3) (130,42,3) (589,3)
How do I do this?
I think this does what you want:
# example data
colors = np.random.randint(0, 100, (5,4,3))
newlife = np.random.randint(0, 2, (5,4), bool)
# create values to add, then mask with newlife
to_add = np.random.randint(-5,6, (5,4,3))
to_add[~newlife] = 0
# modify in place
colors += to_add
This changes the colors in-place assuming uint8 dtype. Both assumptions are not essential:
import numpy as np
n_x, n_y = 2, 2
colors = np.random.randint(5, 251, (n_x+2, n_y+2, 3), dtype=np.uint8)
mask = np.random.randint(0, 2, (n_x+2, n_y+2), dtype=bool)
n_change = np.count_nonzero(mask)
print(colors)
print(mask)
colors[mask] += np.random.randint(-5, 6, (n_change, 3), dtype=np.int8).view(np.uint8)
print(colors)
The easiest way of understanding this is to look at the shape of colors[mask].
I have an M-dimensional np.ndarray, where M <= N. Beyond this condition, the array may have any shape. I want to convert this array to N-dimensional, with dimensions 0 through M kept the same and dimensions M through N set to 1.
I can almost accomplish this behavior by copying the array using np.array and supplying the the ndmin argument. However, this places extra axis to the 'first' rather than 'last' positions:
>>> a3d = np.zeros((2,3,4))
>>> a5d = np.array(a3d, ndmin = 5)
>>> a5d.shape
(1, 1, 2, 3, 4) #actual shape
(2, 3, 4, 1, 1) #desired shape
Is there a way to specify where the added dimensions should go? Is there an alternate function I can use here which can result in my desired output?
Obviously in the example above I could manipulate the array after the fact to put axes in the order I want them, but since the orignal array could have had anywhere from 0 to 5 dimensions (and I want to keep original dimensions in the original order), I can't think of a way to do that without a tedious series of checks on the original shape.
I'd use .reshape ...
>>> a3d = a3d.reshape(a3d.shape + (1, 1))
>>> a3d.shape
(2, 3, 4, 1, 1)
If you want to pad up to a certain dimensionality:
>>> a3d = np.zeros((2,3,4))
>>> ndim = 5
>>> padded_shape = (a3d.shape + (1,)*ndim)[:ndim]
>>> new_a3d = a3d.reshape(padded_shape)
>>> new_a3d.shape
(2, 3, 4, 1, 1)
Just set
a5d = np.array(a3d)
a5d.shape = a3d.shape + (1, 1)
print a5d.shape
(2, 3, 4, 1, 1)
since the arrays are of the same physical size