numpy pad with zeros creates 2d array instead of desired 1d - python

I am trying to pad a 1d numpy array with zeros.
Here is my code
v = np.random.rand(100, 1)
pad_size = 100
v = np.pad(v, (pad_size, 0), 'constant')
result is 200x101 array, whose last column is [0,0,0,... <v>], (leading 100 zeros),
and all 1st 100 columns are zeros.
How to get my desired array
[0,0,0,..0,<v>]
of size (len(v)+pad_size, 1)?

The pad output is 2D because the pad input was 2D. You made a 2D array with rand for some reason:
v = np.random.rand(100, 1)
If you wanted a 1D array, you should have made a 1D array:
v = np.random.rand(100)
If you wanted a 1-column 2D array, then you're using pad incorrectly. The second argument should be ((100, 0), (0, 0)): padding 100 elements before in the first axis, 0 elements after in the first axis, 0 elements before in the second axis, 0 elements after in the second axis:
v = np.random.rand(100, 1)
pad_size = 100
v = np.pad(v, ((pad_size, 0), (0, 0)), 'constant')
For a 1-row 2D array, you would need to adjust both the rand call and the pad call:
v = np.random.rand(1, 100)
pad_size = 100
v = np.pad(v, ((0, 0), (pad_size, 0)), 'constant')

np.hstack((np.zeros((200, 100)), your v))
np.concatenate((np.zeros((200, 100)), your v), axis=1)
may be your desire this:

Related

Concatenate Numpy arrays of different shape

I would like to create a numpy array by concatenating two or more numpy arrays with shape (1, x, 1) where x is variable.
Here is the problem in detail.
x1 = #numpy array with shape (x,)
x2 = #numpy array with shape (y,)
#create batch
x1 = np.expand_dims(x1, 0) #shape (1, x)
x2 = np.expand_dims(x2, 0) #shape (1, y)
#add channel dimension
x1 = np.expand_dims(x1, -1) #shape (1, x, 1)
x2 = np.expand_dims(x2, -1) #shape (1, y, 1)
#merge the two arrays
x = np.concatenate((x1, x2), axis=0)
#expected shape (2, ??, 1)
Note the expected shape (2, ??, 1). I am wondering if what I am trying to do is doable.
Executing this code raises a ValueError:
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 138241 and the array at index 1 has size 104321

padding a input vector, a 4-D matrix, using numpy for a convolutional neural network (CNN)

This is the entire code related to my question. You should be able to run this code and see the plots created - by just pasting and running it into your IDE.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0))\
, mode='constant', constant_values = (0,0))
print ("x.shape =\n", x.shape)
print ("x_pad.shape =\n", x_pad.shape)
print ("x[1,1] =\n", x[1,1])
print ("x_pad[1,1] =\n", x_pad[1,1])
fig, axarr = plt.subplots(1, 2)
axarr[0].set_title('x')
axarr[0].imshow(x[0,:,:,0])
axarr[1].set_title('x_pad')
axarr[1].imshow(x_pad[0,:,:,0])
Specifically, my question is related to these two lines of code:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
I want to pad the 2nd and 3rd dimension in x. So, I want to pad x[1] which has a value of 3 and x[2] which also has the value of 3. Based on the problem that I am solving, x[0] and x[3], which contain '4' and '2' respectively, represent something else. x[0] represents the number of number of such 3*3 matrices and x[3] the channels.
My question is about around how python is representing this information and about how we are interpreting it. Are these the same?
The statement x = np.random.randn (4, 3, 3, 2) created a matrix 4 rows by 3 columns and each element in this 4*3 matrix is a 3 row by 2 column matrix. That is how Python is representing the x_pad. Is this understanding correct?
If so, then in the np.pad statement, we are padding the number of columns in the outer matrix (which is 3 in the 4*3). We are also padding the number of rows, which is 3, in the “3*2” - that is, the number of rows in the inner matrix).
The 3, 3 in (4, 3, 3, 2) was supposed to be part of just one matrix and not the columns of the outer matrix and the rows of the inner matrix? I am having trouble visualizing this? Can someone please clarify. Thank you!
These lines:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
are equivalent to:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.zeros((4, 3+2+2, 3+2+2, 2))
x_pad[:, 2:-2, 2:-2, :] = x
You could interpret a 4-D array as being a 2-D array of 2-D arrays if that fits whatever this data represents for you, but numpy internally stores arrays as a 1D array of data; with x[i,j,k,l] pointing to data[l+n3*(k + n2*(j + n1*i))] where n1, n2, n3 are the lengths of the corresponding axes.
Visualizing 4-D (and higher) arrays is very difficult for humans. You just have to keep track of the indices for the four axes when you deal with such arrays.

Indexing a vector with a matrix of indicies with numpy, similar to MATLAB

I want to pull out a matrix filled with the values from a vector indexed with a matrix of indices
i.e. output(i, j) = vector(indices(i, j))
In Matlab, this can be achieved with output = vector(indices).
In Python/numpy I have the following loop for this purpose but I was wondering if there was a more efficient way to do it:
idx = np.random.randint(0, 100, (25, 10))
data = np.random.random(100)
output = np.empty((np.size(idx, 0), np.size(idx, 1)))
for i in range(0, np.size(idx, 0)):
output[i, :] = np.squeeze(data[idx[i, :]])
Many thanks
In [547]: idx = np.random.randint(0, 100, (25, 10))
...: data = np.random.random(100)
...: output = np.empty((np.size(idx, 0), np.size(idx, 1)))
...: for i in range(0, np.size(idx, 0)):
...: output[i, :] = np.squeeze(data[idx[i, :]])
In [553]: idx.shape
Out[553]: (25, 10)
In [554]: output.shape
Out[554]: (25, 10)
Simply index; no need to iterate
In [555]: np.allclose(output, data[idx])
Out[555]: True
There are differences between MATLAB and numpy when indexing with two arrays, one for each dimension. To put is simply, in MATLAB it's easier to index a block, in numpy indexing a diagonal is more direct. But that's not relevant here.
Output=vector[x, y]
X and Y are coordinates that you choose.
Similarly, if you want to specify an interval:
Output=vector[X1:X2, Y1:Y2]
Just [] instead of ()

NumPy: Concatenating 1D array to 3D array

Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
np.random.seed(777)
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:
np.dstack([A,np.broadcast_to(b[:,None],A.shape[:2])]

Alter a 3D ndarray at the positions represented by a 2d ndarray

This is my first nontrivial use of numpy, and I'm having some trouble in one spot.
So, I have colors, a (xsize + 2, ysize + 2, 3) ndarray, and newlife, a (xsize + 2, ysize + 2) ndarray of booleans. I want to add a random value between -5 and 5 to all three values in colors at all positions where newlife is true. In other words newlife maps 2D vectors to whether or not I want to add a random value to the color in colors at that position.
I've tried a million variations on this:
colors[np.nonzero(newlife)] += (np.random.random_sample((xsize + 2,ysize + 2, 3)) * 10 - 5)
but I keep getting stuff like
ValueError: operands could not be broadcast together with shapes (589,3) (130,42,3) (589,3)
How do I do this?
I think this does what you want:
# example data
colors = np.random.randint(0, 100, (5,4,3))
newlife = np.random.randint(0, 2, (5,4), bool)
# create values to add, then mask with newlife
to_add = np.random.randint(-5,6, (5,4,3))
to_add[~newlife] = 0
# modify in place
colors += to_add
This changes the colors in-place assuming uint8 dtype. Both assumptions are not essential:
import numpy as np
n_x, n_y = 2, 2
colors = np.random.randint(5, 251, (n_x+2, n_y+2, 3), dtype=np.uint8)
mask = np.random.randint(0, 2, (n_x+2, n_y+2), dtype=bool)
n_change = np.count_nonzero(mask)
print(colors)
print(mask)
colors[mask] += np.random.randint(-5, 6, (n_change, 3), dtype=np.int8).view(np.uint8)
print(colors)
The easiest way of understanding this is to look at the shape of colors[mask].

Categories

Resources