How do I apply padding to a list of arrays? - python

I have this function padding function.
It takes X (a list of numpy arrays of dimensions (13,n) where n varies between 0-99 with each array) and returns and returns X_new which should also be a list of numpy arrays that are all shape (13,99) after padding.
X_new = []
for x in X:
shp_1 = len(x[1])
if shp_1 != targetdim:
X_new.append(np.pad(x[1], (0, targetdim - shp_1), 'constant', constant_values=0))
else:
X_new.append(x)
Checking its output by checking the shape of the arrays in X_new
(13, 99) #correct dimensions
(13, 99)
(99,) #wrong
(13, 99)
(13, 99)
(13, 99)
(13, 99)
(99,)
(13, 99)
(99,)
(13, 99)
X_new.append(np.pad(x[1], (0, targetdim - shp_1), 'constant', constant_values=0)) works as intended as it pads column x[1] to 99 if needed.
The problem is that the function only appends the padded array in x[1] to new_list, x[0] is discarded.
The result is that where padding is applied the output shape is (99,) instead of the desired (13,99).
My question is how do I resolve this issue with append?
In short, my the goal is to reproduce list X with a list of padded arrays, any alternative methods of achieving this goal are also welcome.

Related

Why can't numpy remove this useless dimension?

No matter what I do to this array:
data = np.mean(np.mat(segment_data), axis=0)
print(data)
print(data.shape)
print(data[0].shape)
print(data[0,:].shape)
print(data.squeeze().shape)
print(data.flatten().shape)
print(data.transpose().shape)
print(data.transpose()[:,0].shape)
The output is still two-dimensional:
[[-0.48134436 13.09216948 10.63232405 10.6977263 11.95639315 13.83434023
13.61501793 8.21932062 8.93592935 26.15871746 58.73205665]]
(1, 11)
(1, 11)
(1, 11)
(1, 11)
(1, 11)
(11, 1)
(11, 1)
What is happening? Why does numpy refuse to give me a 1-dimensional array?
You specifically used numpy.matrix, which refuses to be 1-dimensional. Don't use numpy.matrix! Remove that np.mat call.

padding a input vector, a 4-D matrix, using numpy for a convolutional neural network (CNN)

This is the entire code related to my question. You should be able to run this code and see the plots created - by just pasting and running it into your IDE.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0))\
, mode='constant', constant_values = (0,0))
print ("x.shape =\n", x.shape)
print ("x_pad.shape =\n", x_pad.shape)
print ("x[1,1] =\n", x[1,1])
print ("x_pad[1,1] =\n", x_pad[1,1])
fig, axarr = plt.subplots(1, 2)
axarr[0].set_title('x')
axarr[0].imshow(x[0,:,:,0])
axarr[1].set_title('x_pad')
axarr[1].imshow(x_pad[0,:,:,0])
Specifically, my question is related to these two lines of code:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
I want to pad the 2nd and 3rd dimension in x. So, I want to pad x[1] which has a value of 3 and x[2] which also has the value of 3. Based on the problem that I am solving, x[0] and x[3], which contain '4' and '2' respectively, represent something else. x[0] represents the number of number of such 3*3 matrices and x[3] the channels.
My question is about around how python is representing this information and about how we are interpreting it. Are these the same?
The statement x = np.random.randn (4, 3, 3, 2) created a matrix 4 rows by 3 columns and each element in this 4*3 matrix is a 3 row by 2 column matrix. That is how Python is representing the x_pad. Is this understanding correct?
If so, then in the np.pad statement, we are padding the number of columns in the outer matrix (which is 3 in the 4*3). We are also padding the number of rows, which is 3, in the “3*2” - that is, the number of rows in the inner matrix).
The 3, 3 in (4, 3, 3, 2) was supposed to be part of just one matrix and not the columns of the outer matrix and the rows of the inner matrix? I am having trouble visualizing this? Can someone please clarify. Thank you!
These lines:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.pad(x, ((0,0), (2, 2), (2, 2), (0,0)), mode='constant', constant_values = (0,0))
are equivalent to:
x = np.random.randn(4, 3, 3, 2)
x_pad = np.zeros((4, 3+2+2, 3+2+2, 2))
x_pad[:, 2:-2, 2:-2, :] = x
You could interpret a 4-D array as being a 2-D array of 2-D arrays if that fits whatever this data represents for you, but numpy internally stores arrays as a 1D array of data; with x[i,j,k,l] pointing to data[l+n3*(k + n2*(j + n1*i))] where n1, n2, n3 are the lengths of the corresponding axes.
Visualizing 4-D (and higher) arrays is very difficult for humans. You just have to keep track of the indices for the four axes when you deal with such arrays.

python nested for loop zip

I have the following list:
grid = [[0] *50 for n in range(50)]
I want to replace the values in grid (with 1) for each coordinate contained in the list:
area = [(30, 28), (27, 32), (32, 34), (43,23), (43, 2) ...] # Continues on
Is there any simple method that this can be done?
A simple for loop is what is needed.
for i,j in area:
grid[i][j] = 1

Why is irfftn(rfftn(x)) not equal to x?

If the trailing dimension of an array x is odd, the transform y = irfftn(rfftn(x)) does not have the same shape as the input array. Is this by design? And if so, what is the motivation? Example code is below.
import numpy as np
shapes = [(10, 10), (11, 11), (10, 11), (11, 10)]
for shape in shapes:
x = np.random.uniform(0, 1, shape)
y = np.fft.irfftn(np.fft.rfftn(x))
if x.shape != y.shape:
print("expected shape %s but got %s" % (shape, y.shape))
# Output
# expected shape (11, 11) but got (11, 10)
# expected shape (10, 11) but got (10, 10)
You need to pass second parameter x.shape
in your case the code will looks like:
import numpy as np
shapes = [(10, 10), (11, 11), (10, 11), (11, 10)]
for shape in shapes:
x = np.random.uniform(0, 1, shape)
y = np.fft.irfftn(np.fft.rfftn(x),x.shape)
if x.shape != y.shape:
print("expected shape %s but got %s" % (shape, y.shape))
from the docs
This function computes the inverse of the N-dimensional discrete
Fourier Transform for real input over any number of axes in an
M-dimensional array by means of the Fast Fourier Transform (FFT). In
other words, irfftn(rfftn(a), a.shape) == a to within numerical
accuracy. (The a.shape is necessary like len(a) is for irfft, and for
the same reason.)
x.shape descriptions from the same docs:
s : sequence of ints, optional Shape (length of each transformed axis)
of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). s is also
the number of input points used along this axis, except for the last
axis, where s[-1]//2+1 points of the input are used. Along any axis,
if the shape indicated by s is smaller than that of the input, the
input is cropped. If it is larger, the input is padded with zeros. If
s is not given, the shape of the input along the axes specified by
axes is used.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.irfftn.html

Shape of numpy array comparison with empty list

I have some problems understanding how python/numpy is casting array shapes when comparing to an empty list - which as far as I understand - is an implicit (element wise) comparison with False.
In the following example the shape decreases by one in the last dimension, if it is not greater than 1.
z = N.zeros((2,2,1))
z == []
>> array([], shape=(2, 2, 0), dtype=bool)
z2 = N.zeros((2,2,2))
z2 ==[]
>> False
If, however, I compare with False directly, I get the expected output.
z = N.zeros((2,2,1))
(z == False).shape
>> (2, 2, 2)
z2 = N.zeros((2,2,2))
(z2 == False).shape
>> (2, 2, 1)
This is ordinary broadcasting at work. When you do
z = N.zeros((2,2,1))
z == []
[] is interpreted as an array of shape (0,), and then the shapes are broadcast against each other:
(2, 2, 1)
vs (0,)
Since (0,) is shorter than (2, 2, 1), it gets expanded, as if the array were copied repeatedly:
(2, 2, 1)
vs (2, 2, 0)
and since there's a 1 in the first shape and the other shape doesn't have a 1 there, the first shape gets "expanded" as if it were copied zero times:
(2, 2, 0)
vs (2, 2, 0)
The comparison thus results in an array of booleans with shape (2, 2, 0).
When z has shape (2, 2, 2):
z2 = N.zeros((2,2,2))
z2 ==[]
broadcasting fails, since a length-2 axis and a length-0 axis can't be broadcast against each other. NumPy reports that it doesn't know how to perform the comparison:
>>> numpy.zeros([2, 2, 2]).__eq__([])
NotImplemented
The list doesn't know how either, so Python falls back on the default comparison by identity, and gets a result of False.
When you compare against False:
z = N.zeros((2,2,1))
(z == False).shape
False gets interpreted as an array of shape () - an empty shape! That gets broadcast out to shape (2, 2, 1), as if copied out to an array full of Falses, so the result has the same shape as z.

Categories

Resources