Why isn't Tensorflow/Keras Flatten layer flattening my array?

Why isn't Tensorflow/Keras Flatten layer flattening my array? - python

I am trying to use the tensorflow.keras.layers.Flatten layer outside of a model to flatten a 4x4 tensor. I can't figure out why the Flatten layer isn't actually flattening my array.
Here is my code:
import tensorflow as tf
import numpy as np
flayer = tf.keras.layers.Flatten()
X = tf.constant(np.random.random((4,4)),dtype=tf.float32)
Xf = flatten_layer(X)
print(Xf)
and print(Xf) shows
tf.Tensor(
[[0.9866459 0.52488756 0.86211777 0.06254051]
[0.32552275 0.23201537 0.8646714 0.80754006]
[0.55823076 0.51929855 0.538077 0.4111973 ]
[0.95845264 0.14468837 0.30223057 0.09648433]], shape=(4, 4), dtype=float32)
Why doesn't my flatten layer output a 16x1 tensor?

That's because the Flatten() layer assumes that the first dimension is the number of samples, so it returns 4 flattened rows. You have 4 observations, and 1D input for each of these already. It would behave differently if you had data with shape (32, 28, 28, 1), for example, which has a higher dimensionality for each row.
import tensorflow as tf
import numpy as np
flayer = tf.keras.layers.Flatten()
X = tf.constant(np.random.random((32, 28, 28, 1)),dtype=tf.float32)
Xf = flayer(X)
print(Xf.shape)
(32, 784)
If you meant to flatten one observation with shape (4, 4), you should add a batch dimension for it to work:
X = tf.constant(np.random.random((1, 4, 4)),dtype=tf.float32)
Xf = flayer(X)
print(Xf.shape)
(1, 16)

Related

Pooling for 1D tensor

I am looking for a way to reduce the length of a 1D tensor by applying a pooling operation. How can I do it? If I apply MaxPool1d, I get the error max_pool1d() input tensor must have 2 or 3 dimensions but got 1.
Here is my code:
import numpy as np
import torch
A = np.random.rand(768)
m = nn.MaxPool1d(4,4)
A_tensor = torch.from_numpy(A)
output = m(A_tensor)

Your initialization is fine, you've defined the first two parameters of nn.MaxPool1d: kernel_size and stride. For one-dimensional max-pooling both should be integers, not tuples.
The issue is with your input, it should be two-dimensional (the batch axis is missing):
>>> m = nn.MaxPool1d(4, 4)
>>> A_tensor = torch.rand(1, 768)
Then inference will result in:
>>> output = m(A_tensor)
>>> output.shape
torch.Size([1, 192])

I think you meant the following instead:
m = nn.MaxPool1d((4,), 4)
As mentioned in the docs, the arguments are:
torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
As you can see, it's one kernel_size, it's not something like kernel_size1 kernel_size2. Instead it's just only kernel_size

For posterity: the solution is to reshape the tensor using A_tensor.reshape(768,1).

Keras expected embedding_13_input to have 2 dimensions, but got array with shape (20, 7, 12)

I know this question has been asked a bunch of times but I just can't resolve this error using any of the answers at SO. I am trying to build embeddings and my data is shaped: (20, 7, 12) i.e. 20 training samples, that have 7 words each, with one-hot encoded to 12 dimensions.
When I fit my model using the below specs, I get the error:
Error when checking input: expected embedding_24_input to have 2
dimensions, but got array with shape (20, 7, 12)
embedding_dims = 10
model = Sequential()
model.add(Embedding(12, embedding_dims,input_length=7))
I then tried to Flatten before Embedding, but that failed complaining that "input_length" is 7, but received input has shape (None, 84)". I then changed the input_length on the embedding layer to match that but no luck with that either:
Error when checking target: expected embedding_26 to have 3
dimensions, but got array with shape (20, 12)
model = Sequential()
model.add(Flatten())
model.add(Embedding(12, embedding_dims,input_length=84))
I would really appreciate any help with some explanation, please!

Embedding layer don't expect the one hot encoding in input data, you should using function like numpy.argmax or tf.argmax to convert data to integer matrix of size (batch, input_length) before feed data in Embedding layer.
Following example working without any error:
import tensorflow as tf
import numpy as np
embedding_dims = 10
batch_size = 20
vocabulary_size = 12
# The model will take as input an integer matrix of size (batch, input_length),
# and the largest integer (i.e. word index) in the input should be no larger than 11 (vocabulary size - 1)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(vocabulary_size, embedding_dims, input_length=7))
model.compile('rmsprop', 'mse')
raw_input_array = np.random.randint(2, size=(batch_size, 7, vocabulary_size))
input_array = np.argmax(raw_input_array, axis=-1)
output_array = model.predict(input_array)
print("One hot encodeing data shape: {}\ninput integer matrix shape: {}\noutput shape: {}\n".format(raw_input_array.shape, input_array.shape, output_array.shape))
Outputs:
One hot encodeing data shape: (20, 7, 12)
input integer matrix shape: (20, 7)
output shape: (20, 7, 10)

Increase the size of a np.array

I ran a conv1D on a X matrix of shape (2000, 20, 28) for batch size of 2000, 20 time steps and 28 features.
I would like to move forward to a conv2D CNN and increase the dimensionality of my matrix to (2000, 20, 28, 10) having 10 elements for which I can build a (2000, 20, 28) X matrix. Similarly, I want to get a y array of size (2000, 10) i.e. 5 times the y array of size (2000, ) that I used to get for LSTM and Conv1D networks.
The code I used to create the 20 time-steps from input dataX, dataY, was
def LSTM_create_dataset(dataX, dataY, seq_length, step):
Xs, ys = [], []
for i in range(0, len(dataX) - seq_length, step):
v = dataX.iloc[i:(i + seq_length)].values
Xs.append(v)
ys.append(dataY.iloc[i + seq_length])
return np.array(Xs), np.array(ys)
I use this function within the loop I prepared to create the data of my conv2D NN :
for ric in rics:
dataX, dataY = get_model_data(dbInput, dbList, ric, horiz, drop_rows, triggerUp1, triggerLoss, triggerUp2 = 0)
dataX = get_model_cleanXset(dataX, trigger) # Clean X matrix for insufficient data
Xs, ys = LSTM_create_dataset(dataX, dataY, seq_length, step) # slide over seq_length for a 3D matrix
Xconv.append(Xs)
yconv.append(ys)
Xconv.append(Xs)
yconv.append(ys)
I obtain a (10, 2000, 20, 28) Xconv matrix instead of the (2000, 20, 28, 10) targeted output matrix X and a (10, 2000) matrix y instead of the targeted (2000, 10).
I know that I can easily reshape yconv with yconv = np.reshape(yconv, (2000, 5)). But the reshape function for Xconv Xconv = np.reshape(Xconv, (2000, 20, 28, 10)) seems hazardous as I cannot vizualize output and even erroneous.
How could I do it safely (or could you confirm my first attempt ?
Thanks a lot in advance.

If your matrix for y has shape (10, 2000) then you will not be able to shape it to your desired (2000,5). I've demonstrated this below.
# create array of same shape as your original y
arr_1 = np.arange(0,2000*10).reshape(10,2000)
print(arr_1.shape) # returns (10,2000)
arr_1 = arr_1.reshape(2000,5)
This returns the following error message as it is critical that the dimensions of the before and after shapes must match.
ValueError: cannot reshape array of size 20000 into shape (2000,5)
I do not fully understand the statement that you cannot visualise the output - you could manually check that the reshape function had done so correctly if you wished, for your dataset (or a small part of it to confirm that the function is working effectively) using print statements, as below - by comparing the output to your original data and what you expect the data to look like afterwards.
import numpy as np
arr = np.arange(0,2000)
arr = arr.reshape(20,10,10,1) # reshape array to shape (20, 10, 10, 1)
# these statements let you examine the array contents at varying depths
print(arr[0][0][0])
print(arr[0][0])

Incorrect number of dimensions in Keras input

I'm attempting to follow along on what I'm thinking is the 5th or 6th simple introductory tutorial for keras that almost but never quite works.
Stripping everything out, I appear to come down to a problem with the format of my input. I read in an array of images, and extract two types, images of sign language ones and images of sign language zeros. I then set up an array of ones and zeros to correspond to what the images actually are, then make sure of sizes and types.
import numpy as np
from subprocess import check_output
print(check_output(["ls", "../data/keras/"]).decode("utf8"))
## load dataset of images of sign language numbers
x = np.load('../data/keras/npy_dataset/X.npy')
# Get the zeros and ones, construct a list of known values (Y)
X = np.concatenate((x[204:409], x[822:1027] ), axis=0) # from 0 to 204 is zero sign and from 205 to 410 is one sign
Y = np.concatenate((np.zeros(205), np.ones(205)), axis=0).reshape(X.shape[0],1)
# test shape and type
print("X shape: " , X.shape)
print("X class: " , type(X))
print("Y shape: " , Y.shape)
print("Y type: " , type(Y))
This gives me:
X shape: (410, 64, 64)
X class: <class 'numpy.ndarray'>
Y shape: (410, 1)
Y type: <class 'numpy.ndarray'>
which is all good. I then load the relevant bits from Keras, using Tensorflow as the backend and try to construct a classifier.
# get the relevant keras bits.
from keras.models import Sequential
from keras.layers import Convolution2D
# construct a classifier
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(410, 64, 64), activation="relu", data_format="channels_last"))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(X, Y, batch_size=32, epochs=10, verbose=1)
This results in:
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (410, 64, 64)
This SO question, I think, suggests that my input shape needs to be altered to have a 4th dimension added to it - though it also says it's the output shape that needs to altered, I haven't been able to find anywhere to specify an output shape, so I'm assuming it is meant that I should alter the input shape to input_shape=(1, 64, 64, 1).
If I change my input shape however, then I immeadiately get this:
ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5
Which this github issue suggests is because I no longer need to specify the number of samples. So I'm left with the situation of using one input shape and getting one error, or changing it and getting another error.
Reading this and this made me think I might need to reshape my data to include information about the channels in X, but if I add in
X = X.reshape(X.shape[0], 64, 64, 1)
print(X.shape)
Then I get
ValueError: Error when checking target: expected conv2d_1 to have 4 dimensions, but got array with shape (410, 1)
If I change the reshape to anything else, i.e.
X = X.reshape(X.shape[0], 64, 64, 2)
Then I get a message saying it's unable to reshape the data, so I'm obviously doing something wrong with that, if that is, indeed, the problem.
I have read the suggested Conv2d docs which shed exactly zero light on the matter for me. Anyone else able to?

At first I used the following data sets (similar to your case):
import numpy as np
import keras
X = np.random.randint(256, size=(410, 64, 64))
Y = np.random.randint(10, size=(410, 1))
x_train = X[:, :, :, np.newaxis]
y_train = keras.utils.to_categorical(Y, num_classes=10)
And then modified your code as follows to work:
from keras.models import Sequential
from keras.layers import Convolution2D, Flatten, Dense
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(64, 64, 1), activation="relu", data_format="channels_last"))
classifier.add(Flatten())
classifier.add(Dense(10, activation='softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(x_train, y_train, batch_size=32, epochs=10, verbose=1)
Changed the shape of X from 410 x 64 x 64 to 410 x 64 x 64 x 1 (with channel 1).
input_shape be the shape of a sample data, that is, 64 x 64 x 1.
Changed the shape of Y using keras.utils.to_categorical() (one-hot encoding with num_classes=10).
Before compiling, Flatten() and Dense() were applied because you want categorical_crossentropy.

Simple network for arbitrary shape input

I am trying to create an autoencoder in Keras with Tensorflow backend. I followed this tutorial in order to make my own. Input to the network is kind of arbitrary i.e. each sample is a 2d array with fixed number of columns (12 in this case) but rows range between 4 and 24.
What I have tried so far is:
# Generating random data
myTraces = []
for i in range(100):
num_events = random.randint(4, 24)
traceTmp = np.random.randint(2, size=(num_events, 12))
myTraces.append(traceTmp)
myTraces = np.array(myTraces) # (read Note down below)
and here is my sample model
input = Input(shape=(None, 12))
x = Conv1D(64, 3, padding='same', activation='relu')(input)
x = MaxPool1D(strides=2, pool_size=2)(x)
x = Conv1D(128, 3, padding='same', activation='relu')(x)
x = UpSampling1D(2)(x)
x = Conv1D(64, 3, padding='same', activation='relu')(x)
x = Conv1D(12, 1, padding='same', activation='relu')(x)
model = Model(input, x)
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model.fit(myTraces, myTraces, epochs=50, batch_size=10, shuffle=True, validation_data=(myTraces, myTraces))
NOTE: As per Keras Doc, it says that input should be a numpy array, if I do so I get following error:
ValueError: Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (100, 1)
And if I dont convert it in to numpy array and let it be a list of numpy arrays I get following error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 100 arrays: [array([[0, 1, 0, 0 ...
I don't know what I am doing wrong here. Also I am kind of new to Keras. I would really appreciate any help regarding this.

Numpy does not know how to handle a list of arrays with varying row sizes (see this answer). When you call np.array with traceTmp, it will return a list of arrays, not a 3D array (An array with shape (100, 1) means a list of 100 arrays).
Keras will need a homogeneous array as well, meaning all input arrays should have the same shape.
What you can do is pad the arrays with zeroes such that they all have the shape (24,12): then np.array can return a 3-dimensional array and the keras input layer does not complain.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why isn't Tensorflow/Keras Flatten layer flattening my array? - python

Related

Pooling for 1D tensor

Keras expected embedding_13_input to have 2 dimensions, but got array with shape (20, 7, 12)

Increase the size of a np.array

Incorrect number of dimensions in Keras input

Simple network for arbitrary shape input

Categories

Resources