Using Keras fit_generator gives an error of wrong shape

Using Keras fit_generator gives an error of wrong shape - python

I am getting an error on fit_generator. My generator returns the following:
yield(row.values, label)
For example, using it:
myg = generate_array()
for i in myg:
print((i[0].shape))
print(i)
break
(9008,)
(array([0.116516, 0.22419 , 0.03373 , ..., 0. , 0. , 0. ]), 0)
But the following throws an exception:
model = Sequential()
model.add(Dense(84, activation='relu', input_dim=9008))
ValueError: Error when checking input: expected dense_1_input to have shape
(9008,) but got array with shape (1,)
Any idea?

As suggested by Kota Mori: data generator needs to give a batch of data, not a single sample. See e.g.: https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
Since I want a stochastic gradient descent (batch size is one) the following code fixed the problem:
def generate_array():
while True:
X = np.empty((1, 9008))
y = np.empty((1), dtype=int)
# Some processing
X[0] = row
y[0] = label
yield(X,y)

Related

Could I set a part of a tensor untrainable?

It's easy to set a tensor untrainable, trainable=False. But Could I set only part of a tensor untrainable?
Suppose I have a 2*2 tensor, I only want one element untrainable and the other three elements trainable.
Like this (I want the 1,1 element always to be zero, and the other three elements updated by optimizer)
untrainable trainable
trainable trainable
Thanks.

Short answer: you can't.
Longer answer: you can mimic that effect by setting part of the gradient to zero after the computation of the gradient so that part of the variable is never updated.
Here is an example:
import tensorflow as tf
tf.random.set_seed(0)
model = tf.keras.Sequential([tf.keras.layers.Dense(2, activation="sigmoid", input_shape=(2,), name="first"), tf.keras.layers.Dense(1,activation="sigmoid")])
X = tf.random.normal((1000,2))
y = tf.reduce_sum(X, axis=1)
ds = tf.data.Dataset.from_tensor_slices((X,y))
In that example, the first layer has a weight W of the following:
>>> model.get_layer("first").trainable_weights[0]
<tf.Variable 'first/kernel:0' shape=(2, 2) dtype=float32, numpy=
array([[ 0.13573623, -0.68269 ],
[ 0.8938798 , 0.6792033 ]], dtype=float32)>
We then write the custom loop that will only update the first row of that weight W :
loss = tf.losses.MSE
opt = tf.optimizers.SDG(1.) # high learning rate to see the change
for xx,yy in ds.take(1):
with tf.GradientTape() as tape:
l = loss(model(xx),yy)
g = tape.gradient(l,model.get_layer("first").trainable_weights[0])
gradient_slice = g[:1] # first row
new_grad = tf.concat([gradient_slice, tf.zeros((1,2), dtype=tf.float32),], axis=0) # replacing the rest with zeros
opt.apply_gradients(zip([new_grad], [model.get_layer("first").trainable_weights[0]]))
And then, after running that loop, we can inspect the wieghts again:
model.get_layer("first").trainable_weights[0]
<tf.Variable 'first/kernel:0' shape=(2, 2) dtype=float32, numpy=
array([[-0.08515069, -0.51738167],
[ 0.8938798 , 0.6792033 ]], dtype=float32)>
And only the first row changed.

How to efficiently assign to a slice of a tensor in TensorFlow

I want to assign some values to slices of an input tensor in one of my model in TensorFlow 2.x (I am using 2.2 but ready to accept a solution for 2.1).
A non-working template of what I am trying to do is:
import tensorflow as tf
from tensorflow.keras.models import Model
class AddToEven(Model):
def call(self, inputs):
outputs = inputs
outputs[:, ::2] += inputs[:, ::2]
return outputs
of course when building this (AddToEven().build(tf.TensorShape([None, None]))) I get the following error:
TypeError: 'Tensor' object does not support item assignment
I can achieve this simple example via the following:
class AddToEvenScatter(Model):
def call(self, inputs):
batch_size = tf.shape(inputs)[0]
n = tf.shape(inputs)[-1]
update_indices = tf.range(0, n, delta=2)[:, None]
scatter_nd_perm = [1, 0]
inputs_reshaped = tf.transpose(inputs, scatter_nd_perm)
outputs = tf.tensor_scatter_nd_add(
inputs_reshaped,
indices=update_indices,
updates=inputs_reshaped[::2],
)
outputs = tf.transpose(outputs, scatter_nd_perm)
return outputs
(you can sanity-check with:
model = AddToEvenScatter()
model.build(tf.TensorShape([None, None]))
model(tf.ones([1, 10]))
)
But as you can see it's very complicated to write. And this is only for a static number of updates (here 1) on a 1D (+ batch size) tensor.
What I want to do is a bit more involved and I think writing it with tensor_scatter_nd_add is going to be a nightmare.
A lot of the current QAs on the topic cover the case for variables but not tensors (see e.g. this or this).
It is mentionned here that indeed pytorch supports this, so I am surprised to see no response from any tf members on that topic recently.
This answer doesn't really help me, because I will need some kind of mask generation which is going to be awful as well.
The question is thus: how can I do slice assignment efficiently (computation-wise, memory-wise and code-wise) w/o tensor_scatter_nd_add? The trick is that I want this to be as dynamical as possible, meaning that the shape of the inputs could be variable.
(For anyone curious I am trying to translate this code in tf).
This question was originally posted in a GitHub issue.

Here is another solution based on binary mask.
"""Solution based on binary mask.
- We just add this mask to inputs, instead of multiplying."""
class AddToEven(tf.keras.Model):
def __init__(self):
super(AddToEven, self).__init__()
def build(self, inputshape):
self.built = True # Actually nothing to build with, becuase we don't have any variables or weights here.
#tf.function
def call(self, inputs):
w = inputs.get_shape()[-1]
# 1-d mask generation for w-axis (activate even indices only)
m_w = tf.range(w) # [0, 1, 2,... w-1]
m_w = ((m_w%2)==0) # [True, False, True ,...] with dtype=tf.bool
# Apply 1-d mask to 2-d input
m_w = tf.expand_dims(m_w, axis=0) # just extend dimension as to be (1, W)
m_w = tf.cast(m_w, dtype=inputs.dtype) # in advance, we need to convert dtype
# Here, we just add this (1, W) mask to (H,W) input magically.
outputs = inputs + m_w # This add operation is allowed in both TF and numpy!
return tf.reshape(outputs, inputs.get_shape())
Sanity-check here.
# sanity-check as model
model = AddToEven()
model.build(tf.TensorShape([None, None]))
z = model(tf.zeros([2,4]))
print(z)
Result (with TF 2.1) is like this.
tf.Tensor(
[[1. 0. 1. 0.]
[1. 0. 1. 0.]], shape=(2, 4), dtype=float32)
-------- Below is the previous answer --------
You need to create tf.Variable in build() method.
It also allows dynamic size by shape=(None,).
In the code below, I specified the input shape as (None, None).
class AddToEven(tf.keras.Model):
def __init__(self):
super(AddToEven, self).__init__()
def build(self, inputshape):
self.v = tf.Variable(initial_value=tf.zeros((0,0)), shape=(None, None), trainable=False, dtype=tf.float32)
#tf.function
def call(self, inputs):
self.v.assign(inputs)
self.v[:, ::2].assign(self.v[:, ::2] + 1)
return self.v.value()
I tested this code with TF 2.1.0 and TF1.15
# test
add_to_even = AddToEven()
z = add_to_even(tf.zeros((2,4)))
print(z)
Result:
tf.Tensor(
[[1. 0. 1. 0.]
[1. 0. 1. 0.]], shape=(2, 4), dtype=float32)
P.S. There are some other ways, such as using tf.numpy_function(), or generating mask function.

It seem to produce no errors with this:
import tensorflow as tf
from tensorflow.keras.models import Model
class AddToEven(Model):
def call(self, inputs):
outputs = inputs
outputs = outputs[:, ::2] + 1
return outputs
# tf.Tensor.__iadd__ does not seem to exist, but tf.Tensor.__add__ does.

Variable batch_size in call function

I am trying to implement an attention network with TensorFlow 2. Thus, for every image, I want to take only some glimpses, i.e. a small part from the image. For this I have implemented a subclass from tensorflow.keras.models.Model, here is a snippet out of it.
class RecurrentAttentionModel(models.Model):
# ...
def call(self, inputs):
l = tf.random.uniform((40,2,), minval=0, maxval=1)
for _ in range(0, self.glimpses):
glimpse = tf.image.extract_glimpse(inputs, size=(self.retina_size, self.retina_size), offsets=l, centered=False, normalized=True)
# some other code...
# update l to take a glimpse somewhere else
return result
Now, the code above works and trains perfectly, but my issue is, that I have the hardcoded 40 in it, the batch_size which I have defined in my dataset. I am not able to read/get the batch_size in the call method since the variable "inputs" is of the form Tensor("input_1_77:0", shape=(None, 250, 500, 1), dtype=float32) where the None for the batch_size seems to be expected behavior.
When I just initialize l with the following code (without the batch_size)
l = tf.random.uniform((2,), minval=0, maxval=1)
it throws this error
ValueError: Shape must be rank 2 but is rank 1 for 'recurrent_attention_model_86/ExtractGlimpse' (op: 'ExtractGlimpse') with input shapes: [?,250,500,1], [2], [2]
what I totally understand but I have no idea how I could implement the initial values according to the batch_size.

You can extract the batch size dimension dynamically by using tf.shape.
l = tf.random.normal(tf.stack([tf.shape(inputs)[0], 2]), minval=0, maxval=1))

ValueError: Error when checking input: expected input_1 to have shape (168, 5) but got array with shape (5808, 5)

I'm trying to implement a hybrid LSTM-DNN forecaster with multiple inputs using the code from Hvass-Labs Time Series tutorial #23. Basically I want to forecast day-ahead prices (just a 24 time step into the future for now) of electricity using sequential and non-sequential data. The model I'm using is two sets of inputs feeding an LSTM (for the sequential data) and Dense for the non-sequential data, with their outputs concatenated. It looks like this:
!https://imgur.com/a/x15FfIy
Basically whenever I try to fit the model after one epoch it shows this error:
UPDATE:
ValueError: Error when checking input: expected input_1 to have shape (168, 5) but got array with shape (5808, 5)
The changes I have implemented:
# Chop off x_test_scaled into two parts:
x_test1_scaled = x_test_scaled[:,0:5] # shape is (5808, 5)
x_test2_scaled = x_test_scaled[:,5:12] # shape is (5808, 7)
validation_data = [np.expand_dims(x_test1_scaled, axis=0), np.expand_dims(x_test2_scaled, axis=0)], np.expand_dims(y_test_scaled, axis=0)
I'm confused because I have indeed assigned the generator to the generator in the model.fit_generator, and I'm not passing the x_test1_scaled which does have the shape of (5808, 5). edit:(not validation_data)
%%time
model.fit_generator(generator=generator,
epochs=10,
steps_per_epoch=30,
validation_data=validation_data,
callbacks=callbacks)
If this helps, this is my model:
# first input model
input_1 = Input(shape=((168,5)))
dense_1 = Dense(50)(input_1)
# second input model
input_2 = Input(shape=((168,7)))
lstm_1 = LSTM(units=64, return_sequences=True, input_shape=(None, 7,))(input_2)
# merge input models
merge = concatenate([dense_1, lstm_1])
output = Dense(num_y_signals, activation='sigmoid')(merge)
model = Model(inputs=[input_1, input_2], outputs=output)
# summarize layers
print(model.summary())
EDIT: Cleared this problem, replaced with error on top.
Thus far I've managed everything up to actually fitting the model.
Whenever an epoch finishes however it goes into the error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[0.4 , 0.44444442, 0. , ..., 0.1734707 ,
0.07272629, 0.07110982],
[0.3904762 , 0.43434343, 0.04347826, ..., 0.1740398 ,
0.07282589, 0.06936309],
...
I have tried the solutions from other stackexchange posts of the same error message. They haven't been successful, but I was able to eventually isolate the problem array to that of the validation_data. I just don't know how to "reshape" it into the required 2 array.
The batch generator: I have included the two sets of inputs already. the x_batch_1 and x_batch_2
def batch_generator(batch_size, sequence_length):
"""
Generator function for creating random batches of training-data.
"""
# Infinite loop.
while True:
# Allocate a new array for the batch of input-signals.
x_shape = (batch_size, sequence_length, num_x_signals)
x_batch = np.zeros(shape=x_shape, dtype=np.float16)
# Allocate a new array for the batch of output-signals.
y_shape = (batch_size, sequence_length, num_y_signals)
y_batch = np.zeros(shape=y_shape, dtype=np.float16)
# Fill the batch with random sequences of data.
for i in range(batch_size):
# Get a random start-index.
# This points somewhere into the training-data.
idx = np.random.randint(num_train - sequence_length)
# Copy the sequences of data starting at this index.
x_batch[i] = x_train_scaled[idx:idx+sequence_length]
y_batch[i] = y_train_scaled[idx:idx+sequence_length]
x_batch_1 = x_batch[ :, :, 0:5]
x_batch_2 = x_batch[ :, :, 5:12]
yield ([x_batch_1, x_batch_2], y_batch)
batch_size = 32
sequence_length = 24 * 7
generator = batch_generator(batch_size=batch_size,
sequence_length=sequence_length)
Validation set:
validation_data = np.expand_dims(x_test_scaled, axis=0), np.expand_dims(y_test_scaled, axis=0)
And lastly the model fit:
%%time
model.fit_generator(generator=generator,
epochs=10,
steps_per_epoch=30,
validation_data=validation_data,
callbacks=callbacks)
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[0.4 , 0.44444442, 0. , ..., 0.1734707 ,
0.07272629, 0.07110982],
[0.3904762 , 0.43434343, 0.04347826, ..., 0.1740398 ,
0.07282589, 0.06936309],
...
The array is the same one as the validation_data. Another thing is that the error creeps up whenever the first epoch finishes which strengthens the case for the problem being the validation_data.

It's because your model need 2 sets of input, x_batch_1, x_batch_2 in your batch_generator. While your validation_data has only one array np.expand_dims(x_test_scaled, axis=0)
You need to make validation_data looks like the batch_generator, probably [np.expand_dims(x_test1_scaled, axis=0), np.expand_dims(x_test2_scaled, axis=0)], np.expand_dims(y_test_scaled, axis=0).
In case of you still don't understand, please provide information about x_test1_scaled, like it's shape or how you load it.

How can I resolve the, "feed a value for placeholder tensor" error?

I'm brand new to Tensorflow and while following through a book I have found that their sample data is too verbose for me to follow what Tensorflow is doing. That being the case I have made my own ultra small csv file instead. After working through several errors I am at the end of my script and can't seem to work through this final error:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'y' with dtype float and shape [?,1]
[[Node: y = Placeholderdtype=DT_FLOAT, shape=[?,1],
_device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Below is my code and below that I have included the output of the print statements. Can anyone help me understand this error. Also, I am aware that my mock data will not output any sensible model I just want to get it working first before switching to more complicated data. Thanks!
import tensorflow as tf
import numpy as np
import pandas as pd
import tarfile
import os
def load_data():
return pd.read_csv("datasets/housing/mock.csv")
#load the data
mockData = load_data()
print("mock data:")
print(mockData)
#add the bias
mockDataPlusBias = np.c_[np.ones((3,1)), mockData]
print("Mock data and bias:")
print(mockDataPlusBias)
#create placeholders
X = tf.constant(mockDataPlusBias, dtype=tf.float32, name="X")
y = tf.placeholder(tf.float32, shape=(None,1), name="y")
#for use with matmul
XT = tf.transpose(X)
print("X:")
print(X)
print("XT:")
print(XT)
print("y:")
print(y)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, XT)), XT), y)
with tf.Session() as sess:
theta_value = theta.eval()
print(theta_value)
And lastly the print statements:
mock data:
col1 col2
0 1 2
1 4 5
2 7 8
Mock data and bias:
[[1. 1. 2.]
[1. 4. 5.]
[1. 7. 8.]]
X:
Tensor("X:0", shape=(3, 3), dtype=float32)
XT:
Tensor("transpose:0", shape=(3, 3), dtype=float32)
y:
Tensor("y:0", shape=(?, 1), dtype=float32)

Seems like you are successfully declaring what the shape and type of 'y' is but not actually specifying any value for y. For placeholders, you also need to use feed_dict to set the value(s) for y during tf.Session run.
See example here.
A useful starting point is the official Tensorflow guides

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Keras fit_generator gives an error of wrong shape - python

Related

Could I set a part of a tensor untrainable?

How to efficiently assign to a slice of a tensor in TensorFlow

Variable batch_size in call function

ValueError: Error when checking input: expected input_1 to have shape (168, 5) but got array with shape (5808, 5)

How can I resolve the, "feed a value for placeholder tensor" error?

Categories

Resources