This is a toy model I am trying to implement with tensorflow. The input is a set (10) of real number pairs. And the underlying function I want to approximate is . The implemented model should look like this:
I also need to mention that "Hidden Layer" is the same layer (same parameters) for all X_i.
What I implemented so far:
import tensorflow as tf
import numpy as np
def tf_model():
# Define the inputs
inputs = tf.keras.Input(shape=[10, 2])
# Define common hidden layer
hidden_layer = tf.keras.layers.Dense(64, activation="relu")(inputs)
# Propagate and average
outputs = tf.keras.layers.Dense(1, activation="sigmoid")(hidden_layer)
outputs = tf.keras.layers.Average()(outputs)
return tf.keras.Model(inputs=inputs, outputs=output)
X = np.random.rand(1000,10,2) * 100
y = 1 / (1 + X[...,0]**2 + X[...,1]**4)
y = np.average(y, axis=1)
model = tf_model()
model.fit(X, y)
What I get from running this:
Traceback (most recent call last):
File "model_test.py", line 21, in <module>
model = tf_model()
File "model_test.py", line 13, in tf_model
outputs = tf.keras.layers.Average()(outputs)
File "/home/redbull/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/redbull/.local/lib/python3.8/site-packages/keras/layers/merge.py", line 88, in build
raise ValueError(
ValueError: A merge layer should be called on a list of inputs. Received: input_shape=(None, 10, 1) (not a list of shapes)
I think the issue is that tf.keras.layers.Average() only works with a list of inputs, but not a tf layer/ tensor.
Since tf.keras.layers.Average() does not seem to be suitable in this scenario, how can I implement the wished functionality?
You can use tf.reduce_mean as below.
outputs = tf.reduce_mean(outputs, axis=1)
Related
I am working on a model that trains on simulation data, which should ideally be able to predict N timesteps forward from a given state in a simulation. I have attempted to model this by feeding the output of the model back into itself N times, where N is a hyperparameter of the model. I have done this in the call function of the tensorflow.keras.Model() class.
The relevant code:
def call(self, inputs):
x = inputs[0]
outputs = tf.TensorArray(
dtype=tf.float32, size=0, dynamic_size=True, infer_shape=False
)
window = inputs[1]
for i in tf.range(window):
x = self.model(x)
outputs = outputs.write(i, x)
outputs = tf.transpose(outputs.stack(), [1, 2, 3, 0, 4])
return outputs
This works, and the model trains, but i want to save the model using the tensorflow.keras.Model.save() function. Trying this leads to the following error:
Traceback (most recent call last):
File "/zhome/22/4/118839/Masters_Thesis/Model_files/Unet.py", line 562, in <module>
model_.save(savepath + "/saved_model/Model")
File "/zhome/22/4/118839/Masters_Thesis/Menv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/appl/python/3.9.11/lib/python3.9/contextlib.py", line 126, in __exit__
next(self.gen)
File "/zhome/22/4/118839/Masters_Thesis/Model_files/Unet.py", line 485, in call
for i in tf.range(4):
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: Iterating over a symbolic `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
Is there a better way of doing what I'm trying to do? Any other threads I have found recommend using the tf.map_fn() function, but this does not work for me due to the sequential nature of the model. Any help is appreciated!
I was in the middle of training my gan when a very unexpected error came up. I have no idea how to fix it. The error doesn't come right away it happens about 2-3 minutes into my training. Here is the Error
Traceback (most recent call last):
File "gan.py", line 103, in <module>
train(X_train_dataset,200)
File "gan.py", line 80, in train
train_step(images) # takes images and improves both the generator and the discriminator
File "gan.py", line 91, in train_step
discriminator_loss = get_discriminator_loss(real_output,fake_output)
File "gan.py", line 48, in get_discriminator_loss
return fake_loss+real_loss
File "/home/jake/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1125, in binary_op_wrapper
return func(x, y, name=name)
File "/home/jake/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/jake/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1447, in _add_dispatch
return gen_math_ops.add_v2(x, y, name=name)
File "/home/jake/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 486, in add_v2
_ops.raise_from_not_ok_status(e, name)
File "/home/jake/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6843, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [100] vs. [13] [Op:AddV2]
So from I can tell from this call back my error occures during my get_discriminator_loss() so here is that code.
def get_discriminator_loss(real_predictions,fake_predictions):
real_predictions = tf.sigmoid(real_predictions)
fake_predictions = tf.sigmoid(fake_predictions)
real_loss=tf.losses.binary_crossentropy(tf.ones_like(real_predictions),real_predictions)
fake_loss=tf.losses.binary_crossentropy(tf.zeros_like(fake_predictions),fake_predictions)
return fake_loss+real_loss
Does anyone have any ideas? And remember this is after running successfully for about 2-3 minutes. The error doesn't occur in the first many passes.
I've found the source of my error but I don't know why it's occuring.
My real loss at one of the passes has only 13 values instead of the normal 100
How can this be?
Here is my full code.
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import time
import pickle
pickle_in_X = open("X.pickle","rb")
pickle_in_y = open("y.pickle","rb")
X=pickle.load(pickle_in_X)
y = pickle.load(pickle_in_y)
y = np.array(y)
X_train = X[ int(len(X)*.3): ]
y_train = y[ int(len(y)*.3 ): ]
X_test = X[ :int(len(X)*.3) ]
y_test = X[ :int(len(y)*.3) ]
X_train = (X_train-127.5)/127.5
BATCH_SIZE = 100
X_train_dataset = tf.data.Dataset.from_tensor_slices(X_train).batch(BATCH_SIZE)
#creates a discriminator model.
#discriminator will ouput 0-1 which represents the probability that the image is real
def make_discriminator():
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(7,(3,3),padding="same",input_shape=(40,40,1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.Dense(50,activation="relu"))
model.add(tf.keras.layers.Dense(1))
return model
model_discriminator = make_discriminator()
discriminator_optimizer = tf.optimizers.Adam(1e-3)
#real_loss is the amount of error when trying to guess that the real images are in fact real. i.e loss will be if our discriminator guesses that there is a 100% chance that this real image is real
#fake_loss is the amount of error when trying to guess that the fake images are in fact fake. i.e loss will be zero if our discriminator guesses there is a 0% chance that this fake image is fake
#returns the total of our loss
def get_discriminator_loss(real_predictions,fake_predictions):
real_predictions = tf.sigmoid(real_predictions)
fake_predictions = tf.sigmoid(fake_predictions)
real_loss=tf.losses.binary_crossentropy(tf.ones_like(real_predictions),real_predictions)
fake_loss=tf.losses.binary_crossentropy(tf.zeros_like(fake_predictions),fake_predictions)
return fake_loss+real_loss
#take an input of a random string of numbers. and output either a dog or a cat
def make_generator():
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10*10*256,input_shape = (100,)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Reshape((10,10,256)))
model.add(tf.keras.layers.Conv2DTranspose(128,(3,3),padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2DTranspose(64,(3,3),strides=(2,2),padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2DTranspose(1,(3,3),strides=(2,2),padding="same"))
return model
model_generator = make_generator()
#generator gets rewarded when it fools the discriminator
def get_generator_loss(fake_predictions):
fake_predictions = tf.sigmoid(fake_predictions)
fake_loss=tf.losses.binary_crossentropy(tf.ones_like(fake_predictions),fake_predictions)
return fake_loss
generator_optimizer = tf.optimizers.Adam(1e-3)
#training
def train(X_train_dataset,epochs):
for _ in range(epochs):
for images in X_train_dataset:
images = tf.cast(images,tf.dtypes.float32)
train_step(images) # takes images and improves both the generator and the discriminator
def train_step(images):
fake_image_noise = np.random.randn(BATCH_SIZE,100)#produces 100 random numbers that wll be converted to images
with tf.GradientTape() as generator_gradient, tf.GradientTape() as discriminator_gradient:
generated_images = model_generator(fake_image_noise)
real_output = model_discriminator(images)
fake_output = model_discriminator(generated_images)
generator_loss = get_generator_loss(fake_output)
discriminator_loss = get_discriminator_loss(real_output,fake_output)
gradients_of_generator = generator_gradient.gradient(generator_loss,model_generator.trainable_variables)#gradient of gen loss with respect to trainable variables
gradients_of_discriminator = discriminator_gradient.gradient(discriminator_loss,model_discriminator.trainable_variables)
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator,model_discriminator.trainable_variables))
generator_optimizer.apply_gradients(zip(gradients_of_generator,model_generator.trainable_variables))
print("generator loss: ", np.mean(generator_loss))
print("discriminator loss: ",np.mean(discriminator_loss))
train(X_train_dataset,200)
model_generator.save('genModel')
model_discriminator.save('discModel')
If the size of your dataset is not a multiple of your batch size, then your last batch will have a smaller number of samples than the other batches. To avoid this, you can force a tf.data.Dataset to drop the last batch if it is smaller than the batch size. See the documentation for more information.
tf.data.Dataset.from_tensor_slices(X_train).batch(BATCH_SIZE, drop_remainder=True)
I am building a prediction model for sequence data using conv1d layer provided by Keras. This is how I did
input_layer = Input(shape=(500,))
layer = Conv1D(128,5,activation="relu")(input_layer)
layer = MaxPooling1D(pool_size=2)(layer)
layer = Flatten()(layer)
layer = Dense(128, activation='relu')(layer)
output_layer = Dense(10, activation='softmax')(layer)
classifier = Model(input_layer, output_layer)
classifier.summary()
classifier.compile(optimizer=optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return classifier
However, am facing the following error:
Traceback (most recent call last):
File "train.py", line 71, in <module>
classifier = create_cnn_model()
File "train.py", line 60, in create_cnn_model
layer = Conv1D(128,5, activation="relu")(input_layer)
File "C:\Python368\lib\site-packages\keras\backend\tensorflow_backend.py", line 75, in symbolic_fn
_wrapper
return func(*args, **kwargs)
File "C:\Python368\lib\site-packages\keras\engine\base_layer.py", line 446, in __call__
self.assert_input_compatibility(inputs)
File "C:\Python368\lib\site-packages\keras\engine\base_layer.py", line 342, in assert_input_compat
ibility
str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer conv1d_1: expected ndim=3, found ndim=2
I think the input_shape in the first layer is not setup right. How to set it up?
Right, conv layers need 3 dimensional input.
I am assuming you have a univariate time series with 500 samples.
You need to write a function to split the time series into steps.
For example:
x y
[t-n,...,t-2,t-1] t
So you are basically using the last n values to predict the next value in your series.
Then your input shape will be [len(x), n, 1]
I am trying to see what my data looks like after going through a convolutional layer in my Keras model. I am using the Theano backend. The code I have has been cobbled together from the Keras Github:
def get_layer0_outputs(model, test_data):
output = model.layers[0].output
inputs = [K.learning_phase()] + model.inputs
func = K.function(inputs, [output])
return func([0] + [test_data])
What I'm trying to do here is compile a function for the first layer (a Conv2D layer) in my network. The test_data argument is an np.ndarray. My model is loaded correctly, and I have already trained it with decent accuracy.
However, when I call this function, I get a cryptic stacktrace:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 884, in __call__
self.fn() if output_subset is None else\
File "/usr/local/lib/python3.5/dist-packages/theano/gof/op.py", line 872, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/local/lib/python3.5/dist-packages/theano/tensor/nnet/abstract_conv.py", line 1626, in perform
conv_out = self.conv(img, kern, mode="valid", dilation=self.filter_dilation)
File "/usr/local/lib/python3.5/dist-packages/theano/tensor/nnet/abstract_conv.py", line 1531, in conv
dilated_kern[n, im0, ...],
IndexError: index 1 is out of bounds for axis 1 with size 1
What does this mean? Am I calling my function incorrectly?
Your function works for me using the following model:
a = Input(shape=(224,224,3))
b = Conv2D(8, 3, strides=(2,2))(a)
model = Model(inputs=a, outputs=b)
model.compile(optimizer='sgd', loss='mse')
def get_layer0_outputs(model, test_data):
output = model.layers[0].output
inputs = [K.learning_phase()] + model.inputs
func = K.function(inputs, [output])
return func([0] + [test_data])
print get_layer0_outputs(model, np.zeros((1, 224, 224, 3)))[0].shape
Note that layer 0 is an Input layer not a Conv2D, but the code also works for layer 1. I'm using the tensorflow backend so I don't know if the difference is your model or the theano backend.
I have a model network implemented in keras with theano as the backend which currently initialises with random weights for the filters:
# Combine and reshape for convolution
seq = concat(embeddings)
cshape = (config.window_size, sum(f.output_dim for f in features))
seq = Reshape((1,)+cshape)(seq)
# Convolutions
conv_outputs = []
for filter_size, filter_num in zip(config.filter_sizes, config.filter_nums):
conv = Convolution2D(filter_num, filter_size, cshape[1], activation='relu')(seq)
cout = Flatten()(conv)
conv_outputs.append(cout)
seq = concat(conv_outputs)
But now I want to be able to load in previously generated weights which were stored as numpy arrays.
This is the altered code which attempts to read in the arrays:
# Combine and reshape for convolution
seq = concat(embeddings)
cshape = (config.window_size, sum(f.output_dim for f in features))
seq = Reshape((1,)+cshape)(seq)
# Convolutions
conv_outputs = []
for filter_size, filter_num in zip(config.filter_sizes, config.filter_nums):
filters = np.load('path/to/filters/size-%d.npy' % filter_size)
conv = Convolution2D(filter_num, filter_size, cshape[1], weights=filters)(seq)
cout = Flatten()(conv)
conv_outputs.append(cout)
seq = concat(conv_outputs)
When I try to run the augmented script, I run into the following error:
Traceback (most recent call last):
File "conv.py", line 78, in <module>
weights=filters)(seq)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 458, in __call__
self.build(input_shapes[0])
File "/usr/local/lib/python2.7/dist-packages/keras/layers/convolutional.py", line 324, in build
self.set_weights(self.initial_weights)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 846, in set_weights
' weights. Provided weights: ' + str(weights))
Exception: You called `set_weights(weights)` on layer "convolution2d_1" with a weight list of length 3, but the layer was expecting 2 weights.
I've tried searching online but I can't figure out why keras is throwing this error up.