Related
I am trying to tidy up my code by moving from the Keras functional API to the subclassing API. The class I came up with so far is below:
class FeedForwardNN(Model):
def __init__(self, params):
super().__init__()
self.params = params
self.layout = params['layout']
# Define layers
self.dense = Dense(units=params['layout'][1],
activation=params['activation'],
kernel_initializer=params['initializer'])
self.output_layer = Dense(units=params['layout'][-1],
kernel_initializer=params['initializer'])
self.dropout = Dropout(params['dropout'])
self.batch_norm = BatchNormalization()
def call(self, x):
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
if self.params['batch_norm']:
x = self.batch_norm(x)
x = self.output_layer(x)
return x
Where layout is a list of the neurons in each layer (including input and output layers).
However, when fitting the model, the following error is raised:
ValueError: Input 0 of layer "dense" is incompatible with the layer: expected axis -1 of input shape to have value 5, but received input with shape (None, 100)
Call arguments received:
• x=tf.Tensor(shape=(None, 5), dtype=float32)
which seems to occur on the line:
x = self.dropout(self.dense(x))
I checked the shape of the training data X that is passed to the fit() method, and it appears to have the right shape i.e. (number of observations, number of predictors).
Does anyone have an idea of where my mistake is?
The problem is that you are using same self.dense layer over and over again in your for loops
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
After the first loop, x has shape (batch, 100). Then in the second loop, instead of passing this x to the second Dense layer (which you don't seem to have created in the first place), you re-pass it to the first Dense layer, which expects shape (batch, 5), causing the error.
You can create a list of dense layer as follows in __init__
self.denses = [Dense(units=self.layout[i],
activation=params['activation'],
kernel_initializer=params['initializer']) for i in self.layout[1:-1]]
and call them in sequence
for dense_layer in self.denses:
x = self.dropout(dense_layer(x))
I have written a generator function with Keras, before returning X,y from __getitem__ I have double check the shapes of the X's and Y's and they are alright, but generator is giving dimension mismatch array and warnings.
(Colab Code to reproduce: https://colab.research.google.com/drive/1bSJm44MMDCWDU8IrG2GXKBvXNHCuY70G?usp=sharing)
My training and validation generators are pretty much same as
class ValidGenerator(Sequence):
def __init__(self, df, batch_size=64):
self.batch_size = batch_size
self.df = df
self.indices = self.df.index.tolist()
self.num_classes = num_classes
self.shuffle = shuffle
self.on_epoch_end()
def __len__(self):
return int(len(self.indices) // self.batch_size)
def __getitem__(self, index):
index = self.index[index * self.batch_size:(index + 1) * self.batch_size]
batch = [self.indices[k] for k in index]
X, y = self.__get_data(batch)
return X, y
def on_epoch_end(self):
self.index = np.arange(len(self.indices))
if self.shuffle == True:
np.random.shuffle(self.index)
def __get_data(self, batch):
#some logic is written here
#hat prepares 3 X features and 3 Y outputs
X = [input_array_1,input_array_2,input_array_3]
y = [out_1,out_2,out_3]
#print(len(X))
return X, y
I am return tupple of X,y from which has 3 input features and 3 output features each, so shape of X is (3,32,10,1)
I am using functional api to build model(I have things like concatenation, multi input/output, which isnt possible with sequential) with following structure
When I try to fit the model with generator with following code
train_datagen = TrainGenerator(df=train_df, batch_size=32, num_classes=None, shuffle=True)
valid_datagen = ValidGenerator(df=train_df, batch_size=32, num_classes=None, shuffle=True)
model.fit(train_datagen, epochs=2,verbose=1,callbacks=[checkpoint,es])
I get these warnings and errors, that dont go away
Epoch 1/2
WARNING:tensorflow:Model was constructed with shape (None, 10) for input >Tensor("input_1:0", shape=(None, 10), dtype=float32), but it was called >on an input with incompatible shape (None, None, None).
WARNING:tensorflow:Model was constructed with shape (None, 10) for input
Tensor("input_2:0", shape=(None, 10), dtype=float32), but it was
called on an input with incompatible shape (None, None, None).
WARNING:tensorflow:Model was constructed with shape (None, 10) for
input Tensor("input_3:0", shape=(None, 10), dtype=float32), but it was
called on an input with incompatible shape (None, None, None).
...
...
call
return super(RNN, self).call(inputs, **kwargs)
/home/eduardo/.virtualenvs/kgpu3/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:975
call
input_spec.assert_input_compatibility(self.input_spec, inputs,
/home/eduardo/.virtualenvs/kgpu3/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py:176
assert_input_compatibility
raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, None, None, 88]
I have rechecked whole code and it isnt possible to have input (None,None,None) like in warning or in error, my input dimension is (3,32,10,1)
Update
I have also tried to write a generator function with python and got exactly same error.
My generator function
def generate_arrays_from_file(batchsize,df):
#print(bat)
inputs = []
targets = []
batchcount = 0
while True:
df3 = df.loc[np.arange(batchcount*batchsize,(batchcount*batchsize)+batchsize)]
#Some pre processing
X = [input_array_1,input_array_2,input_array_3]
y = [out_1,out_2,out_3]
yield X,y
batchcount = batchcount +1
It seems like it is something wrong internally wit keras (may be due to the fact I am using functional API)
Update 2
I also tried to output tuple
X = (input1_X,input2_X,input3_X)
y = (output1_y,output2_y,output3_y)
and also named input/output, but it doesnt work
X = {"input_1": input1_X, "input_2": input2_X,"input_3": input3_X}
y = {"output_1": output1_y, "output_2": output2_y,"output_3": output3_y}
Note about problem formulation:
Changing the individual X features to shape (32,10) instead of (32,10,1) might help to get rid of this error but that is not what I want, it changes my problem(I no longer have 10 time steps with one feature each)
Keras use 'None' for dynamic dimensions.
As you can see on the model.summary() chart - the model expecting shape(None, 10) for all of your inputs, which is two dimensional. With batch dimension - you should feed three dimensional data to the model.
But you are feeding four dimensional data.
I would guess that your model doesn't split your input list by three inputs. Try to change your inputs to tuple:
X = (input_array_1,input_array_2,input_array_3)
In order to resolve this error:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, None, None, 88]
TrainGenerator should be changed in the following way.
Current code:
input1_X = np.array(df3['input1_X'].to_list()).reshape(dlen,pad_len,1)
input2_X = np.array(df3['input2_X'].to_list()).reshape(dlen,pad_len,1)
input3_X = np.array(df3['input3_X'].to_list()).reshape(dlen,pad_len,1)
Should be changed to:
input1_X = np.array(df3['input1_X'].to_list()).reshape(dlen,pad_len)
input2_X = np.array(df3['input2_X'].to_list()).reshape(dlen,pad_len)
input3_X = np.array(df3['input3_X'].to_list()).reshape(dlen,pad_len)
The reason is that each of the 3 Inputs expects a 2-dimensional array, but the generator provides a 3-dimensional one. The expected shape is (batch_size, 10).
I had a similar issue with a custom generator that just had to pass a numpy array of size 10 as input and one single output.
To solve this problem i had to trasform the shape of the 2 vectors passed to the neural network like this:
def slides_generator(integer_list):
# stuff happens
x = np_ts[np_index:np_index+10] # numpy array
y = np_ts[np_index+10] # numpy array
yield tf.convert_to_tensor(x)[np.newaxis, ...], tf.convert_to_tensor(y)[np.newaxis, ...]
doge_gen = slides_generator(integer_list) #next(doge_gen)
basically you need to pass the 2 arrays with shape (None,size),
so in my case were (None,10) and (None,1), and to achieve this i just passed 2 reshaped tensors.
you need the None dimension as the batch size.
I'm trying to build an RNN in Keras. I don't quite understand the required input format. I can build dense networks no problem, but I think that the RNN layers expect input dimension x batch x time step? Can anyone verify this?
Here is the code I would like to update:
Original code:
def get_generative(G_in, dense_dim=200, out_dim=50, lr=1e-3):
x = Dense(dense_dim)(G_in)
x = Activation('tanh')(x)
G_out = Dense(out_dim, activation='tanh')(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=[10])
G, G_out = get_generative(G_in)
G.summary()
Modified with GRU layers and some slightly different dimensions:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
clear_session()
x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
G_out = GRU(out_dim, return_state=True)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=(None,3))
G, G_out = get_generative(G_in)
G.summary()
The error that I am seeing with this code is:
ValueError: Tensor("gru_1/strided_slice:0", shape=(3, 10),
dtype=float32) must be from the same graph as
Tensor("strided_slice_1:0", shape=(?, 3), dtype=float32).
If I remove the None above, I get:
ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3,
found ndim=2
Any explanation would be helpful here.
You get an error because you clear the session after creating the input tensor. That is why the input tensor is not coming from the same graph as the rest of your network. To fix this simply leave out the line clear_session().
Another problem with your code: the second GRU layer expects a sequence input, therefore you should use return_sequences=True inside the first GRU layer. You probably want to leave out the argument return_state=True since that makes the layer return a tuple of tensors (output and state) instead of just one output tensor.
To sum up, the following code should do it:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
x = GRU(dense_dim, activation='tanh', return_sequences=True)(G_in)
G_out = GRU(out_dim)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
The problem here is that RNN layers expect a 3D tensor input of the form: [num samples, time steps, features].
So we can modify the code above as:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
G_out = GRU(out_dim, return_state=True)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=(1,3))
G, G_out = get_generative(G_in)
G.summary()
So what we are saying is that we expect an input of an arbitrary number of samples, each of 1 time step with 3 features.
Anna is correct that clear_session() should not be inside the generator function.
Lastly, if you actually want to input data into the network, its shape should also match what we just discussed. You can do this by using numpy reshape:
X = np.reshape(X, (X.shape[0], 1, X.shape[1]))
I am working with some custom layers and having problems with the shape of them, when I work with it separately it works, but when I load the model to use in another one as a layer, it doesn't work anymore. Here is my layers definition:
def signumTransform(x):
"""
SIGNUM function
if positive 1
if negative -1
"""
import keras.backend
return keras.backend.sign(x)
def logical_or_layer(x):
"""Processing an OR operation"""
import keras.backend
#normalized to 0,1
aux_array = keras.backend.sign(x)
aux_array = keras.backend.relu(aux_array)
# OR operation
aux_array = keras.backend.any(aux_array)
# casting back the True/False to 1,0
aux_array = keras.backend.cast(aux_array, dtype='float32')
return aux_array
#this is the input tensor
inputs = Input(shape=(inputSize,), name='input')
#this is the Neurule layer
x = Dense(neurulesQt, activation='softsign', name='neurules')(inputs)
#after each neuron layer, the outputs need to be put into SIGNUM (-1 or 1)
x = Lambda(signumTransform, output_shape=lambda x:x, name='signumAfterNeurules')(x)
#separating into 2 (2 possible outputs)
layer_split0 = Lambda( lambda x: x[:, :11], output_shape=[11], name='layer_split0')(x)
layer_split1 = Lambda( lambda x: x[:, 11:20], output_shape=[9], name='layer_split1')(x)
#this is the OR layer
y_0 = Lambda(logical_or_layer, output_shape=[1], name='or0')(layer_split0)
y_1 = Lambda(logical_or_layer, output_shape=[1], name='or1')(layer_split1)
y = Lambda(lambda x: K.stack([x[0], x[1]]),output_shape=[2], name="output")([y_0, y_1])
Until the layer_split everything works properly, but in my y_0 and y_1 I need to do an OR operation with keras.backend.any(), as a return I receive a boolean so I cast it back with keras.backend.cast().
If I use the Model as it is here described, it works...it compiles, can be validated and so on, but if I try to save it and load it, it simply loses one dimension, the batch dimension (None). The output in the summary is shown as (None, 2), but when used as a layer and concatenated with another one, it shows (2,) and an error is thrown:
InvalidArgumentError: Shape must be rank 2 but is rank 1 for 'merging_layer_10/concat' (op: 'ConcatV2') with input shapes: [?,16], [2], [].
How should I properly cast it in the logical_or_layer function? Should I change the output_shape in the Lambda Layer?
I am trying to build a neural network using custom activation functions. I followed the solution given here, and it works when the input and output vectors have the same size, but not when using different sizes (like in a pooling function). Here is my problem so far:
I am trying to generalize this to the case when the input and the output have different sizes. In my code the input 'x' is of size (2,4), the output 'y' is of size (1,2), and the activation function MEX(.) does the mapping y = MEX(x). I have computed the gradient of MEX() as d_MEX(), where d_MEX(x) has the same size as 'x', that is (2,4). Nevertheless, I get this error
InvalidArgumentError (see above for traceback): Incompatible shapes: [1,2] vs. [2,4]
Shouldn't the gradient of MEX(x) be of the same size as x? Here is my complete code:
import tensorflow as tf
import numpy as np
# This is our target function
def MEX(x):
'''
:param x: is a row vector which is the concatenation of [input, beta]
:return MEX_{beta}(x): scalar output
'''
# lenx = np.size(x) # Number of columns (ROW vector)
lenx = x.shape[1]
N = x.shape[0]
out = np.zeros((1,N))
for ii in range(N):
c = x[ii,0:lenx-1]
beta = x[ii,lenx-1]
out[0,ii] = 1./beta * np.log( np.mean( np.exp(beta*c) ))
return np.array(out)
# Now we should write its derivative.
def d_MEX(x):
# lenx = np.size(x) # Number of
lenx = x.shape[1]
N = x.shape[0]
out = np.zeros((N,lenx))
for ii in range(N):
c = x[ii,0:lenx-1]
beta = x[ii,lenx-1]
d_beta = np.array([0.])
d_beta[0] = -1./beta*( MEX(np.array([x[ii,:]])) - np.mean( np.multiply( c, np.exp(beta*c)))/np.mean( np.exp(beta*c)) )
d_c = 1./lenx*np.exp(beta*c) /np.mean( np.exp(beta*c))
out[ii,:] = np.concatenate((d_c,d_beta), axis=0)
return out
# The first step is making it into a numpy function, this is easy:
np_MEX = np.vectorize(MEX, excluded=['x']) # IMPORTANT!! Otherwise np.vectorize() doesnt work
np_d_MEX = np.vectorize(d_MEX, excluded=['x']) # IMPORTANT!! Otherwise np.vectorize() doesnt work
# Now we make a tensforflow function
'''
Making a numpy fct to a tensorflow fct: We will start by making np_d_MEX_32 into a tensorflow function.
There is a function in tensorflow tf.py_func(func, inp, Tout, stateful=stateful, name=name) [doc]
which transforms any numpy function to a tensorflow function, so we can use it:
'''
np_d_MEX_32 = lambda x: np_d_MEX(x=x).astype(np.float32)
def tf_d_MEX(x,name=None):
with tf.name_scope(name, "d_MEX", [x]) as name:
y = tf.py_func(np_d_MEX_32,
[x],
[tf.float32],
name=name,
stateful=False)
return y[0]
'''
tf.py_func acts on lists of tensors (and returns a list of tensors), that is why we have [x] (and return y[0]).
The stateful option is to tell tensorflow whether the function always gives the same output for the same input (stateful = False)
in which case tensorflow can simply the tensorflow graph, this is our case and will probably be the case in most situations.
One thing to be careful of at this point is that numpy used float64 but tensorflow uses float32 so you need to convert
your function to use float32 before you can convert it to a tensorflow function otherwise tensorflow will complain.
This is why we need to make np_d_MEX_32 first.
What about the Gradients? The problem with only doing the above is that even though we now have tf_d_MEX which is the
tensorflow version of np_d_MEX, we couldn't use it as an activation function if we wanted to because tensorflow doesn't
know how to calculate the gradients of that function.
Hack to get Gradients: As explained in the sources mentioned above, there is a hack to define gradients of a function
using tf.RegisterGradient [doc] and tf.Graph.gradient_override_map [doc]. Copying the code from harpone we can modify
the tf.py_func function to make it define the gradient at the same time:
'''
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
'''
Now we are almost done, the only thing is that the grad function we need to pass to the above py_func function needs to
take a special form. It needs to take in an operation, and the previous gradients before the operation and propagate
the gradients backward after the operation.
Gradient Function: So for our MEX activation function that is how we would do it:
'''
def MEXgrad(op, grad):
x = op.inputs[0]
# x = op
n_gr = tf_d_MEX(x)
return grad * n_gr
'''
The activation function has only one input, that is why x = op.inputs[0]. If the operation had many inputs, we would
need to return a tuple, one gradient for each input. For example if the operation was a-bthe gradient with respect to a
is +1 and with respect to b is -1 so we would have return +1*grad,-1*grad. Notice that we need to return tensorflow
functions of the input, that is why need tf_d_MEX, np_d_MEX would not have worked because it cannot act on
tensorflow tensors. Alternatively we could have written the derivative using tensorflow functions:
'''
# Combining it all together: Now that we have all the pieces, we can combine them all together:
np_MEX_32 = lambda x: np_MEX(x=x).astype(np.float32)
def tf_MEX(x, name=None):
with tf.name_scope(name, "MEX",[x]) as name:
y = py_func(np_MEX_32,
[x],
[tf.float32],
name=name,
grad=MEXgrad) # <-- here's the call to the gradient
return y[0]
with tf.Session() as sess:
x = tf.constant([[0.2,0.7,1.2,1.7],[0.2,0.7,1.2,1.7]])
y = tf_MEX(x)
tf.global_variables_initializer().run()
print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())
In the console, I have checked that the variables have the "correct" shapes:
x.eval()
Out[9]:
array([[ 0.2 , 0.69999999, 1.20000005, 1.70000005],
[ 0.2 , 0.69999999, 1.20000005, 1.70000005]], dtype=float32)
y.eval()
Out[10]: array([[ 0.83393127, 0.83393127]], dtype=float32)
tf_d_MEX(x).eval()
Out[11]:
array([[ 0.0850958 , 0.19909413, 0.46581003, 0.07051659],
[ 0.0850958 , 0.19909413, 0.46581003, 0.07051659]], dtype=float32)
My bad, I just found the mistake.
Its here:
def MEXgrad(op, grad):
x = op.inputs[0]
# x = op
n_gr = tf_d_MEX(x)
return n_gr
I wonder if there is a typo here, where this mistake is also there.