I was hoping to implement a variation of PointNet (https://arxiv.org/pdf/1612.00593.pdf) in keras, but I'm having trouble repeating the context vector (g) a variable amount of times so that I can Concatenate it rowwise with a previous layer that lacks context(pre). I tried Repeat() and keras.backend.Tile().
input = Input(shape=(None,3))
x = TimeDistributed(Dense(128, activation = 'relu'))(input)
pre = TimeDistributed(Dense(256, activation = 'relu'))(x)
g = GlobalMaxPooling1D()(pre)
x = Lambda(merge_on_single, output_shape=(None,512))([pre,g])
print(x.shape)
This is the lambda definition I came up with.
def merge_on_single(v):
#v[0] is variable length tensor, v[1] is the single vector
return Concatenate()([K.repeat(v[1],K.get_variable_shape(v[0])),v[0]])
However the following error occurs:
TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [int32, , int32] that don't all match.
UPDATE:
So I was able to get the layers to not give errors by doing the following:
input = Input(shape=(None,3))
num_point = K.placeholder(input.get_shape()[1].value, dtype=tf.int32)
#first global feature layer
x = TimeDistributed(Dense(512, activation = 'relu'))(input)
x = TimeDistributed(Dense(256, activation = 'relu'))(x)
g = GlobalMaxPooling1D()(x)
g = K.reshape(g,(-1,1,256))
g = K.tile(x, [1,num_point,1])
concat_feat = K.concatenate([x, g])
but now, I get the following error:
AttributeError: 'Tensor' object has no attribute '_keras_history'
I suspect the culprit is K.get_variable_shape(v[0]). Since v[0] is of type int32 (as specified by your error), when you get the shape it returns None. Concatenate wants all inputs to be of the same type.
Related
I'm trying to build an RNN in Keras. I don't quite understand the required input format. I can build dense networks no problem, but I think that the RNN layers expect input dimension x batch x time step? Can anyone verify this?
Here is the code I would like to update:
Original code:
def get_generative(G_in, dense_dim=200, out_dim=50, lr=1e-3):
x = Dense(dense_dim)(G_in)
x = Activation('tanh')(x)
G_out = Dense(out_dim, activation='tanh')(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=[10])
G, G_out = get_generative(G_in)
G.summary()
Modified with GRU layers and some slightly different dimensions:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
clear_session()
x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
G_out = GRU(out_dim, return_state=True)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=(None,3))
G, G_out = get_generative(G_in)
G.summary()
The error that I am seeing with this code is:
ValueError: Tensor("gru_1/strided_slice:0", shape=(3, 10),
dtype=float32) must be from the same graph as
Tensor("strided_slice_1:0", shape=(?, 3), dtype=float32).
If I remove the None above, I get:
ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3,
found ndim=2
Any explanation would be helpful here.
You get an error because you clear the session after creating the input tensor. That is why the input tensor is not coming from the same graph as the rest of your network. To fix this simply leave out the line clear_session().
Another problem with your code: the second GRU layer expects a sequence input, therefore you should use return_sequences=True inside the first GRU layer. You probably want to leave out the argument return_state=True since that makes the layer return a tuple of tensors (output and state) instead of just one output tensor.
To sum up, the following code should do it:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
x = GRU(dense_dim, activation='tanh', return_sequences=True)(G_in)
G_out = GRU(out_dim)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
The problem here is that RNN layers expect a 3D tensor input of the form: [num samples, time steps, features].
So we can modify the code above as:
def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
G_out = GRU(out_dim, return_state=True)(x)
G = Model(G_in, G_out)
opt = SGD(lr=lr)
G.compile(loss='binary_crossentropy', optimizer=opt)
return G, G_out
G_in = Input(shape=(1,3))
G, G_out = get_generative(G_in)
G.summary()
So what we are saying is that we expect an input of an arbitrary number of samples, each of 1 time step with 3 features.
Anna is correct that clear_session() should not be inside the generator function.
Lastly, if you actually want to input data into the network, its shape should also match what we just discussed. You can do this by using numpy reshape:
X = np.reshape(X, (X.shape[0], 1, X.shape[1]))
I am working with some custom layers and having problems with the shape of them, when I work with it separately it works, but when I load the model to use in another one as a layer, it doesn't work anymore. Here is my layers definition:
def signumTransform(x):
"""
SIGNUM function
if positive 1
if negative -1
"""
import keras.backend
return keras.backend.sign(x)
def logical_or_layer(x):
"""Processing an OR operation"""
import keras.backend
#normalized to 0,1
aux_array = keras.backend.sign(x)
aux_array = keras.backend.relu(aux_array)
# OR operation
aux_array = keras.backend.any(aux_array)
# casting back the True/False to 1,0
aux_array = keras.backend.cast(aux_array, dtype='float32')
return aux_array
#this is the input tensor
inputs = Input(shape=(inputSize,), name='input')
#this is the Neurule layer
x = Dense(neurulesQt, activation='softsign', name='neurules')(inputs)
#after each neuron layer, the outputs need to be put into SIGNUM (-1 or 1)
x = Lambda(signumTransform, output_shape=lambda x:x, name='signumAfterNeurules')(x)
#separating into 2 (2 possible outputs)
layer_split0 = Lambda( lambda x: x[:, :11], output_shape=[11], name='layer_split0')(x)
layer_split1 = Lambda( lambda x: x[:, 11:20], output_shape=[9], name='layer_split1')(x)
#this is the OR layer
y_0 = Lambda(logical_or_layer, output_shape=[1], name='or0')(layer_split0)
y_1 = Lambda(logical_or_layer, output_shape=[1], name='or1')(layer_split1)
y = Lambda(lambda x: K.stack([x[0], x[1]]),output_shape=[2], name="output")([y_0, y_1])
Until the layer_split everything works properly, but in my y_0 and y_1 I need to do an OR operation with keras.backend.any(), as a return I receive a boolean so I cast it back with keras.backend.cast().
If I use the Model as it is here described, it works...it compiles, can be validated and so on, but if I try to save it and load it, it simply loses one dimension, the batch dimension (None). The output in the summary is shown as (None, 2), but when used as a layer and concatenated with another one, it shows (2,) and an error is thrown:
InvalidArgumentError: Shape must be rank 2 but is rank 1 for 'merging_layer_10/concat' (op: 'ConcatV2') with input shapes: [?,16], [2], [].
How should I properly cast it in the logical_or_layer function? Should I change the output_shape in the Lambda Layer?
I wanted to plot all Keras activation functions but some of them are not working. i.e. linear throws an error:
AttributeError: 'Series' object has no attribute 'eval'
which is weird. How can I plot the rest of my activation functions?
points = 100
zeros = np.zeros((points,1))
df = pd.DataFrame({"activation": np.linspace(-1.2,1.2,points)})
df["softmax"] = K.eval(activations.elu(df["activation"]))
#df["linear"] = K.eval(activations.linear(df["activation"]))
df["tanh"] = K.eval(activations.tanh(df["activation"]))
df["sigmoid"] = K.eval(activations.sigmoid(df["activation"]))
df["relu"] = K.eval(activations.relu(df["activation"]))
#df["hard_sigmoid"] = K.eval(activations.hard_sigmoid(df["activation"]))
#df["exponential"] = K.eval(activations.exponential(df["activation"]))
df["softsign"] = K.eval(activations.softsign(df["activation"]))
df["softplus"] = K.eval(activations.softplus(df["activation"]))
#df["selu"] = K.eval(activations.selu(df["activation"]))
df["elu"] = K.eval(activations.elu(df["activation"]))
df.plot(x="activation", figsize=(15,15))
That's because the linear activation returns the input without any modifications:
def linear(x):
"""Linear (i.e. identity) activation function.
"""
return x
Since you are passing a Pandas Series as input, the same Pandas Series will be returned and therefore you don't need to use K.eval():
df["linear"] = activations.linear(df["activation"])
As for the selu activation, you need to reshape the input to (n_samples, n_output):
df["selu"] = K.eval(activations.selu(df["activation"].values.reshape(-1,1)))
And as for the hard_sigmoid activation, its input should be explicitly a Tensor which you can create using K.variable():
df["hard_sigmoid"] = K.eval(activations.hard_sigmoid(K.variable(df["activation"].values)))
Further, exponential activation works as you have written and there is no need for modifications.
I am not sure how to pass a value for initial_state when the cell is a LSTMCell. I am using LSTMStateTuple as it is shown in the following piece of code:
c_placeholder = tf.placeholder(tf.float32, [ None, config.state_dim], name='c_lstm')
h_placeholder = tf.placeholder(tf.float32, [ None, config.state_dim], name='h_lstm')
state_tuple = tf.nn.rnn_cell.LSTMStateTuple(c_placeholder, h_placeholder)
cell = tf.contrib.rnn.LSTMCell(num_units=config.state_dim, state_is_tuple=True, reuse=not is_training)
rnn_outs, states = tf.nn.dynamic_rnn(cell=cell, inputs=x,sequence_length=seqlen, initial_state=state_tuple, dtype= tf.float32)
However, the execution returns this error:
TypeError: 'Tensor' object is not iterable.
Here is the link of the documentation for dynamic_rnn
I've seen this same error before. I was using multiple layers of RNN-cells made with tf.contrib.rnn.MultiRNNCell, and I needed to specify a tuple of LSTMStateTuples -- one for each layer. Something like
state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(c_ph[i], h_ph[i])
for i in range(nRecurrentLayers)]
)
I am creating a tf.Variable() and then create a simple function using that variable, then I flatten the original variable using tf.reshape() and then I take the tf.gradients() between the function and the flattened variable. Why does that return [None].
var = tf.Variable(np.ones((5,5)), dtype = tf.float32)
f = tf.reduce_sum(tf.reduce_sum(tf.square(var)))
var_f = tf.reshape(var, [-1])
print tf.gradients(f,var_f)
The above codeblock when executed returns [None]. Is this a bug? Please Help!
You are finding derivative of f with respect to var_f, but f is not a function of var_f but var instead. Thats why you are getting [None]. Now if you change the code to:
var = tf.Variable(np.ones((5,5)), dtype = tf.float32)
var_f = tf.reshape(var, [-1])
f = tf.reduce_sum(tf.reduce_sum(tf.square(var_f)))
grad = tf.gradients(f,var_f)
print(grad)
your gradients will be defined:
tf.Tensor 'gradients_28/Square_32_grad/mul_1:0' shape=(25,) dtype=float32>
The visualization of the graphs for the following code is given below:
var = tf.Variable(np.ones((5,5)), dtype = tf.float32, name='var')
f = tf.reduce_sum(tf.reduce_sum(tf.square(var)), name='f')
var_f = tf.reshape(var, [-1], name='var_f')
grad_1 = tf.gradients(f,var_f, name='grad_1')
grad_2 = tf.gradients(f,var, name='grad_2')
The derivative of grad_1 is not defined, while for grad_2 it's defined. The back-propagation graph (gradient graphs) of the two gradients are shown.