I've coded a custom TensorFlow model. However, when I pass in single tensor to it, it considers each element of that tensor as single input and therefore, gives a batch output.
For eg. my input tensor is of the shape [3,1] with values:
tf.Tensor(
[[0.7001484 ]
[0.2581525 ]
[0.04169908]], shape=(3, 1), dtype=float32)
Corresponding to this I should get a single vector of shape (3,). But what I'm getting is 3x3 tensor:
tf.Tensor(
[[0.31234854 0.3224371 0.36521438]
[0.32561225 0.3294511 0.3449366 ]
[0.33208787 0.33271718 0.33519495]], shape=(3, 3), dtype=float32)
My Model
class MAE_Model(tf.keras.Model):
def __init__(self):
super(MAE_Model, self).__init__()
self.h_fin = EnsembleBlock()
self.ipt = tf.keras.layers.InputLayer(input_shape=(3,1), batch_size=None)
self.fc_1 = tf.keras.layers.Dense(16, activation='relu')
self.fc_2 = tf.keras.layers.Dense(16, activation='relu')
self.classifier = tf.keras.layers.Dense(3, activation='softmax')
def call(self, inputs):
x = self.h_fin(inputs) # this returns a vector of shape [3] E.g. x = [1., 2., 3.]
x = tf.reshape(x, (3,1))
print(x)
x = self.ipt(x)
x = self.fc_1(x)
x = self.fc_2(x)
return self.classifier(x)
Here, print(x) prints a (3,1) tensor, similar to the example above. Is there any way to solve this? I want the model to consider the x as a single input (whole of it) and not as some batch input
The output of your model makes sense. The reason it gives a 3x3 tensor is because the last layer in your model outputs probabilities corresponding to 3 classes. So, given a batch of 3 inputs, your output will have probabilities for 3 classes for each element in your batch. If you want to find the predicted class, you want the classes with the highest probability. Tensorflow allows you to easily find the classes by doing
tf.argmax(predictions, axis=-1)
The corresponding output shape from this operation will be of shape (None,) where None is the batch size.
Related
I am trying to tidy up my code by moving from the Keras functional API to the subclassing API. The class I came up with so far is below:
class FeedForwardNN(Model):
def __init__(self, params):
super().__init__()
self.params = params
self.layout = params['layout']
# Define layers
self.dense = Dense(units=params['layout'][1],
activation=params['activation'],
kernel_initializer=params['initializer'])
self.output_layer = Dense(units=params['layout'][-1],
kernel_initializer=params['initializer'])
self.dropout = Dropout(params['dropout'])
self.batch_norm = BatchNormalization()
def call(self, x):
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
if self.params['batch_norm']:
x = self.batch_norm(x)
x = self.output_layer(x)
return x
Where layout is a list of the neurons in each layer (including input and output layers).
However, when fitting the model, the following error is raised:
ValueError: Input 0 of layer "dense" is incompatible with the layer: expected axis -1 of input shape to have value 5, but received input with shape (None, 100)
Call arguments received:
• x=tf.Tensor(shape=(None, 5), dtype=float32)
which seems to occur on the line:
x = self.dropout(self.dense(x))
I checked the shape of the training data X that is passed to the fit() method, and it appears to have the right shape i.e. (number of observations, number of predictors).
Does anyone have an idea of where my mistake is?
The problem is that you are using same self.dense layer over and over again in your for loops
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
After the first loop, x has shape (batch, 100). Then in the second loop, instead of passing this x to the second Dense layer (which you don't seem to have created in the first place), you re-pass it to the first Dense layer, which expects shape (batch, 5), causing the error.
You can create a list of dense layer as follows in __init__
self.denses = [Dense(units=self.layout[i],
activation=params['activation'],
kernel_initializer=params['initializer']) for i in self.layout[1:-1]]
and call them in sequence
for dense_layer in self.denses:
x = self.dropout(dense_layer(x))
I am replicating a pytorch model in keras and ahve problems to see where the extra dimension comes from.
This how my code looks so far:
class Attention(tf.keras.Model):
def __init__(self, input_shape):
super(Attention, self).__init__()
in_features=input_shape[-1]
small_in_features = max(math.floor(in_features/10), 1)
self.d_k = small_in_features
query = tf.keras.models.Sequential()
query.add(tf.keras.layers.Dense(in_features))
query.add(tf.keras.layers.Dense(small_in_features,activation="tanh"))
self.query= query
self.key = tf.keras.layers.Dense(small_in_features)
def call(self, inp):
# inp.shape should be (B,N,C)
q = self.query(inp) # (B,N,C/10)
k = self.key(inp) # B,N,C/10
k = tf.transpose(k)
print(q)
print(k)
x = tf.linalg.matmul(q, k) / math.sqrt(self.d_k) # B,N,N
x = tf.nn.softmax(x) # over rows
x = tf.transpose(x)
x = tf.linalg.matmul(x, inp) # (B, N, C)
return x
But if I want to add it to my Sequential model I get this Error:
ValueError: Dimensions must be equal, but are 1 and 256 for '{{node attention_19/MatMul}} = BatchMatMulV2[T=DT_FLOAT, adj_x=false, adj_y=false](attention_19/sequential_36/Identity, attention_19/transpose)' with input shapes: [?,256,1], [1,256,?].
I have now printed my 'q' and 'k' and it prints out like following:
Tensor("attention_19/sequential_36/Identity:0", shape=(None, 256, 1), dtype=float32)
Tensor("attention_19/transpose:0", shape=(1, 256, None), dtype=float32)
So they are 3 dimensional where one dimension is unfilled.I dont quite understand why it happens.
How can I "remove" the extra dimension or bring this custom layer to work?
Note: The original codes seems to use 3 dimensional Input but I want 2 dimensional input.
I'm trying to setup a simple tf.keras model in which a vector is fed in as input and the output is the result of a single matrix multiply.
The lines of code to create the model suceed but calling it for a forward pass results in an error.
n_input_nodes = 2
n_output_nodes = 1
x = tf.keras.Input(shape=(n_input_nodes,))
W = tf.ones((n_input_nodes,n_output_nodes), dtype=tf.float32)
y = tf.matmul(x, W)
model = tf.keras.Model(inputs=x, outputs=y)
x_input = tf.constant([10,30.], shape=[1, 2])
output = model(x_input)
The final line (i.e. the forward pass) throws the following error:
ValueError: Argument must be a dense tensor: [array([[1.], [1.]], dtype=float32)] - got shape [1, 2, 1], but wanted [1].
The input is of shape (2,1) and the weight matrix has shape (2,1). Matrix multiply between the two should be a valid multiplication and result in a [1,1] tensor; however, this is not the case.
They require a dense tensor and not a sparse tensor. Consider this shape
W = tf.ones((n_input_nodes,), dtype=tf.float32)
It requires a tensor of shape ( 2, ) which is dense.
I have two datasets, which is like:
input:
array([[[ 0.99309823],
...
[ 0. ]]])
shape : (1, 2501)
output:
array([[0, 0, 0, ..., 0, 0, 1],
...,
[0, 0, 0, ..., 0, 0, 0]])
shape : (2501, 9)
And I processed it with TFLearn; as
input_layer = tflearn.input_data(shape=[None,2501])
hidden1 = tflearn.fully_connected(input_layer,1205,activation='ReLU', regularizer='L2', weight_decay=0.001)
dropout1 = tflearn.dropout(hidden1,0.8)
hidden2 = tflearn.fully_connected(dropout1,1205,activation='ReLU', regularizer='L2', weight_decay=0.001)
dropout2 = tflearn.dropout(hidden2,0.8)
softmax = tflearn.fully_connected(dropout2,9,activation='softmax')
# Regression with SGD
sgd = tflearn.SGD(learning_rate=0.1,lr_decay=0.96, decay_step=1000)
top_k=tflearn.metrics.Top_k(3)
net = tflearn.regression(softmax,optimizer=sgd,metric=top_k,loss='categorical_crossentropy')
model = tflearn.DNN(net)
model.fit(input,output,n_epoch=10,show_metric=True, run_id='dense_model')
It works but not the way that I want. It's a DNN model. I want that when I enter 0.95, model must give me corresponding prediction for example [0,0,0,0,0,0,0,0,1]. However, when I want to enter 0.95, it says that,
ValueError: Cannot feed value of shape (1,) for Tensor 'InputData/X:0', which has shape '(?, 2501)'
When I tried to understand I realise that I need (1,2501) shaped data to predict for my wrong based model.
What i want is for every element in input, predict corresponding element in output. As you can see, in the instance dataset,
for [0.99309823], corresponding output is [0,0,0,0,0,0,0,0,1]. I want tflearn to train itself like this.
I may have wrong structured data, or model(probably dataset), I explained all the things, I need help I'm really out of my mind.
Your input data should be Nx1 (N = number of samples) dimensional to archive this transformation ([0.99309823] --> [0,0,0,0,0,0,0,0,1] ). According to your input data shape, it looks more likely including 1 sample with 2501 dimensions.
ValueError: Cannot feed value of shape (1,) for Tensor 'InputData/X:0', which has shape '(?, 2501)' This error means that tensorflow expecting you to provide a vector with shape (,2501), but you are feeding the network with a vector with shape (1,).
Example modified code with dummy data:
import numpy as np
import tflearn
#creating dummy data
input_data = np.random.rand(1, 2501)
input_data = np.transpose(input_data) # now shape is (2501,1)
output_data = np.random.randint(8, size=2501)
n_values = 9
output_data = np.eye(n_values)[output_data]
# checking the shapes
print input_data.shape #(2501,1)
print output_data.shape #(2501,9)
input_layer = tflearn.input_data(shape=[None,1]) # now network is expecting ( Nx1 )
hidden1 = tflearn.fully_connected(input_layer,1205,activation='ReLU', regularizer='L2', weight_decay=0.001)
dropout1 = tflearn.dropout(hidden1,0.8)
hidden2 = tflearn.fully_connected(dropout1,1205,activation='ReLU', regularizer='L2', weight_decay=0.001)
dropout2 = tflearn.dropout(hidden2,0.8)
softmax = tflearn.fully_connected(dropout2,9,activation='softmax')
# Regression with SGD
sgd = tflearn.SGD(learning_rate=0.1,lr_decay=0.96, decay_step=1000)
top_k=tflearn.metrics.Top_k(3)
net = tflearn.regression(softmax,optimizer=sgd,metric=top_k,loss='categorical_crossentropy')
model = tflearn.DNN(net)
model.fit(input_data, output_data, n_epoch=10,show_metric=True, run_id='dense_model')
Also my friend warned me about same thing as rcmalli. He says
reshape:
input = tf.reshape(input, (2501,1))
change
input_layer = tflearn.input_data(shape=[None,2501])
to
input_layer = tflearn.input_data(shape=[None, 1])
Variable dimension must be "None". In your wrong case, 2501 is the magnitude(or something else, I translated from another lang., but you got it) of your dataset. 1 is constant input magnitude.
I am implementing an RNN and contrarily to the examples I have found which minimize only the cost for the output in the last step
x = tf.placeholder ("float", [features_dimension, None, n_timesteps])
y = tf.placeholder ("float", [labels_dimension, None, n_timesteps])
# Define weights
weights = {'out': tf.Variable (tf.random_normal ([N_HIDDEN, labels_dimension]))}
biases = {'out': tf.Variable (tf.random_normal ([labels_dimension]))}
def RNN (x, weights, biases):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (features_dimension, BATCH_SIZE, n_timesteps)
# Required shape: `n_timesteps` tensors list of shape (BATCH_SIZE, features_dimension)
# We make a division of the data to split it in individual vectors that
# will be fed for each timestep
# Permuting features_dimension and n_timesteps
# Shape will be (n_timesteps, BATCH_SIZE, features_dimension)
x = tf.transpose (x, [2, 1, 0])
# Reshaping to (BATCH_SIZE*n_timesteps, features_dimension) (we are removing the depth dimension with this)
x = tf.reshape(x, [BATCH_SIZE*n_timesteps, features_dimension])
# Split the previous 2D tensor to get a list of `n_timesteps` tensors of
# shape (batch_size, features_dimension).
x = tf.split (x, n_timesteps, 0)
# Define a lstm cell with tensorflow
lstm_cell = rnn.BasicLSTMCell (N_HIDDEN, forget_bias=1.0)
# Get lstm cell output
outputs, states = rnn.static_rnn (lstm_cell, x, dtype=tf.float32)
# Linear activation; outputs contains the array of outputs for all the
# timesteps
pred = tf.matmul (outputs, weights['out']) + biases['out']
However, the object outputs is a list of Tensor with n_timesteps elements, so the pred = tf.matmul (outputs, weights['out']) + biases['out'] throws the error
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op:
'MatMul') with input shapes: [100,128,16], [16,1].
. How can I do this multiplication?
The solution is to tf.stack the list of tensors into a 3d tensor and then use tf.map_fn to apply the multiplication operation on each 2d tensor along dimension 0:
# Transform the list into a 3D tensor with dimensions (n_timesteps, batch_size, N_HIDDEN)
outputs = tf.stack(outputs)
def pred_fn(current_output):
return tf.matmul(current_output, weights['out']) + biases['out']
# Use tf.map_fn to apply pred_fn to each tensor in outputs, along dimension 0 (timestep dimension)
pred = tf.map_fn(pred_fn, outputs)