I am replicating a pytorch model in keras and ahve problems to see where the extra dimension comes from.
This how my code looks so far:
class Attention(tf.keras.Model):
def __init__(self, input_shape):
super(Attention, self).__init__()
in_features=input_shape[-1]
small_in_features = max(math.floor(in_features/10), 1)
self.d_k = small_in_features
query = tf.keras.models.Sequential()
query.add(tf.keras.layers.Dense(in_features))
query.add(tf.keras.layers.Dense(small_in_features,activation="tanh"))
self.query= query
self.key = tf.keras.layers.Dense(small_in_features)
def call(self, inp):
# inp.shape should be (B,N,C)
q = self.query(inp) # (B,N,C/10)
k = self.key(inp) # B,N,C/10
k = tf.transpose(k)
print(q)
print(k)
x = tf.linalg.matmul(q, k) / math.sqrt(self.d_k) # B,N,N
x = tf.nn.softmax(x) # over rows
x = tf.transpose(x)
x = tf.linalg.matmul(x, inp) # (B, N, C)
return x
But if I want to add it to my Sequential model I get this Error:
ValueError: Dimensions must be equal, but are 1 and 256 for '{{node attention_19/MatMul}} = BatchMatMulV2[T=DT_FLOAT, adj_x=false, adj_y=false](attention_19/sequential_36/Identity, attention_19/transpose)' with input shapes: [?,256,1], [1,256,?].
I have now printed my 'q' and 'k' and it prints out like following:
Tensor("attention_19/sequential_36/Identity:0", shape=(None, 256, 1), dtype=float32)
Tensor("attention_19/transpose:0", shape=(1, 256, None), dtype=float32)
So they are 3 dimensional where one dimension is unfilled.I dont quite understand why it happens.
How can I "remove" the extra dimension or bring this custom layer to work?
Note: The original codes seems to use 3 dimensional Input but I want 2 dimensional input.
Related
I am trying to tidy up my code by moving from the Keras functional API to the subclassing API. The class I came up with so far is below:
class FeedForwardNN(Model):
def __init__(self, params):
super().__init__()
self.params = params
self.layout = params['layout']
# Define layers
self.dense = Dense(units=params['layout'][1],
activation=params['activation'],
kernel_initializer=params['initializer'])
self.output_layer = Dense(units=params['layout'][-1],
kernel_initializer=params['initializer'])
self.dropout = Dropout(params['dropout'])
self.batch_norm = BatchNormalization()
def call(self, x):
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
if self.params['batch_norm']:
x = self.batch_norm(x)
x = self.output_layer(x)
return x
Where layout is a list of the neurons in each layer (including input and output layers).
However, when fitting the model, the following error is raised:
ValueError: Input 0 of layer "dense" is incompatible with the layer: expected axis -1 of input shape to have value 5, but received input with shape (None, 100)
Call arguments received:
• x=tf.Tensor(shape=(None, 5), dtype=float32)
which seems to occur on the line:
x = self.dropout(self.dense(x))
I checked the shape of the training data X that is passed to the fit() method, and it appears to have the right shape i.e. (number of observations, number of predictors).
Does anyone have an idea of where my mistake is?
The problem is that you are using same self.dense layer over and over again in your for loops
for layer in self.layout[1:-1]:
x = self.dropout(self.dense(x))
After the first loop, x has shape (batch, 100). Then in the second loop, instead of passing this x to the second Dense layer (which you don't seem to have created in the first place), you re-pass it to the first Dense layer, which expects shape (batch, 5), causing the error.
You can create a list of dense layer as follows in __init__
self.denses = [Dense(units=self.layout[i],
activation=params['activation'],
kernel_initializer=params['initializer']) for i in self.layout[1:-1]]
and call them in sequence
for dense_layer in self.denses:
x = self.dropout(dense_layer(x))
I have input data for my 2D CNN model, say; X_train with shape (torch.Size([716, 50, 50])
my model is:
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(1, 32, kernel_size=4,stride=1,padding = 1)
self.mp1 = nn.MaxPool2d(kernel_size=4,stride=2)
self.conv2 = nn.Conv2d(32,64, kernel_size=4,stride =1)
self.mp2 = nn.MaxPool2d(kernel_size=4,stride=2)
self.fc1= nn.Linear(2304,256)
self.dp1 = nn.Dropout(p=0.2)
self.fc2 = nn.Linear(256,10)
def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp1(self.conv1(x)))
x = F.relu(self.mp2(self.conv2(x)))
x = x.view(in_size,-1)
x = F.relu(self.fc1(x))
x = self.dp1(x)
x = self.fc2(x)
return F.log_softmax(x, dim=1)
but when I run the model, I always get this error:
---> x = F.relu(self.mp1(self.conv1(x)))
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 4, 4], but got 3-dimensional input of size [64, 50, 50] instead
I understand my input for the model is of size 64 (batch size), 50*50 (size of each input, in this case is signal picture).
But I don't understand why it still requires 4-dimensional input where I had set my in_channels for nn.Conv2d to be 1.
How to solve this input dimension problem or to change the dimension requirement of model input?
Whether in_channels is 1 or 42 does not matter: it is still an added dimension. It is useful to read the documentation in this respect.
In- and output are of the form N, C, H, W
N: batch size
C: channels
H: height in pixels
W: width in pixels
So you need to add the dimension in your case:
# Add a dimension at index 1
x = x.unsqueeze(1)
That's the problem...
You've entered the in_channels=1, That doesn't mean that It doesn't exists...
Expanding the Dimension of Your Data to [64, 1, 50, 50] should solve your problem
use .view() on input tensor
I have written a generator function with Keras, before returning X,y from __getitem__ I have double check the shapes of the X's and Y's and they are alright, but generator is giving dimension mismatch array and warnings.
(Colab Code to reproduce: https://colab.research.google.com/drive/1bSJm44MMDCWDU8IrG2GXKBvXNHCuY70G?usp=sharing)
My training and validation generators are pretty much same as
class ValidGenerator(Sequence):
def __init__(self, df, batch_size=64):
self.batch_size = batch_size
self.df = df
self.indices = self.df.index.tolist()
self.num_classes = num_classes
self.shuffle = shuffle
self.on_epoch_end()
def __len__(self):
return int(len(self.indices) // self.batch_size)
def __getitem__(self, index):
index = self.index[index * self.batch_size:(index + 1) * self.batch_size]
batch = [self.indices[k] for k in index]
X, y = self.__get_data(batch)
return X, y
def on_epoch_end(self):
self.index = np.arange(len(self.indices))
if self.shuffle == True:
np.random.shuffle(self.index)
def __get_data(self, batch):
#some logic is written here
#hat prepares 3 X features and 3 Y outputs
X = [input_array_1,input_array_2,input_array_3]
y = [out_1,out_2,out_3]
#print(len(X))
return X, y
I am return tupple of X,y from which has 3 input features and 3 output features each, so shape of X is (3,32,10,1)
I am using functional api to build model(I have things like concatenation, multi input/output, which isnt possible with sequential) with following structure
When I try to fit the model with generator with following code
train_datagen = TrainGenerator(df=train_df, batch_size=32, num_classes=None, shuffle=True)
valid_datagen = ValidGenerator(df=train_df, batch_size=32, num_classes=None, shuffle=True)
model.fit(train_datagen, epochs=2,verbose=1,callbacks=[checkpoint,es])
I get these warnings and errors, that dont go away
Epoch 1/2
WARNING:tensorflow:Model was constructed with shape (None, 10) for input >Tensor("input_1:0", shape=(None, 10), dtype=float32), but it was called >on an input with incompatible shape (None, None, None).
WARNING:tensorflow:Model was constructed with shape (None, 10) for input
Tensor("input_2:0", shape=(None, 10), dtype=float32), but it was
called on an input with incompatible shape (None, None, None).
WARNING:tensorflow:Model was constructed with shape (None, 10) for
input Tensor("input_3:0", shape=(None, 10), dtype=float32), but it was
called on an input with incompatible shape (None, None, None).
...
...
call
return super(RNN, self).call(inputs, **kwargs)
/home/eduardo/.virtualenvs/kgpu3/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:975
call
input_spec.assert_input_compatibility(self.input_spec, inputs,
/home/eduardo/.virtualenvs/kgpu3/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py:176
assert_input_compatibility
raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, None, None, 88]
I have rechecked whole code and it isnt possible to have input (None,None,None) like in warning or in error, my input dimension is (3,32,10,1)
Update
I have also tried to write a generator function with python and got exactly same error.
My generator function
def generate_arrays_from_file(batchsize,df):
#print(bat)
inputs = []
targets = []
batchcount = 0
while True:
df3 = df.loc[np.arange(batchcount*batchsize,(batchcount*batchsize)+batchsize)]
#Some pre processing
X = [input_array_1,input_array_2,input_array_3]
y = [out_1,out_2,out_3]
yield X,y
batchcount = batchcount +1
It seems like it is something wrong internally wit keras (may be due to the fact I am using functional API)
Update 2
I also tried to output tuple
X = (input1_X,input2_X,input3_X)
y = (output1_y,output2_y,output3_y)
and also named input/output, but it doesnt work
X = {"input_1": input1_X, "input_2": input2_X,"input_3": input3_X}
y = {"output_1": output1_y, "output_2": output2_y,"output_3": output3_y}
Note about problem formulation:
Changing the individual X features to shape (32,10) instead of (32,10,1) might help to get rid of this error but that is not what I want, it changes my problem(I no longer have 10 time steps with one feature each)
Keras use 'None' for dynamic dimensions.
As you can see on the model.summary() chart - the model expecting shape(None, 10) for all of your inputs, which is two dimensional. With batch dimension - you should feed three dimensional data to the model.
But you are feeding four dimensional data.
I would guess that your model doesn't split your input list by three inputs. Try to change your inputs to tuple:
X = (input_array_1,input_array_2,input_array_3)
In order to resolve this error:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, None, None, 88]
TrainGenerator should be changed in the following way.
Current code:
input1_X = np.array(df3['input1_X'].to_list()).reshape(dlen,pad_len,1)
input2_X = np.array(df3['input2_X'].to_list()).reshape(dlen,pad_len,1)
input3_X = np.array(df3['input3_X'].to_list()).reshape(dlen,pad_len,1)
Should be changed to:
input1_X = np.array(df3['input1_X'].to_list()).reshape(dlen,pad_len)
input2_X = np.array(df3['input2_X'].to_list()).reshape(dlen,pad_len)
input3_X = np.array(df3['input3_X'].to_list()).reshape(dlen,pad_len)
The reason is that each of the 3 Inputs expects a 2-dimensional array, but the generator provides a 3-dimensional one. The expected shape is (batch_size, 10).
I had a similar issue with a custom generator that just had to pass a numpy array of size 10 as input and one single output.
To solve this problem i had to trasform the shape of the 2 vectors passed to the neural network like this:
def slides_generator(integer_list):
# stuff happens
x = np_ts[np_index:np_index+10] # numpy array
y = np_ts[np_index+10] # numpy array
yield tf.convert_to_tensor(x)[np.newaxis, ...], tf.convert_to_tensor(y)[np.newaxis, ...]
doge_gen = slides_generator(integer_list) #next(doge_gen)
basically you need to pass the 2 arrays with shape (None,size),
so in my case were (None,10) and (None,1), and to achieve this i just passed 2 reshaped tensors.
you need the None dimension as the batch size.
I am trying to model CNN for 1-D signal but I am unable to understand rank errors.
My program goes like this:
#Weights
def init_weights(shape):
init_random_dist = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(init_random_dist)
#Bias
def init_bias(shape):
init_bias = tf.constant(0.1,shape=shape)
return tf.Variable(init_bias)
def conv1d(x,W):
#x is input accelration data and W is corresponding weight
x = tf.cast(x, tf.float32)
tf.nn.conv1d(x,W,stride=1,padding='VALID')
def convolution_layer(input_x,shape):
w = init_weights(shape)
b = init_bias([shape[3]])
return tf.nn.relu(conv1d(input_x,w)+b)
Now placeholders
x = tf.placeholder(tf.float32,shape=[1,1,200,1])
y_true = tf.placeholder(tf.float32,shape=[None,6])
while creating 1st layer using con_layer_1 = convolution_layer(x,shape=[1,20,1,32]) i get rank ValueError which i'm unable to degubg. Error statement is:
ValueError: Shape must be rank 4 but is rank 5 for 'conv1d_20/Conv2D' (op: 'Conv2D') with input shapes: [1,1,1,200,1], [1,1,20,1,32].
The input and weights shapes to the nn.conv1d is not right. The input shape for the nn.conv1d should be of the size : [ batch_size, input_length, input_channels] and the weights matrix should be of size [filter_size, inputs_channels, output_channels]. So you need to change your code to :
def convolution_layer(input_x,shape):
w = init_weights(shape)
b = init_bias([shape[2]])
return tf.nn.relu(conv1d(input_x,w)+b)
x = tf.placeholder(tf.float32,shape=[1,200,1])
y_true = tf.placeholder(tf.float32,shape=[None,6])
con_layer_1 = convolution_layer(x,shape=[20,1,32])
Note: You should try yo use tf.layers API that takes care of the weights assignment and all.
I got an error when trying to create a simple binary classification for XOR case using Theano. It said dimension mismatch, but I can't find out what variable cause that.
and the strange part, my program is works when I change the number of neuron in the last layer. When I change to use 2 neuron in the last layer, and change that layer to softmax layer, and also use the negative log likelihood (multiclass classification style), this program is works fine.
This is my full code:
import numpy as np
import theano
import theano.tensor as T
class HiddenLayer(object):
def __init__(self, input, nIn, nOut, is_last, W=None):
self.input = input
W_val = np.random.randn(nIn,nOut)*0.001
b_val = np.zeros((nOut,))
self.W = theano.shared(np.asarray(W_val,dtype=theano.config.floatX),
name='W',borrow=True)
self.b = theano.shared(np.asarray(b_val,dtype=theano.config.floatX),
name='b',borrow=True)
self.z = T.dot(input,self.W) + self.b
if(is_last==0):
self.output = T.switch(self.z < 0 , 0 ,self.z)
else:
self.output = T.nnet.sigmoid(self.z)
self.y_pred = self.output > 0.5
self.params = [self.W, self.b]
def cost_function(self,y):
return -T.mean(y*T.log(self.output)+(1-y)*T.log(1-self.output))
def errors(self,y):
return T.mean(T.neq(self.y_pred,y))
alfa = 1
epoch = 1000
neu = 5
inpx = np.array([[1,0],[1,1],[0,0],[0,1]])
inpy = np.array([1,0,0,1])
x = T.fmatrix('x')
y = T.ivector('y')
layer0 = HiddenLayer(
input = x,
nIn = 2,
nOut = neu,
is_last=0
)
layer1 = HiddenLayer(
input = layer0.output,
nIn = neu,
nOut = 1,
is_last=1
)
params = layer0.params + layer1.params
cost = layer1.cost_function(y)
grads = T.grad(cost, params)
updates = [(param_i, param_i - alfa * grad_i) for param_i, grad_i in zip(params, grads)]
eror = layer1.errors(y)
train_model = theano.function([x,y], [eror,cost],updates=updates,allow_input_downcast=True)
test_model = theano.function([x,y],[eror,layer1.y_pred],allow_input_downcast=True)
for i in xrange(epoch):
etr,ctr = train_model(inpx, inpy)
if i%(epoch/10)==0:
print etr,ctr
et,pt = test_model(inpx,inpy)
print pt
and the error:
ValueError: Input dimension mis-match. (input[0].shape[1] = 1, input[1].shape[1] = 4)
Apply node that caused the error: Elemwise{neq,no_inplace}(sigmoid.0, DimShuffle{x,0}.0)
Toposort index: 41
Inputs types: [TensorType(float32, matrix), TensorType(int32, row)]
Inputs shapes: [(4L, 1L), (1L, 4L)]
Inputs strides: [(4L, 4L), (16L, 4L)]
Inputs values: [array([[ 0.94264328],
[ 0.99725735],
[ 0.5 ],
[ 0.95675617]], dtype=float32), array([[1, 0, 0, 1]])]
Outputs clients: [[Shape(Elemwise{neq,no_inplace}.0), Sum{acc_dtype=int64}(Elemwise{neq,no_inplace}.0)]]
Thank you in advance for any help.
Your problem is with your y and inpy variables: what you are trying to do is to have y be the expected output of the network. Your network is given a dataset with 4 elements, each having 2 features, you thus have 4 rows in your input matrix, and 2 columns. You are thus expected to have 4 elements in your predicted output, that is 4 rows in your y or inpy matrix, but you are using a vector, which in theano is a row vector and thus has only one row. You need either to transpose your y vector when computing the cost, or to define your y variable as a matrix, and thus to have inpy as a (4,1) matrix instead of a (4,) vector (once again, vectors are row vectors in theano).
Hope this helps,
Best