I want to train a binary classifier using Keras and my training data is of shape (2000,2,128) and labels of shape (2000,) as Numpy arrays.
The idea is to train such that embeddings together in a single array means they are either same or different, labelled using 0 or 1 respectively.
The training data looks like:
[[[0 1 2 ....128][129.....256]][[1 2 3 ...128][9 9 3 5...]].....]
and the labels looks like [1 1 0 0 1 1 0 0..].
Here is the code:
import keras
from keras.layers import Input, Dense
from keras.models import Model
frst_input = Input(shape=(128,), name='frst_input')
scnd_input = Input(shape=(128,),name='scnd_input')
x = keras.layers.concatenate([frst_input, scnd_input])
x = Dense(128, activation='relu')(x)
x=(Dense(1, activation='softmax'))(x)
model=Model(inputs=[frst_input, scnd_input], outputs=[x])
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[ 0.2],metrics=['accuracy'])
I am getting the following error while running this code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[ 0.07124118, -0.02316936, -0.12737238, ..., 0.15822273,
0.00129827, -0.02457245],
[ 0.15869428, -0.0570458 , -0.10459555, ..., 0.0968155 ,
0.0183982 , -0.077924...
How can I resolve this issue? Is my code correct to train the classifier using two inputs to classify?
Well, you have two options here:
1) Reshape the training data to (2000, 128*2) and define only one input layer:
X_train = X_train.reshape(-1, 128*2)
inp = Input(shape=(128*2,))
x = Dense(128, activation='relu')(inp)
x = Dense(1, activation='sigmoid'))(x)
model=Model(inputs=[inp], outputs=[x])
2) Define two input layers, as you have already done, and pass a list of two input arrays when calling fit method:
# assuming X_train have a shape of `(2000, 2, 128)` as you suggested
model.fit([X_train[:,0], X_train[:,1]], y_train, ...)
Further, since you are doing binary classification here, you need to use sigmoid as the activation of last layer (i.e. using softmax in this case would always outputs 1, since softmax normalizes the outputs such that their sum equals to one).
Related
I am trying to build an RNN based model in Tensorflow that takes a sequence of categorical values as an input, and sequence of categorical values as the output.
For example, if I have sequence of 30 values, the first 25 would be the training data, and the last 5 would be the target. Imagine the data is something like a person pressing keys on a computer keyboard and recording their key presses over time.
I've tried to feed the training data and targets into this model in different shapes, and I always get an error that indicates the data is in the wrong shape.
I've included a code sample that should run and demonstrate what I'm trying to do and the failure I'm seeing.
In the code sample, I've used windows for batches. So if there are 90 values in the sequence, the first 25 values would be the training data for the first batch, and the next 5 values would be the target. This next batch would be the next 30 values (25 training values, 5 target values).
import numpy as np
import tensorflow as tf
from tensorflow import keras
num_categories = 20
data_sequence = np.random.choice(num_categories, 10000)
def create_target(batch):
X = tf.cast(batch[:,:-5][:,:,None], tf.float32)
Y = batch[:,-5:][:,:,None]
return X,Y
def add_windows(data):
data = tf.data.Dataset.from_tensor_slices(data)
return data.window(20, shift=1, drop_remainder=True)
dataset = tf.data.Dataset.from_tensor_slices(data_sequence)
dataset = dataset.window(30, drop_remainder=True)
dataset = dataset.flat_map(lambda x: x.batch(30))
dataset = dataset.batch(5)
dataset = dataset.map(create_target)
model = keras.models.Sequential([
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.TimeDistributed(keras.layers.Dense(num_categories, activation="softmax"))
])
optimizer = keras.optimizers.Adam()
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer)
model.fit(dataset, epochs=1)
The error I get when I run the above code is
Node: 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [125,20] and labels shape [25]
I've also tried the following model, but the errors are similar.
model = keras.models.Sequential([
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.SimpleRNN(20),
keras.layers.Dense(num_categories, activation="softmax"))
])
Does anybody have any recommendations about what I need to do to get this working?
Thanks.
I figured out the issue. The size of the time dimension needs to be the same for the training data and the target.
If you look at my original example code, the training data has these shapes
X.shape = (1, 25, 1)
Y.shape = (1, 5, 1)
To fix it, the time dimension should be the same.
X.shape = (1, 15, 1)
Y.shape = (1, 15, 1)
Here is the updated function that will let the model train. Note that all I did was update the array sizes so they are equally sized. The value of 15 is used because the original array length is 30.
def create_target(batch):
X = tf.cast(batch[:,:-15][:,:,None], tf.float32)
Y = batch[:,-15:][:,:,None]
return X,Y
Only the first output parameter is learned to be properly estimated during training of a multi regression output net. Second and subsequent parameters only seem to follow first parameter. It seems, that ground truth for second output parameter is not used during training. How do I shape tf.data.Dataset and input it into model.fit() function so second output parameter is trained?
import tensorflow as tf
import pandas as pd
from tensorflow import keras
from keras import layers
#create dataset from csv
file = pd.read_csv( 'minimalDataset.csv', skipinitialspace = True)
input = file["input"].values
output1 = file["output1"].values
output2 = file["output2"].values
dataset = tf.data.Dataset.from_tensor_slices((input, (output1, output2))).batch(4)
#create multi output regression net
input_layer = keras.Input(shape=(1,))
x = layers.Dense(20, activation="relu")(input_layer)
x = layers.Dense(60, activation="relu")(x)
output_layer = layers.Dense(2)(x)
model = keras.Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="mean_squared_error")
#train model and make prediction (deliberately overfitting to illustrate problem)
model.fit(dataset, epochs=500)
prediction = model.predict(dataset)
minimalDataset.csv and predictions:
input
output1
output2
prediction_output1
prediction_output2
0
-1
1
-0.989956
-0.989964
1
2
0
1.834444
1.845085
2
0
2
0.640249
0.596099
3
1
-1
0.621426
0.646796
If I create two independent dense final layers the second parameter is learned accurately but I get two losses:
output_layer = (layers.Dense(1)(x), layers.Dense(1)(x))
Note: I want to use tf.data.Dataset because I build a 20k image/csv with it and do per-element transformations as preprocessing.
tf.data.Dataset.from_tensor_slices() slices along the first dimension. Because of this the input and output tensors need to be transposed:
dataset = tf.data.Dataset.from_tensor_slices((tf.transpose(input), (tf.transpose([output1, output2])))).batch(4)
I am trying to train a network on the Swiss Roll dataset with three features X = [x1, x2, x3] for the classification task. There are four classes with labels 1, 2, 3, 4, and the vector y contains the labels for all the data.
A row in the X matrix looks like this:
-5.2146470e+00 7.0879738e+00 6.7292474e+00
The shape of X is (100, 3), and the shape of y is (100,).
I want to use Radial Basis Functions to train this model. I have used the custom RBFLayer from this StackOverflow answer (also see this explanation) to build the RBFLayer. I want to use a couple of Keras Dense layers to build the network for classification.
What I have tried so far
I have used a Dense layer for the first layer, followed by the custom RBFLayer, and two other Dense layers. Here's the code:
model = Sequential()
model.add((Dense(100, input_dim=3)))
# number of units = 10, gamma = 0.05
model.add(RBFLayer(10,0.05))
model.add(Dense(15, activation='relu'))
model.add(Dense(1, activation='softmax'))
This model gives me zero accuracy. I think there is something wrong with the model architecture, but I can't figure out what is the issue.
Also, I thought the number of units in the last Dense layer should match the number of classes, which is 4 in this case. But when I set the number of units to 4 in the last layer, I get the following error:
ValueError: Shapes (None, 1) and (None, 4) are incompatible
Can you help me with this model architecture?
I faced the same issue while practicing with multi-class classification. Where I had 7 features and the model classifies into 7 classes. I tried encoding the labels and it fixed the issue.
First import LabelEncoder class from sklearn and import to_categorical from tensorflow
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
Then, initialize an object to the LabelEncoder class and transform your labels before fitting and training the model.
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)
y = to_categorical(y)
Note that you have to use np.argmax for getting the actual predicted classification. in my case, the prediction is stored in variable called res
res = np.argmax(res, axis=None, out=None)
You can get your actual predicted class after this line. Looking forward to help you. Hope it solved your problem.
There are four classes with labels 1, 2, 3, 4, and the vector y contains the labels for all the data.
The simplest solution for input output matching is that you print the shape of the inputs and output for a single batch and then compare.
RBF layer should have no problem because output is taken from last dense layer rather then RBF layer.
With classification problem you must have last nodes equal to classes in regression the last node is 1 sometimes.
you should print
pseudo code
print(input.shape)
compare it with
print(model.input_shape)
then at output
print(output.shape)
then compare it with
print(model.predict(input).shape)
you can find the correct syntax at keras docs these are approx correct syntax / pseudo
I am getting an error relating to shapes whilst defining a very simple network using Tensorflow 2.
My code is:
import tensorflow as tf
import pandas as pd
data = pd.read_csv('data.csv')
target = data.pop('result')
target = tf.keras.utils.to_categorical(target.values, num_classes=3)
data_set = tf.data.Dataset.from_tensor_slices((data.values, target))
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=data.shape[1:]),
tf.keras.layers.Dense(12, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit(data_set, epochs=5)
The call to fit() throws the following error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 12 but received input with shape [12, 1]
Walking through the code:
The input CSV file has thirteen columns - with the last being the label
This is converted to a 3 bit one-hot encoding
The Dataset is constructed of two Tensors - one of shape (12,) and the other of shape (3,)
The network Input layer defines it's expected shape as be the value data shape ignoring the first axis which is the batch size
I am stumped about why there is mismatch between the shape of the data and the expected data shape for the network - especially as the latter is defined by reference to the former.
Add .batch() at the end of the dataset:
data_set = tf.data.Dataset.from_tensor_slices((data.values, target)).batch(8)
I have a Keras LSTM model that contains multiple outputs.
The model is defined as follows:
outputs=[]
main_input = Input(shape= (seq_length,feature_cnt), name='main_input')
lstm = LSTM(32,return_sequences=True)(main_input)
for _ in range((output_branches)): #output_branches is the number of output branches of the model
prediction = LSTM(8,return_sequences=False)(lstm)
out = Dense(1)(prediction)
outputs.append(out)
model = Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop',loss='mse')
I have a problem when reshaping the output data.
The code for reshaping the output data is:
y=y.reshape((len(y),output_branches,1))
I got the following error:
ValueError: Error when checking model target: the list of Numpy arrays
that you are passing to your model is not the size the model expected.
Expected to see 5 array(s), but instead got the following list of 1
arrays: [array([[[0.29670931],
[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612]],
[[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612],...
How can I correctly reshape the output data?
It depends on how y is structured initially. Here I assume that y is a single-valued label for each sequence in batch.
When there are multiple inputs/outputs model.fit() expects a corresponding list of inputs/outputs to be given. np.split(y, output_branches, axis=-1) in a following fully reproducible example does exactly this - for each batch splits a single list of outputs into a list of separate outputs where each output (in this case) is 1-element list:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
batch_size = 100
seq_length = 10
feature_cnt = 5
output_branches = 3
# Say we've got:
# - 100-element batch
# - of 10-element sequences
# - where each element of a sequence is a vector describing 5 features.
X = np.random.random_sample([batch_size, seq_length, feature_cnt])
# Every sequence of a batch is labelled with `output_branches` labels.
y = np.random.random_sample([batch_size, output_branches])
# Here y.shape() == (100, 3)
# Here we split the last axis of y (output_branches) into `output_branches` separate lists.
y = np.split(y, output_branches, axis=-1)
# Here y is not a numpy matrix anymore, but a list of matrices.
# E.g. y[0].shape() == (100, 1); y[1].shape() == (100, 1) etc...
outputs = []
main_input = tf.keras.layers.Input(shape=(seq_length, feature_cnt), name='main_input')
lstm = tf.keras.layers.LSTM(32, return_sequences=True)(main_input)
for _ in range(output_branches):
prediction = tf.keras.layers.LSTM(8, return_sequences=False)(lstm)
out = tf.keras.layers.Dense(1)(prediction)
outputs.append(out)
model = tf.keras.models.Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop', loss='mse')
model.fit(X, y)
You might need to play around with axes as you didn't specify how exactly your data look like.
EDIT:
As author is looking for an answer drawing from official sources, it's mentioned here (not explicitly though, it only mentions what the Dataset should yield, hence - what kind of input structure model.fit() expects):
When calling fit with a Dataset object, it should yield either a tuple of lists like ([title_data, body_data, tags_data], [priority_targets, dept_targets]) or a tuple of dictionaries like ({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets}).
Since you have an amount of outputs equal to output_branches, your output data must be a list with the same amount of arrays.
Basically, if the output data is in the middle dimension as your reshape suggests:
y = [ y[:,i] for i in range(output_branches)]