My understanding is that "sparse_categorical_crossentropy" fits my multi-classification without one-hot-encoding case. I also slowed the adam learning rate in case it is overshooting the predictions.
I am not sure what I am not understanding that I am doing incorrectly.
My input data looks similar to this:
My output prediction results are labels: [1 2 3 4 5 6 7 8 9 10] (not one-hot-encoded). Each number represents I want the network to end up choosing.
print(x_train.shape)
print(x_test.shape)
x_train = x_train.reshape(x_train.shape[0], round(x_train.shape[1]/5), 5)
x_test = x_test.reshape(x_test.shape[0], round(x_test.shape[1]/5), 5)
print(x_train.shape)
print(np.unique(y_train))
print(len(np.unique(y_train)))
input_shape = (x_train.shape[1], 5)
adam = keras.optimizers.Adam(learning_rate=0.0001)
model = Sequential()
model.add(Conv1D(512, 5, activation='relu', input_shape=input_shape))
model.add(Conv1D(512, 5, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(512, 5, activation='relu'))
model.add(Conv1D(512, 5, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=25, validation_data=(x_test, y_test))
print(model.summary())
Here is the error results:
Model Layers (if it helps):
I see two main problems in your approach
your labels are from 1 to 10... they must start from 0 in order to have them in the range 0-9. this can be achieved simply doing y_train-1 and y_test-1 (if y_test and y_train are numpy arrays)
the last layer of your network must be Dense(10, activation='softmax') where 10 is the number of class to predict and softmax is used to generate probabilities in multiclass problem
Use sparse_categorical_crossentropy is ok because you have integer encoded target
Related
I'm trying to make a binary Classification by combining CNN (con1D) with GRU. my dataset dataset is like that :
X_train shape : (223461, 5)
y_train shape :(223461,)
the X_train is like that and the Y_train is a labels (0,1) like that
first I convert that train dataset :
dataset = X_train.values
dataset=dataset[1:]
dataset = dataset.astype('float32')
dataset
the same for y-train:
dataset_target = y_train.values
dataset_target=dataset_target[1:]
dataset_target = dataset_target.astype('float32')
dataset_target
now the shapes are dataset.shape =(223460, 5)
, dataset_target.shape = (223460,)
than my model structure is :
verbose, epochs, batch_size = 0, 100, 64
n_timesteps, n_features, n_outputs = dataset.shape[0], dataset.shape[1], dataset_target.shape[0]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape= (n_timesteps,n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(GRU(64))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(1, activation='sigmoid'))
opt = Adam(learning_rate=0.01)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(), optimizer=opt , metrics=['accuracy'])
model.summary()
and when I want to fit dataset to my model:
# fit network
model.fit(dataset, dataset_target, epochs=epochs, batch_size=batch_size, verbose=1)
# evaluate model
_, accuracy = model.evaluate(X_test, y_test, batch_size=batch_size, verbose=1)
#accuracy
I get an error Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 223460, 5), found shape=(64, 5)
Is the first axis of the dataset (233460 samples) actually time steps, and do you have 5 'channels' of data? In that case, it would help if you slice the dataset along the first axis and then assign to each 'slice' single label, for example, the last value related to the slice from the y_train. In that case, n_timesteps would be the length of the slice, and the shape of the dataset something like (n_samples, n_timesteps, 5). Basically, Conv1D expects each training sample to be 2D, but in you case it's 1D, because the first dimension is just a number of samples.
I might have interpreted the dataset the wrong way. In that case, please clarify how it works so I would fix my suggestion.
Here's the example:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Conv1D, MaxPooling1D, GRU, \
Dropout, Flatten, Dense
from tensorflow.keras import Sequential
from tensorflow.keras.optimizers import Adam
import numpy as np
X_train = np.random.normal(0, 1, (223461, 5))
y_train = np.random.randint(0, 2, 223461)
dataset = X_train[1:]
dataset_target = y_train[1:]
n_timesteps = 10
# Slice dataset and target
dataset = np.stack(np.split(dataset, n_timesteps)[:-1])
dataset_target = np.stack([y[-1] for y in np.split(dataset_target, n_timesteps)[:-1]])
Define and train the model:
def get_model(dataset, n_timesteps):
verbose, epochs, batch_size = 0, 100, 64
n_timesteps, n_features = dataset.shape[1], dataset.shape[2]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape = (n_timesteps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(GRU(64))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(1, activation='sigmoid'))
opt = Adam(learning_rate=0.01)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(), optimizer=opt , metrics=['accuracy'])
model.summary()
return model
verbose, epochs, batch_size = 0, 1, 64
model = get_model(dataset, n_timesteps)
model.fit(dataset, dataset_target, epochs=epochs, batch_size=batch_size, verbose=1)
Hope it helps!
I am trying to build a 1D CNN for numerical dataset. My dataset has 520 rows and 13 features. Here is the code below.
It gives
"ValueError: Input 0 of layer sequential_21 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (1, 13)" error.
How do I need to set input shape, or do I have to reshape X_train? Any help is highly appreciated.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)
n_features = 13
model = Sequential()
model.add(Conv1D(filters=1, kernel_size=1, activation='relu', input_shape=(1, n_features)))
model.add(Conv1D(filters=1, kernel_size=1, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=1))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=1)
yhat_classes = model.predict_classes(testX, verbose=0)
The problem is that batching your numpy dataset yields rows. Now you use batch size 1, so the generator yields 1 row resulting in a array of shape (1, n_features) but you want shape (batch_size, 1, n_features).
Adding a dimension to the dataset before spliting it should fix the problem
X = X.reshape(-1, 1, n_features)
I'm trying to bulid a model to classify my text to hate (1) or not (0) using nn.
Information about the data, it's consists of tweets and class label (hate (1) or not (0)):
sentences = df['comment']
y = df['isHate']
sentences_train, sentences_test, train_y, test_y = train_test_split(sentences, y, test_size=0.25, random_state=42)
the text get through a lot of Word Embeddings and I applied pad sequences on the tweets and LabelEncoder on the labels.
the problem is when I do the run I get this error:
ValueError: logits and labels must have the same shape ((None, 1) vs (None, 2))
the code of the model:
emb_dim = 16
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim= emb_dim, input_length=maxlen))
model.add(Flatten())
model.add(Dense(2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
the problem happened in this part:
history = model.fit(X_train, y_train,
batch_size=32,
epochs=15,
validation_data=(X_test, y_test))
Any help?
In your code:
model.add(Dense(1, activation='sigmoid'))
Your last dense layer has only 1 unit but your labels are one hot encoded which consist of 2 classes. So you need to change:
model.add(Dense(2, activation='softmax'))
You also need to change your loss function, because they are one-hot-encoded:
loss='categorical_crossentropy'
I am trying to classify my input time-series data in 10 response classes. So I have 10 classes in my response feature.
My input data has 40 features and response(y_train) has 1 feature with 10 classes.
train input shape (4320, 43), train_y shape (4320,)
My LSTM Network looks like following
model = Sequential()
model.add(LSTM(25, dropout=0.2, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_X, train_y, epochs=10, batch_size=36, validation_split =0.05)
And I get an error
Error when checking target: expected dense_21 to have shape (10,) but got array with shape (1,)
I think it is happening because I have 1 feature in my train_y, where the dense output layer is expecting 10 features. How to run my multiclass time series classification with categorical_entropy loss function?
Also, as soon as I change loss function to sparse_categorical_entropy, it runs smooth.
model = Sequential()
model.add(LSTM(25, dropout=0.2, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(10, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_X, train_y, epochs=10, batch_size=36, validation_split =0.05)
Please help me to understand the reason behind it. Also, which loss function shall I use for multiclass classification time series?
The initial error:
Error when checking target: expected dense_21 to have shape (10,) but got array with shape (1,)
Would be due to the y_train, not converted into a categorical measure. You need to clean the y_train data and convert then into your desired 10 classes categorical array through, maybe, one-hot encoding.
In simple terms, categorical_crossentropy should only be used on data that is one-hot encoded.
[1, 0, 0, 0]
[0, 1, 0, 0]
[0, 0, 1, 0]
...
Otherwise, sparse_categorical_crossentropy deals with integers.
1
2
5
3
...
This model tries to predict two states based on an array with 400 numbers. During the first training round the model starts with loss on the first +- 200 samples, and then goes into Nan loss. The accuracy stays around 50% and when I print the predictions for the test set, it only predicts NaN. My X_train has a shape of (1934, 400, 1) and my y_train a shape of (1934,). I already tried checking for NaNs in the dataset, but there were none.
My model looks like this:
model = Sequential()
model.add(LSTM(128, input_shape=(400,1), activation='relu', return_sequences=True))
model.add(Dropout(0,2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0,2))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0,2))
model.add(Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(lr=0.01)
# mean_squared_error = mse
model.compile(loss='binary_crossentropy',
optimizer=opt,
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=3, validation_split = 0.1, shuffle=True, batch_size = 64)
edit: Solved by changing the activation function to tanh. Sigmoid stays sigmoid!
The problem was solved with changing the activation functions to "tanh". Seems that dropout should be 0.2 instead of 0,2 also, but this wasn't the cause of the problem.