I am hoping to get some guidance on what steps I should take next in my attempt to model a certain system. It contains 3 independent variables, 24 dependent variables, and about 21,000 rows. In my attempts to model, I cannot get the accuracy above about 50% or the loss less than about 6500. I've been using variations on the following code:
EPOCHS = 30
#OPTIMIZER = 'adam'
#OPTIMIZER = 'adagrad'
BATCH_SIZE = 10
OUTPUT_UNITS = len(y.columns)
print(f'OUTPUT_UNITS: {OUTPUT_UNITS}')
model = Sequential()
model.add(Dense(8, activation='relu', input_dim=3)) # 3 X parameters, with eng_speed removed
#model.add(Dense(8, activation='relu', input_dim=4)) # 4 X parameters
model.add(Dense(32, activation='relu' ))
#model.add(Dense(64, activation='relu' ))
#model.add(Dense(12, activation='relu' ))
model.add(Dense(OUTPUT_UNITS)) # number of predicted (y) values (labels)
model.summary()
adadelta = optimizers.Adadelta()
adam = optimizers.Adam(lr=0.001)
model.compile(optimizer=adadelta, loss='mse', metrics=['accuracy'])
#model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=EPOCHS, batch_size=BATCH_SIZE)
I've tried removing and adding layers, changing the size of them, different optimizers, learning rates, etc. The following two graphs are typical of what I've been seeing--they both flatten very quickly, then don't improve:
I'm obviously new at this and would appreciate it if someone pointed me in the right direction: an approach to try, something to read up on, whatever. Thanks in advance.
Since (according to your mse loss and your regression tag) you are in a regression setting, accuracy is meaningless (it is only used in classification settings); please see own answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)?
Given that, there is in principle absolutely no reason to consider a loss of 6500 as "high", and hence needing improvement...
Related
I'm trying to solve the Titanic competition on Kaggle. But the modelaccuracy isn't going beyond 80%.
I tried to change a number of hidden nodes, a number of epochs, also tried to apply batch normalization, dropout, changing the weights initializations, but there's the same 80%. What am I doing wrong?
This is my code below:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(10, input_shape=(5,), kernel_initializer='he_normal', activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(20, kernel_initializer='he_normal', activation='relu'))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dense(2, kernel_initializer=tf.keras.initializers.GlorotNormal(), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
train_scores = model.fit(train_features, train_labels, epochs=200, batch_size=64, verbose=2)
And here's on the picture accuracy in some last epochs:model accuracy
How can I improve it?
You can try normalising the data, Generally while implementing Neural Networks we don't need to normalise our data (if the network is deep) but since here we are only working with 3 layers only I guess normalising the data might help.
I would suggest to split your training data again into training and validation set and use K-fold cross validation ( I am not sure about this one!! I too am new in this field).
But in general I have seen if the accuracy is constant then the best approach is to alter the training data ( I mean normalise it or try imputing NaN values with the mean (rather than setting the to 0)).
I am a newbie to Keras and machine learning in general. I’m trying to build a classification model using the Sequential model. After some experiments, I see that my validation accuracy behavior is very low and not increasing, although the training accuracy works well. I added regularization parameters to the layers and dropouts also in between the layers. Still, the behavior exists. Here’s my code.
from keras.regularizers import l2
model = keras.models.Sequential()
model.add(keras.layers.Conv1D(filters=32, kernel_size=1, strides=1, padding="SAME", activation="relu", input_shape=[512,1],kernel_regularizer=keras.regularizers.l2(l=0.1))) # 一定要加 input shape
keras.layers.Dropout=0.35
model.add(keras.layers.MaxPool1D(pool_size=1,activity_regularizer=l2(0.01)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(256, activation="softmax",activity_regularizer=l2(0.01)))
model.compile(loss="sparse_categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"])
Ahistory = model.fit(train_x, trainy, epochs=300,
validation_split = 0.2,
batch_size = 16)
And here is the final results I got.
What is the reason behind this.? How do I fine-tune the model.?
I am trying to use GloVe embeddings to train a cnn model based on this article (also a rnn, which has this issue). The dataset is a labeled data: text (tweets) with labels (hate, offensive or neither).
The problem is that model performs well on train set but poorly on validation set.
here is the model:
kernel_size = 2
filters = 256
pool_size = 2
gru_node = 64
model = Sequential()
model.add(Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True))
model.add(Dropout(0.25))
model.add(Conv1D(filters, kernel_size, activation='relu'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(Conv1D(filters, kernel_size, activation='softmax'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, recurrent_dropout=0.2))
model.add(Dense(1024,activation='relu'))
model.add(Dense(nclasses))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
fitting the model:
X = df.tweet
y = df['classifi'] # classes 0,1,2
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, shuffle=False)
X_train_Glove,X_test_Glove, word_index,embeddings_index = loadData_Tokenizer(X_train,X_test)
model_RCNN = Build_Model_RCNN_Text(word_index,embeddings_index, 20)
model_RCNN.fit(X_train_Glove, y_train,validation_data=(X_test_Glove, y_test),
epochs=15,batch_size=128,verbose=2)
predicted = model_RCNN.predict(X_test_Glove)
predicted = np.argmax(predicted, axis=1)
print(metrics.classification_report(y_test, predicted))
this is what the distribution looks like (0:hate, 1:offensive, 2:neither)
model summary
Results:
classification report
is this the correct approach or am I missing something here
Generally speaking there are two sides that you can tackle overfitting:
Improving the data
More unique data
oversampling (to balance data)
Limiting the network structure
Dropout (You've implemented this)
Less parameters (You might want to benchmark against a much smaller network)
regularization (ex. L1 and L2)
I'd suggest trying with significantly fewer parameters (because this is quick) and oversampling (because your data seems lopsided).
Also, You can also try hyperparameter fitting. Making a large number of networks with different parameters than picking the best one.
Note: if you do hyper parameter fitting make sure to have an extra validation set because you can easily overfit your test set this way.
Side note: Sometimes when troubleshooting NN it is helpful to set the optimizer to a basic stochastic gradient descent. It slows the training down a bunch but makes the progression much clearer.
Good luck!
I try to create a neural network with keras (backened tensorflow).
I have 4 Input and 2 Output variables:
not available
I want to do predictions to a Testset not available.
This is my Code:
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dense
import numpy
numpy.random.seed(7)
dataset = numpy.loadtxt("trainingsdata.csv", delimiter=";")
X = dataset[:,0:4]
Y = dataset[:,4:6]
model = Sequential()
model.add(Dense(4, input_dim=4, init='uniform', activation='sigmoid'))
model.add(Dense(3, init='uniform', activation='sigmoid'))
model.add(Dense(2, init='uniform', activation='linear'))
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])
model.fit(X, Y, epochs=150, batch_size=10, verbose=2)
testset = numpy.loadtxt("testdata.csv", delimiter=";")
Z = testset[:,0:4]
predictions = model.predict(Z)
print(predictions)
When I run the script, the accuracy is 1.000 after every epoch and I get as result always the same output for every input pair:
[-5.83297 68.2967]
[-5.83297 68.2967]
[-5.83297 68.2967]
...
Has anybody an idea what the fault in my code is?
I suggest you normalize / standardize your data before feeding it to your model and then check if your model starts to learn.
Have a look at scikit-learn's StandardScaler.
And look into this SO thread to learn how to correctly fit_transform your training data and only transform your test data.
There is also this tutorial that makes use of scikit-learn's data preprocessing pipeline: http://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
Neural networks have a tough time if the scale of the input variables is too different from each other. Having 10, 1000, 100000 as the same inputs causes the gradients to collapse towards whatever the large value is. The other values effectively don't provide any information.
One method is to simply rescale the input variables by a constant. You can simply divide the 206000 by 100000. Try getting all of the variables to be at around the same number of digits. Large numbers are a bit harder than small numbers, for networks.
I am trying to build a binary classification algorithm (output is 0 or 1) on a dataset that contains normal and malicious network packets.
The dataset shape (after converting IP #'s and hexa to decimal) is:
IP src, IP dest, ports, TTL, etc..
Note: The final column is the output.
And the Keras model is:
from keras.models import Sequential
from keras.layers import Dense
from sklearn import preprocessing
import numpy
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
seed = 4
numpy.random.seed(seed)
dataset = numpy.loadtxt("NetworkPackets.csv", delimiter=",")
X = dataset[:, 0:11].astype(float)
Y = dataset[:, 11]
model = Sequential()
model.add(Dense(12, input_dim=11, kernel_initializer='normal', activation='relu'))
model.add(Dense(12, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='relu'))
model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.fit(X, Y, nb_epoch=100, batch_size=5)
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
However, I tried different optimizers, activation functions, number of layers, but the accuracy is reaching 0.5 at most:
Result
Even I tried Grid search for searching the best parameters, but the maximum is 0.5.
Does anyone knows why the output is always like that? and how can I enhance it.
Thanks in advance!
Your model isn't even outperforming a random chance model, so there must be something wrong in the data.
There may be two possibilities
1 - You don't feed enough training samples to your model for it to identify significant features as to distinguish between normal and malicious.
2 - The data itself is not informative enough to derive the decision you are looking for.