I am currently trying train a regression network using keras. To ensure I proper training I've want to train using crossvalidation.
The Problem is that it seems that keras don't have any functions supporting crossvalidation or do they?
The only solution I seemed to have found is to use scikit test_train_split and run a model.fit for for each k fold manually. Isn't there a already an integrated solutions for this, rather than manually doing it ?
Nope... That seem to be the solution. (Of what I know of.)
There is a scikit learn wrapper for Keras that will help you do this easily: https://keras.io/scikit-learn-api/
I recommend reading Dr. Jason Brownlee's example: https://machinelearningmastery.com/regression-tutorial-keras-deep-learning-library-python/
def baseline_model():
# create model
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
estimator = KerasRegressor(build_fn=wider_model, nb_epoch=100, batch_size=5, verbose=0)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)
Related
I'm working on a project where I want to train a neural network using another neural network as the discriminator. I already made a discriminator model that can distinguish between different sets of data but now I'm stuck on how to train the new neural network using it. I'm not an expert in this field so if anyone has any tips or know of any techniques I should use to accomplish this, that would be great. Also, are there any specific libraries or frameworks that are recommended for this type of task? I'm just trying to figure out the best way to approach this. Thanks in advance for any help!
So far this is all I've come up with for the discriminator model:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
# Define image data generator for both real and fake images
datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
dataset_path = "../training/discriminator"
train_generator = datagen.flow_from_directory(dataset_path,
target_size=(150, 150),
class_mode='binary',
subset='training')
validation_generator = datagen.flow_from_directory(dataset_path,
target_size=(150, 150),
class_mode='binary',
subset='validation')
# Create the neural network model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(150,150,3)))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
model.add(tf.keras.layers.Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train the model
model.fit(train_generator,
epochs=15,
validation_data=validation_generator,
steps_per_epoch=500,
validation_steps=5000)
model.save("../model/discriminator")
Though I haven't gotten past making the discriminator. I have no idea of what to do, and I haven't had much success searching online. I need all the help I can get.
I created a neural network in python that is predicting my time-series very well.
My issue is I want to be able to create a neural network that can predict multiple time series at the same time.
Is this possible and how would I go about it?
This is the code to build the NN for a single time series
nn_model = Sequential()
nn_model.add(Dense(12, input_dim=1, activation='relu'))
nn_model.add(Dense(1))
nn_model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mse', 'mae'])
early_stop = EarlyStopping(monitor='loss', patience=2, verbose=1)
history = nn_model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=1, callbacks=[early_stop], shuffle=False)
Any ideas about how to convert this to run for multiple time series?
I wrote simple code to learn Keras:
from tensorflow import keras
def main():
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
model = keras.Sequential()
model.add(keras.layers.Conv2D(16, 3, padding='same', activation='relu'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=4)
model.summary()
if __name__ == '__main__':
main()
But it seems to not learn anything. Not like it should learn much, but should at least decrease loss and increase accuracy a little. But both are stuck the same every epoch.
I had exact same model written in Pytorch and it achieved around 35% accuracy. This in tensorflow + keras is stuck on 10%.
tensorflow-gpu v1.9
What am I missing?
I think the default learning rate is to high for this problem. Try something like
opt=keras.optimizers.Adam(lr=1.e-5)
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
I checked the default learning rate used by Adam in both keras and PyTorch, and they both use 1e-3. Therefore, learning rate should not be the issue, assume you use default in both models.
Alternatively, I think this is related to the weight initialization, which is explicitly handled by each layer in keras but not in PyTorch.
Simply changing the training line to the following,
model.fit(x_train/255., y_train, shuffle=True,
validation_data=(x_test/255., y_test), epochs=4)
you should observe both training and validation accuracy reach around 60%.
I am not familiar with PyTorch, but I suggest you initialize the weights in the keras network with those used by the PyTorch network. In this way, you will have a fair comparison.
I'm the freshman in Machine Learning and Neural Network. I've got the problem with text classification. I use LSTM NN architecture system with Keras library.
My model every time reach the results about 97%. I got the database with something about 1 million records, where 600k of them are positive and 400k are negative.
I got also 2 labeled classes as 0 (for negative) and 1 (for positive). My database is split for training database and tests database in relation 80:20. For the NN input, I use Word2Vec trained on PubMed articles.
My network architecure:
model = Sequential()
model.add(emb_layer)
model.add(LSTM(64, dropout =0.5))
model.add(Dense(2))
model.add(Activation(‘softmax’)
model.compile(optimizer=’rmsprop’, loss=’binary_crossentropy’, metrics=[‘accuracy’])
model.fit(X_train, y_train, epochs=50, batch_size=32)
How can I fix (do better) my NN created model in this kind of text classification?
The problem with which we are dealing here is called overfitting.
First of all, make sure your input data is properly cleaned. One of the principles of machine learning is: ‘Garbage In, Garbage Out”. Next, you should balance your data collection, for example on 400k positive and 400k negative records. In sequence, the data set should be divided into a training, test and validation set (60%:20%:20%), for example using scikit-learn library, as in the following example:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)
Then I would use a different neural network architecture and try to optimize the parameters.
Personally, I would suggest using a 2-layer LSTM neural network or a combination of a convolutional and recurrent neural network (faster and reading articles that give better results).
1) 2-layer LSTM:
model = Sequential()
model.add(emb_layer)
model.add(LSTM(64, dropout=0.5, recurrent_dropout=0.5, return_sequences=True)
model.add(LSTM(64, dropout=0.5, recurrent_dropout=0.5))
model.add(Dense(2))
model.add(Activation(‘sigmoid’))
You can try using 2 layers with 64 hidden neurons, add recurrent_dropout parameter.
The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.
2) CNN + LSTM
model = Sequential()
model.add(emb_layer)
model.add(Convolution1D(32, 3, padding=’same’))
model.add(Activation(‘relu’))
model.add(MaxPool1D(pool_size=2))
model.add(Dropout(0.5))
model.add(LSTM(32, dropout(0.5, recurrent_dropout=0.5, return_sequences=True))
model.add(LSTM(64, dropout(0.5, recurrent_dropout=0.5))
model.add(Dense(2))
model.add(Activation(‘sigmoid’))
You can try using combination of a CNN and RNN. In this architecture, the model learns faster (up to 5 times faster).
Then, in both cases, you need to apply optimization, loss function.
A good optimizer for both cases is the "Adam" optimizer.
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
In the last step, we validate our network on the validation set.
In addition, we use callback, which will stop the network learning process, in case when, for example, in 3 more iterations, there are no changes in the accuracy of the classification.
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(patience=3)
model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stopping])
We can also control the overfitting using graphs. If you want to see how to do it, check here.
If you need further help, let me know in a comment.
I have an algorithm written in python, it is timeseries analysis using LSTM. My professor asked me to show the details of the model that is created in the code. How do I inspect the "model" here? Does it have some visualization of the model in the background?
model = Sequential()
model.add(LSTM(50, input_shape=(trainX.shape[1], trainX.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
history = model.fit(trainX, trainY, epochs=50, batch_size=72, validation_data=(testX, testY), verbose=0, shuffle=False)
There is a visualization tool in Keras called plot_model. You can use it to save your model as an image where you can see the structure of your model including input and output dimensions.
from keras.utils import plot_model
plot_model(model, to_file='model.png')
You can read more about it here: Keras Visualization