A neural network trained on iris dataset using [4, 4] hidden layers and created separately in tensorflow and keras gives different results.
While the tensorflow model gives 96.6 % accuracy on test, keras model gives only around 50%. The various hyper parameters like learning rate, optimiser, mini batch size, etc were the same in both cases.
Keras model
model = Sequential()
model.add(Dense(units = 4, activation = 'relu', input_dim = 4))
model.add(Dropout(0.25))
model.add(Dense(units = 4, activation = 'relu'))
model.add(Dropout(0.25))
model.add(Dense(units = 3, activation = 'softmax'))
adam = Adam(epsilon = 10**(-6), lr = 0.01)
model.compile(optimizer = 'adagrad', loss = 'categorical_crossentropy', metrics = ['accuracy'])
one_hot_labels = keras.utils.to_categorical(y_train, num_classes = 3)
model.fit(X_train, one_hot_labels, epochs = 50, batch_size = 40)
Tensorflow model
feature_columns = [tf.feature_column.numeric_column(key = name,
shape = (1),
dtype = tf.float32) for name in list(X_train.columns)]
classifier = tf.estimator.DNNClassifier(hidden_units = [4, 4],
feature_columns = feature_columns,
n_classes = 3,
dropout = 0.25,
model_dir = './DNN_model')
train_input_fn = tf.estimator.inputs.pandas_input_fn(x = X_train,
y = y_train,
batch_size = 40,
num_epochs = 50,
shuffle = False)
classifier.train(input_fn = train_input_fn, steps = None)
For the keras model, I did try changing the learning rate, increasing the number of epochs, using different optimisers, etc. As such, the accuracy remained poor. Clearly, both the models are doing different things, but on the surface, they seem identical to me for all the key aspects.
Any help is appreciated.
they have the same architecture, and that's all.
The difference in performance is coming from one or more of these factors:
You have Dropout. Therefore your networks in every start behaving differently (check how the Dropout works);
Weight initializations, which method you're using in Keras and TensorFlow?
Check all parameters of the optimizer.
Related
my neural network written in keras, for the problem of binary image classification, after selecting hyperparameters using the keras tuner, produces only zeros.
import keras_tuner
from kerastuner import BayesianOptimization
from keras_tuner import Objective
from tensorflow.keras.models import Model
from tensorflow.keras.applications import Xception
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from torch.nn.modules import activation
def build_model(hp):
# create the base pre-trained model
base_model = Xception(include_top=False, input_shape=(224, 224, 3))
base_model.trainable = False
x = base_model.output
x = Flatten()(x)
hp_units = hp.Int('units', min_value=32, max_value=4096, step=32)
x = Dense(units = hp_units, activation="relu")(x)
hp_rate = hp.Float('rate', min_value = 0.01, max_value=0.9, step=0.01)
x = Dropout(rate = hp_rate)(x)
predictions = Dense(1, activation='sigmoid')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
hp_learning_rate = hp.Float('learning_rate', max_value = 1e-2, min_value = 1e-7, step = 0.0005)
optimizer = hp.Choice('optimizer', ['adam', 'sgd', 'adagrad', 'rmsprop'])
model.compile(optimizer,
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
return model
stop_early = keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, min_delta= 0.0001)
tuner = BayesianOptimization(
hypermodel = build_model,
objective = Objective(name="val_accuracy",direction="max"),
max_trials = 10,
directory='/content/best_model_s',
overwrite=False
)
tuner.search(train_batches,
validation_data = valid_batches,
epochs = 100,
callbacks=[stop_early]
)
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
model = tuner.hypermodel.build(best_hps)
history = model.fit(train_batches, validation_data = valid_batches ,epochs=50)
val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))
best_model = tuner.hypermodel.build(best_hps)
# Retrain the model
best_model.fit(train_batches, validation_data = valid_batches , epochs=best_epoch)
test_generator.reset()
predict = best_model.predict_generator(test_generator, steps = len(test_generator.filenames))
I'm guessing that maybe the problem is that the ImageDataGenerator is fed to train with 2 batches of 16 images each, and to test the ImageDataGenerator with 2 batches of 4 images (each batch has an equal number of class representatives).I also noticed that with a small number of epochs, the neural network produces values from 0 to 1, but the more epochs, the closer the response of the neural network is to zero. For a solution, I tried to stop training as soon as the next 5 iterations do not decrease the loss on validation. Again, it seems to me that the matter is in the validation sample, it is very small.
Any advice?
I am used to working in PyTorch but now have to learn Tensorflow for my job. I am trying to get up to speed by creating a simple dense network and training it on the MNIST dataset, but I cannot get it to train. My super simple code:
import tensorflow as tf
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
# Load mnist data from keras
(train_data, train_label), (test_data, test_label) = tf.keras.datasets.mnist.load_data(path="mnist.npz")
train_label, test_label = to_categorical(train_label), to_categorical(test_label)
train_data, train_label, test_data, test_label = Flatten()(train_data), Flatten()(train_label), Flatten()(test_data), Flatten()(test_label)
# Create generic SGD optimizer (no learning schedule)
optimizer = SGD(learning_rate = 0.01)
# Define function to build and compile model
def build_mnist_model(input_shape, batch_size = 30):
input_img = Input(shape = input_shape, batch_size = batch_size)
# Pass through dense layer
x = Dense(200, activation = 'relu', use_bias = True)(input_img)
x = Dense(400, activation = 'relu', use_bias = True)(x)
scores = Dense(10, activation = 'softmax', use_bias = True)(x)
# Create and compile tf model
mnist_model = Model(input_img, scores)
mnist_model.compile(optimizer = optimizer, loss = 'categorical_crossentropy')
return mnist_model
# Build the model
mnist_model = build_mnist_model(train_data[0].shape)
# Train the model
mnist_model.fit(
x = train_data,
y = train_label,
batch_size = 30,
epochs = 20,
verbose = 2,
shuffle = True,
# steps_per_epoch = 200
)
When I run this I get
ValueError: When using data tensors as input to a model, you should specify the `steps_per_epoch` argument.
This does not really make sense to me because my train_data and train_label are just regular tensors and per the Tensorflow documentation in this case it should default to the number of samples in the dataset divided by the batch size (which would be 200 in my case).
At any rate, I tried specifying steps_per_epoch = 200 when I call mnist_model.fit() but then I get a different error:
InvalidArgumentError: Incompatible shapes: [60000,10] vs. [30,1]
[[{{node training_4/SGD/gradients/gradients/loss_5/dense_17_loss/softmax_cross_entropy_with_logits_grad/mul}}]]
I can't seem to discern where a size mismatch would come from. In PyTorch, I am used to manually creating batches (by subindexing my data and label tensors) but in Tensorflow this seems to happen automatically. As such, this leaves me quite confused about what batch has the wrong size, how it got the wrong size, etc. I hope this simple model is way easier than I am making it and I just do not know the Tensorflow tricks yet.
Thanks for the help.
Hi so im learning about k fold cross validation, this first snippet of code is the building of a simple ANN:
def buildModel():
# Fitting classifier to the Training set
# Create your classifier here
model = Sequential()
model.add(Dense(units = 6, input_dim = X.shape[1], activation = 'relu'))
model.add(Dense(units = 6, activation = 'relu'))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return model
I then used cross_val_score validation in sklearn to run the ANN.
Keras is also runing on my gpu.
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
model = KerasClassifier(build_fn = buildModel, batch_size = 10, epochs =100)
accuracies = cross_val_score(estimator = model, X = X_train, y = y_train, cv = 10, n_jobs = -1)
But if i put n_jobs = -1 to try and use all cores i get an error (ps i have 11 features):
Blas GEMM launch failed : a.shape=(10, 11), b.shape=(11, 6), m=10, n=6, k=11
[[node dense_1/MatMul (defined at C:\Users\Brandon Cardillo\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\framework\ops.py:1751) ]]
[Op:__inference_keras_scratch_graph_1030]
Function call stack:
keras_scratch_graph
Ps. I am also running on jupyter notebook
Any help is very much appriciated.
Thank you.
When using this code I got from some tutorial I got the error that says The model is not configured to compute accuracy and that I should pass accuracy , The weird part is I am already passing metrics = ['accuracy']
I've searched a lot and all the codes I have seen works fine except mine.
Evaluating the ANN
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from tensorflow.python.keras.models import Sequential #Used to initialize the NN
from tensorflow.python.keras.layers import Dense #Used to create the layers in the ANN
def build_classifier():
classifier = Sequential()
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu',input_dim = 11))
classifier.add(Dense(units= 6, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics= ['accuracy'])
return classifier
# Needs to be revised from evaluting video in the course if needed
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, nb_epoch = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
I expect the output to be the accuarcies vector, instead i got:
ValueError: The model is not configured to compute accuracy. You should pass metrics=["accuracy"] to the model.compile() method.
Changing the parameter from metrics=['accuracy'] by metrics=['acc'] works for me.
Regards,
Joseph
I am new to keras and Neural networks. I am trying to tune the hyperparameters of a simple Neural network using GridSearchCV from scikit-learn with keras in python. Below is an example code for reference.
def base_model(input_layer_nodes = 150, optimizer = 'adam', kernel_initializer = 'normal', dropout_rate = 0.2):
model = Sequential()
model.add(Dense(units = input_layer_nodes, input_dim = 107, kernel_initializer = kernel_initializer, activation='relu'))
Dropout(dropout_rate)
model.add(Dense(units = 1, kernel_initializer = kernel_initializer, activation='sigmoid'))
# Compile model
model.compile(loss = 'binary_crossentropy', optimizer = optimizer, metrics = ['accuracy'])
return model
# Defining parameters for performing GridSearch
# optimizer = ['sgd', 'rmsprop', 'adam']
# dropout_rate = [0.1, 0.2, 0.3, 0.4, 0.5]
# input_layer_nodes = [50, 107, 150, 200]
kernel_initializer = ['uniform', 'normal']
param_grid = dict(kernel_initializer = kernel_initializer)
model = KerasClassifier(build_fn = base_model, epochs = 10, batch_size = 128, verbose = 2)
grid = GridSearchCV(estimator = model, param_grid=param_grid, n_jobs = 1, cv = 5)
grid.fit(X_train, y_train)
# View hyperparameters of best neural network
print("\nBest Training Parameters: ", grid.best_params_)
print("Best Training Accuracy: ", grid.best_score_)
When I execute the above code, I get the below error.
ValueError: ('Some keys in session_kwargs are not supported at this time: %s', dict_keys(['kernel_initializer']))
I am able to tune some of the other parameters of the network, like dropout_rate, optimizer, epochs. If the same code is working for other parameters, why is the kernel_initializer part not working? I am using keras 2.2.2, tensorflow 1.9.0-gpu and python 3.6.6. My OS windows 10 x64. Any help on this would be appreciated.