I am using Keras for computing a simple sequence classification neural network. I played with the different module and I found that there are two way to create Sequential neural network.
The first way is to use Sequential API. This is the most common way which I found in a lot of tutorial/documentation.
Here is the code :
# Sequential Neural Network using Sequential()
model = Sequential()
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu', input_shape=(27 , 300,)))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(len(7, activation='softmax'))
model.summary()
The second ways is to build de sequential neural network from "scratch" with the Model API. Here is the code.
# Sequential neural network using Model()
inputs = Input(shape=(27 , 300))
x = Conv1D(filters=32, kernel_size=3, padding='same', activation='relu')(inputs)
x = MaxPooling1D(pool_size=2)(x)
x = LSTM(100)(x)
predictions = Dense(7, activation='softmax')(x)
model = Model(inputs=inputs, outputs=predictions)
model.summary()
I trained it both with a fixed seed (np.random.seed(1337)), with the same training data and my output are different...
Knowing that the only difference in the summary is the first layer of inputs with the Model API.
Is there anyone that knows why this neural network are different ?
And if there are not, why did i get different results ?
Thanks
You setup the random seed only in numpy and not in tensorflow (in case it's the backend of keras in your case). Try to add this in your code:
from numpy.random import seed
seed(1337)
from tensorflow import set_random_seed
set_random_seed(1337)
the detailed article about this topic here
tf.keras.backend.clear_session()
tf.random.set_seed(seed_value)
You can use above code block and run the loaded model for some iterations and check if the error still persists .
I was facing the same issue for reproducibiity,it worked for me .
As mentioned by andrey, over and above these 2 seed setter, you need to setup the Python Hash Environment
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
you can still add one more block to force TensorFlow to use single thread.
( if you are using multicore)
Multiple threads are a potential source of non-reproducible results.
session_conf = tf.ConfigProto(
intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf)
Related
I'm working on a neural network that's designed to identify patterns in a large dataset, but I'm running into an issue where the training process seems to be stuck in a local minimum. Despite trying a variety of different optimization algorithms and adjusting the learning rate, I can't seem to get the network to converge on a more optimal solution. Here's the code I'm using to train the network:
import numpy as np
import tensorflow as tf
# Load dataset
data = np.load('data.npy')
# Split dataset into training and validation sets
train_data = data[:5000]
val_data = data[5000:]
# Define neural network architecture
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(512, activation='relu', input_shape=(data.shape[1],)),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Compile model
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
# Train model
history = model.fit(train_data[:, :-1], train_data[:, -1],
validation_data=(val_data[:, :-1], val_data[:, -1]),
batch_size=32,
epochs=100,
verbose=1)
I suspect that there might be an issue with the dataset itself, but I've tried normalizing and standardizing the data, as well as applying various preprocessing techniques, with no luck. Any insights or suggestions would be helpful.
Here is one source related to your problems (https://github.com/christianversloot/machine-learning-articles/blob/main/getting-out-of-loss-plateaus-by-adjusting-learning-rates.md). Have a look! May give you some ideas.
Have you tried using different train/val splits? It could be something specific to the split you happen to have selected.
I am trying to make the first 10 layers in this TFHub model to be non-traininable. I want to freeze those layers so that I can finetune the remaining layers. I could not find any example to do this. I have seen similar examples in Keras models such as resNet50 where layers.trainable can be exclusively set to True or False. I am not able to do this in TFHub models. Any pointer will be appreciated. Thanks
import tensorflow_hub as tfhub
model_loc = "https://tfhub.dev/google/imagenet/resnet_v1_50/classification/5"
model = tfhub.KerasLayer(
model_loc,
input_shape=(224, 224, 3),
trainable=True)
Anyone could share an idea/blog/code snipet on how to convert a Keras Stateful LSTM into pure Tensorflow model? And then train it on batch..
Tensorflow doesn't support Keras Stateful LSTM on TPUs. Their devs refused to fix it.
I have tons of TPU time reserved and no way to use it for now. Any help is appreciated.
Model example and code to train:
model = Sequential()
model.add(LSTM(neurons, batch_input_shape=(window_size, n_steps, inputs_n), stateful=True))
model.add(Dense(outputs_n, activation='sigmoid'))
…
H = model.train_on_batch(X, y)
GitHub issue: https://github.com/tensorflow/tensorflow/issues/28837
I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services.
My idea is to store in the device only the code in charge of retrieving the inputs of the ML module and performing the prediction, together with the file containing the Neural Network model already fitted using Keras.
In other words, I would like to avoid to install all the Keras/Tensorflow packages and dependencies if my purpose is only to perform the prediction on a trained model, and not to train a new model.
Is there a way to do that? Are there any lightweight libraries that allow to load the model of a Neural Network (with all the weights and biases settings) and perform a prediction, given the inputs?
What I am able to do now is to load in the Raspberry Pi a ".h5" file containing the model, weights and biases, but still I have to declare the building function of the model through Keras.
from tensorflow.keras.models import load_model
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
def NN_model():
'''
Definition of the Neural Network model
'''
model = Sequential()
model.add(Dense(7, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(15, kernel_initializer='normal', activation='relu'))
model.add(Dense(24, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
'''
Load NN model and use it to predict the radiation values
for the next 24 hours, hour by hour
'''
regr = KerasRegressor(build_fn=NN_model, epochs=1000, batch_size=5, verbose=0)
regr.model = load_model('saved_model.h5')
pred=regr.predict(input_row)
Since a fitted Neural Network is just a matter of weights and biases (and activation functions), I would expect that, once these parameters are determined, I wouldn't need the whole Tensforflow and Keras environment to map an output to the inputs I give to the NN.
What I would like to have is just something like:
import lightweight_module as lm
regression_model = lm.load_model('saved_model.h5')
prediction=regression_model.predict(inputs)
What you can do is, prune your neural network while retaining the same accuracy. It removes all the unwanted connections between different neurons that does not learn anything significant. It not only reduces complexity of your NN, also drastically reduces the storage space required & also reduces the inference time. In Keras I don't know of any such module (though I think people have made their own version), but modules like pytorch & caffe have some implementation of AlexNets & VGGNets they can reduce the size of your NN model by even 49x times. You can find one such implementation here.
https://github.com/felzek/AlexNet-A-Practical-Implementation/blob/master/testModel.py
I have several neural networks built using Keras that I used so far mostly in Jupyter. I often save models from scikit-learn with joblib and Keras with json + hdf5 and use them in other notebooks without issue.
I made a Python Spark application that can make use of those serialized models in cluster mode. joblib models are working fine however, I encountered an issue with Keras.
Here is the model used in notebook and pyspark:
def build_gru_model():
model = Sequential()
model.add(Embedding(max_nb_words, 128, input_length=max_sequence_length, dropout=0.2))
model.add(GRU(128, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
both called the same way:
preds = model.predict_proba(data, verbose=0)
However, only in Spark I get the error:
MissingInputError: ("An input of the graph, used to compute DimShuffle{x,x,x,x}(keras_learning_phase), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", keras_learning_phase)
I've done the mandatory search and found: https://github.com/fchollet/keras/issues/2430 which points to https://keras.io/getting-started/faq/
If I indeed remove dropout from my model, it works. However, I fail to understand how to implement something that would allow me to keep dropout during the training phase like described in the FAQ.
Based on the model code, how one would accomplish this?
You can try to put (before your prediction)
import keras.backend as K
K.set_learning_phase(0)
It should set your learning phase to 0 (test time)