Anyone could share an idea/blog/code snipet on how to convert a Keras Stateful LSTM into pure Tensorflow model? And then train it on batch..
Tensorflow doesn't support Keras Stateful LSTM on TPUs. Their devs refused to fix it.
I have tons of TPU time reserved and no way to use it for now. Any help is appreciated.
Model example and code to train:
model = Sequential()
model.add(LSTM(neurons, batch_input_shape=(window_size, n_steps, inputs_n), stateful=True))
model.add(Dense(outputs_n, activation='sigmoid'))
…
H = model.train_on_batch(X, y)
GitHub issue: https://github.com/tensorflow/tensorflow/issues/28837
Related
I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services.
My idea is to store in the device only the code in charge of retrieving the inputs of the ML module and performing the prediction, together with the file containing the Neural Network model already fitted using Keras.
In other words, I would like to avoid to install all the Keras/Tensorflow packages and dependencies if my purpose is only to perform the prediction on a trained model, and not to train a new model.
Is there a way to do that? Are there any lightweight libraries that allow to load the model of a Neural Network (with all the weights and biases settings) and perform a prediction, given the inputs?
What I am able to do now is to load in the Raspberry Pi a ".h5" file containing the model, weights and biases, but still I have to declare the building function of the model through Keras.
from tensorflow.keras.models import load_model
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
def NN_model():
'''
Definition of the Neural Network model
'''
model = Sequential()
model.add(Dense(7, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(15, kernel_initializer='normal', activation='relu'))
model.add(Dense(24, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
'''
Load NN model and use it to predict the radiation values
for the next 24 hours, hour by hour
'''
regr = KerasRegressor(build_fn=NN_model, epochs=1000, batch_size=5, verbose=0)
regr.model = load_model('saved_model.h5')
pred=regr.predict(input_row)
Since a fitted Neural Network is just a matter of weights and biases (and activation functions), I would expect that, once these parameters are determined, I wouldn't need the whole Tensforflow and Keras environment to map an output to the inputs I give to the NN.
What I would like to have is just something like:
import lightweight_module as lm
regression_model = lm.load_model('saved_model.h5')
prediction=regression_model.predict(inputs)
What you can do is, prune your neural network while retaining the same accuracy. It removes all the unwanted connections between different neurons that does not learn anything significant. It not only reduces complexity of your NN, also drastically reduces the storage space required & also reduces the inference time. In Keras I don't know of any such module (though I think people have made their own version), but modules like pytorch & caffe have some implementation of AlexNets & VGGNets they can reduce the size of your NN model by even 49x times. You can find one such implementation here.
https://github.com/felzek/AlexNet-A-Practical-Implementation/blob/master/testModel.py
I am wondering if Keras model compile/training with the functional API train variables defined by tf.get_variable? Can Keras training also incorporate Tensorflow operations?
So basically I am looking to define a Keras model with Tensorflow variables and operations, then use
model = tf.keras.Model(inputs=inputs, outputs=predictions)
model.compile(optimizer=optimizer, loss=loss)
model.fit(data, labels, batch_size=batch_size, epochs=epochs)
To train the model. The reason for this is that Google's TPUs require either a Keras or TF.Estimator API, with Keras being more recommended, so I am looking to see how easily I can convert my model.
BackGround
It looks like since Tensorflow is the backend, there are ways to mix Keras/Tensorflow variables. This blog post shows how Keras variables are trained using a Tensorflow graph/session
https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html
from keras.layers import Dropout
from keras import backend as K
img = tf.placeholder(tf.float32, shape=(None, 784))
labels = tf.placeholder(tf.float32, shape=(None, 10))
x = Dense(128, activation='relu')(img)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(10, activation='softmax')(x)
loss = tf.reduce_mean(categorical_crossentropy(labels, preds))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
with sess.as_default():
for i in range(100):
batch = mnist_data.train.next_batch(50)
train_step.run(feed_dict={img: batch[0],
labels: batch[1],
K.learning_phase(): 1})
acc_value = accuracy(labels, preds)
with sess.as_default():
print acc_value.eval(feed_dict={img: mnist_data.test.images,
labels: mnist_data.test.labels,
K.learning_phase(): 0})
And also here it shows that Tensorflow variables can be used as input to a Keras model
How to set the input of a Keras layer of a functional model, with a Tensorflow tensor?
tf_embedding_input = ... # pre-processing output tensor
# Keras model
model = Sequential()
model.add(Input(tensor=tf_embedding_input))
model.add(Embedding(max_features, 128, input_length=maxlen))
So I am wondering if Keras can train Tensorflow variables.
Example
I would like to train the embedding and softmax variables in the Tensorflow architecture below
embeddings = tf.get_variable( 'embeddings',
initializer= tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
softmax_weights = tf.get_variable( 'softmax_weights',
initializer= tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
softmax_biases = tf.get_variable('softmax_biases',
initializer= tf.zeros([vocabulary_size]), trainable=False )
embed = tf.nn.embedding_lookup(embeddings, train_dataset) #train data set is
embed_reshaped = tf.reshape( embed, [batch_size*num_inputs, embedding_size] )
segments= np.arange(batch_size).repeat(num_inputs)
averaged_embeds = tf.segment_mean(embed_reshaped, segments, name=None)
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=averaged_embeds,
labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))
Since Tensorflow Keras uses a Tensorflow backend, I'm guessing it's somehow possible to use and train Tensorflow variables and use Tensorflow operations in training.
Why do I want to do this?
Google's TPUs require that your architecture be implemented via the Estimator API or Keras API. Since the Keras API is more recommended, there is probably interest in converting a regular Tensorflow Graph/Session to use the Keras API with as few alterations to their code as possible.
Knowing how to incorporate Tensorflow operations and train Tensorflow variables using the Keras model compile/train would greatly help with this.
Little background:
As we know Keras is a model-level library, providing high-level building blocks for developing deep learning models.
The most important thing: Keras API does not handle tensor operations. It needs a well-optimized tensor manipulation library to do so, know as a "backend engine" for Keras.
At this time, Keras has three backend engines available: the TensorFlow backend (Google), the Theano backend, and the CNTK backend (MSFT).
Knowing how to incorporate Tensorflow operations and train Tensorflow variables using the Keras model compile/train would greatly help with this.
The only thing you should ask yourself, is what is the difference between the Keras variable and regular Tensorflow variable.
Happens to be that Keras variable have metadata. So in order to use the TensorFlow variables in Keras you convert them.
Note: A TensorFlow variable scope will have no effect on a Keras layer or model.
Finally variable sharing can be done by initializing the Keras layer (or model).
Would this solution help?
keras add external trainable variable to graph
You could feed your embeddings and softmax layers into the Keras model using
model.add()
and then define those variables as trainable using
model.layers[-1].trainable_weights.extend()
I wanted to add one tanh layer to embedding layer with keras functional api:
x=layers.Embedding(vocab_size, 8, input_length=max_length)(input)
output=keras.activations.tanh(x)
model = Model(inputs=input, outputs=output)
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(data, labels)
but system told me I must use keras layers ,not tensor. I searched a lot keras tutorials. There is only one way to solve this problem:
model.add(Activation('tanh'))
but it is Sequential model which I don't want to use.Is there some ways to solve this with functional api?
With the functional api it's almost the same as the Sequential model:
output = Activation('tanh')(x)
I am working on a 2D RGB pixel-based image classification problem via convolution neural networks (CNN) in Keras. My full CNN model can be found here.
I do the following to train/fit the CNN model:
model = my_CNN_unet()
model_checkpoint = ModelCheckpoint('testweights_{epoch:02d}.hdf5')
model.fit(x_trn, y_trn, batch_size=50, epochs=3, verbose=1, shuffle=True,
callbacks=[model_checkpoint], validation_data=(x_val, y_val))
How can I change my code, so that I use pre-trained weights (i.e., transfer learning) from well-known CNN architectures such as VGG and Inception
As people have mentioned in the comments, keras.applications provides a way for you to access pretrained models. As an example:
import keras
from keras.models import Model
model_base = keras.applications.vgg16.VGG16(include_top=False, input_shape=(*IMG_SIZE, 3), weights='imagenet')
output = model_base.output
# Add any other layers you want to `output` here...
output = Dense(len(categories), activation='softmax')(output)
model = Model(model_base.input, output)
for layer in model_base.layers:
layer.trainable = False
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
You can train this model in the same way you trained your previous CNN. Keras applications provides access to many models such as Inception, VGG16, VGG19, ResNet, and more-- you can access them all in a similar way. I wrote a blog post walking through how to use transfer learning in Keras to build an image classifier here: http://innolitics.com/10x/pretrained-models-with-keras/. It's got a working code example that you can look at as well.
I am using Keras for computing a simple sequence classification neural network. I played with the different module and I found that there are two way to create Sequential neural network.
The first way is to use Sequential API. This is the most common way which I found in a lot of tutorial/documentation.
Here is the code :
# Sequential Neural Network using Sequential()
model = Sequential()
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu', input_shape=(27 , 300,)))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(len(7, activation='softmax'))
model.summary()
The second ways is to build de sequential neural network from "scratch" with the Model API. Here is the code.
# Sequential neural network using Model()
inputs = Input(shape=(27 , 300))
x = Conv1D(filters=32, kernel_size=3, padding='same', activation='relu')(inputs)
x = MaxPooling1D(pool_size=2)(x)
x = LSTM(100)(x)
predictions = Dense(7, activation='softmax')(x)
model = Model(inputs=inputs, outputs=predictions)
model.summary()
I trained it both with a fixed seed (np.random.seed(1337)), with the same training data and my output are different...
Knowing that the only difference in the summary is the first layer of inputs with the Model API.
Is there anyone that knows why this neural network are different ?
And if there are not, why did i get different results ?
Thanks
You setup the random seed only in numpy and not in tensorflow (in case it's the backend of keras in your case). Try to add this in your code:
from numpy.random import seed
seed(1337)
from tensorflow import set_random_seed
set_random_seed(1337)
the detailed article about this topic here
tf.keras.backend.clear_session()
tf.random.set_seed(seed_value)
You can use above code block and run the loaded model for some iterations and check if the error still persists .
I was facing the same issue for reproducibiity,it worked for me .
As mentioned by andrey, over and above these 2 seed setter, you need to setup the Python Hash Environment
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
you can still add one more block to force TensorFlow to use single thread.
( if you are using multicore)
Multiple threads are a potential source of non-reproducible results.
session_conf = tf.ConfigProto(
intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf)