Essentially, I am looking for something like with tf.device('/device:GPU:0' for keras. I want to put my operations on different GPUs. I am using a Sequential model following lines of
...
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
...
model.fit(train_images, train_labels, epochs=5)
Probably you can use with K.tf.device('/gpu:1'): as a context. Or, if the backend is tensorflow, then the way you assign gpu in tf should work for keras also.
Related
I am writing a neural network to train incrementally (not online). Here is a snippet of the code
output = create_model()
model = Model(inputs=values, outputs=output)
if start_epoch > 1:
weights_list = load_model_from_pickle()
model.set_weights(weights_list)
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(data , label, epochs=1, verbose=1, batch_size=1024, shuffle=False)
In essence, I want to load previously trained weights and train for a few more epochs. I read some SO reply that calling compile changes the weights? Is there any other way to do it? Does it make sense to set weight after calling compile? Will the answer change if I run my model in multi gpu setting?
You need to compile the model ones and after training when you reload the model, you dont' require to compile it again. Read more here.
Compile function defines the optimizer, loss functions and metrics you want. It does not change any weights. For more detailed information, read here.
I have keras with tensorflow backend that runs on GPU. However, I am training an LSTM so instead I am training on the CPU.
with tf.device('/cpu:0'):
model = Sequential()
model.add(Bidirectional(LSTM(50, return_sequences=True), input_shape=(50, len(train_x[0][0]))))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['acc'])
The problem I have is that when I save and load the model, the predict function for the loaded model performs very slowly. After some timed tests I believe what is happening is that the loaded model is running on the GPU rather than the CPU, so it is slow. I tried compiling the loaded model on the CPU however this does not speed things up:
model.save('test_model.h5')
new_model = load_model('test_model.h5')
with tf.device('/cpu:0'):
new_model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['acc'])
Is there a way to achieve the same speeds with the loaded model as with the newly trained model? The newly trained model is almost five times faster. Thanks for your help.
Load the model with the device you want to use:
with tf.device('/cpu:0'):
new_model = load_model('test_model.h5')
I wrote simple code to learn Keras:
from tensorflow import keras
def main():
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
model = keras.Sequential()
model.add(keras.layers.Conv2D(16, 3, padding='same', activation='relu'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=4)
model.summary()
if __name__ == '__main__':
main()
But it seems to not learn anything. Not like it should learn much, but should at least decrease loss and increase accuracy a little. But both are stuck the same every epoch.
I had exact same model written in Pytorch and it achieved around 35% accuracy. This in tensorflow + keras is stuck on 10%.
tensorflow-gpu v1.9
What am I missing?
I think the default learning rate is to high for this problem. Try something like
opt=keras.optimizers.Adam(lr=1.e-5)
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
I checked the default learning rate used by Adam in both keras and PyTorch, and they both use 1e-3. Therefore, learning rate should not be the issue, assume you use default in both models.
Alternatively, I think this is related to the weight initialization, which is explicitly handled by each layer in keras but not in PyTorch.
Simply changing the training line to the following,
model.fit(x_train/255., y_train, shuffle=True,
validation_data=(x_test/255., y_test), epochs=4)
you should observe both training and validation accuracy reach around 60%.
I am not familiar with PyTorch, but I suggest you initialize the weights in the keras network with those used by the PyTorch network. In this way, you will have a fair comparison.
I am working on a 2D RGB pixel-based image classification problem via convolution neural networks (CNN) in Keras. My full CNN model can be found here.
I do the following to train/fit the CNN model:
model = my_CNN_unet()
model_checkpoint = ModelCheckpoint('testweights_{epoch:02d}.hdf5')
model.fit(x_trn, y_trn, batch_size=50, epochs=3, verbose=1, shuffle=True,
callbacks=[model_checkpoint], validation_data=(x_val, y_val))
How can I change my code, so that I use pre-trained weights (i.e., transfer learning) from well-known CNN architectures such as VGG and Inception
As people have mentioned in the comments, keras.applications provides a way for you to access pretrained models. As an example:
import keras
from keras.models import Model
model_base = keras.applications.vgg16.VGG16(include_top=False, input_shape=(*IMG_SIZE, 3), weights='imagenet')
output = model_base.output
# Add any other layers you want to `output` here...
output = Dense(len(categories), activation='softmax')(output)
model = Model(model_base.input, output)
for layer in model_base.layers:
layer.trainable = False
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
You can train this model in the same way you trained your previous CNN. Keras applications provides access to many models such as Inception, VGG16, VGG19, ResNet, and more-- you can access them all in a similar way. I wrote a blog post walking through how to use transfer learning in Keras to build an image classifier here: http://innolitics.com/10x/pretrained-models-with-keras/. It's got a working code example that you can look at as well.
I have several neural networks built using Keras that I used so far mostly in Jupyter. I often save models from scikit-learn with joblib and Keras with json + hdf5 and use them in other notebooks without issue.
I made a Python Spark application that can make use of those serialized models in cluster mode. joblib models are working fine however, I encountered an issue with Keras.
Here is the model used in notebook and pyspark:
def build_gru_model():
model = Sequential()
model.add(Embedding(max_nb_words, 128, input_length=max_sequence_length, dropout=0.2))
model.add(GRU(128, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
both called the same way:
preds = model.predict_proba(data, verbose=0)
However, only in Spark I get the error:
MissingInputError: ("An input of the graph, used to compute DimShuffle{x,x,x,x}(keras_learning_phase), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", keras_learning_phase)
I've done the mandatory search and found: https://github.com/fchollet/keras/issues/2430 which points to https://keras.io/getting-started/faq/
If I indeed remove dropout from my model, it works. However, I fail to understand how to implement something that would allow me to keep dropout during the training phase like described in the FAQ.
Based on the model code, how one would accomplish this?
You can try to put (before your prediction)
import keras.backend as K
K.set_learning_phase(0)
It should set your learning phase to 0 (test time)