Migrating to the TF2.0 I'm trying to use the tf.keras approach for solving things.
In standard TF, I can use with tf.device(...) to control where ops are.
For example, I might have a model something like
model = tf.keras.Sequential([tf.keras.layers.Input(..),
tf.keras.layers.Embedding(...),
tf.keras.layers.LSTM(...),
...])
Assuming I want to have the network up until Embedding (including) on the CPU and the and from there on on the GPU, how will I go about that?
(This is just an example, the layers could have nothing to do with embeddings)
If the solution involves subclassing tf.keras.Model that is OK too, I don't mind not using Sequential
You can use the Keras functional API:
inputs = tf.keras.layers.Input(..)
with tf.device("/GPU:0"):
model = tf.keras.layers.Embedding(...)(inputs)
outputs = tf.keras.layers.LSTM(...)(model)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
Related
Is there a method to apply a low pass filter on inputs for a Keras model? I have 4 inputs of noisy sensor data, and I'm curious if I can build it into the model before I export it for Inference with ONNX, or if I need to filter it outside the model.
I'm pretty new to ML, but currently my model works perfect when running the Low Pass prior to the model. My goal would be to limit user error by being able to attach the model directly to the sensor output.
Not sure if this helps, but since you're using Keras, you can include a Lambda layer between your Input Layer and the rest of your model. The Lambda layer lets you run arbitrary Tensorflow code on model inputs, so in theory you could implement your filter in this Lambda layer, provided it is something you can do with the Tensorflow primitives.
For instance, consider the following code:
input_layer = keras.Layers.Input(shape=(signal_length,no_channels))
sig_diff = keras.Layers.Lambda(lambda x: x[:,1:] - x[:,:-1])(input_layer)
conv_1 = keras.Layers.Conv1D(5)(sig_diff)
#rest of model goes here
In this very trivial example, I'm differencing the input signal before passing it to some sort of convolutional model. You might be able to implement your low pass filter in the lambda layer as above, which should solve the problem.
I am trying to implement a model that uses encoding from multiple pre-trained BERT models on different datasets and gets a combined representation using a fully-connected layer. In this, I want that BERT models should remain fixed and only fully-connected layers should get trained. Is it possible to achieve this in huggingface-transformers? I don't see any flag which allows me to do that.
PS: I don't want to go by the way of dumping the encoding of inputs for each BERT model and use them as inputs.
A simple solution to this is to just exclude the parameters related to the BERT model while passing to the optimizer.
param_optimizer = [x for x in param_optimizer if 'bert' not in x[0]]
optimizer = AdamW(param_optimizer, lr)
I am creating a model somewhat similar to the one mentioned below:
model
I am using Keras to create such model but have struck a dead end as I have not been able find a way to add SoftMax to outputs of the LSTM units. So far all the tutorials and helping material provides with information about outputting a single class even like in the case of image captioning as provided in this link.
So is it possible to apply SoftMax to every unit of LSTM (where return sequence is true) or do I have to move to pytorch.
The answer is: yes, it is possible to apply to each unit of LSTM and no, you do not have to move to PyTorch.
While in Keras 1.X you needed to explicitly state that you add a TimeDistributed layer, in Keras 2.X you can just write:
model.add(LSTM(50,activation='relu',return_sequences=False))
model.add(Dense(number_of_classes,activation='softmax'))
Is there (more or less) simple way to write a complicated NN model so it will be trainable in the eager mode? Are there examples of a such code?
For example, I want to use the InceptionResnetV2. I have the code created with tf.contrib.slim. According to this link, https://github.com/tensorflow/tensorflow/issues/16182 , slim is deprecated and I need to use Keras. And I really can't use a slim code for training with eager because I can't take the list of variables and apply gradients (ok, I can try to wrap model into the GradientTape but not sure what to do with regularization loss).
Ok, let's try Keras.
In [30]: tf.__version__
Out[30]: '1.13.1'
In [31]: tf.enable_eager_execution()
In [32]: from keras.applications.inception_resnet_v2 import InceptionResNetV2
In [33]: model = InceptionResNetV2(weights=None)
...
/usr/local/lib/python3.6/dist-packages/keras_applications/inception_resnet_v2.py in InceptionResNetV2(include_top, weights, input_tensor, input_shape, pooling, classes, **kwargs)
246
247 if input_tensor is None:
--> 248 img_input = layers.Input(shape=input_shape)
249 else:
250 if not backend.is_keras_tensor(input_tensor):
...
RuntimeError: tf.placeholder() is not compatible with eager execution.
Doesn't work by default.
In this tutorial they say that I need to make my own class of a model and maintain variables by myself https://www.tensorflow.org/tutorials/eager/custom_training#define_the_model. I'm not sure that I want to do it for Inception. Too many variables to create and maintain. It's like returning back to old versions of TF, in days when even slim didn't exist.
In this tutorial networks are created using Keras https://www.tensorflow.org/tutorials/eager/custom_training_walkthrough#create_a_model_using_keras but I doubt that I can easily maintain complicated structure in a such way, by only defining model without using it with Input. For example, in this article, if I understand correctly, author initialize keras Input and propagate it through the model (which causes RuntimeError when used with Eager, as you seen earlier). I can make my own model by subclassing the model class as here: https://www.tensorflow.org/api_docs/python/tf/keras/Model . Oops, in this way I need to maintain layers, not variables. It seems as almost the same problem to me.
There is an interesting mention of AutoGrad here https://www.tensorflow.org/beta/guide/autograph#keras_and_autograph . They only overwrite __call__, so it seems like I don't need to maintain variables in this case, but I didn't test it yet.
So, is there any simple solution?
Wrap slim model in GradientTape? How can I then apply reg loss to the weights?
Track every variable by myself? Sounds a little bit painful.
Use Keras? How to use it with eager when I have branches and complicated structure in the model?
Your first approach is probably the most common. This error:
RuntimeError: tf.placeholder() is not compatible with eager execution.
is because one cannot use a tf.placeholder in eager mode. There is no concept of such a thing when executing eagerly.
You could use the tf.data API to build a dataset for your training data and feed that to the model. Something like this with the datasets replaced with your real data:
import tensorflow as tf
tf.enable_eager_execution()
model = tf.keras.applications.inception_resnet_v2.InceptionResNetV2(weights=None)
model.compile(tf.keras.optimizers.Adam(), loss=tf.keras.losses.categorical_crossentropy)
### Replace with tf.data.Datasets for your actual training data!
train_x = tf.data.Dataset.from_tensor_slices(tf.random.normal((10,299,299,3)))
train_y = tf.data.Dataset.from_tensor_slices(tf.random.uniform((10,), maxval=10, dtype=tf.int32))
training_data = tf.data.Dataset.zip((train_x, train_y)).batch(BATCH_SIZE)
model.fit(training_data)
This approach works as is in TensorFlow 2.0 too as mentioned in your title.
I have been working with Keras for a week or so. I know that Keras can use either TensorFlow or Theano as a backend. In my case, I am using TensorFlow.
So I'm wondering: is there a way to write a NN in Keras, and then print out the equivalent version in TensorFlow?
MVE
For instance suppose I write
#create seq model
model = Sequential()
# add layers
model.add(Dense(100, input_dim = (10,), activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
# compile model
model.compile(optimizer = 'adam', loss = 'mse')
# fit
model.fit(Xtrain, ytrain, epochs = 100, batch_size = 32)
# predict
ypred = model.predict(Xtest, batch_size = 32)
# evaluate
result = model.evaluate(Xtest)
This code might be wrong, since I just started, but I think you get the idea.
What I want to do is write down this code, run it (or not even, maybe!) and then have a function or something that will produce the TensorFlow code that Keras has written to do all these calculations.
First, let's clarify some of the language in the question. TensorFlow (and Theano) use computational graphs to perform tensor computations. So, when you ask if there is a way to "print out the equivalent version" in Tensorflow, or "produce TensorFlow code," what you're really asking is, how do you export a TensorFlow graph from a Keras model?
As the Keras author states in this thread,
When you are using the TensorFlow backend, your Keras code is actually building a TF graph. You can just grab this graph.
Keras only uses one graph and one session.
However, he links to a tutorial whose details are now outdated. But the basic concept has not changed.
We just need to:
Get the TensorFlow session
Export the computation graph from the TensorFlow session
Do it with Keras
The keras_to_tensorflow repository contains a short example of how to export a model from Keras for use in TensorFlow in an iPython notebook. This is basically using TensorFlow. It isn't a clearly-written example, but throwing it out there as a resource.
Do it with TensorFlow
It turns out we can actually get the TensorFlow session that Keras is using from TensorFlow itself, using the tf.contrib.keras.backend.get_session() function. It's pretty simple to do - just import and call. This returns the TensorFlow session.
Once you have the TensorFlow session variable, you can use the SavedModelBuilder to save your computational graph (guide + example to using SavedModelBuilder in the TensorFlow docs). If you're wondering how the SavedModelBuilder works and what it actually gives you, the SavedModelBuilder Readme in the Github repo is a good guide.
P.S. - If you are planning on heavy usage of TensorFlow + Keras in combination, have a look at the other modules available in tf.contrib.keras
So you want to use instead of WX+b a different function for your neurons. Well in tensorflow you explicitly calculate this product, so for example you do
y_ = tf.matmul(X, W)
you simply have to write your formula and let the network learn. It should not be difficult to implement.
In addition what you are trying to do (according to the paper you link) is called batch normalization and is relatively standard. The idea being you normalize your intermediate steps (in the different layers). Check for example https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFggyMAE&url=https%3A%2F%2Farxiv.org%2Fabs%2F1502.03167&usg=AOvVaw1nGzrGnhPhNGEczNwcn6WK or https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFghCMAM&url=https%3A%2F%2Fbcourses.berkeley.edu%2Ffiles%2F66022277%2Fdownload%3Fdownload_frd%3D1%26verifier%3DoaU8pqXDDwZ1zidoDBTgLzR8CPSkWe6MCBKUYan7&usg=AOvVaw0AHLwD_0pUr1BSsiiRoIFc
Hope that helps,
Umberto