Related
I'm trying to solve 2D Darcy equation which is a mixed formulation. Suppose I have a target vector and source vector as follows:
u = [u1,u2,p]
x = [x,y].
grad(u,x) =
[du1/dx, du2/dx, dp/dx;
du1/dy, du2/dy, dp/dy]
I'm not understanding if this is what happens if I do tf.gradients(u,x).
tf.gradients(u,x) doesn't return what you want because
from https://www.tensorflow.org/api_docs/python/tf/gradients,
gradients() adds ops to the graph to output the derivatives of ys with
respect to xs. It returns a list of Tensor of length len(xs) where
each tensor is the sum(dy/dx) for y in ys and for x in xs.
Here is how you can get jacobian.
import tensorflow as tf
x=tf.constant([3.0,4.0])
with tf.GradientTape() as tape:
tape.watch(x)
u1=x[0]**2+x[1]**2
u2=x[0]**2
u3=x[1]**3
u=tf.stack([u1,u2,u3])
J = tape.jacobian(u, x)
print(J)
'''
tf.Tensor(
[[ 6. 8.]
[ 6. 0.]
[ 0. 48.]], shape=(3, 2), dtype=float32)
'''
I'm trying to implement GAN in Keras, and I want to use One-sided label smoothing trick, i.e. put the label of True image to be 0.9 instead of 1. However, now the built-in metrics binary_crossentropy does not do the correct thing, it's always 0 for True image.
Then I tried to implement my own metrics in Keras. I want to convert all 0.9 label to be 1, but I'm new to Keras and I don't know how to do that. Here's what I intend:
# Just a pseudo code
def custom_metrics(y_true, y_pred):
if K.equal(y_true, [[0.9]]):
y_true = y_true+0.1
return metrics.binary_accuracy(y_true, y_pred)
How should I compare and change the y_true label? Thanks in advance!
EDIT:
The output of the following code is:
def custom_metrics(y_true, y_pred):
print(K.shape(y_true))
print(K.shape(y_pred))
y_true = K.switch(K.equal(y_true, 0.9), K.ones_like(y_true), K.zeros_like(y_true))
return metrics.binary_accuracy(y_true, y_pred)
Tensor("Shape:0", shape=(2,), dtype=int32)
Tensor("Shape_1:0", shape=(2,), dtype=int32)
ValueError: Shape must be rank 0 but is rank 2 for 'cond/Switch' (op: 'Switch') with input shapes: [?,?], [?,?].
You can use tf.where:
y_true = tf.where(K.equal(y_true, 0.9), tf.ones_like(y_true), tf.zeros_like(y_true))
Alternatively, You can use keras.backend.switch function for that.
keras.backend.switch(condition, then_expression, else_expression)
Your custom metrics function would look something like below:
def custom_metrics(y_true, y_pred):
y_true = K.switch(K.equal(y_true, 0.9),K.ones_like(y_true), K.zeros_like(y_true))
return metrics.binary_accuracy(y_true, y_pred)
Test code:
def test_function(y_true):
print(K.eval(y_true))
y_true = K.switch(K.equal(y_true, 0.9),K.ones_like(y_true), K.zeros_like(y_true))
print(K.eval(y_true))
y_true = K.variable(np.array([0, 0, 0, 0, 0, 0.9, 0.9, 0.9, 0.9, 0.9]))
test_function(y_true)
output:
[0. 0. 0. 0. 0. 0.9 0.9 0.9 0.9 0.9]
[0. 0. 0. 0. 0. 1. 1. 1. 1. 1.]
I have a variable that contains the 4x4 identitiy matrix.
I wish to assign some values to this matrix (these values are learned by the model).
When I use tf.assign() I get an error saying that strided slices do not have gradients.
My question is how can I do this without using tf.assign()
Here is a sample code of the desired behaviour(without the error, since the values are not learned here) :
params = [[1.0, 2.0, 3.0]]
M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
M = tf.assign(M[:, 0:3, 3], params)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
output_val = sess.run(M)
Note - the variable is created solely for the purpose of housing these parameters.
UPDATE: I am adding a minimal working example that creates the error. (obviously training like this won't result in anything good. Its just to illustrate the error since my code is far too long to copy here)
params = [[1.0, 2.0, 3.0]]
M_gt = np.eye(4)
M_gt[0:3, 3] = [4.0, 5.0, 6.0]
M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
M = tf.assign(M[:, 0:3, 3], params)
loss = tf.nn.l2_loss(M - M_gt)
optimizer = tf.train.AdamOptimizer(0.001)
train_op = optimizer.minimize(loss)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
sess.run(train_op)
Here is an example of how you could do what (I think) you want:
import tensorflow as tf
import numpy as np
with tf.Graph().as_default(), tf.Session() as sess:
params = [[1.0, 2.0, 3.0]]
M_gt = np.eye(4)
M_gt[0:3, 3] = [4.0, 5.0, 6.0]
M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
params_t = tf.constant(params, dtype=tf.float32)
shape_m = tf.shape(M)
batch_size = shape_m[0]
num_m = shape_m[1]
num_params = tf.shape(params_t)[1]
last_column = tf.concat([tf.tile(tf.transpose(params_t)[tf.newaxis], (batch_size, 1, 1)),
tf.zeros((batch_size, num_m - num_params, 1), dtype=params_t.dtype)], axis=1)
replace = tf.concat([tf.zeros((batch_size, num_m, num_m - 1), dtype=params_t.dtype), last_column], axis=2)
r = tf.range(num_m)
ii = r[tf.newaxis, :, tf.newaxis]
jj = r[tf.newaxis, tf.newaxis, :]
mask = tf.tile((ii < num_params) & (tf.equal(jj, num_m - 1)), (batch_size, 1, 1))
M_replaced = tf.where(mask, replace, M)
loss = tf.nn.l2_loss(M_replaced - M_gt[np.newaxis])
optimizer = tf.train.AdamOptimizer(0.001)
train_op = optimizer.minimize(loss)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
M_val, M_replaced_val = sess.run([M, M_replaced])
print('M:')
print(M_val)
print('M_replaced:')
print(M_replaced_val)
Output:
M:
[[[ 1. 0. 0. 0.]
[ 0. 1. 0. 0.]
[ 0. 0. 1. 0.]
[ 0. 0. 0. 1.]]]
M_replaced:
[[[ 1. 0. 0. 1.]
[ 0. 1. 0. 2.]
[ 0. 0. 1. 3.]
[ 0. 0. 0. 1.]]]
Let's say I have a 360px by 240px image. Instead of cropping my (already small) image to 240x240, can I create a convolutional neural network that operates on the full rectangle? Specifically using the Convolution2D layer.
I ask because every paper I've read doing CNNs seems to have square input sizes, so I wonder if what I propose will be OK, and if so, what disadvantages I may run into. Are all the settings (like border_mode='same') going to work the same?
No issues with a rectangle image... Everything will work properly as for square images.
Yes.
But why don't you give it a try
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model
print("Building Model...")
inp = Input(shape=(1,None,None))
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
model_network = Model(input=inp, output=output)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
model_network.layers[1].set_weights(w)
print("Weights after change:")
print(model_network.layers[1].get_weights())
print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))
Build a sample model
inp = Input(shape=(1,None,None))
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
model_network = Model(input=inp, output=output)
Give it some weights and set them so you could predit the output, say:
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
model_network.layers[1].set_weights(w)
So that the convolution would simply double your input.
Give it your rectangular image:
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
And check the output to see if it works
print("Output:")
print(model_network.predict(input_mat))
Sample output:
Using Theano backend.
Building Model...
Weights after change:
[array([[[[ 0., 0., 0.],
[ 0., 2., 0.],
[ 0., 0., 0.]]]], dtype=float32)]
Input:
[[[[ 1. 2. 3. 10.]
[ 4. 5. 6. 11.]
[ 7. 8. 9. 12.]]]]
Output:
[[[[ 2. 4. 6. 20.]
[ 8. 10. 12. 22.]
[ 14. 16. 18. 24.]]]]
original post with some changes
I'd like to reset (randomize) the weights of all layers in my Keras (deep learning) model. The reason is that I want to be able to train the model several times with different data splits without having to do the (slow) model recompilation every time.
Inspired by this discussion, I'm trying the following code:
# Reset weights
for layer in KModel.layers:
if hasattr(layer,'init'):
input_dim = layer.input_shape[1]
new_weights = layer.init((input_dim, layer.output_dim),name='{}_W'.format(layer.name))
layer.trainable_weights[0].set_value(new_weights.get_value())
However, it only partly works.
Partly, becuase I've inspected some layer.get_weights() values, and they seem to change. But when I restart the training, the cost values are much lower than the initial cost values on the first run. It's almost like I've succeeded resetting some of the weights, but not all of them.
Save the initial weights right after compiling the model but before training it:
model.save_weights('model.h5')
and then after training, "reset" the model by reloading the initial weights:
model.load_weights('model.h5')
This gives you an apples to apples model to compare different data sets and should be quicker than recompiling the entire model.
Reset all layers by checking for initializers:
def reset_weights(model):
import keras.backend as K
session = K.get_session()
for layer in model.layers:
if hasattr(layer, 'kernel_initializer'):
layer.kernel.initializer.run(session=session)
if hasattr(layer, 'bias_initializer'):
layer.bias.initializer.run(session=session)
Update: kernel_initializer is kernel.initializer now.
If you want to truly re-randomize the weights, and not merely restore the initial weights, you can do the following. The code is slightly different depending on whether you're using TensorFlow or Theano.
from keras.initializers import glorot_uniform # Or your initializer of choice
import keras.backend as K
initial_weights = model.get_weights()
backend_name = K.backend()
if backend_name == 'tensorflow':
k_eval = lambda placeholder: placeholder.eval(session=K.get_session())
elif backend_name == 'theano':
k_eval = lambda placeholder: placeholder.eval()
else:
raise ValueError("Unsupported backend")
new_weights = [k_eval(glorot_uniform()(w.shape)) for w in initial_weights]
model.set_weights(new_weights)
I have found the clone_model function that creates a cloned network with the same architecture but new model weights.
Example of use:
model_cloned = tensorflow.keras.models.clone_model(model_base)
Comparing the weights:
original_weights = model_base.get_weights()
print("Original weights", original_weights)
print("========================================================")
print("========================================================")
print("========================================================")
model_cloned = tensorflow.keras.models.clone_model(model_base)
new_weights = model_cloned.get_weights()
print("New weights", new_weights)
If you execute this code several times, you will notice that the cloned model receives new weights each time.
Tensorflow 2 answer:
for ix, layer in enumerate(model.layers):
if hasattr(model.layers[ix], 'kernel_initializer') and \
hasattr(model.layers[ix], 'bias_initializer'):
weight_initializer = model.layers[ix].kernel_initializer
bias_initializer = model.layers[ix].bias_initializer
old_weights, old_biases = model.layers[ix].get_weights()
model.layers[ix].set_weights([
weight_initializer(shape=old_weights.shape),
bias_initializer(shape=old_biases.shape)])
Original weights:
model.layers[1].get_weights()[0][0]
array([ 0.4450057 , -0.13564804, 0.35884023, 0.41411972, 0.24866664,
0.07641453, 0.45726687, -0.04410008, 0.33194816, -0.1965386 ,
-0.38438258, -0.13263905, -0.23807487, 0.40130925, -0.07339832,
0.20535922], dtype=float32)
New weights:
model.layers[1].get_weights()[0][0]
array([-0.4607593 , -0.13104361, -0.0372932 , -0.34242013, 0.12066692,
-0.39146423, 0.3247317 , 0.2635846 , -0.10496247, -0.40134245,
0.19276887, 0.2652442 , -0.18802321, -0.18488845, 0.0826562 ,
-0.23322225], dtype=float32)
K.get_session().close()
K.set_session(tf.Session())
K.get_session().run(tf.global_variables_initializer())
Try set_weights.
for example:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model
print("Building Model...")
inp = Input(shape=(1,None,None))
x = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
for layer_i in range(len(model_network.layers)):
print (model_network.layers[layer_i])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w)
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))
w2 = np.asarray([
[[[
[0,0,0],
[0,3,0],
[0,0,0]
]]]
])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w2)
print("Output:")
print(model_network.predict(input_mat))
build a model with say, two convolutional layers
print("Building Model...")
inp = Input(shape=(1,None,None))
x = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)
then define your weights (i'm using a simple w, but you could use np.random.uniform or anything like that if you want)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
Take a peek at what are the layers inside a model
for layer_i in range(len(model_network.layers)):
print (model_network.layers[layer_i])
Set each weight for each convolutional layer (you'll see that the first layer is actually input and you don't want to change that, that's why the range starts from 1 not zero).
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w)
Generate some input for your test and predict the output from your model
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
print("Output:")
print(model_network.predict(input_mat))
You could change it again if you want and check again for the output:
w2 = np.asarray([
[[[
[0,0,0],
[0,3,0],
[0,0,0]
]]]
])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w2)
print("Output:")
print(model_network.predict(input_mat))
Sample output:
Using Theano backend.
Building Model...
<keras.engine.topology.InputLayer object at 0x7fc0c619fd50>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6166250>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6150a10>
Weights after change:
[array([[[[ 0., 0., 0.],
[ 0., 2., 0.],
[ 0., 0., 0.]]]], dtype=float32)]
Input:
[[[[ 1. 2. 3. 10.]
[ 4. 5. 6. 11.]
[ 7. 8. 9. 12.]]]]
Output:
[[[[ 4. 8. 12. 40.]
[ 16. 20. 24. 44.]
[ 28. 32. 36. 48.]]]]
Output:
[[[[ 9. 18. 27. 90.]
[ 36. 45. 54. 99.]
[ 63. 72. 81. 108.]]]]
From your peek at .layers you can see that the first layer is input and the others your convolutional layers.
For tf2 the simplest way to actually reset weights would be:
tf_model.set_weights(
clone_model(tf_model).get_weights()
)
clone_model() as mentioned by #danielsaromo returns new model with trainable params initialized from scratch, we use its weights to reinitialize our model thus no model compilation (knowledge about its loss or optimizer) is needed.
There are two caveats though, first is mentioned in clone_model()'s documentation:
clone_model will not preserve the uniqueness of shared objects within the model (e.g. a single variable attached to two distinct layers will be restored as two separate variables).
Another caveat is that for large models cloning might fail due to memory limit.
To "random" re-initialize weights of a compiled untrained model in TF 2.0 (tf.keras):
weights = [glorot_uniform(seed=random.randint(0, 1000))(w.shape) if w.ndim > 1 else w for w in model.get_weights()]
Note the "if wdim > 1 else w". You don't want to re-initialize the biases (they stay 0 or 1).
use keras.backend.clear_session()