Can Convolution2D work on rectangular images? - python

Let's say I have a 360px by 240px image. Instead of cropping my (already small) image to 240x240, can I create a convolutional neural network that operates on the full rectangle? Specifically using the Convolution2D layer.
I ask because every paper I've read doing CNNs seems to have square input sizes, so I wonder if what I propose will be OK, and if so, what disadvantages I may run into. Are all the settings (like border_mode='same') going to work the same?

No issues with a rectangle image... Everything will work properly as for square images.

Yes.
But why don't you give it a try
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model
print("Building Model...")
inp = Input(shape=(1,None,None))
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
model_network = Model(input=inp, output=output)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
model_network.layers[1].set_weights(w)
print("Weights after change:")
print(model_network.layers[1].get_weights())
print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))
Build a sample model
inp = Input(shape=(1,None,None))
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
model_network = Model(input=inp, output=output)
Give it some weights and set them so you could predit the output, say:
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
model_network.layers[1].set_weights(w)
So that the convolution would simply double your input.
Give it your rectangular image:
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
And check the output to see if it works
print("Output:")
print(model_network.predict(input_mat))
Sample output:
Using Theano backend.
Building Model...
Weights after change:
[array([[[[ 0., 0., 0.],
[ 0., 2., 0.],
[ 0., 0., 0.]]]], dtype=float32)]
Input:
[[[[ 1. 2. 3. 10.]
[ 4. 5. 6. 11.]
[ 7. 8. 9. 12.]]]]
Output:
[[[[ 2. 4. 6. 20.]
[ 8. 10. 12. 22.]
[ 14. 16. 18. 24.]]]]
original post with some changes

Related

Trying to achieve same result with Pytorch and Tensorflow MultiheadAttention

I'm trying to recreate a transformer written in Pytorch and implement it in Tensorflow. The problem is that despite both the documentation for the Pytorch version and Tensorflow version, they still come out pretty differently.
I wrote a little code snippet to show the issue:
import torch
import tensorflow as tf
import numpy as np
class TransformerLayer(tf.Module):
def __init__(self, d_model, nhead, dropout=0):
super(TransformerLayer, self).__init__()
self.self_attn = torch.nn.MultiheadAttention(d_model, nhead, dropout=dropout)
batch_size = 2
seq_length = 5
d_model = 10
src = np.random.uniform(size=(batch_size, seq_length, d_model))
srcTF = tf.convert_to_tensor(src)
srcPT = torch.Tensor(src.reshape((seq_length, batch_size, d_model)))
self_attnTF = tf.keras.layers.MultiHeadAttention(key_dim=10, num_heads=5, dropout=0)
transformer_encoder = TransformerLayer(d_model=10, nhead=5, dropout=0.0)
output, scores = self_attnTF(srcTF, srcTF, srcTF, return_attention_scores=True)
print("Tensorflow Attendtion outputs:", output)
print("Tensorflow (averaged) weights:", tf.math.reduce_mean(scores, 1))
print("Torch Attendtion outputs:", transformer_encoder.self_attn(srcPT,srcPT,srcPT)[0])
print("Torch attention output weights:", transformer_encoder.self_attn(srcPT,srcPT,srcPT)[1])
and the result is:
Tensorflow Attendtion outputs: tf.Tensor(
[[[ 0.02602757 -0.14134401 0.00855263 0.4735083 -0.01851891
-0.20382246 -0.18152176 -0.21076852 0.08623976 -0.33548725]
[ 0.02607442 -0.1403394 0.00814065 0.47415024 -0.01882939
-0.20353754 -0.18291879 -0.21234266 0.08595885 -0.33613583]
[ 0.02524654 -0.14096384 0.00870436 0.47411725 -0.01800703
-0.20486829 -0.18163288 -0.21082559 0.08571021 -0.3362339 ]
[ 0.02518575 -0.14039244 0.0090138 0.47431853 -0.01775141
-0.20391947 -0.18138805 -0.2118245 0.08432849 -0.33521986]
[ 0.02556361 -0.14039293 0.00876258 0.4746476 -0.01891363
-0.20398234 -0.18229616 -0.21147579 0.08555281 -0.33639923]]
[[ 0.07844199 -0.1614371 0.01649148 0.5287745 0.05126739
-0.13851154 -0.09829871 -0.1621251 0.01922669 -0.2428589 ]
[ 0.07844222 -0.16024739 0.01805423 0.52941847 0.04975721
-0.13537636 -0.09829231 -0.16129729 0.01979005 -0.24491176]
[ 0.07800542 -0.160701 0.01677295 0.52902794 0.05082911
-0.13843337 -0.09805533 -0.16165744 0.01928401 -0.24327613]
[ 0.07815789 -0.1600025 0.01757433 0.5291927 0.05032986
-0.1368022 -0.09849522 -0.16172451 0.01929555 -0.24438493]
[ 0.0781548 -0.16028519 0.01764914 0.52846324 0.04941286
-0.13746066 -0.09787872 -0.16141161 0.01994199 -0.2440269 ]]], shape=(2, 5, 10), dtype=float32)
Tensorflow (averaged) weights: tf.Tensor(
[[[0.199085 0.20275716 0.20086522 0.19873264 0.19856 ]
[0.2015336 0.19960018 0.20218948 0.19891861 0.19775811]
[0.19906266 0.20318432 0.20190334 0.19812575 0.19772394]
[0.20074987 0.20104568 0.20269363 0.19744729 0.19806348]
[0.19953248 0.20176074 0.20314851 0.19782843 0.19772986]]
[[0.2010009 0.20053487 0.20004745 0.20092985 0.19748697]
[0.20034568 0.20035927 0.19955876 0.20062163 0.19911464]
[0.19967113 0.2006859 0.20012529 0.20047483 0.19904283]
[0.20132652 0.19996871 0.20019794 0.20008174 0.19842513]
[0.2006393 0.20000939 0.19938737 0.20054278 0.19942114]]], shape=(2, 5, 5), dtype=float32)
Torch Attendtion outputs: tensor([[[ 0.1097, -0.4467, -0.0719, -0.1779, -0.0766, -0.1247, 0.1557,
0.0051, -0.3932, -0.1323],
[ 0.1264, -0.3822, 0.0759, -0.0335, -0.1084, -0.1539, 0.1475,
-0.0272, -0.4235, -0.1744]],
[[ 0.1122, -0.4502, -0.0747, -0.1796, -0.0756, -0.1271, 0.1581,
0.0049, -0.3964, -0.1340],
[ 0.1274, -0.3823, 0.0754, -0.0356, -0.1091, -0.1547, 0.1477,
-0.0272, -0.4252, -0.1752]],
[[ 0.1089, -0.4427, -0.0728, -0.1746, -0.0756, -0.1202, 0.1501,
0.0031, -0.3894, -0.1242],
[ 0.1263, -0.3820, 0.0718, -0.0374, -0.1063, -0.1562, 0.1485,
-0.0271, -0.4233, -0.1761]],
[[ 0.1061, -0.4369, -0.0685, -0.1696, -0.0772, -0.1173, 0.1454,
0.0012, -0.3860, -0.1201],
[ 0.1265, -0.3820, 0.0762, -0.0325, -0.1082, -0.1560, 0.1501,
-0.0271, -0.4249, -0.1779]],
[[ 0.1043, -0.4402, -0.0705, -0.1719, -0.0791, -0.1205, 0.1508,
0.0018, -0.3895, -0.1262],
[ 0.1260, -0.3805, 0.0775, -0.0298, -0.1083, -0.1547, 0.1494,
-0.0276, -0.4242, -0.1768]]], grad_fn=<AddBackward0>)
Torch attention output weights: tensor([[[0.2082, 0.2054, 0.1877, 0.1956, 0.2031],
[0.2100, 0.2079, 0.1841, 0.1943, 0.2037],
[0.2007, 0.1995, 0.1929, 0.1999, 0.2070],
[0.1995, 0.1950, 0.1976, 0.2002, 0.2077],
[0.1989, 0.1969, 0.1970, 0.2024, 0.2048]],
[[0.2095, 0.1902, 0.1987, 0.2027, 0.1989],
[0.2090, 0.1956, 0.1997, 0.2004, 0.1952],
[0.2047, 0.1869, 0.2006, 0.2121, 0.1957],
[0.2073, 0.1953, 0.1982, 0.2014, 0.1978],
[0.2089, 0.2003, 0.1953, 0.1957, 0.1998]]], grad_fn=<DivBackward0>)
The output weights look similar but the base attention outputs are way off. Is there any way to make the Tensorflow model come out more like the Pytorch one? Any help would be greatly appreciated!
In MultiHeadAttention there is also a projection layer, like
Q = W_q # input_query + b_q
K = W_k # input_keys + b_k
V = W_v # input_values + b_v
Matrices W_q, W_k and W_v and biases b_q, b_k, b_v are initialized randomly, so difference in outputs should be expected (even between outputs of two distinct layers in pytorch on same input). After self-attention operation there is one more projection and it's also initialized randomly. Weights can be set manually in tensorflow by calling method set_weights of self_attnTF.
Correspondence between weights in tf.keras.layers.MultiHeadAttention and nn.MultiheadAttention not so clear, as an example: torch shares weights between heads, while tf keeps them unique. So if you are using weights of pretrained model from pytorch and try to put them in tensorflow model (for whatever reason) it'll certainly take more than five minutes.
Results should be the same if after initializing pytorch model and tensorflow model you step through their parameters and assign them identical values.

Why there are so many array of weights?

I am working on python with keras. I learned in my theory study that in a neural network the weights are only between the input layer and a hidden layer or between hidden layers.
I wrote this code, where I added two layers:
NN.add(Dense(4, input_shape=array_input.shape, activation='relu', name="Layer", kernel_constraint=changeWeight()))
NN.add(Dense(4, activation='relu', name="Output"))
NN.compile(loss='mean_squared_error', optimizer=Adam(learning_rate=0.3), metrics=['accuracy'])
print(NN.summary())
a = NN.fit(array_input, array_input, epochs=100)
for lay in NN.layers:
print(lay.name)
print(lay.get_weights())
I think that one is the hidden layer (the one renamed "Layer") and the other is the output layer. The problem is that if i printed "lay.get_weights()" there are two arrays of weights, one for each layer. Like this:
[array([[-1.5516974 , -1.600516 , -0. , 0. ],
[-0. , -2.1766946 , 0.32734624, -0. ],
[-0. , -0. , 0.32156652, -0.812184 ],
[-0. , -0. , -0. , -0.7288372 ]],
dtype=float32), array([-1.8015273, -1.801546 , -0.1462403, 0. ], dtype=float32)]
Output
[array([[-1.5045888 , -0.14155084, -0.29977936, -0.0492779 ],
[-1.2379107 , -0.44411597, -0.41499865, -0.2560569 ],
[ 1.2397875 , -0.3541401 , 1.2223543 , 1.5617256 ],
[ 0.18388063, 0.44298917, -0.2201969 , -0.1165269 ]],
dtype=float32), array([-0.82720596, 0. , 1.1942271 , 1.7084894 ], dtype=float32)]
Can someone explain to me where is the problem. I don't understand keras API, do I?
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer#get_weights
get_weights() returns the weight and value of the bias in an array.
Each of your inputs is connected to the first layers. So the weight matrix has a shape of (input.shape, number of neurons in the current layer) and the bias vector has a shape of (number of neurons in the current layer, ).
Therefore, without knowing what your input array contains, I know that this array has a shape of (4,).
For the second layer, the same process is repeated
weight : (number of neurons of the last layer, number of neurons of the current layer)
bias (number of neurone of the current layer,)
Try this example:
NN = Sequential()
NN.add(Dense(2, input_shape=(3,), activation='relu', name="Layer"))
NN.add(Dense(4, activation='relu', name="Output"))
for lay in NN.layers:
print(lay.name)
print(lay.get_weights())
Output:
Layer
[array([[-0.674668 , -0.34347552],
[ 0.63090587, 0.8558588 ],
[-0.5063792 , -0.23311883]], dtype=float32), array([0., 0.], dtype=float32)]
Output
[array([[-0.07787323, 0.22444701, 0.52729607, 0.07616615],
[-0.5380094 , -0.3146367 , -0.73177123, -0.9248886 ]],
dtype=float32), array([0., 0., 0., 0.], dtype=float32)]
Graphical representation :

TF Gradient Tape has issues with cross products?

I'm trying to use TF gradient tape as an autograd tool for root finding via Newton's method. But when I'm trying to compute the Jacobian matrix, it seems that tf.GradientTape.jacobian can't handle cross products:
x = tf.convert_to_tensor(np.array([1., 2., 3.]))
Wx = np.ones((3))
with tf.GradientTape() as tape:
tape.watch(x)
y = tf.linalg.cross(x, Wx)
print(tape.jacobian(y, x))
gives below error:
StagingError: in converted code:
relative to /Users/xinzhang/anaconda3/lib/python3.7/site-packages:
tensorflow_core/python/ops/parallel_for/control_flow_ops.py:184 f *
return _pfor_impl(loop_fn, iters, parallel_iterations=parallel_iterations)
tensorflow_core/python/ops/parallel_for/control_flow_ops.py:257 _pfor_impl
outputs.append(converter.convert(loop_fn_output))
tensorflow_core/python/ops/parallel_for/pfor.py:1231 convert
output = self._convert_helper(y)
tensorflow_core/python/ops/parallel_for/pfor.py:1395 _convert_helper
if flags.FLAGS.op_conversion_fallback_to_while_loop:
tensorflow_core/python/platform/flags.py:84 __getattr__
wrapped(_sys.argv)
absl/flags/_flagvalues.py:633 __call__
name, value, suggestions=suggestions)
UnrecognizedFlagError: Unknown command line flag 'f'
Whereas if I switch out the call to jacobian to a simple gradient:
x = tf.convert_to_tensor(np.array([1., 2., 3.]))
Wx = np.ones((3))
with tf.GradientTape() as tape:
tape.watch(x)
y = tf.linalg.cross(x, Wx)
print(tape.gradient(y, x))
gives the expected result:
tf.Tensor([0. 0. 0.], shape=(3,), dtype=float64)
Is this a bug?? Or am I doing something wrong with the tape.jacobian method?
p.s. python version 3.7.4; tf version 2.0.0 Everything installed with conda.
This might be a bug in Tensorflow Version 2.0 but it is fixed in Tensorflow Version 2.1.
So, please upgrade your Tensorflow Version to either 2.1 or 2.2 and the issue will be resolved.
Working code is mentioned below:
!pip install tensorflow==2.2
import tensorflow as tf
import numpy as np
print(tf.__version__)
x = tf.convert_to_tensor(np.array([1., 2., 3.]))
Wx = np.ones((3))
with tf.GradientTape() as tape:
tape.watch(x)
y = tf.linalg.cross(x, Wx)
print(tape.jacobian(y, x))
Output is shown below:
2.2.0
tf.Tensor(
[[ 0. 1. -1.]
[-1. 0. 1.]
[ 1. -1. 0.]], shape=(3, 3), dtype=float64)

Keras:How to get weights, make weights become 1D array, then make weights shape become initial shape?

I want to do some experiment. and i need to get Keras model weights, make it 1D array , and make the shape like initial shape
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
model = tf.keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(layers.Dense( 4, input_dim = 5 ,activation='relu'))
# Add another:
model.add(layers.Dense(3, activation='relu'))
# Add an output layer with 10 output units:
model.add(layers.Dense(2))
# Configure a model for mean-squared error regression.
model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
loss='mse', # mean squared error
metrics=['mae']) # mean absolute error
weights = (model.get_weights())
#make weight become 1D array
#maka 1D array become like inital shape
model.set_weights(weights)
why iwant to do this ?
because i want to do some mutation using other module, that's necessary to pass 1D array
how to do this ?
as we know the shape of Keras model weights look like this
[array([[-0.24053234, 0.4722855 , 0.29863954, 0.22805429],
[ 0.45101106, -0.00229341, -0.6142864 , -0.2751704 ],
[ 0.159172 , 0.43983865, 0.61577237, 0.24255097],
[ 0.24160242, 0.422235 , 0.8066592 , -0.2711717 ],
[-0.30763668, -0.4841219 , 0.767977 , 0.23558974]],
dtype=float32), array([0., 0., 0., 0.], dtype=float32), array([[ 0.24129152, -0.4890638 , 0.18787515],
[ 0.8663894 , -0.09163451, -0.86416066],
[-0.01754427, 0.32654428, -0.78837514],
[ 0.589849 , 0.5886531 , 0.27824092]], dtype=float32), array([0., 0., 0.], dtype=float32), array([[ 0.8456359 , -0.26292562],
[-1.0447757 , -0.43539298],
[ 1.0835328 , -0.43536085]], dtype=float32), array([0., 0.], dtype=float32)]

Reset weights in Keras layer

I'd like to reset (randomize) the weights of all layers in my Keras (deep learning) model. The reason is that I want to be able to train the model several times with different data splits without having to do the (slow) model recompilation every time.
Inspired by this discussion, I'm trying the following code:
# Reset weights
for layer in KModel.layers:
if hasattr(layer,'init'):
input_dim = layer.input_shape[1]
new_weights = layer.init((input_dim, layer.output_dim),name='{}_W'.format(layer.name))
layer.trainable_weights[0].set_value(new_weights.get_value())
However, it only partly works.
Partly, becuase I've inspected some layer.get_weights() values, and they seem to change. But when I restart the training, the cost values are much lower than the initial cost values on the first run. It's almost like I've succeeded resetting some of the weights, but not all of them.
Save the initial weights right after compiling the model but before training it:
model.save_weights('model.h5')
and then after training, "reset" the model by reloading the initial weights:
model.load_weights('model.h5')
This gives you an apples to apples model to compare different data sets and should be quicker than recompiling the entire model.
Reset all layers by checking for initializers:
def reset_weights(model):
import keras.backend as K
session = K.get_session()
for layer in model.layers:
if hasattr(layer, 'kernel_initializer'):
layer.kernel.initializer.run(session=session)
if hasattr(layer, 'bias_initializer'):
layer.bias.initializer.run(session=session)
Update: kernel_initializer is kernel.initializer now.
If you want to truly re-randomize the weights, and not merely restore the initial weights, you can do the following. The code is slightly different depending on whether you're using TensorFlow or Theano.
from keras.initializers import glorot_uniform # Or your initializer of choice
import keras.backend as K
initial_weights = model.get_weights()
backend_name = K.backend()
if backend_name == 'tensorflow':
k_eval = lambda placeholder: placeholder.eval(session=K.get_session())
elif backend_name == 'theano':
k_eval = lambda placeholder: placeholder.eval()
else:
raise ValueError("Unsupported backend")
new_weights = [k_eval(glorot_uniform()(w.shape)) for w in initial_weights]
model.set_weights(new_weights)
I have found the clone_model function that creates a cloned network with the same architecture but new model weights.
Example of use:
model_cloned = tensorflow.keras.models.clone_model(model_base)
Comparing the weights:
original_weights = model_base.get_weights()
print("Original weights", original_weights)
print("========================================================")
print("========================================================")
print("========================================================")
model_cloned = tensorflow.keras.models.clone_model(model_base)
new_weights = model_cloned.get_weights()
print("New weights", new_weights)
If you execute this code several times, you will notice that the cloned model receives new weights each time.
Tensorflow 2 answer:
for ix, layer in enumerate(model.layers):
if hasattr(model.layers[ix], 'kernel_initializer') and \
hasattr(model.layers[ix], 'bias_initializer'):
weight_initializer = model.layers[ix].kernel_initializer
bias_initializer = model.layers[ix].bias_initializer
old_weights, old_biases = model.layers[ix].get_weights()
model.layers[ix].set_weights([
weight_initializer(shape=old_weights.shape),
bias_initializer(shape=old_biases.shape)])
Original weights:
model.layers[1].get_weights()[0][0]
array([ 0.4450057 , -0.13564804, 0.35884023, 0.41411972, 0.24866664,
0.07641453, 0.45726687, -0.04410008, 0.33194816, -0.1965386 ,
-0.38438258, -0.13263905, -0.23807487, 0.40130925, -0.07339832,
0.20535922], dtype=float32)
New weights:
model.layers[1].get_weights()[0][0]
array([-0.4607593 , -0.13104361, -0.0372932 , -0.34242013, 0.12066692,
-0.39146423, 0.3247317 , 0.2635846 , -0.10496247, -0.40134245,
0.19276887, 0.2652442 , -0.18802321, -0.18488845, 0.0826562 ,
-0.23322225], dtype=float32)
K.get_session().close()
K.set_session(tf.Session())
K.get_session().run(tf.global_variables_initializer())
Try set_weights.
for example:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model
print("Building Model...")
inp = Input(shape=(1,None,None))
x = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
for layer_i in range(len(model_network.layers)):
print (model_network.layers[layer_i])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w)
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))
w2 = np.asarray([
[[[
[0,0,0],
[0,3,0],
[0,0,0]
]]]
])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w2)
print("Output:")
print(model_network.predict(input_mat))
build a model with say, two convolutional layers
print("Building Model...")
inp = Input(shape=(1,None,None))
x = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)
then define your weights (i'm using a simple w, but you could use np.random.uniform or anything like that if you want)
w = np.asarray([
[[[
[0,0,0],
[0,2,0],
[0,0,0]
]]]
])
Take a peek at what are the layers inside a model
for layer_i in range(len(model_network.layers)):
print (model_network.layers[layer_i])
Set each weight for each convolutional layer (you'll see that the first layer is actually input and you don't want to change that, that's why the range starts from 1 not zero).
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w)
Generate some input for your test and predict the output from your model
input_mat = np.asarray([
[[
[1.,2.,3.,10.],
[4.,5.,6.,11.],
[7.,8.,9.,12.]
]]
])
print("Output:")
print(model_network.predict(input_mat))
You could change it again if you want and check again for the output:
w2 = np.asarray([
[[[
[0,0,0],
[0,3,0],
[0,0,0]
]]]
])
for layer_i in range(1,len(model_network.layers)):
model_network.layers[layer_i].set_weights(w2)
print("Output:")
print(model_network.predict(input_mat))
Sample output:
Using Theano backend.
Building Model...
<keras.engine.topology.InputLayer object at 0x7fc0c619fd50>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6166250>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6150a10>
Weights after change:
[array([[[[ 0., 0., 0.],
[ 0., 2., 0.],
[ 0., 0., 0.]]]], dtype=float32)]
Input:
[[[[ 1. 2. 3. 10.]
[ 4. 5. 6. 11.]
[ 7. 8. 9. 12.]]]]
Output:
[[[[ 4. 8. 12. 40.]
[ 16. 20. 24. 44.]
[ 28. 32. 36. 48.]]]]
Output:
[[[[ 9. 18. 27. 90.]
[ 36. 45. 54. 99.]
[ 63. 72. 81. 108.]]]]
From your peek at .layers you can see that the first layer is input and the others your convolutional layers.
For tf2 the simplest way to actually reset weights would be:
tf_model.set_weights(
clone_model(tf_model).get_weights()
)
clone_model() as mentioned by #danielsaromo returns new model with trainable params initialized from scratch, we use its weights to reinitialize our model thus no model compilation (knowledge about its loss or optimizer) is needed.
There are two caveats though, first is mentioned in clone_model()'s documentation:
clone_model will not preserve the uniqueness of shared objects within the model (e.g. a single variable attached to two distinct layers will be restored as two separate variables).
Another caveat is that for large models cloning might fail due to memory limit.
To "random" re-initialize weights of a compiled untrained model in TF 2.0 (tf.keras):
weights = [glorot_uniform(seed=random.randint(0, 1000))(w.shape) if w.ndim > 1 else w for w in model.get_weights()]
Note the "if wdim > 1 else w". You don't want to re-initialize the biases (they stay 0 or 1).
use keras.backend.clear_session()

Categories

Resources