Why does Dropout not change my input tensor? - python

Please see the following code associated with output,
import torch
import torch.nn as nn
inputTensor = torch.tensor([1.0, 2.0, 3, 4, 5])
outplace_dropout = nn.Dropout(p=0.4)
print(inputTensor)
output_afterDropout = outplace_dropout(inputTensor)
print(output_afterDropout)
print(inputTensor)
The output is:
tensor([1., 2., 3., 4., 5.])
tensor([1.6667, 3.3333, 0.0000, 6.6667, 0.0000])
tensor([1., 2., 3., 4., 5.])
Could you please elaborate why the input tensor values are still unchanged?

From the documentation of torch.nn.Dropout, you can see that the inplace argument defaults to False. If you wish to change the input tensor in place, change the initialization to:
outplace_dropout = nn.Dropout(p=0.4, inplace=True)

Related

Tensor slicing looses the shape information in TensorFlow

I'm trying to dynamically slice a tensor to automatically adjust its shape for the next iteration. However, I realized that when sliced in Graph mode the shape information of the tensor is lost hence I can not further apply certain operations on it which requires knowing the shape of a given tensor. Below I attached an example code, in my specific example the opt_with_slicing function is within a vectorized_map which is defined in a larger function that takes care of auto differentiation. Since the original function is too large to include here I simplified it accordingly;
a = tf.constant(np.linspace(0.,10.,11,endpoint=True)[::-1])
b = tf.ones((2,10))
def opt_with_slicing(x, some_cutoff: float):
a, b = x
new_size = tf.math.count_nonzero(
tf.cast(a >= some_cutoff, dtype=tf.int32), dtype=tf.int32
)
tf.print(f"new size {new_size}, initial size {a.get_shape()}")
test1 = b[:, :new_size]
test2 = tf.slice(b, [0, 0], [b.get_shape()[0], new_size])
tf.print(f"test1 shape {test1.get_shape()}, test2 shape {test2.get_shape()}")
return test1, test2
tf.function(opt_with_slicing)([a,b], 5.)
# Output:
# new size Tensor("count_nonzero/Cast_1:0", shape=(), dtype=int32), initial size (11,)
# test1 shape (2, None), test2 shape (2, None)
# (<tf.Tensor: shape=(2, 6), dtype=float32, numpy=
# array([[1., 1., 1., 1., 1., 1.],
# [1., 1., 1., 1., 1., 1.]], dtype=float32)>,
# <tf.Tensor: shape=(2, 6), dtype=float32, numpy=
# array([[1., 1., 1., 1., 1., 1.],
# [1., 1., 1., 1., 1., 1.]], dtype=float32)>)
As you can see from the print out the shape information of test1 and test2 is lost and since this is a dynamic operation I have no way to know the new_size prior to the execution. Is there a way to reinstate the shape information of the function without breaking the graph mode?
PS: I tried the same with boolean_mask as well;
mask = tf.greater_equal(a, some_cutoff)
masked_shape = tf.boolean_mask(a, mask).get_shape()[0]
but masked_shape turns out to be None as well.
System info:
Tensorflow v2.5.0
Python v3.8.2

Understanding torch.nn.Flatten

I understand that Flatten removes all of the dimensions except for one. For example, I understand flatten():
> t = torch.ones(4, 3)
> t
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
> flatten(t)
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
However, I don't get Flatten, especially I don't get meaning of this snippet from the doc:
>>> input = torch.randn(32, 1, 5, 5)
>>> m = nn.Sequential(
>>> nn.Conv2d(1, 32, 5, 1, 1),
>>> nn.Flatten()
>>> )
>>> output = m(input)
>>> output.size()
torch.Size([32, 288])
I felt the output should have size [160], because 32*5=160.
Q1. So why it outputted size [32,288]?
Q2. I also don't get meaning of shape information given in the doc:
Q3. And also meaning of parameters:
It is a difference in the default behaviour. torch.flatten flattens all dimensions by default, while torch.nn.Flatten flattens all dimensions starting from the second dimension (index 1) by default.
You can see this behaviour in the default values of the start_dim and end_dim arguments. The start_dim argument denotes the first dimension to be flattened (zero-indexed), and the end_dim argument denotes the last dimension to be flattened. So, when start_dim=1, which is the default for torch.nn.Flatten, the first dimension (index 0) is not flattened, but it is included when start_dim=0, which is the default for torch.flatten.
The reason behind this difference is probably because torch.nn.Flatten is intended to be used with torch.nn.Sequential, where typically a series of operations are performed on a batch of inputs, where each input is treated independently of the others. For example, if you have a batch of images and you call torch.nn.Flatten, the typical use case would be to flatten each image separately, and not flatten the whole batch.
If you do want to flatten all dimensions using torch.nn.Flatten, you can simply create the object as torch.nn.Flatten(start_dim=0).
Finally, the shape information in the docs just covers how the shape of the tensor will be affected, illustrating that the first (index 0) dimension is left as it is. So, if you have an input tensor of shape (N, *dims), where *dims is an arbitrary sequence of dimensions, the output tensor will have the shape (N, product of *dims), since all dimensions except the batch dimension are flattened. For example, an input of shape (3,10,10) will have an output of shape (3, 10 x 10) = (3, 100).

Input an integer with placeholder in tensorflow?

I want to feed a batch_size integer as a placeholder in Tensorflow. But it does not act as an integer. Consider the following example:
import tensorflow as tf
max_length = 5
batch_size = 3
batch_size_placeholder = tf.placeholder(dtype=tf.int32)
mask_0 = tf.one_hot(indices=[0]*batch_size_placeholder, depth=max_length, on_value=0., off_value=1.)
mask_1 = tf.one_hot(indices=[0]*batch_size, depth=max_length, on_value=0., off_value=1.)
# new session
with tf.Session() as sess:
feed = {batch_size_placeholder : 3}
batch, mask0, mask1 = sess.run([
batch_size_placeholder, mask_0, mask_1
], feed_dict=feed)
When I print the values of batch, mask0 and mask1 I have the following:
print(batch)
>>> array(3, dtype=int32)
print(mask0)
>>> array([[0., 1., 1., 1., 1.]], dtype=float32)
print(mask1)
>>> array([[0., 1., 1., 1., 1.],
[0., 1., 1., 1., 1.],
[0., 1., 1., 1., 1.]], dtype=float32)
Indeed I thought mask0 and mask1 must be the same, but it seems that Tensorflow does not treat batch_size_placeholder as an integer. I believe it would be a tensor, but is there anyway that I can use it as an integer in my computations?
Is there anyway I can fix this problem? Just FYI, I used tf.one_hot as just an example, I want to run train/validation during training in my code where I will need a lot of other computations with different values for batch_size in training and in validation steps.
Any help would be appreciated.
In pure python usage, [0]*3 will be [0,0,0]. However, batch_size_placeholder is a placeholder, during the graph execution, it will be a tensor. [0]*tensor will be regarded as tensor multiplication. In your case, it will be a 1-d tensor which has 0 value. To correctly use batch_size_placeholder, you should create a tensor which has the same length as batch_size_placeholder.
mask_0 = tf.one_hot(tf.zeros(batch_size_placeholder, dtype=tf.int32), depth=max_length, on_value=0., off_value=1.)
It will have the same result as mask_1.
A simple example to show the difference.
batch_size_placeholder = tf.placeholder(dtype=tf.int32)
a = [0]*batch_size_placeholder
b = tf.zeros(batch_size_placeholder, dtype=tf.int32)
with tf.Session() as sess:
print(sess.run([a, b], feed_dict={batch_size_placeholder : 3}))
# [array([0], dtype=int32), array([0, 0, 0], dtype=int32)]

How to use reset_states(states) function in Keras?

I'm trying to set the LSTM internal state before training each batch.
I'm sharing my test code and findings, hoping to find an answer and help others that are addressing similar problems.
In particular, for each data I have a feature X (which doesn't change over time) and a sequence P = p1, p2, p3,... p30.
The goal is: given X and p1,p2,p3 predict p4, p5, .. p30.
To this aim, I want to initialize the hidden state of an LSTM with X, as done in several works (e.g., neuraltalk), then the LSTM has to be fit with p1,p2,p3 to predict p4,..,p30.
This initialization is needed before each batch (batch_size=1), therefore I need to have the control of the LSTM states initialization.
Considerint this question Initializing LSTM hidden state Tensorflow/Keras I've tested the following code:
First of all I've added some prints in the reset_states() function defined in recurrent.py, in order to understand what exactly happens.
def reset_states(self, states=None):
if not self.stateful:
raise AttributeError('Layer must be stateful.')
batch_size = self.input_spec[0].shape[0]
if not batch_size:
raise ValueError('If a RNN is stateful, it needs to know '
'its batch size. Specify the batch size '
'of your input tensors: \n'
'- If using a Sequential model, '
'specify the batch size by passing '
'a `batch_input_shape` '
'argument to your first layer.\n'
'- If using the functional API, specify '
'the time dimension by passing a '
'`batch_shape` argument to your Input layer.')
# initialize state if None
if self.states[0] is None:
self.states = [K.zeros((batch_size, self.units))
for _ in self.states]
print "reset states A (all zeros)"
elif states is None:
for state in self.states:
K.set_value(state, np.zeros((batch_size, self.units)))
print "reset states B (all zeros)"
else:
if not isinstance(states, (list, tuple)):
states = [states]
print "reset states C (list or tuple copying)"
if len(states) != len(self.states):
raise ValueError('Layer ' + self.name + ' expects ' +
str(len(self.states)) + ' states, '
'but it received ' + str(len(states)) +
' state values. Input received: ' +
str(states))
for index, (value, state) in enumerate(zip(states, self.states)):
if value.shape != (batch_size, self.units):
raise ValueError('State ' + str(index) +
' is incompatible with layer ' +
self.name + ': expected shape=' +
str((batch_size, self.units)) +
', found shape=' + str(value.shape))
K.set_value(state, value)
print "reset states D (set values)"
print value
print "\n"
Here is the test code:
import tensorflow as tf
from keras.layers import LSTM
from keras.layers import Input
from keras.models import Model
import numpy as np
import keras.backend as K
input = Input(batch_shape=(1,3,1))
lstm_layer = LSTM(10,stateful=True)(input)
>>> reset states A (all zeros)
As you can see, the first print is executed when the lstm layer is created
model = Model(input,lstm_layer)
model.compile(optimizer="adam", loss="mse")
with tf.Session() as sess:
tf.global_variables_initializer().run()
h = sess.run(model.layers[1].states[0])
c = sess.run(model.layers[1].states[1])
print h
>>> [[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
print c
>>> [[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
The internal states have been set to all zeros.
As an alternative the function reset_states() can be used
model.layers[1].reset_states()
>>> reset states B (all zeros)
The second message has been printed in this case. Everything seem to work correctly.
Now I want to set the states with arbitrary values.
new_h = K.variable(value=np.ones((1, 10)))
new_c = K.variable(value=np.ones((1, 10))+1)
model.layers[1].states[0] = new_h
model.layers[1].states[1] = new_c
with tf.Session() as sess:
tf.global_variables_initializer().run()
h = sess.run(model.layers[1].states[0])
c = sess.run(model.layers[1].states[1])
print h
>>> [[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]
print c
>>> [[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]]
Ok, I've successfully set both hidden states with my vectors of all one and all two.
However, it worth to exploit the class function reset_states() which takes as input the states.
This function exploits the function K.set_values(x,values) which expects 'values' to be a numpy array.
new_h_5 = np.zeros((1,10))+5
new_c_24 = np.zeros((1,10))+24
model.layers[1].reset_states([new_h_5,new_c_24])
It seems to work, indeed the output is:
>>> reset states D (set values)
>>> [[ 5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]]
>>>
>>>
>>>
>>>
>>> reset states D (set values)
>>> [[ 24. 24. 24. 24. 24. 24. 24. 24. 24. 24.]]
However, if i want to check if the states have been initializated I find the previous initialization values (all one, all two).
with tf.Session() as sess:
tf.global_variables_initializer().run()
hh = sess.run(model.layers[1].states[0])
cc = sess.run(model.layers[1].states[1])
print hh
>>> [[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]
print cc
>>> [[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]]
What is exactly happening here? Why the function seems working according to the prints but doesn't change the values of the internal states?
As you may read here, value parameter sets a value by which a variable should be initialized. So when you call tf.global_variables_initializer().run() your states are initialized with values defined here:
new_h = K.variable(value=np.ones((1, 10)))
new_c = K.variable(value=np.ones((1, 10))+1)
Edit:
It seemed obvious for me but once again I will explain why reset_states doesn't work.
Variable definition: When you defined your inner states to be variables initialized by a certain value the n this certain vaklue will be set every time you call variable_initializer.
Reset states: it will update a current value of this variable but it will not change a default value of initializer. In order to do that you need to reassign this states by yet another variable with a given states set as default.

How to get integer labels from MNIST examples after specifying one_hot=True?

I have been attempting this tutorial on Youtube (explination of .cls and .labels at 1m31s) which is just a simple MNIST classifier model. But I was unable to complete it due to an apparently missing function in Tensorflow.
>>>from tensorflow.examples.tutorials.mnist import input_data
>>>data = input_data.read_data_sets("data/MNIST", one_hot=True)
>>>one_hot_labels = data.test.labels #mat shape=(num_images X num_classes)
>>>cls_labels = data.test.cls #mat shape=(num_images X 1)
Traceback (most recent call last):
File "/home/file.py", line 5, in <module>
cls_labels = data.test.cls
AttributeError: 'DataSet' object has no attribute 'cls'
After searching on Google for ".cls" reference in TF, I was unable to find any information pertaining to it.
A dirty example that made things work:
>>>data = input_data.read_data_sets("data/MNIST", one_hot=True)
>>>data2 = input_data.read_data_sets("data/MNIST")
>>>one_hot_labels = data.test.labels #mat shape=(num_images X num_classes)
>>>cls_labels = data2.test.labels #mat shape=(num_images X 1)
I am using Tensorflow 0.10.0 on Linux and am wondering if the .cls option has been removed?
If so, is there an alternative method for encoding an array of classifier names from an array of one_hot vectors?
Thanks
Your labels are in this type of array (one hot) for example :
array([[ 0., 0., 0., ..., 1., 0., 0.],
[ 0., 0., 1., ..., 0., 0., 0.],
[ 0., 1., 0., ..., 0., 0., 0.],
The number 1. is in the position of the array where tells you which label is.
To get a integer label from this data you have to get the index with:
data.test.cls = np.argmax(data.test.labels, axis=1)
Currently we use attribute images for image data and labels for classes(labels). For example,
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/MNIST", one_hot=True)
# data
images = mnist.test.images
# label
labels = mnist.test.labels
# without one-hot
mnist = input_data.read_data_sets("data/MNIST", one_hot=False)
# original data
images = mnist.test.images.reshape([-1, 28, 28])
print(images.shape)
# label
labels = mnist.test.labels
print(labels)

Categories

Resources