I am trying to understand what tf.layers.dense does to an array and am using the code below. However, I get an error while running the code.
I tried debugging and it seems that there might be some issue while computing the rank of the tensor. However, sess.run(tf.rank(a)) successfully returns 3. So I suppose there is some other issue with the tensor itself.
import numpy as np
import tensorflow as tf
a = np.array([[[1, 0, 0], [1, 1, 0]], [[0, 0, 0], [0, 1, 1]]])
hidden_layer = tf.layers.dense(a, 5, activation=tf.nn.relu)
sess = tf.Session()
print(sess.run(hidden_layer))
The above code throws the error AttributeError: 'tuple' object has no attribute 'ndims', but I expect that a fully connected layer with weights and biases should be created.
What am I doing wrong?
Also, it would be really helpful if someone can maybe show a Python/NumPy equivalent of this implementation (without using tensorflow's dense), so that it's intuitive to follow.
There are two problems with the code: The Dense layer expects a Tensor as input instead of a Numpy array and the weights of the hidden layer need to be explicitly initialized. Here is the corrected code with some comments:
import numpy as np
import tensorflow as tf
# Make sure Dense layer is always initialized with the same values.
tf.set_random_seed(0)
# Dense layer will need float input
a = np.array([[[1, 0, 0], [1, 1, 0]], [[0, 0, 0], [0, 1, 1]]],
dtype=np.float32)
# Convert numpy array to Tensor
t = tf.convert_to_tensor(a)
# Create hidden layer with the array as input and random initializer that
# draws from a uniform distribution.
hidden_layer = tf.layers.dense(t, 5, activation=tf.nn.relu)
sess = tf.Session()
# Initialize Dense layer with the random initializer
sess.run(tf.global_variables_initializer())
# Print result of running the array through the Dense layer
print(sess.run(hidden_layer))
Btw as long as you're experimenting, you might benefit from using TensorFlow in eager mode or using PyTorch which has a friendlier interface.
First, you should reshape a. Then input the a to hidden layer. And you should initialize the parameter.
import numpy as np
import tensorflow as tf
a = np.array([[[1, 0, 0], [1, 1, 0]], [[0, 0, 0], [0, 1, 1]]], dtype=np.int32)
a = tf.reshape(a, [-1, 4*3])
hidden_layer = tf.layers.dense(a, 5, activation=tf.nn.relu)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(sess.run(hidden_layer))
Related
I am learning python by printing the output line by line from this tutorial https://towardsdatascience.com/convolution-neural-network-for-image-processing-using-keras-dc3429056306 and find out what does each line do. At the last line of code I facing error weight not define but it seem the code is running fine without the need to define the weight as inside the tutorial link. What I did wrongly in the code and how to fix it?
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as fn
filter_vals = np.array([[-1, -1, 1, 2], [-1, -1, 1, 0], [-1, -1, 1, 1], [-1, -1, 1,
1]])
print('Filter shape: ', filter_vals.shape)# Neural network with one convolutional layer
and four filters
# Neural network with one convolutional layer and four filters
class Net(nn.Module):
def __init__(self, weight): super(Net, self).__init__()
k_height, k_width = weight.shape[2:]
The error is due to an indentation issue. The last line needs to be executed inside the constructor init for it to recognize the weight argument. The code should look like this:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as fn
filter_vals = np.array([[-1, -1, 1, 2],
[-1, -1, 1, 0],
[-1, -1, 1, 1],
[-1, -1, 1, 1]])
print('Filter shape: ', filter_vals.shape) # Neural network with one convolutional layer and four filters
# Neural network with one convolutional layer and four filters
class Net(nn.Module):
def __init__(self, weight):
super(Net, self).__init__()
k_height, k_width = weight.shape[2:]
I have a tensorflow model with my truth data in the shape (N, 32, 32, 5) ie. 32x32 images with 5 channels.
Inside the loss function I would like to calculate, for each pixel, the sum of the values of the neighboring pixels for each channel, generating a new (N, 32, 32, 5) tensor.
The tf.nn.pool function does something similar but not exactly what I need. I was trying to see if tf.nn.conv2d could get me there but I'm not sure what I'd need to use as the filter parameter in this case.
Is there a specific function for this? Or can I use conv2d somehow?
You can do that with tf.nn.separable_conv2d like this
import tensorflow as tf
input = tf.placeholder(tf.float32, [None, 32, 32, 5])
# Depthwise filter adds the neighborhood of each pixel per channel
depthwise_filter = tf.ones([3, 3, 5, 1], input.dtype)
# Pointwise filter does not do anything
pointwise_filter = tf.eye(5, batch_shape=[1, 1], dtype=input.dtype)
output = tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter,
strides=[1, 1, 1, 1], padding='SAME')
print(output.shape)
# (?, 32, 32, 5)
The following method using tf.nn.conv2d is also equivalent:
import tensorflow as tf
input = tf.placeholder(tf.float32, [None, 32, 32, 5])
# Each filter adds the neighborhood for a different channel
filter = tf.eye(5, batch_shape=[3, 3], dtype=input.dtype)
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
A new convolutional layer with the filter size of 3x3 and filters initialized to 1 will do the job. Just be careful to declare this special filter as an untrainable variable otherwise your optimizer would change its contents. Additionally, set padding to "SAME" to get the same size output from that convolutional layer. The pixels at the edges will have zero neigbors in that case.
A model that I am working on should be predicting quite a lot of variables simultaneously (>1000). Therefore I would like to have a small neural network at the end of the network for each output.
In order to do this compactly, I would like to find a way to create a sparse trainable connection between two layers in the neural network within the Tensorflow framework.
Only a small portion of the connection matrix should be trainable: It is only the parameters that are part of the block-diagonal.
For example:
The connection matrix is the following:
The trainable parameters should be in the place of the 1's.
I have written exactly such a layer:
https://github.com/ArnovanHilten/GenNet/blob/master/GenNet_utils/LocallyDirectedConnected_tf2.py
It takes a sparse matrix as an input and lets you decide how to connect between layers. The layer uses sparse tensors and matrix multiplications.
edit
so the comment was Is this a trainable object though?
The answer: No. You cannot use sparse matrix currently and make it trainable. Instead you can use a mask matrix (see at the end)
But if you need to use sparse matrix, you just have to use tf.sparse.sparse_dense_matmul() or tf.sparse_tensor_to_dense() where your sparse interacts with a dense matrix. I have taken a simple XOR example from here and replaced dense with a sparse matrix:
#Declaring necessary modules
import tensorflow as tf
import numpy as np
"""
A simple numpy implementation of a XOR gate to understand the backpropagation
algorithm
"""
x = tf.placeholder(tf.float32,shape = [4,2],name = "x")
#declaring a place holder for input x
y = tf.placeholder(tf.float32,shape = [4,1],name = "y")
#declaring a place holder for desired output y
m = np.shape(x)[0]#number of training examples
n = np.shape(x)[1]#number of features
hidden_s = 2 #number of nodes in the hidden layer
l_r = 1#learning rate initialization
theta1 = tf.SparseTensor(indices=[[0, 0],[0, 1], [1, 1]], values=[0.1, 0.2, 0.1], dense_shape=[3, 2])
#theta1 = tf.cast(tf.Variable(tf.random_normal([3,hidden_s]),name = "theta1"),tf.float64)
theta2 = tf.cast(tf.Variable(tf.random_normal([hidden_s+1,1]),name = "theta2"),tf.float32)
#conducting forward propagation
a1 = tf.concat([np.c_[np.ones(x.shape[0])],x],1)
#the weights of the first layer are multiplied by the input of the first layer
#z1 = tf.sparse_tensor_dense_matmul(theta1, a1)
z1 = tf.matmul(a1,tf.sparse_tensor_to_dense(theta1))
#the input of the second layer is the output of the first layer, passed through the
a2 = tf.concat([np.c_[np.ones(x.shape[0])],tf.sigmoid(z1)],1)
#the input of the second layer is multiplied by the weights
z3 = tf.matmul(a2,theta2)
#the output is passed through the activation function to obtain the final probability
h3 = tf.sigmoid(z3)
cost_func = -tf.reduce_sum(y*tf.log(h3)+(1-y)*tf.log(1-h3),axis = 1)
#built in tensorflow optimizer that conducts gradient descent using specified
optimiser = tf.train.GradientDescentOptimizer(learning_rate = l_r).minimize(cost_func)
#setting required X and Y values to perform XOR operation
X = [[0,0],[0,1],[1,0],[1,1]]
Y = [[0],[1],[1],[0]]
#initializing all variables, creating a session and running a tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
#running gradient descent for each iterati
for i in range(200):
sess.run(optimiser, feed_dict = {x:X,y:Y})#setting place holder values using feed_dict
if i%100==0:
print("Epoch:",i)
print(sess.run(theta1))
and the output is:
Epoch: 0
SparseTensorValue(indices=array([[0, 0],
[0, 1],
[1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))
Epoch: 100
SparseTensorValue(indices=array([[0, 0],
[0, 1],
[1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))
So the only way is to use a mask matrix. You can use it by multiplication or tf.where
1) Multiplication: You can create mask matrix of the desired shape and multiply it with your weight matrix:
mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.matmul(weight, mask)
2) tf.where
mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.where(mask > 0, tf.ones_like(weight), weight)
Hope it helps
You can do that by using sparse tensors like so:
SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])
and the output is:
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
you can look up more on the documentation of sparse tensor here:
https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor
Hope it helps!
I am trying to import weights saved from a Tensorflow model to PyTorch. So far the results have been very similar. I ran into a snag when the model calls for conv2d with stride=2.
To verify the mismatch, I set up a very simple comparison between TF and PyTorch. First, I compare conv2d with stride=1.
import tensorflow as tf
import numpy as np
import torch
import torch.nn.functional as F
np.random.seed(0)
sess = tf.Session()
# Create random weights and input
weights = torch.empty(3, 3, 3, 8)
torch.nn.init.constant_(weights, 5e-2)
x = np.random.randn(1, 3, 10, 10)
weights_tf = tf.convert_to_tensor(weights.numpy(), dtype=tf.float32)
# PyTorch adopts [outputC, inputC, kH, kW]
weights_torch = torch.Tensor(weights.permute((3, 2, 0, 1)))
# Tensorflow defaults to NHWC
x_tf = tf.convert_to_tensor(x.transpose((0, 2, 3, 1)), dtype=tf.float32)
x_torch = torch.Tensor(x)
# TF Conv2D
tf_conv2d = tf.nn.conv2d(x_tf,
weights_tf,
strides=[1, 1, 1, 1],
padding="SAME")
# PyTorch Conv2D
torch_conv2d = F.conv2d(x_torch, weights_torch, padding=1, stride=1)
sess.run(tf.global_variables_initializer())
tf_result = sess.run(tf_conv2d)
diff = np.mean(np.abs(tf_result.transpose((0, 3, 1, 2)) - torch_conv2d.detach().numpy()))
print('Mean of Abs Diff: {0}'.format(diff))
The result of this execution is:
Mean of Abs Diff: 2.0443112092038973e-08
When I change stride to 2, the results start to vary.
# TF Conv2D
tf_conv2d = tf.nn.conv2d(x_tf,
weights_tf,
strides=[1, 2, 2, 1],
padding="SAME")
# PyTorch Conv2D
torch_conv2d = F.conv2d(x_torch, weights_torch, padding=1, stride=2)
The result of this execution is:
Mean of Abs Diff: 0.2104552686214447
According to PyTorch documentation, conv2d uses zero-padding defined by the padding argument. Thus, zeros are added to the left, top, right, and bottom of the input in my example.
If PyTorch simply adds padding on both sides based on the input parameter, it should be easy to replicate in Tensorflow.
# Manually add padding - consistent with PyTorch
paddings = tf.constant([[0, 0], [1, 1], [1, 1], [0, 0]])
x_tf = tf.convert_to_tensor(x.transpose((0, 2, 3, 1)), dtype=tf.float32)
x_tf = tf.pad(x_tf, paddings, "CONSTANT")
# TF Conv2D
tf_conv2d = tf.nn.conv2d(x_tf,
weights_tf,
strides=[1, 2, 2, 1],
padding="VALID")
The result of this comparison is:
Mean of Abs Diff: 1.6035047067930464e-08
What this tells me is that if I am somehow able to replicate the default padding behavior from Tensorflow into PyTorch, then my results will be similar.
This question inspected the behavior of padding in Tensorflow. TF documentation explains how padding is added for "SAME" convolutions. I discovered these links while writing this question.
Now that I know the padding strategy of Tensorflow, I can implement it in PyTorch.
To replicate the behavior, padding sizes are calculated as described in the Tensorflow documentation. Here, I test the padding behavior by setting stride=2 and padding the PyTorch input.
import tensorflow as tf
import numpy as np
import torch
import torch.nn.functional as F
np.random.seed(0)
sess = tf.Session()
# Create random weights and input
weights = torch.empty(3, 3, 3, 8)
torch.nn.init.constant_(weights, 5e-2)
x = np.random.randn(1, 3, 10, 10)
weights_tf = tf.convert_to_tensor(weights.numpy(), dtype=tf.float32)
weights_torch = torch.Tensor(weights.permute((3, 2, 0, 1)))
# Tensorflow padding behavior. Assuming that kH == kW to keep this simple.
stride = 2
if x.shape[2] % stride == 0:
pad = max(weights.shape[0] - stride, 0)
else:
pad = max(weights.shape[0] - (x.shape[2] % stride), 0)
if pad % 2 == 0:
pad_val = pad // 2
padding = (pad_val, pad_val, pad_val, pad_val)
else:
pad_val_start = pad // 2
pad_val_end = pad - pad_val_start
padding = (pad_val_start, pad_val_end, pad_val_start, pad_val_end)
x_tf = tf.convert_to_tensor(x.transpose((0, 2, 3, 1)), dtype=tf.float32)
x_torch = torch.Tensor(x)
x_torch = F.pad(x_torch, padding, "constant", 0)
# TF Conv2D
tf_conv2d = tf.nn.conv2d(x_tf,
weights_tf,
strides=[1, stride, stride, 1],
padding="SAME")
# PyTorch Conv2D
torch_conv2d = F.conv2d(x_torch, weights_torch, padding=0, stride=stride)
sess.run(tf.global_variables_initializer())
tf_result = sess.run(tf_conv2d)
diff = np.mean(np.abs(tf_result.transpose((0, 3, 1, 2)) - torch_conv2d.detach().numpy()))
print('Mean of Abs Diff: {0}'.format(diff))
The output is:
Mean of Abs Diff: 2.2477470551507395e-08
I wasn't quite sure why this was happening when I started writing this question, but a bit of reading clarified this very quickly. I hope this example can help others.
Now i'm trying lstm tutorial, look some one's book. But it didn't work. What's the problem? :
import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
import pprint
pp = pprint.PrettyPrinter(indent=4)
sess = tf.InteractiveSession()
a = [1, 0, 0, 0]
b = [0, 1, 0, 0]
c = [0, 0, 1, 0]
d = [0, 0, 0, 1]
init=tf.global_variables_initializer()
with tf.variable_scope('one_cell') as scope:
hidden_size = 2
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
print(cell.output_size, cell.state_size)
x_data = np.array([[a]], dtype=np.float32)
pp.pprint(x_data)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
sess.run(init)
pp.pprint(outputs.eval())
Error message is like that. Please solve this problem.
Attempting to use uninitialized value one_cell/rnn/basic_rnn_cell/weights
[[Node: one_cell/rnn/basic_rnn_cell/weights/read = Identity[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](one_cell/rnn/basic_rnn_cell/weights)]]
You haven't initialized some graph variables, as the error mentioned. Shift your code to this and it will work.
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
init=tf.global_variables_initializer()
sess.run(init)
Best practice is to have init right at the end of your graph and before sess.run.
EDIT: Refer to What does tf.global_variables_initializer() do under the hood? for more insights.
You define the operation init before creating your variables. Thus this operation will be performed only on the variables defined at that time, even if you run it after creating your variables.
So just move the definition of init and you will be fine.