Pytorch classifier output dimension - python

I am trying to pass a tensor of size(1,12,512,512) into pytroch classifier, the layers of the classifier do not really matter. The output should be of the size(2) something as follow [0.5124,0.7557]. The classifier is not projecting the size of the tensor, although I did add some projection layers. I was wondering what is the best way to project the tensor of the size(1,12,512,512) into size(2) and pass that through the classifier. really appreciate the help.
This is what I tried, but did not work:
class cls(nn.Module):
def __init__(self, in_dim, hid_dim, out_dim, dropout):
super(SimpleClassifier, self).__init__()
layers = [
weight_norm(nn.Linear(in_dim, hid_dim), dim=None),
nn.ReLU(),
nn.Dropout(dropout, inplace=True),
weight_norm(nn.Linear(hid_dim, out_dim), dim=None)
nn.Sigmoid()
]
self.main = nn.Sequential(*layers)
def forward(self, x):
logits = self.main(x)
return logits
in_dim=512, hid_dim=512, out_dim=2
I thought I would add some more linear projection but did not work, I get an error about matrix multiplication being invalid because of some size issue. Not sure what I should do.
The expected behavior is the following for example:
tensor of size(1,12,512,512) ---> output:tensor([0.5124,0.7557])

I write here, an example to understand my solution:
from torch.nn.utils import weight_norm
class cls(nn.Module):
def __init__(self,in_dim,hid_dim,out_dim, dropout):
super(cls, self).__init__()
layers = [
weight_norm(nn.Linear(in_dim, hid_dim)),
nn.ReLU(),
nn.Dropout(dropout, inplace=True),
weight_norm(nn.Linear(hid_dim, out_dim)),
nn.Sigmoid()
]
self.main = nn.Sequential(*layers)
def forward(self, x):
logits = self.main(x)
return logits
x = torch.rand(512) # <-- the first layer input size is 512.
model = cls(512,512,2,0.2) # input=512, hidden=512, output=2 as in your question.
model(x)
#output
tensor([0.5249, 0.5026], grad_fn=<SigmoidBackward0>)
Second case:
x = torch.rand(512, 512) # No. batches = (512), input size = 512
#output the size is (512,2)
tensor( [[0.4779, 0.4676],
[0.4630, 0.5059],
[0.4675, 0.5203],
...,
[0.4641, 0.5179],
[0.4877, 0.4733],
[0.4103, 0.4845],
...
[0.4442, 0.4981],
[0.4616, 0.4990],
[0.4673, 0.5278]], grad_fn=<SigmoidBackward0>)
Third case:
x = torch.rand(12, 512, 512) # No. batches = (12, 512), input size = 512
#output the size is (12, 512,2)
tensor([[[0.4779, 0.4676],
[0.4630, 0.5059],
[0.4675, 0.5203],
...,
[0.4641, 0.5179],
[0.4877, 0.4733],
[0.4103, 0.4845]],
...
[0.4442, 0.4981],
[0.4616, 0.4990],
[0.4673, 0.5278]]], grad_fn=<SigmoidBackward0>)
Fourth case:
x = torch.rand(1, 12, 512, 512) # No. batches = (1, 12, 512), input size = 512
#output the size is (1,12, 512,2)
tensor([[[[0.4779, 0.4676],
[0.4630, 0.5059],
[0.4675, 0.5203],
...,
[0.4641, 0.5179],
[0.4877, 0.4733],
[0.4103, 0.4845]],
...
[0.4442, 0.4981],
[0.4616, 0.4990],
[0.4673, 0.5278]]]], grad_fn=<SigmoidBackward0>)
In conclusion, your input x should be the same size as the input dimension of the first layer nn.Linear in your model. If your input size is larger than the input layer's size, PyTorch will process the additional dimensions with your input x as batches.

Related

input must have 3 dimensions, got 2 Error in create LSTM Classifier

The structure of the network must be as follows:
(lstm): LSTM(1, 64, batch_first=True)
(fc1): Linear(in_features=64, out_features=32, bias=True)
(relu): ReLU()
(fc2): Linear(in_features=32, out_features=5, bias=True)
I wrote this code:
class LSTMClassifier(nn.Module):
def __init__(self):
super(LSTMClassifier, self).__init__()
self.lstm = nn.LSTM(1, 64, batch_first=True)
self.fc1 = nn.Linear(in_features=64, out_features=32, bias=True)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(in_features=32, out_features=5, bias=True)
def forward(self, x):
x = torch.tanh(self.lstm(x)[0])
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
This is for test:
(batch_data, batch_label) = next (iter (train_loader))
model = LSTMClassifier().to(device)
output = model (batch_data.to(device)).cpu()
assert output.shape == (batch_size, 5)
print ("passed")
The error is:
----> 3 output = model (batch_data.to(device)).cpu()
5 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/rnn.py in check_input(self, input, batch_sizes)
201 raise RuntimeError(
202 'input must have {} dimensions, got {}'.format(
--> 203 expected_input_dim, input.dim()))
204 if self.input_size != input.size(-1):
205 raise RuntimeError(
RuntimeError: input must have 3 dimensions, got 2
What is my problem?
LSTMs support 3 dimensional input (sample, time-step, features). You need to transform your input from 2D to 3D. To do so, you can :
Use reshape function
First, you need the shape of your 2D input using batch_data.shape. Let's assume the shape of your 2D input is (15, 4).
Now to reshape your input from 2D to 3D you use the reshape function np.reshape(data, new_shape)
(batch_data, batch_label) = next (iter (train_loader))
batch_data = np.reshape(batch_data, (15, 4, 1)) # line to add
model = LSTMClassifier().to(device)
output = model (batch_data.to(device)).cpu()
assert output.shape == (batch_size, 5)
print ("passed")
Later on, you will also need to reshape your test data from 2D to 3D.
Add RepeatVector Layer
This layer is implemented in Keras, I'm not sure if it's available in PyTorch which is your case.
This layer adds an extra dimension to your data (repeats the input n times). For example you can convert a 2D input (batch size, input size) to a 3D input (batch_size, sequence_length, input size).

Error in transformation of EMNIST data through Pytorch

I was trying to train my model for prediction of EMNIST by using Pytorch.
Edit:- Here's the link of colab notebook for the problem.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(28, 64, (5, 5), padding=2)
self.conv1_bn = nn.BatchNorm2d(64)
self.conv2 = nn.Conv2d(64, 128, 2, padding=2)
self.fc1 = nn.Linear(2048, 1024)
self.dropout = nn.Dropout(0.3)
self.fc2 = nn.Linear(1024, 512)
self.bn = nn.BatchNorm1d(1)
self.fc3 = nn.Linear(512, 128)
self.fc4 = nn.Linear(128, 47)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = self.conv1_bn(x)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 2048)
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = self.fc2(x)
x = x.view(-1, 1, 512)
x = self.bn(x)
x = x.view(-1, 512)
x = self.fc3(x)
x = self.fc4(x)
return F.log_softmax(x, dim=1)
return x
I am getting this type of error as shown below, whenever I am training my model.
<ipython-input-11-07c68cf1cac2> in forward(self, x)
24 def forward(self, x):
25 x = F.relu(self.conv1(x))
---> 26 x = F.max_pool2d(x, 2, 2)
27 x = self.conv1_bn(x)
RuntimeError: Given input size: (64x28x1). Calculated output size: (64x14x0). Output size is too small
I tried to searched for the solutions and found that I should transform the data before. So i tried transforming it by the most common suggestion:-
transform_valid = transforms.Compose(
[
transforms.ToTensor(),
])
But then again I am getting the error mentioned below. Maybe the problem lies here in the transformation part.
/opt/conda/lib/python3.7/site-packages/torchvision/datasets/mnist.py:469: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1595629403081/work/torch/csrc/utils/tensor_numpy.cpp:141.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
I wanted to make that particular numpy array writable by using "ndarray.setflags(write=None, align=None, uic=None)" but I'm not able to figure out from where and what type of array should I make writable, as I'm directly loading the dataset using ->
"datasets.EMNIST(root, split="balanced", train=False, download=True, transform=transform_valid)"
welcome to Stackoverflow !
Your problem is not related to the toTensor transform, this error is yielded because of the dimension of the tensor you input in your maxpool : the error clearly states that you are trying to maxppol a tensor of which one of the dimensions is 1 (64, 28, 1) and thus it will output a tensor with a dimension of 0 (64,14,0), which makes no sense.
You need to check the dimensions of the tensors you input in your model. They are definitely too small. Maybe you made a mistake with a view somewhere (hard to tell without a minimal reproducible example).
If I can try to guess, you have at the beginning a tensor size 28x28x1 (typical MNIST), and you put it into a convolution that expects a tensor of dims BxCxWxH (batch_size, channels, width, height), i.e something like (B, 1, 28, 28), but you confuse the width (28) from the input channels (nn.Conv2d(->28<-, 64, (5, 5), padding=2))
I believe you want your first layer to be nn.Conv2d(1, 64, (5, 5), padding=2), and you need to resize your tensors to give them the shape (B, 1, 28, 28) (the value of B is up to you) before giving them to the network.
Sidenote : the warning about writable numpy arrays is completely unrelated, it just means that pytorch will possibly override the "non-writable" data of your numpy array. If you don't care about this numpy array being modified, you can ignore the warning.

Classifier Loss function dimension out of range error

import torch.nn as nn
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2d(1, 32, kernel_size=5, stride=2, padding=2)
self.conv2 = nn.Conv2d(32, 64, kernel_size=5,stride=2)
self.fc = nn.Linear(884736, 1000)
self.fc1 = nn.Linear(1000, 600)
self.fc2 = nn.Linear(600, 200)
self.fc3 = nn.Linear(200, 6)
self.pooling = nn.MaxPool2d(2, 2)
def forward(self, x):
x = self.conv1(x)
x = nn.functional.relu(x)
x= self.pooling(x)
x= self.conv2(x)
x = torch.flatten(nn.functional.relu(x))
x= self.fc(x)
x = nn.functional.relu(x)
# import pdb; pdb.set_trace()
x= self.fc1(x)
x= self.fc2(x)
x= self.fc3(x)
# x = torch.softmax(x)
return x
# model = torch.nn.Sequential(
# )
model = MyModel()
#Training
dataiter = iter(trainloader)
total_epochs = 5
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())
for epoch in tqdm(range(total_epochs)):
#initialize batch
gc.collect()
input_, label_ = dataiter.next()
#forwardd
out = model.forward(input_)
#backwardd
print (out,out.shape,)
print (label_, label_.shape)
# out = out.unsqueeze(dim=0)
# label_ =label_.type_as(out)
loss = criterion(out, label_)
loss.backward()
optimizer.zero_grad()
optimizer.step()
print('batch_loss:', str(loss.item()))
print('Epochs completed:', epoch+1,'\n')
print('epoch_loss:' + loop_loss/float(batch_size))
I have a dataset of different breed of dogs (120 classes)
http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar
The labels are int values ranging from 1 to 120
I need to make a classifier
Getting an error at loss computation
Dimension out of range (expected to be in range of [-1, 0], but got 1)
What could be wrong?
The output of the model has only a single dimension, it has the size [6], but the nn.CrossEntropyLoss expects a size of [batch_size, num_classes].
In the model you flatten the output of the convolutions. You have to preserve the batch dimension, as they are independent of each other and flattening it completely would combine them into a single one. torch.flatten accepts a start_dim argument (second argument), which decides from which dimension it starts to flatten. By setting it to 1, it will start with the second dimension, leaving the first dimension (batch dimension) unchanged.
# Flatten everything but the first dimension
# From: [batch_size, channels, height, width] (4D)
# To: [batch_size, channels * height * width] (2D)
x = torch.flatten(nn.functional.relu(x), 1)
The output of the model must also have the same number of classes as your dataset. Since you have 120 classes, the output of the last linear layer must be 120.
self.fc3 = nn.Linear(200, 120)
Also, the labels need to be in range [0, 119], because they are the indices of the classes and like every indexing in Python, it is zero-based. If your labels are in range [1, 120], you can simply subtract one from them.

RuntimeError: size mismatch, m1: [4 x 784], m2: [4 x 784] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:136

I have executed the following code
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
from torch.utils import data as t_data
import torchvision.datasets as datasets
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
mnist_trainset = datasets.MNIST(root='./data', train=True,
download=True, transform=data_transforms)
batch_size=4
dataloader_mnist_train = t_data.DataLoader(mnist_trainset,
batch_size=batch_size,
shuffle=True
)
def make_some_noise():
return torch.rand(batch_size,100)
class generator(nn.Module):
def __init__(self, inp, out):
super(generator, self).__init__()
self.net = nn.Sequential(
nn.Linear(inp,784),
nn.ReLU(inplace=True),
nn.Linear(784,1000),
nn.ReLU(inplace=True),
nn.Linear(1000,800),
nn.ReLU(inplace=True),
nn.Linear(800,out)
)
def forward(self, x):
x = self.net(x)
return x
class discriminator(nn.Module):
def __init__(self, inp, out):
super(discriminator, self).__init__()
self.net = nn.Sequential(
nn.Linear(inp,784),
nn.ReLU(inplace=True),
nn.Linear(784,784),
nn.ReLU(inplace=True),
nn.Linear(784,200),
nn.ReLU(inplace=True),
nn.Linear(200,out),
nn.Sigmoid()
)
def forward(self, x):
x = self.net(x)
return x
def plot_img(array,number=None):
array = array.detach()
array = array.reshape(28,28)
plt.imshow(array,cmap='binary')
plt.xticks([])
plt.yticks([])
if number:
plt.xlabel(number,fontsize='x-large')
plt.show()
d_steps = 100
g_steps = 100
gen=generator(4,4)
dis=discriminator(4,4)
criteriond1 = nn.BCELoss()
optimizerd1 = optim.SGD(dis.parameters(), lr=0.001, momentum=0.9)
criteriond2 = nn.BCELoss()
optimizerd2 = optim.SGD(gen.parameters(), lr=0.001, momentum=0.9)
printing_steps = 20
epochs = 5
for epoch in range(epochs):
print (epoch)
# training discriminator
for d_step in range(d_steps):
dis.zero_grad()
# training discriminator on real data
for inp_real,_ in dataloader_mnist_train:
inp_real_x = inp_real
break
inp_real_x = inp_real_x.reshape(batch_size,784)
dis_real_out = dis(inp_real_x)
dis_real_loss = criteriond1(dis_real_out,
Variable(torch.ones(batch_size,1)))
dis_real_loss.backward()
# training discriminator on data produced by generator
inp_fake_x_gen = make_some_noise()
#output from generator is generated
dis_inp_fake_x = gen(inp_fake_x_gen).detach()
dis_fake_out = dis(dis_inp_fake_x)
dis_fake_loss = criteriond1(dis_fake_out,
Variable(torch.zeros(batch_size,1)))
dis_fake_loss.backward()
optimizerd1.step()
# training generator
for g_step in range(g_steps):
gen.zero_grad()
#generating data for input for generator
gen_inp = make_some_noise()
gen_out = gen(gen_inp)
dis_out_gen_training = dis(gen_out)
gen_loss = criteriond2(dis_out_gen_training,
Variable(torch.ones(batch_size,1)))
gen_loss.backward()
optimizerd2.step()
if epoch%printing_steps==0:
plot_img(gen_out[0])
plot_img(gen_out[1])
plot_img(gen_out[2])
plot_img(gen_out[3])
print("\n\n")
On running the code,following error is shown
File "mygan.py", line 105, in <module>
dis_real_out = dis(inp_real_x)
RuntimeError: size mismatch, m1: [4 x 784], m2: [4 x 784] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:136
How can I resolve this?
I got the code from https://blog.usejournal.com/train-your-first-gan-model-from-scratch-using-pytorch-9b72987fd2c0
The error hints that the tensor you fed into the discriminator has incorrect shape. Now let's try to find out what the shape of the tensor is, and what shape is expected.
The tensor itself has a shape of [batch_size x 784] because of the reshape operation above. The discriminator network, on the other hand, expects a tensor with a last dimension of 4. This is because the first layer in the discriminator network is nn.Linear(inp, 784), where inp = 4.
A linear layer nn.Linear(input_size, output_size), expects the final dimension of the input tensor to be equal to input_size, and generates output with the final dimension projected to output_size. In this case, it expects an input tensor of shape [batch_size x 4], and outputs a tensor of shape [batch_size x 784].
And now to the real issue: the generator and discriminator that you defined has incorrect size. You seem to have changed the 300 dimension size from the blog post to 784, which I assume is the size of your image (28 x 28 for MNIST). However, the 300 is not the input size, but rather a "hidden state size" -- the model uses a 300-dimensional vector to encode your input image.
What you should do here is to set the input size to 784, and the output size to 1, because the discriminator makes a binary judgment of fake (0) or real (1). For the generator, the input size should be equal to the "input noise" that you randomly generate, in this case 100. The output size should also be 784, because its output is the generated image, which should be the same size as the real data.
So, you only need to make the following changes to your code, and it should run smoothly:
gen = generator(100, 784)
dis = discriminator(784, 1)

RuntimeError: expected stride to be a single integer value

I am new at Pytorch sorry for the basic question. The model gives me dimension mismatch error how to solve this ?
Maybe more than one problems in it.
Any help would be appriciated.
Thanks
class PR(nn.Module):
def __init__(self):
super(PR, self).__init__()
self.conv1 = nn.Conv2d(3,6,kernel_size=5)
self.conv2 = nn.Conv2d(6,1,kernel_size=2)
self.dens1 = nn.Linear(300, 256)
self.dens2 = nn.Linear(256, 256)
self.dens3 = nn.Linear(512, 24)
self.drop = nn.Dropout()
def forward(self, x):
out = self.conv1(x)
out = self.conv2(x)
out = self.dens1(x)
out = self.dens2(x)
out = self.dens3(x)
return out
model = PR()
input = torch.rand(28,28,3)
output = model(input)
Please have a look at the corrected code. I numbered the lines where I did corrections and described them below.
class PR(torch.nn.Module):
def __init__(self):
super(PR, self).__init__()
self.conv1 = torch.nn.Conv2d(3,6, kernel_size=5) # (2a) in 3x28x28 out 6x24x24
self.conv2 = torch.nn.Conv2d(6,1, kernel_size=2) # (2b) in 6x24x24 out 1x23x23 (6)
self.dens1 = torch.nn.Linear(529, 256) # (3a)
self.dens2 = torch.nn.Linear(256, 256)
self.dens3 = torch.nn.Linear(256, 24) # (4)
self.drop = torch.nn.Dropout()
def forward(self, x):
out = self.conv1(x)
out = self.conv2(out) # (5)
out = out.view(-1, 529) # (3b)
out = self.dens1(out)
out = self.dens2(out)
out = self.dens3(out)
return out
model = PR()
ins = torch.rand(1, 3, 28, 28) # (1)
output = model(ins)
First of all, pytorch handles image tensors (you perform 2d convolution therefore I assume this is an image input) as follows: [batch_size x image_depth x height width]
It is important to understand how the convolution with kernel, padding and stride works. In your case kernel_size is 5 and you have no padding (and stride 1). This means that the dimensions of the feature-map gets reduced (as depicted). In your case the first conv. layer takes a 3x28x28 tensor and produces a 6x24x24 tensor, the second one takes 6x24x24 out 1x23x23. I find it very useful to have comments with the in and out tensor dimensions next to the definition conv layers (see in the code above)
Here you need to "flatten" the [batch_size x depth x height x width] tensor to [batch_size x fully connected input]. This can be done via tensor.view().
There was a wrong input for the linear layer
Each operation in the forward-pass took the input value x, instead I think you might want to pass the results of each layer to the next one
Altough this code is now runnable, it does not mean that it makes perfect sense. The most important thing (for neural networks in general i would say) are activation functions. These are missing completely.
For getting started with neural networks in pytorch I can highly recommend the great pytorch tutorials: https://pytorch.org/tutorials/ (I would start with the 60min blitz tutorial)
Hope this helps!
There are few problems with your code. I've reviewed and corrected it below:
class PR(nn.Module):
def __init__(self):
super(PR, self).__init__()
self.conv1 = nn.Conv2d(3, 6, kernel_size=5)
self.conv2 = nn.Conv2d(6, 1, kernel_size=2)
# 300 does not match the shape of the previous layer's output,
# for the specified input, the output of conv2 is [1, 1, 23, 23]
# this output should be flattened before feeding it to the dense layers
# the shape then becomes [1, 529], which should match the input shape of dens1
# self.dens1 = nn.Linear(300, 256)
self.dens1 = nn.Linear(529, 256)
self.dens2 = nn.Linear(256, 256)
# The input should match the output of the previous layer, which is 256
# self.dens3 = nn.Linear(512, 24)
self.dens3 = nn.Linear(256, 24)
self.drop = nn.Dropout()
def forward(self, x):
# The output of each layer should be fed to the next layer
x = self.conv1(x)
x = self.conv2(x)
# The output should be flattened before feeding it to the dense layers
x = x.view(x.size(0), -1)
x = self.dens1(x)
x = self.dens2(x)
x = self.dens3(x)
return x
model = PR()
# The input shape should be (N,Cin,H,W)
# where N is the batch size, Cin is input channels, H and W are height and width respectively
# so the input should be torch.rand(1,3,28,28)
# input = torch.rand(28,28,3)
input = torch.rand(1, 3, 28, 28)
output = model(input)
Let me know if you have any follow-up questions.

Categories

Resources