Access lower dimensional encoded data of autoencoder - python

Here is an autoencoder I’m working on from tutorial:https://debuggercafe.com/implementing-deep-autoencoder-in-pytorch/
I’m just learning about autoencoders and I’ve modified the source encode a custom small dataset which consists of:
[0,1,0,1,0,1,0,1,0],[0,1,1,0,0,1,0,1,0],[0,1,1,0,0,1,0,1,0],[0,1,1,0,0,1,0,1,0]
It seems to work ok, but I’m unsure how to access the lower dimensional embedding values of dimension 2 (set by parameter out_features).
I've added a methods to the Autoencoder class to return the embedding , is this the recommended method of accessing the embedding's ?
Code:
import torch
import torchvision
from torch import nn
from torch.autograd import Variable
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import MNIST
from torchvision.utils import save_image
import warnings
import os
# import packages
import os
import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
import torch.optim as optim
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision import datasets
from torch.utils.data import DataLoader
from torchvision.utils import save_image
import numpy as np
# utility functions
def get_device():
if torch.cuda.is_available():
device = 'cuda:0'
else:
device = 'cpu'
return device
device = get_device()
features = torch.tensor(np.array([ [0,1,0,1,0,1,0,1,0],[0,1,1,0,0,1,0,1,0],[0,1,1,0,0,1,0,1,0],[0,1,1,0,0,1,0,1,0] ])).float()
tic_tac_toe_data_loader = torch.utils.data.DataLoader(features, batch_size=1, shuffle=True)
class Encoder(nn.Module):
def __init__(self):
super(Encoder, self).__init__()
self.fc1 = nn.Linear(in_features=9, out_features=2)
def forward(self, x):
return F.sigmoid(self.fc1(x))
class Decoder(nn.Module):
def __init__(self):
super(Decoder, self).__init__()
self.fc1 = nn.Linear(in_features=2, out_features=9)
def forward(self, x):
return F.sigmoid(self.fc1(x))
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
self.fc1 = Encoder()
self.fc2 = Decoder()
def forward(self, x):
return self.fc2(self.fc1(x))
net = Autoencoder()
net.to(device)
NUM_EPOCHS = 50
LEARNING_RATE = 1e-3
criterion = nn.MSELoss()
optimizer = optim.Adam(net.parameters(), lr=LEARNING_RATE)
# image transformations
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
outputs = None
def train(net, trainloader, NUM_EPOCHS):
train_loss = []
for epoch in range(NUM_EPOCHS):
running_loss = 0.0
for data in trainloader:
img = data
img = img.to(device)
img = img.view(img.size(0), -1)
# print('img.shape' , img.shape)
optimizer.zero_grad()
outputs = net(img)
loss = criterion(outputs, img)
loss.backward()
optimizer.step()
running_loss += loss.item()
loss = running_loss / len(trainloader)
train_loss.append(loss)
return train_loss
# train the network
train_loss = train(net, tic_tac_toe_data_loader, NUM_EPOCHS)
I can access the lower dimensional embedding using
print(Encoder().forward( torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()))
But is this using the trained weight values for the embedding ? If I call Encoder multiple times with same values:
print(Encoder().forward( torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()))
print(Encoder().forward( torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()))
different results are returned:
tensor([0.5083, 0.5020], grad_fn=<SigmoidBackward>)
tensor([0.4929, 0.6940], grad_fn=<SigmoidBackward>)
Why is this the case ? Is an extra training step being invoked as a result of calling Encoder ?

By calling Encoder() you are basically creating a new instance of the encoder everytime and the weights are randomly initialized each time.
Generally, you make one instance of it and train it, save the weights, and infer on it.
Also, for PyTorch, you need not call .forward(), but call the instance directly. Forward is called by it implicitly, including other hook methods if any.
enc = Encoder()
input = torch.from_numpy(np.asarray([0,1,0,1,0,1,0,1,0]).float()
print(enc(input))
print(enc(input))
Training pass happens when you pass the Encode() instance to train function. Calling Encoder() only creates a new object.
Since each object has it's own weights, and the weights are initialized randomly (see xavier and kaiming initialization), you are different outputs. Moving to a single object, you still have to explicitly train it with the train function.

Like other responder pointed out when you call Encoder() you generate new instances with randomly initialized weights. Because you are interested in lower dimensional embedding produced by your encoder you need to access the weights of the encoder in your trained net:
trained_encoder = net.fc1
Now that you have your encoder with trained weights the following lines should produces same result:
print(trained_encoder.forward( torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()))
print(trained_encoder.forward( torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()))
As pointed out by others you can further simplify by passing input directly:
test_input = torch.tensor(np.array([0,1,0,1,0,1,0,1,0])).float()
print(trained_encoder(test_input))

Related

Simple Neural Network in Pytorch with 3 inputs (Numerical Values)

Having a hard time setting up a neural network most of the examples are images. My problem has 3 inputs each of size N X M where N are the samples and M are the features. I have a separate file (CSV) with 1 x N binary target (0,1).
The network i'm trying to configure should have two hidden layers with 100 and 50 neurons, respectively. Sigmoid activation function and cross-entropy to check performance. The result should just be a single probability output.
Please help?
EDIT:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.autograd as autograd
import torch.nn.functional as F
#from torch.autograd import Variable
import pandas as pd
# Import Data
Input1 = pd.read_csv(r'...')
Input2 = pd.read_csv(r'...')
Input3 = pd.read_csv(r'...')
Target = pd.read_csv(r'...' )
# Convert to Tensor
Input1_tensor = torch.tensor(Input1.to_numpy()).float()
Input2_tensor = torch.tensor(Input2.to_numpy()).float()
Input3_tensor = torch.tensor(Input3.to_numpy()).float()
Target_tensor = torch.tensor(Target.to_numpy()).float()
# Transpose to have signal as columns instead of rows
input1 = Input1_tensor
input2 = Input2_tensor
input3 = Input3_tensor
y = Target_tensor
# Define the model
class Net(nn.Module):
def __init__(self, num_inputs, hidden1_size, hidden2_size, num_classes):
# Initialize super class
super(Net, self).__init__()
#self.criterion = nn.CrossEntropyLoss()
# Add hidden layer
self.layer1 = nn.Linear(num_inputs,hidden1_size)
# Activation
self.sigmoid = torch.nn.Sigmoid()
# Add output layer
self.layer2 = nn.Linear(hidden1_size,hidden2_size)
# Activation
self.sigmoid2 = torch.nn.Sigmoid()
self.layer3 = nn.Linear(hidden2_size, num_classes)
def forward(self, x1, x2, x3):
# implement the forward pass
in1 = self.layer1(x1)
in2 = self.layer1(x2)
in3 = self.layer1(x3)
xyz = torch.cat((in1,in2,in3),1)
return xyz
# Define loss function
loss_function = nn.CrossEntropyLoss()
# Define optimizer
optimizer = optim.SGD(model.parameters(), lr=1e-4)
for t in range(num_epochs):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(input1, input2, input3)
# Compute and print loss
loss = loss_function(y_pred, y)
print(t, loss.item())
# Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
# Calculate gradient using backward pass
loss.backward()
# Update model parameters (weights)
optimizer.step()
Here I am getting an error of "
RuntimeError: 0D or 1D target tensor expected, multi-target not supported"
for line "loss = loss_function(y_pred, y)"
Where y_pred is [20000,375] and y is [20000,1]
you can refer to pytorch, a python library for deep learning and neural networks.
and you can use code that defines network below:
from torch import nn
import torch.nn.functional as F
def network(nn.Module):
def __init__(self, M):
# M is the dimension of input feature
super(network, self).__init__()
self.layer1 = nn.Linear(M, 100)
self.layer2 = nn.Linear(100, 50)
self.out = nn.Linear(50,1)
def forward(self,x):
return F.sigmoid(self.out(self.layer2(self.layer1(x))))
----------
You can then refer to the pytorch documentation and finish the rest training code.
Edit:
As for RuntimeError, you can squeeze the target tensor by y.squeeze(). This will remove redundant dimension in your tensor, e.g. [20000,1] -> [20000]

Pytorch GRU Trained on one class to Predict Unlabelled Data

I am creating a GRU to predict if data derived from traffic packets from a device is considered safe or anomalous. I plan to do this by training a model only on safe/ normal operating data and then having it check what it considers new unseen traffic to be (testing). I wish to only train on the safe data (one class) as an attack could take many forms and I don't want to train the model on labeled attack data and then have it miss an attack type that I didn't train it on (basically I want to overfit on the normal operating data). As such I need it to be able to check if the incoming unlabeled data matches the one class it has already trained on (i.e. does the incoming data match the normal operation of the device) or if it is anomalous.
The issue I am having is that as the model is being trained on only one class it is having trouble differentiating the anomalous unseen data from normal data and considers virtually all data that it sees as normal (same as the class it trained on).
As such I would appreciate it if anyone has any ideas or could point out flaws in the way I have implemented by model.
# Imports
import pandas as pd
import numpy as np
import torch
import torchvision # torch package for vision related things
import torch.nn.functional as F # Parameterless functions, like (some) activation functions
import torchvision.datasets as datasets # Standard datasets
import torchvision.transforms as transforms # Transformations we can perform on our dataset for augmentation
from torch import optim # For optimizers like SGD, Adam, etc.
from torch import nn # All neural network modules
from torch.utils.data import Dataset, DataLoader # Gives easier dataset managment by creating mini batches etc.
from tqdm import tqdm # For a nice progress bar
from sklearn.preprocessing import StandardScaler
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters
input_size = 24
hidden_size = 128
num_layers = 1
num_classes = 2
sequence_length = 1
learning_rate = 0.005
batch_size = 8
num_epochs = 5
# Recurrent neural network with GRU (many-to-one)
class RNN_GRU(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
super(RNN_GRU, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size * sequence_length, num_classes)
def forward(self, x):
# Set initial hidden and cell states
x = x.unsqueeze(1)
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
# Forward propagate GRU
out, _ = self.gru(x, h0)
out = out[:, -1, :]
# Decode the hidden state of the last time step
out = self.fc(out)
return out
class MyDataset(Dataset):
def __init__(self,file_name):
stats_df=pd.read_csv(file_name)
x=stats_df.iloc[:,0:24].values
y=stats_df.iloc[:,24].values
self.x_train=torch.tensor(x,dtype=torch.float32)
self.y_train=torch.tensor(y,dtype=torch.float32)
def __len__(self):
return len(self.y_train)
def __getitem__(self,idx):
return self.x_train[idx],self.y_train[idx]
nomDs=MyDataset("nomStats.csv")
atkDs=MyDataset("atkStats.csv")
train_loader=DataLoader(dataset=nomDs,batch_size=batch_size)
test_loader=DataLoader(dataset=atkDs,batch_size=batch_size)
# Initialize network
model = RNN_GRU(input_size, hidden_size, num_layers, num_classes).to(device)
# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Train Network
for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible
data = data.to(device=device).squeeze(1)
targets = targets.to(device=device)
targets = targets.to(dtype=torch.long)
# forward
scores = model(data)
loss = criterion(scores, targets)
# backward
optimizer.zero_grad()
loss.backward()
# gradient descent update step/adam step
optimizer.step()
# Check accuracy on training & test to see how good our model
def check_accuracy(loader, model):
num_correct = 0
num_samples = 0
# Set model to eval
model.eval()
with torch.no_grad():
for x, y in loader:
x = x.to(device=device).squeeze(1)
y = y.to(device=device)
scores = model(x)
_, predictions = scores.max(1)
num_correct += (predictions == y).sum()
num_samples += predictions.size(0)
# Toggle model back to train
model.train()
return num_correct / num_samples
print(f"Accuracy on training set: {check_accuracy(train_loader, model)*100:.2f}%")
print(f"Accuracy on test set: {check_accuracy(test_loader, model)*100:.2f}%")

Issue running backward function after calculating loss

import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision import transforms, datasets
#neural network class
class Net(nn.Module):
#intialize class
def __init__(self):
super().__init__()
#feedforward neural network passes data from input layer to output layer
#fully connected layer with input shape of 28*28 pixels (flatten image to one row) and output feature is size 64. Linear means flat layer
self.fc1 = nn.Linear(28*28,64)
#feed in data from fc1 to fc2
self.fc2 = nn.Linear(64,64)
self.fc3 = nn.Linear(64,64)
#output layer has input 64 and output of size 10 to represent 10 classes in MNIST
self.fc4 = nn.Linear(64,10)
#forward pass through the data
def forward(self, x):
#relu is activation function and performs operation on input data
#input and output dimenson of relu are the same
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
#softmax function gets probability distribution for each class adding up to 1 for output layer
x = F.log_softmax(self.fc4(x), dim=1)
return x
#declare a model
net = Net()
#print(net)
#passing in random data
x = torch.rand(28,28)
#resize to represent input shape (batch size, input x, input y)
x = x.view(-1,28,28)
#print(x)
#optmizer adjusts neural network based on error calculation
import torch.optim as optim
#net.parameters() means all the adjustable parts of the neural network, learning rate is amount of change (we don't want model to swerve based on one train case)
optimizer = optim.Adam(net.parameters(),lr=0.001)
#get datasets using torchvision.datasets transforms are application applied to data (transforms conversion to tensors)
train = torchvision.datasets.MNIST("", train=True, download=True,transform=transforms.Compose([transforms.ToTensor()]))
test = torchvision.datasets.MNIST("", train=False, download=True,transform=transforms.Compose([transforms.ToTensor()]))
#store in data loader, batch size is how many samples is passed through the model at once (in GPU memory), best batch size is between 8-64
#shuffling avoids feeding too much of one kind of image and leads to more generalization
trainset = torch.utils.data.DataLoader(train,batch_size=10,shuffle=True)
testset = torch.utils.data.DataLoader(test,batch_size=10,shuffle=True)
#full pass through data is epoch
EPOCHS = 3
for epoch in range(EPOCHS):
#data is a batch of data in the training set
for data in trainset:
#split into features and labels
features, labels = data
#print(features, labels)
#reset the gradient for next passes to avoid convoluting the results of multiple backpropogations
net.zero_grad()
#pass data into network (make sure input shape matches)
output = net(features.view(-1,28*28))
#compute error (output,expected)
loss = F.nll_loss(output,labels)
print("loss is the ", loss)
#backpropogate loss through trainiable parameters of model
loss.backward()
#adjust neural network
optimizer.step()
I am using Pytorch on Google colab. The error message says that the gradient isn't there.
I am unsure where the error stems from? I used this tutorial: https://www.youtube.com/watch?v=9j-_dOze4IM&list=PLQVvvaa0QuDdeMyHEYc0gxFpYwHY2Qfdh&index=4
Error message from Google colab
Error message

How to get grads in pytorch after matrix multiplication?

I want to get the product of matrix multiplication in the latent space and optimize the weight matrix by the optimizer. I use different kinds of ways to do that. While, The value of 'pi_' in the below codes never changes. What should I do?
I've tried different functions to get the product, like torch.mm(), torch.matual() and #. The weight matrix 'pi_' never changed.
import torch
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
#from torchvision import transforms
from torchvision.datasets import MNIST
def get_mnist(data_dir='./data/mnist/',batch_size=128):
train=MNIST(root=data_dir,train=True,download=True)
test=MNIST(root=data_dir,train=False,download=True)
X=torch.cat([train.data.float().view(-1,784)/255.,test.data.float().view(-1,784)/255.],0)
Y=torch.cat([train.targets,test.targets],0)
dataset=dict()
dataset['X']=X
dataset['Y']=Y
dataloader=DataLoader(TensorDataset(X,Y),batch_size=batch_size,shuffle=True)
return dataloader
class tests(torch.nn.Module):
def __init__(self):
super(tests, self).__init__()
self.pi_= torch.nn.Parameter(torch.FloatTensor(10, 1).fill_(1),requires_grad=True)
self.linear0 = torch.nn.Linear(784,10)
self.linear1 = torch.nn.Linear(1,784)
def forward(self, data):
data = torch.nn.functional.relu(self.linear0(data))
# data = data.mm(self.pi_)
# data = torch.mm(data, self.pi_)
# data = data # self.pi_
data = torch.matmul(data, self.pi_)
data = torch.nn.functional.relu(self.linear1(data))
return data
if __name__ == '__main__':
DL=get_mnist()
t = tests().cuda()
optimizer = torch.optim.Adam(t.parameters(), lr = 2e-3)
for i in range(100):
for inputs, classes in DL:
inputs = inputs.cuda()
res = t(inputs)
loss = torch.nn.functional.mse_loss(res, inputs)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch:", i,"pi:",t.pi_)
TL;DR
You have too many parameters in your neural network, some of them becomes useless and therefore they are no longer being updated. Change your network architecture to reduce useless parameters.
Full explanation:
The weight matrix pi_ does change. You initialize pi_ as all 1, after running the first epochs, the weight matrix pi_ becomes
output >>>
tensor([[0.9879],
[0.9874],
[0.9878],
[0.9880],
[0.9876],
[0.9878],
[0.9878],
[0.9873],
[0.9877],
[0.9871]], device='cuda:0', requires_grad=True)
So, it has changed once. The true reason behind it involves some mathematics. But to put it in non-mathematical ways it means this layer doesn't contribute much to the loss, therefore, the network decided not to update this layer. i.e. The existence of pi_ in this network is redundant.
If you want to observe the change in pi_, you should modify the neural network such that pi_ is not redundant anymore.
One possible modification is to change your reconstruction problem to a classification problem
import torch
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
#from torchvision import transforms
from torchvision.datasets import MNIST
def get_mnist(data_dir='./data/mnist/',batch_size=128):
train=MNIST(root=data_dir,train=True,download=True)
test=MNIST(root=data_dir,train=False,download=True)
X=torch.cat([train.data.float().view(-1,784)/255.,test.data.float().view(-1,784)/255.],0)
Y=torch.cat([train.targets,test.targets],0)
dataset=dict()
dataset['X']=X
dataset['Y']=Y
dataloader=DataLoader(TensorDataset(X,Y),batch_size=batch_size,shuffle=True)
return dataloader
class tests(torch.nn.Module):
def __init__(self):
super(tests, self).__init__()
# self.pi_= torch.nn.Parameter(torch.randn((10, 1),requires_grad=True))
self.pi_= torch.nn.Parameter(torch.FloatTensor(10, 1).fill_(1),requires_grad=True)
self.linear0 = torch.nn.Linear(784,10)
# self.linear1 = torch.nn.Linear(1,784)
def forward(self, data):
data = torch.nn.functional.relu(self.linear0(data))
# data = data.mm(self.pi_)
# data = torch.mm(data, self.pi_)
# data = data # self.pi_
data = torch.matmul(data, self.pi_)
# data = torch.nn.functional.relu(self.linear1(data))
return data
if __name__ == '__main__':
DL=get_mnist()
t = tests().cuda()
optimizer = torch.optim.Adam(t.parameters(), lr = 2e-3)
for i in range(100):
for inputs, classes in DL:
inputs = inputs.cuda()
classes = classes.cuda().float()
output = t(inputs)
loss = torch.nn.functional.mse_loss(output.view(-1), classes)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# print("Epoch:", i, "pi_grad", t.pi_.grad)
print("Epoch:", i,"pi:",t.pi_)
Now, pi_ changes every single epoch.
output >>>
Epoch: 0 pi: Parameter containing:
tensor([[1.3429],
[1.0644],
[0.9817],
[0.9767],
[0.9715],
[1.1110],
[1.1139],
[0.9759],
[1.2424],
[1.2632]], device='cuda:0', requires_grad=True)
Epoch: 1 pi: Parameter containing:
tensor([[1.4413],
[1.1977],
[0.9588],
[1.0325],
[0.9241],
[1.1988],
[1.1690],
[0.9248],
[1.2892],
[1.3427]], device='cuda:0', requires_grad=True)
Epoch: 2 pi: Parameter containing:
tensor([[1.4653],
[1.2351],
[0.9539],
[1.1588],
[0.8670],
[1.2739],
[1.2058],
[0.8648],
[1.2848],
[1.3891]], device='cuda:0', requires_grad=True)
Epoch: 3 pi: Parameter containing:
tensor([[1.4375],
[1.2256],
[0.9580],
[1.2293],
[0.8174],
[1.3471],
[1.2035],
[0.8102],
[1.2505],
[1.4201]], device='cuda:0', requires_grad=True)

PyTorch naive single label classification with embedding layer fails at random

I am new to PyTorch and I am trying out the Embedding Layer.
I wrote a naive classification task, where all the inputs are the equal and all the labels are set to 1.0. I hence expect the model to learn quickly to predict 1.0.
The input is always 0, which is fed into a nn.Embedding(1,32) layer, followed by nn.Linear(32,1) and nn.Relu().
However, an unexpected and undesired behavior occurs: training outcome is different for different times I run the code.
For example,
setting the random seed to 10, model converges: loss decreases and model always predicts 1.0
setting the random seed to 1111, model doesn't converge: loss doesn't decrease and model always predicts 0.5. In those cases the parameters are not updated
Here is the minimal, replicable code:
from torch.nn import BCEWithLogitsLoss
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.autograd import Variable
from torch.utils.data import Dataset
import torch
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.vgg_fc = nn.Linear(32, 1)
self.relu = nn.ReLU()
self.embeddings = nn.Embedding(1, 32)
def forward(self, data):
emb = self.embeddings(data['index'])
return self.relu(self.vgg_fc(emb))
class MyDataset(Dataset):
def __init__(self):
pass
def __len__(self):
return 1000
def __getitem__(self, idx):
return {'label': 1.0, 'index': 0}
def train():
model = MyModel()
db = MyDataset()
dataloader = DataLoader(db, batch_size=256, shuffle=True, num_workers=16)
loss_function = BCEWithLogitsLoss()
optimizer_rel = optim.SGD(model.parameters(), lr=0.1)
for epoch in range(50):
for i_batch, sample_batched in enumerate(dataloader):
model.zero_grad()
out = model({'index': Variable(sample_batched['index'])})
labels = Variable(sample_batched['label'].type(torch.FloatTensor).view(sample_batched['label'].shape[0], 1))
loss = loss_function(out, labels)
loss.backward()
optimizer_rel.step()
print 'Epoch:', epoch, 'batch', i_batch, 'Tr_Loss:', loss.data[0]
return model
if __name__ == '__main__':
# please, try seed 10 (converge) and seed 1111 (fails)
torch.manual_seed(10)
train()
Without specifying the random seed, different runs have different outcome.
Why is, in those cases, the model unable to learn such a easy task?
Is there any mistake in the way I use nn.Embedding layer?
Thank you
I found the problem was the final relu layer, before the sigmoid.
As stated here, that layer will:
throw away information without adding any additional benefit
Removing the layer, the network learned as expected with any seed.

Categories

Resources