I am trying to implement an Inverse Sigmoid function to the last layer of my Convolutional Neural Network?
I am trying to build the network in Pytorch and I want to take the output from the last Convolutional Layer and then apply Inverse Sigmoid Function to it.
I have read that the logit function is the opposite of sigmoid function and I tried implementing it but its not working.
I used the logit function from the scipy library and used it in the function.
def InverseSigmoid(self, x):
x = logit(x)
return x
Sigmoid is just 1 / (1 + e**-x). So if you want to invert it you can just -ln((1 / x) - 1). For numerical stability purposes, you can also do -ln((1 / (x + 1e-8)) - 1). This is the inverse function of sigmoid, implementation is straightforward.
Related
I am using the binary cross entropy loss function to train a neural network. With the help of the pytorch library, I am calling nn.BCELoss(). I am trying to compute the armijo size step length. The formula for checking the armijo step length is given as follows -
f(x + lr*v) <= f(x) + c*ß*func_gradient , where lr is the learning rate and v is the descent direction
I have all the parameters but how to call the function f(x) or f(x+something) considering the loss function as binary cross entropy?
I am trying to implement a neural network in PyTorch to solve an ordinary differential equation (ODE). The network architecture is straight-forward. It is just a feed-forward neural network with n inputs and outputs and k layers.
class PINN(torch.nn.Module):
def __init__(self,n):
super().__init__()
# Layers
self.L1=torch.nn.Linear(1,n)
self.L2=torch.nn.Linear(n,n)
self.L3=torch.nn.Linear(n,1)
# Activation functions
self.t=torch.nn.Tanh()
self.r=torch.nn.ReLU()
def forward(self,x):
a1=self.r(self.L1(x))
a2=self.r(self.L2(a1))
a3=self.r(self.L3(a2))+x
return a3
I want to minimize the loss between the gradient of the output of my neural network and the right-hand side of the ODE. I have chosen to work with a mean-squared error loss. I know that PyTorch includes a built-in MSE loss. However, I defined my own loss function since I have to pass in the gradient of a tensor.
def ODELoss(x,y,x0):
# Number of collocation points to sample
n=len(x)
# Initialize the loss to zero
loss=torch.tensor(0.,requires_grad=True)
# Loop over the "data". Technically, this is an unsupervised problem.
# The "data" are points sampled on the domain which are then evaluated
# according to the ODE.
for (xx,yy) in zip(x,y):
xx=torch.tensor([[xx]],requires_grad=True)
yy=torch.tensor([[yy]])
g(xx,x0).backward()
dg=xx.grad.clone().requires_grad_(True)
loss=loss+(dg-yy)**2
loss=loss/n
return loss
Here, g(x) is called the universal predictor. It is used in the literature to account for the initial condition(s).
def g(x,x0):
return x*model(x)+x0
This doesn't seem to work because it seems like I am not passing in the gradient of the output correctly. Can anyone give me some guidance on how to do this?
If I've an following layer
x = Conv2D(x, activation='linear')
Is this layer trainable? As we know derivative of linear function is constant, so in this cases will the weight get ever updated? Situation like also
tf.keras.activation.linear (x) # no changes
tf.keras.activation.relu (x) # will change
The layer is trainable. Your data will be approximated by linear function.
Training process is finding a function which is the best approximation of your data. If you don't use activation - your data will be approximated by linear function.
E.g. if your layer is Dense(1) - your data will be approximated by line. If your data is 2D - you can draw the points, run training and see that your data will be approximated by line: dense.w * x + dense.b.
The finction should be differentiable (for backpropagation). Linear finction is differentiable, so it is fine.
Loss function can not be linear because it should have minimum. But it is not required for layer.
I'm trying to implement a Pytorch version of Creative Adversarial Networks, a GAN with a modified/custom loss function.
Here are the formulae for the loss function. I'm using Pytorch's nn.CrossEntropyLoss for the discriminator's modified loss function, and it seems to be working, as its loss decreases over epochs, but I don't think nn.CrossEntropyLoss is suitable for the generator, as nn.CrossEntropyLoss seems to expect Long and not Float tensors, and the paper’s loss function, particularly the generator's loss, seems to me like it would require floats.
This is my current (initial) thinking for the generator's custom loss:
y_dim is the number of classes
disc_class_layer = FC layer that outputs a style/class given an input image
The for loop attempts to be the equivalent to:
(sigma k=1 up to k) ((1/K)log(Dc(ck|G(z)) + (1 − (1/K)log(1 − Dc(ck|G(z)).
class CanGLoss(nn.Module):
def __init__(self,y_dim,labels,disc_class_layer):
super(CanGLoss,self).__init__()
def forward(self,inp):
style_loss = 0
for i in range(1,y_dim+1):
style_loss += (1/i)*torch.log(disc_class_layer(inp)) + (1 - (1/i))*torch.log(1-disc_class_layer(inp))
return style_loss*-1
Is this on the right track? I am new to custom loss functions and Pytorch and not sure this is the way to go.
Any help would be great!
My activation function right now is the logistic function f(x) = 1/(1+e^-x). But the values of x range from 10,000 to 100,000 so I don't think its feasible. Is there another way?