I am looking to design a custom loss function for Keras model. The model itself is neural network that accepts a set of images and is supposed to run a regression to get an output, which is a value. Due to the physical conditions of the problem, I need to add a regularization term to the regular mse which would be calculated as $cos(y_{pred})*f(X_i)$, where $y_{pred}$ is the output of the neural network, $X_i$ is the training example used to calculate $y_{pred}$, $f$ is some function which would calculate a value based on the image.
My problem is how to get the $X_i$ from the model? Loss function is supposed to accept just two inputs: $y_{pred}$ and $y_{true}$ which are tensors.
Related
So i want to evaluate the best values of :
Learning Rate
Batch Size
Number of epoch
Activation function
Optimizers
Loss function
no of Hidden Layers
noof nodes in the hidden layers
to solve a regression problem
I have done this so far using randomized search. I have to do this individually for each parameter- so set the keras model to a standard model then test one parameter, then repeat.
Can anyone advise why my loss is >50,000 despite using the best network? And mean squared error is >40000000000.0000? This is a network fit using the optimal parameters from the randomized search but it isnt performing well at all. The data has been pre-processed, and scaled. I added in L2 regularisation but it doesnt improve it much.
So this is a tough question to ask, as i dont have examples of code.
I have a task where the same neural network predicts a time-series characteristic(say a single sigmoid activation) for single time period ahead. So you have several areas with information at time t say A(t),B(t),C(t) -- this same neural network will take in some past information and give you various outputs Y(A;t),Y(B;t).Y(C;t).
You want to take the time based outputs of this neural network for various inputs, and at each output apply a time-varying but known linear transformation. At this point you have a collection of outputs depending on the network parameters, time-dependent inputs and linear transformations.
After this you basically you want to collect a number of these outputs over a period of a month. You then compute a value X based on all of these outputs (after their linear transformation) for that month.
I now want to use ADAM to optimize the weights of this neural network(the same network is used each time). I have something that depends on the outputs of a lot of different fixed neural network inputs, and a lot of different fixed linear transformations.
So i am not sure how to use tensorflow ADAM to change the weights, if you can simply use the model.output or predict in a custom version of the optimizer and it will understand this is a function of the weights.
Sorry if this is hard to understand, i am finding it hard to explain and i dont have code for it as that would make the question redundant.
I'd like to implement some active learning algorithm (modAL) with Keras. But I'd like to know whether initiating multiple training instances (i.e., running .fit() more than once) will build on previous training, or if the weights are reset. In other words, is training additive or iterative?
In case training starts from scratch each time, is there a way to have the model build on previous training?
For a given iteration the inputs are provided to the network with the network weights set to a certain value. For each applied input back propagation is used to calculate a gradients. For a batch size of 100 ,100 inputs are applied and the resultant 100 gradients averaged to determine a new value for the network weights. These weights are then used to process the next batch of 100 inputs. So the process is iterative not additive. There are many possible explanations for why the network appears not to be learning.
I have a model that is essentially an Auxiliary Conditional GAN; the first part of the model is the Generator, the last part is the Discriminator. The Discriminator makes multiclass (k=10) predictions.
Following the work of http://arxiv.org/abs/1912.07768 (pp3 for a helpful diagram, but note I ignore network structure modifications for the purposes of this question) I train the entire model for T=32 iterations by generating synthetic input and class labels (The 'inner loop'). I can predict on real data and labels using just the Discriminator(Learner) to get losses. However I need to back-propagate the Discriminator's error all the way back through the inner loop to the Generator.
How can I achieve this with Keras? Is it possible to do loop unrolling in Keras? How can I provide an arbitrary loss and backprop this down the unrolled layers?
Update: There's now one implementation, in PyTorch, which uses Facebooks 'Higher' library. This appears to mean that the updates made during the inner loop must be 'unwrapped' in order for the final meta-loss to be applied throughout the entire network. Is there a Keras way of achieving this? https://github.com/GoodAI/GTN
This seems to be a Generative Adversarial Network (GAN) which both model learns; one is to classify and the other is for generation.
To summarize, the output of the Generator is fed as a Input in the Discriminator and then feeds back its output to the Generator as its Input.
There is also a discussion and implementation of this in TensorFlow Keras using MNIST Digit Dataset in TensorFlow GAN/DCGAN Documentation in this link.
I'm currently working on recurrent neural nets using Lasagne / Theano.
While training, updates are calculated using Theano's symbolic gradient.
grads = theano.grad(loss_or_grads, params)
While the gradient expression is perfectly fine in general, I'm also interested in the gradient values in order to monitor training.
My question now is if there is a built-in method to also get gradient values, which I haven't found so far, or if I'll have to do it myself.
Thanks in advance
I'm not aware of any lasagne function to evaluate the gradient, but you can get it yourself with simple theano function.
Say we have the following theano variables:
inputs = Inputs to the network
targets = Target outputs of the network
loss = Value of the loss function, defined as a function of network outputs and targets
l_hid = Recurrent layer of the network, type lasagne.layers.RecurrentLayer
Say we're interested in the gradient of the loss function w.r.t. the recurrent weights:
grad = theano.grad(loss, l_hid.W_hid_to_hid)
Define a theano function to get a numerical value for the gradient
get_grad = theano.function([inputs, targets], grad)
Now, just call get_grad for any value of the inputs and targets (e.g. the current minibatch). get_grad() doesn't need to be passed the value of the weights because they're stored as a theano shared variable.