How to add regularization in CNN autoencoder model_Based on Keras - python

I am a freshman in Keras and deep learning, I am not quite sure the right way to add the regularization, I wrote a CNN autoencoder using the API model class, right now I add the regularizer in each of the "Conv2D" Keras function,I am not sure if this is the right place to add regularization, could anyone please give me some suggestions?
(I tried to run the training and check the reconstructed test images, it is OK, but not very good, I use MNIST to test, the line of the reconstructed MNIST number is thicker than the original one.)
In my problem, the input image is an impaired one, and the original good image is used as a training label, by comparing the output image of the CNN with the training label image, I use the "mean absolute error" to define the loss , and also use it as the metric.
I defined three functions first, one downsampling function (the one below), one upsampling function, and one function to squeeze the third dimension of the matrix to get a two-dimensional matrix as the output.
My code is too long, just to help illustrate the problem, part of my code is as follow:
After having three defined functions, I defined the model as follow (not in detail, just part of it to help explain my problem)
load all necessary parameters to the model,then define the optimizer parameters, and compile the model

Related

Improve the result of EfficientNet

Halo there, I'm still struggling in python.
Now I'm going to use the EfficientNet model to detect the ripeness of palm oil.
I'm using 5852 training picture which is divided into 4 class (1463 per class) with 132 testing picture (33 per class).
After testing with 200 epoch, the result is far from good.
Is there any solution for me to improve the result?
Here's the result of my model accuracy and model loss.
Here's my code
https://colab.research.google.com/drive/18AtIP7aOycHPDR84PuQ7iS8aYUdclZIe?usp=sharing
your help means a lot to me.
You have rescaling in your generators, and it may be the root of the problem.
Tensorflow implementation of Efficientnets already contain rescaling layer, so you mustn't rescale images in your ImageDataGenerator. You can check this via .summary() method.
Official documentation says:
Note: each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus tf.keras.applications.efficientnet.preprocess_input is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the [0-255] range
Resnets, for example, don't have this layer, and you should rescale images before feeding them to the model. It's tricky to remember those things for every single network from tf.keras.applications, so I suggest to just check them before using new models.

Keras: Manually backprop from Discriminator to Generator

I have a model that is essentially an Auxiliary Conditional GAN; the first part of the model is the Generator, the last part is the Discriminator. The Discriminator makes multiclass (k=10) predictions.
Following the work of http://arxiv.org/abs/1912.07768 (pp3 for a helpful diagram, but note I ignore network structure modifications for the purposes of this question) I train the entire model for T=32 iterations by generating synthetic input and class labels (The 'inner loop'). I can predict on real data and labels using just the Discriminator(Learner) to get losses. However I need to back-propagate the Discriminator's error all the way back through the inner loop to the Generator.
How can I achieve this with Keras? Is it possible to do loop unrolling in Keras? How can I provide an arbitrary loss and backprop this down the unrolled layers?
Update: There's now one implementation, in PyTorch, which uses Facebooks 'Higher' library. This appears to mean that the updates made during the inner loop must be 'unwrapped' in order for the final meta-loss to be applied throughout the entire network. Is there a Keras way of achieving this? https://github.com/GoodAI/GTN
This seems to be a Generative Adversarial Network (GAN) which both model learns; one is to classify and the other is for generation.
To summarize, the output of the Generator is fed as a Input in the Discriminator and then feeds back its output to the Generator as its Input.
There is also a discussion and implementation of this in TensorFlow Keras using MNIST Digit Dataset in TensorFlow GAN/DCGAN Documentation in this link.

How do I fine tune a model on a dataset with different classes?

In this paper the author fine tuned a COCO-pretrained model to PoseTrack Dataset. I want to do something similiar but I am not sure how to do that, since the number of classes (keyjoints) are not the same.
I thought about "disabling" certain classes but I am unsure how.
You will need to change the last layer of the nn with the number of classes of your problem, and re train for your dataset (supposing it has the same input shape). You can use the pre trained weights of the previous layer so that the convergence will be faster and you will need less data

Can CNN autoencoders have different input and output dimensions?

I am working on a problem which requires me to build a deep learning model that based on certain input image it has to output another image. It is worth noting that these two images are conceptually related but they don't have the same dimensions.
At first I thought that a classical CNN with a final dense layer whose argument is the multiplication of the height and width of the output image would suit this case, but when training it was giving strange figures such as accuracy of 0.
While looking for some answers on the Internet I discovered the concepts of CNN autoencoders and I was wondering if this approach could help me solve my problem. Among all the examples I saw, the input and output of an autoencoder had the same size and dimensions.
At this point I wanted to ask if there was a type of CNN autoencoders that produce an output image that has different dimension compared to input image.
Auto-encoder (AE) is an architecture that tries to encode your image into a lower-dimensional representation by learning to reconstruct the data from such representation simultaniously. Therefore AE rely on a unsupervised (don't need labels) data that is used both as an input and as the target (used in the loss).
You can try using a U-net based architecture for your usecase. A U-net would forward intermediate data representations to later layers of the network which should assist with faster learning/mapping of the inputs into a new domain..
You can also experiment with a simple architecture containing a few ResNet blocks without any downsampling layers, which might or might not be enough for your use-case.
If you want to dig a little deeper you can look into Disco-GAN and related methods.They explicitly try to map image into a new domain while maintaining image information.

Right metrics and loss for averaging label probabilities across an RNN

I am trying to reproduce that in Keras:
LRCN is trained to predict the
video’s activity class at each time step. To produce a single
label prediction for an entire video clip, we average the label
probabilities—the outputs of the network’s softmax layer—
across all frames and choose the most probable label.
But I am quite new to LSTMs and am not sure about which metrics and loss function to use to replicate the method applied in the text above.
So far I have an LSTM RNN, which returns sequences and its outputs I feed into a time-distributed dense layer of 3 classes.
A "frame" corresponds to a timestep of the RNN and return_sequences=True will enable me to return prediction per frame.
Could you please tell me which metrics and loss I need and if I need custom ones also?
Could you please tell me which metrics and loss I need and if I need custom ones also?
From the paper document you can see the work is old: May 2016.
Please consider some recent work with more specifics.
This paper, provides no clue on LSTM specifics except the variable number of units they used, so you may try to find models with explained metrics.

Categories

Resources