What would be the best way to train a neural network with CNTK and with progressive growing? I'm referring to the method described in Progressive Growing of GANs for Improved Quality, Stability, and Variation.
The network is first trained with smaller resolutions. After a while new convolutional layers are added to the network that operate at higher resolutions. The already trained parameters of the lower resolution part need to be there after adding the new layers.
Is there an easy way to add the layers and transfer the already learned parameters?
Related
My problem is that after creating and training a neural net with TensorFlow (version 2.1.0) I need to extrapolate all the basic parameters: net architecture, functions used and weight values found through training.
These parameters will then be read by a library that will generate the VHDL code to bring the neural network created on python on an FPGA.
So I wanted to ask if there are one or more methods to get all this information, not in binary format. Among all these values the most important one is the extrapolation of the value of the weights found at the end of the training.
I have trouble understanding weight transfer in transfer learning like tasks...
I trained two networks and saved the weights using keras with tensorflow backend (two networks are in the same model). I would like to use half of the layers from one network and half of the layers from the other network and concatenate them as a new network. Practically I want to cut two networks and join them in a new network and throw away remaining layers. Since half of the layer are top layers I couldn't do it with .pop() so I decided to transfer weights.
I tried this by setting corresponding weights from each layer (the ones that I needed) in the old model to corresponding layers in my new model like:
new_model.layers[i].set_weights(model.layers[i].get_weights())
This however loads the weights but seems to not work as I expect.
Then I tried get_layer:
new_model.layers[i] = model.get_layer('name').output
This also seems to do a meaningless weight transfer.
What should I transfer from my old network to the new network to carry the sense of actually taking half of the whole network?
Do only weights (and biases) carry all information? What else should I assign to have the theoretically same layers?
What does get_leyer return?
Does get_weight/set_weight do same thing as load_weight?
I used gridsearchcv to determine which hyperparameters in the mlpclassifier can make the accuracy from my neural network higher. I figured out that the amount of layers and nodes makes a difference but I'm trying to figure out which other configurations can make a difference in accuracy (F1 score actualy). But from my experience it lookes like parameters like "activation", "learning_rate", "solver" don't really change anything.
I need to do a research on which other hyperparameters can make a difference in the accuracy from predictions via the neural network.
Does someone have some tips/ideas on which parameters different from the amount of layers / nodes that can make a difference in the accuracy from my neural network predictions?
It all depends on your dataset. Neural network are not magical tools that can learn everything and also they require a lot of data compared to traditional machine learning models. In case of MLP, making a model extremely complex by adding a lot of layers is never a good idea as it makes the model more complex, slow and can lead to overfitting as well. Learning rate is an important factor as it is used to find the best solution for the model. A model makes mistakes and learns from it and the speed of learning is controlled by learning rate. If learning rate is too small, your model will take a long time to reach the best possible stage but if it is too high the model might just skip the best stage. The choice of activation function is again dependent on the use case and the data but for simpler datasets, activation function will not make a huge differnece.
In traditional deep learning models, a neural network is build up of several layers which might not always be dense. All the layers in MLP as dense i.e. feed forward. To improve your model, you can try a combination of dense layers along with cnn, rnn, lstm, gru or other layers. Which layer to use depends completely on your dataset. If you are using a very simple dataset for a school project, then experiment with traditional machine learning methods like random forest as you might get better results.
If you want to stick to neural nets, read about other types of layers, dropout, regularization, pooling, etc.
I'm working on a project that requires the recognition of just people in a video or a live stream from a camera. I'm currently using the tensorflow object recognition API with python, and i've tried different pre-trained models and frozen inference graphs. I want to recognize only people and maybe cars so i don't need my neural network to recognize all 90 classes that come with the frozen inference graphs, based on mobilenet or rcnn, as it seems this slows the process, and 89 of this 90 classes are not needed in my project. Do i have to train my own model or is there a way to modify the inference graphs and the existing models? This is probably a noob question for some of you, but mind that i've worked with tensorflow and machine learning for just one month.
Thanks in advance
Shrinking the last layer to output 1 or two classes is not likely to yield large speed ups. This is because most of the computation is in the intermediate layers. You could shrink the intermediate layers, but this would result in poorer accuracy.
Yes, you have to train own model. Let's see in short words some ways how to do.
OPTION 1. When you want to apply transfer knowledge as maximum as possible, you can froze the CNN layers. After, you change a quantity of detected classes with dimension of classifier (dense layers). The classifier is the latest part in CNN architecture. Now, you should retrain only classifier.
OPTION 2. Assuming, you want to apply transfer knowledge for first layers of CNN (for example, froze first 2-3 CNN layers) and retrain rest of CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain rest of CNN layers and classifier.
OPTION 3. Assuming, you want to retrain whole CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain whole CNN with classifier.
Generally, the Tensorflow Object Detection API is a good start for beginners! How to proceed with your problem you can see here more detail about whole process and extra explanation here.
I am trying to replicate a neural network for depth estimation. The original authors have taken a pre-trained network and added between the fully connected layer and the convolutional layer a 'Superpixel Pooling Layer'. In this layer, the convolutional feature maps are upsampled and the features per superpixel are averaged.
My problem is that in order to successfully achieve this, I need to calculate the superpixels per image. How can I access the data being used by keras/tensorflow during batch processing to perform SLIC oversegmentation?
I considered splitting the tasks and working by pieces i.e. feed the images into the convolutional network. Process the outputs separately and then feed them into a fully connected layer. However, this makes further training of the network impossible.
At the time it seems to be impossible to actually access the data within the symbolic tensor. It also seems unlikely that such functionality will be added in the future since in the Tensorflow page it says:
A Tensor object is a symbolic handle to the result of an operation, but
does not actually hold the values of the operation's output.
Keras allows for the creation of personalized layers. However, these are limited by the available backend operations. As such, it is simply not possible to access the batch data.