I am implementing a speech emotion recognition multitask model in Pytorch. I see many researchers use single-input multi-output model in speech emotion recognition, but I want to realize the multi_input multi-output model. For this kind of model, I have seen some examples, they use different datasets as input, speech emotion recognition and gender recognition as output tasks. Although it is called multi-input multi-output model, but they concat the input information together in the following layers, which I thought can be regard as single-input.
I want to implement a model that can use several datasets as input and the input features in each datasets can be used predict the labels independently, which means for the output tasks, each task has its own input features, and input features of each dataset are corresponding to the each output tasks. How can I apply this to the model using Pytorch or Keras? Maybe if I don't concat features in all dataset, it can be worked. I don't know whether it is right.
Related
0
I am looking for unsupervised image recommendation model approach and the data is too much domain specific and pre-trained models are not giving good results like Vgg, resnet, densenet, alexnet. The end task is to have several similar images in decreasing order of their resemblance as compared to the query image. One way is to fine-tune or create a CNN model but challenge is then their aren't any class or target labels. The other way is using a methodology if I can generate custom image embeddings then I believe I can directly use them against similarity metrices for their ranking rather than using a pre-trained model. The challenge is again how to generate custom image embeddings without any targets or classes. It would be nice if one can direct me towards unsupervised image recommendation models.
I am new to NLP and Neural Networks. I want to do topic analysis for a dataset of reviews of our product. I tried to use the universal-sentence-encoder along with top2vec and they do a good job. However, the similarity score is low most of the time because the combination of words is unique. I want to retrain the model to capture the similarity in my dataset (which is about 40k reviews).
Is there a way to do so? I think this is called unsupervised fine-tuning
I am aware of Keras' API to import the transformer as a layer, but I do not how to continue from there.
Question 1: What layers or loses could I add to capture similarity between my reviews and outputs vector embedding of the text?
I also read about siamese networks. If I understood correctly, I can add the universal encoder twice as the common network, then add a layer for similarity like keras.layers.Dot.
Question 2: If I trained the model on every possible combination of me reviews, can I then use the output of the first embedding layers (with the new weights) as the embedding of text?
Question 3: If the answer for question is yes, both embeddings in the siamese networks should be almost identical for the same text. Right?
I'm doing sentiment analysis of Spanish tweets.
After reviewing some of the recent literature, I've seen that there's been a most recent effort to train a RoBERTa model exclusively on Spanish text (roberta-base-bne). It seems to perform better than the current state-of-the-art model for Spanish language modeling so far, BETO.
The RoBERTa model has been trained for a variety of tasks, which do not include text classification.
I want to take this RoBERTa model and fine-tune it for text classification, more specifically, sentiment analysis.
I've done all the preprocessing and created the dataset objects, and want to natively train the model.
Code
# Training with native TensorFlow
from transformers import TFRobertaForSequenceClassification
model = TFRobertaForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne")
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
model.compile(optimizer=optimizer, loss=model.compute_loss) # can also use any keras loss fn
model.fit(train_dataset.shuffle(1000).batch(16), epochs=3, batch_size=16)
Question
My questions is regarding the TFRobertaForSequenceClassification:
Is it correct to use this, since it's not specified in the model card? Instead of the AutoModelForMaskedLM specified in the model card.
Do we, by simply applying TFRobertaForSequenceClassification, imply that it will automatically apply the trained (and pretrained) knowledge to the new task, namely text classification?
The model in the model card references what essentially the model has been trained on. If you are familiar with architectural choices for different modeling tasks (e.g., token classification vs sequence classification), it should become clear that these models will have slightly different layouts, specifically in the layers after the Transformer output layer. For token classification, this is (generally speaking) Dropout and an additional linear layer, mapping from the hidden_size of the model to the number of output classes. See here for an example with BERT.
This means that the model checkpoint which was pre-trained with a different learning objective will not have weights for this final layer, but instead you train these (comparably few) parameters during your fine-tuning. In fact, for PyTorch models you will generally get a warning when loading a model checkpoint that slightly differs in the available weights:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: [...]
This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). [...]
This is exactly what you are doing, so as long as you have a decent number of fine-tuning examples (depending on the number of classes, I would suggest 10e3-10e4 as a rule of thumb), this will not affect your training by much.
I want to point out, however, that it might be necessary for you to specify the number of labels that your TokenClassification layer has. You can do this, by specifying it during the loading of your model:
from transformers import TFRobertaForSequenceClassification
roberta = TFRobertaForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne",
num_labels=<your_value>)
I'm working on a project that requires the recognition of just people in a video or a live stream from a camera. I'm currently using the tensorflow object recognition API with python, and i've tried different pre-trained models and frozen inference graphs. I want to recognize only people and maybe cars so i don't need my neural network to recognize all 90 classes that come with the frozen inference graphs, based on mobilenet or rcnn, as it seems this slows the process, and 89 of this 90 classes are not needed in my project. Do i have to train my own model or is there a way to modify the inference graphs and the existing models? This is probably a noob question for some of you, but mind that i've worked with tensorflow and machine learning for just one month.
Thanks in advance
Shrinking the last layer to output 1 or two classes is not likely to yield large speed ups. This is because most of the computation is in the intermediate layers. You could shrink the intermediate layers, but this would result in poorer accuracy.
Yes, you have to train own model. Let's see in short words some ways how to do.
OPTION 1. When you want to apply transfer knowledge as maximum as possible, you can froze the CNN layers. After, you change a quantity of detected classes with dimension of classifier (dense layers). The classifier is the latest part in CNN architecture. Now, you should retrain only classifier.
OPTION 2. Assuming, you want to apply transfer knowledge for first layers of CNN (for example, froze first 2-3 CNN layers) and retrain rest of CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain rest of CNN layers and classifier.
OPTION 3. Assuming, you want to retrain whole CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain whole CNN with classifier.
Generally, the Tensorflow Object Detection API is a good start for beginners! How to proceed with your problem you can see here more detail about whole process and extra explanation here.
I am trying to solve a time series prediction problem. I tried with ANN and LSTM, played around a lot with the various parameters, but all I could get was 8% better than the persistence prediction.
So I was wondering: since you can save models in keras; are there any pre-trained model (LSTM, RNN, or any other ANN) for time series prediction? If so, how to I get them? Are there in Keras?
I mean it would be super useful if there a website containing pre trained models, so that people wouldn't have to speent too much time training them..
Similarly, another question:
Is it possible to do the following?
1. Suppose I have a dataset now and I use it to train my model. Suppose that in a month, I will have access to another dataset (corresponding to same data or similar data, in the future possibly, but not exclusively). Will it be possible to continue training the model then? It is not the same thing as training it in batches. When you do it in batches you have all the data in one moment.
Is it possible? And how?
I'll answer your last questions first.
Will it be possible to continue training the model then? It is not the same thing as training it in batches. When you do it in batches you have all the data in one moment. Is it possible? And how?
Yes, it is possible. In general, it's called transfer learning. But keep in mind that if two datasets represent very different populations, the network will soon "forget" what it learned on the first run and will optimize to the second one. To do this, you simply start training from a loaded state instead of random initialization and save the model afterwards. It is also recommended to use a smaller learning rate on the second run in order to adapt it gradually to the new data.
are there any pre-trained model (LSTM, RNN, or any other ANN) for time
series prediction? If so, how to I get them? Are there in Keras?
I haven't found exactly a pre-trained model, but a quick search gave me several active GitHub projects that you can just run and get a result for yourself: Time Series Prediction with Machine Learning (LSTM, GRU implementation in tensorflow), LSTM Neural Network for Time Series Prediction (keras and tensorflow), Time series predictions with Keras (keras and theano), Neural-Network-with-Financial-Time-Series-Data (keras and tensorflow). See also this post.
Now you can use BERT or related variants and here you can find all the pre-trained models: https://huggingface.co/transformers/pretrained_models.html
And it is possible to pre-train and fine-tune RNN, and you can refer to this paper: TimeNet: Pre-trained deep recurrent neural network for time series classification.