I have trained a spacy textcat model but then I realized that there were some incorrect training data: data from one category happened to be labeled with another category. My question is: is it possible to remove these training examples from the model without retraining it? Something like nlp.update() but in reverse? Would appreciate any help!
You mean to revert specific cases? As far as I know, that's not currently possible in spaCy.
I would suggest to either retrain from scratch with the corrected annotations, or continue training with the updated annotations. If you continue training, make sure that you keep feeding a representative set to your model, so that it doesn't "forget" cases it was already predicting correctly before.
Related
I am trying to train a model with the following data
The image is the input and 14 features must be predicted.
Could I please know how to go about training such a model?
Thank you.
These are not really features as far as I am concerned. These are classes and if I got it correctly, your images sometimes belong to more than one classes.
This is a very broad question but I think here might be a good start to learn more about multi-label image classification.
Note that your model should not be much different than an image classification model that is used for cifar10 challenge, for example. But you need to structure your data and choose your loss function accordingly.
I have the code to classify the images as Nude or Non nude. It is implemented on deep learning with tensor flow python. The code can be founded in Tensorflow Implementation of Yahoo's Open NSFW Model
I want to add some more images in to the dataset on order to do fine tuning. How can I do fine tuning in this implementation by using another dataset.
Just load their model and initialize its weights with the ones they provide, similar to how they do it here. Assuming that you are familiar with tensorflow, you should then proceed to train that model on your images.
Besides this blog post I'm not aware of any other publications the team has made on their work. This is a bit of an issue as they don't state their training parameters (choice of optimizer, learning rate, etc.). If you want to fine-tune this model you will have to experiment a bit in this regard.
Do they give you the original dataset that the provided model is trained off of? If so, you can easily just add your own dataset to their dataset, and train a completely new model based on the combined dataset.
I wrote more about this "combined" dataset, where you can add more or less data, here.
Good Luck!
I'm new to keras seq2seq LSTM models. I have a working machine translation model and English-to-Arabic training data. I just trained the model using google colab tool and made some predictions. As you can see in the image, when I test the model on a text from the training data, it predicts well, but when I change ONE word, the prediction goes completely wrong!
I want my model to UNDERSTAND the full meaning of the text even when adding/deleting one word. How can I solve this problem?
LSTM wrong predictions when adding/deleting one word
In the image, the first test of each section is the text from the training data, which predicts well. The second test is the same but with adding/deleting one word.
UPDATE: Whenever I add validation split, the val_loss is always increasing and the model isn't learning too much! What's going worng?
This is the classical overtraining problem. Your model only learn to translate your training data by remembering each sample instead of understanding the concept behind it.
For this reason always split your training data in training data and validation data. The validation data must not be in the training data set! This way you can check if your model is actually learning something.
There are two main solution for this:
Like m33n said more training data (there is no data like more data)
Implement more regularization techniques like Dropout
Also the problem seems very ambigous. Translating sentences is not an easy task at all and copanies like google or deepl created very complex models trained with lots and lots of data occupied over years. Are you sure you have the necessary resources to accomplish this?
I'm learning about machine learning, and I've often come across people separating their data into a 'training set' and a 'validation set.' I could never figure out why people never just used all of the data for training and then just used it again for validation. Is there a reason for this that I'm missing?
Think of it like this, you are going to take an exam and you are practicing hard with your practice materials. You don't know what you are going to be asked in the exam right ?
On the other hand if you practice with the exam itself, when you take the exam you will know all the answers, so you don't even have to bother studying.
That's the case for your model, if you train your model on both the train set and test set, your model will know all the answers beforehand. You need to give him something he does not know so that he can deduce some answers to you.
Basically, you would want the model to be trained using the train dataset, in order to test if the hyper-parameter tuning is done right you would like to test it out with a portion of the dataset.
If this was done on the test data directly then chances of over-fitting is high. To avoid this, you use the validation data set and measure your model's performance against the test dataset.
I am new to tensorflow, so please pardon my ignorance.
I have a tensorflow demo model "from an online tutorial" that should predict stockmarket prices for S&P. When I run the code I get inconsistent results everytime I run it. Training data does not change, I suppressed block shuffling , ...
But, When I run the prediction 2 times in the same run I get consistent results "i.e. use Only one training , run prediction twice".
My questions are:
Why am I getting inconsistent results?
If you are going to release such code to production , would you
just take the last time you ran this model training results? if not, then what would you do?
Does it make sense to force the model to produce consistent predictions? how would
you do that?
Here is my code location github repo
In training a neural network there is more randomness involved than just the batch shuffling. The initial weights of the layers are also randomly initialized.
Typically you would use the best model you have trained so far. To determine which model is the best you usually use some test dataset you did not use during training.
It is probably not a good sign if your performance fluctuates for different training runs. This means your result depends a lot on the random initialization. But I personally don't know about any general techniques to make learning more stable. But there probably are some.