Tensorflow: Combining Loss Functions in LSTM Model for Domain Adaptation

Tensorflow: Combining Loss Functions in LSTM Model for Domain Adaptation - python

Can any one please help me out?
I am working on my thesis work. Its about Predicting Parkinson disease, Since i want to build an LSTM model to adapt independent of patients. Currently i have implemented it using TensorFlow with my own loss function.
Since i am planning to introduce both labeled train and unlabeled train data in every batch of data to train the model. I want to apply my own loss function on this both labeled and unlabeled train data and also want to apply cross entropy loss only on labeled train data. Can i do this in tensorflow?
So my question is, Can i have combination of loss functions in a single model training on different set of train data?

From an implementation perspective, the short answer would be yes. However, I believe your question could be more specific, maybe what you mean is whether you could do it with tf.estimator?

Related

Validation loss way greater than 1

I'm using Keras to classify EEG signals, which are processed with different feature extraction methods. I constantly run into an issue, where the Validation loss is way greater than 1 regardless of the method used. I am using Sparse Categorical Crossentropy as a loss function, and the ANN is a traditional resolution neural network with 50 layers (ResNet50). The validation set is made by choosing random samples from the training set. Hope someone has an idea and I really hope it is caused by a silly mistake.

Train and validation data structure

What will happen if I use the same training data and validation data for my machine learning classifier?

If the train data and the validation data are the same, the trained classifier will have a high accuracy, because it has already seen the data. That is why we use train-test splits. We take 60-70% of the training data to train the classifier, and then run the classifier against 30-40% of the data, the validation data which the classifier has not seen yet. This helps measure the accuracy of the classifier and its behavior, such as over fitting or under fitting, against a real test set with no labels.

We create multiple models and then use the validation to see which model performed the best. We also use the validation data to reduce the complexity of our model to the correct level. If you use train data as your validation data, you will achieve incredibly high levels of success (your misclassification rate or average square error will be tiny), but when you apply the model to real data that isn't from your train data, your model will do very poorly. This is called OVERFITTING to the train data.

Basically nothing happens. You are just trying to validate your model's performance on the same data it was trained on, which practically doesn't yield anything different or useful. It is like teaching someone to recognize an apple and asking them to recognize just the same apple and see how they performed.
Why a validation set is used then? To answer this in short, the train and validation sets are assumed to be generated from the same distribution and thus the model trained on training set should perform almost equally well on the examples from validation set that it has not seen before.

Generally, we divide the data to validation and training to prevent overfitting. To explain it, we can think a model that classifies that it is human or not and you have dataset contains 1000 human images. If you train your model with all your images in that dataset , and again validate it with again same data set your accuracy will be 99%. However, when you put another image from different dataset to be classified by the your model ,your accuracy will be much more lower than the first. Therefore, generalization of the model for this example is a training a model looking for a stickman to define basically it is human or not instead of looking for specific handsome blonde man. Therefore, we divide dataset into validation and training to generalize the model and prevent overfitting.

TLDR;
If you use the same dataset for training and validation then:
training_accuracy = testing_accuracy
Your testing_accuracy will be the same as training_accuracy if you use the training dataset as the validation dataset. Therefore you will NOT be able to tell if your model has underfit or not.
Let's talk about datasets and evaluation metrics. Here is some terminology (reference) -
Datasets:
Training dataset: The data used to fit the model.
Validation dataset: the data used to validate the generalization ability of the model or for early stopping, during the training process. In most cases, this is the same as the test dataset
Evaluations:
Training accuracy: The accuracy you achieve when comparing predictions and actuals from the training data itself.
Testing accuracy: The accuracy you achieve when comparing predictions and actuals from the testing/validation data.
With the training_accuracy, you can get a sense of how well a model fits your data and the testing_accuracy tells you how well that model is generalizable. If train_accuracy is low, then your model has underfitted and you may need a better model (better features, different architecture, etc) for modeling the given problem. If training_accuracy is high but testing_accuracy is low, this means your model fits the data well, but it's not generalizable on unseen data. This is overfitting.
Note: In practice, it is better to have a overfit model and regularize it heavily rather than work with an underfit model.
Another important thing you need to understand that training a model (fit) and inference from a model (predict / score) are 2 separate tasks. Therefore, when you use the validation dataset as the training dataset, you are basically still training the model on the same training dataset but while inference, you are using the training dataset which will give you the same accuracy as the training_accuracy.
You will therefore not come to know if at all you overfit BUT that doesn't mean you will get 99% accuracy like the other answer to suggest! You may still underfit and get an extremely low model accuracy

Shall I update my training data in real-time?

I tried image classification using trained model and its working well but some images could not find perfectly in that time have to get that image and label from users so my doubt is..Is it possible to add new data into already trained model?

No, during inference time you use the weights of the trained model for predictions. Which basically means that at the time your model is deployed the capabilities of your image classifier are fixed by the weights. If you wish to improve your model, you would have to retrain your model with the new - data. However, there is another paradigm of learning called "Online Learning" where the model is continuously learning and modifying the weights. In this case your weights are not fixed and your model is continuously updating its weights with each training input. However afaik this is not usually recommended for CNNs, because the backward pass of gradients is computationally intensive and your inference will be slow because of this.

No model can predict with 100% accuracy if it does it's an ideal model. And if you want to add more data to your train model you have to retrain the model with the new data. Having more data is always a good idea. It allows the “data to tell for itself,” instead of relying on assumptions and weak correlations. Presence of more data results in better and accurate models. So if you want to get better accuracy you have to train your model with more data. Without retraining, you can't add data to your trained model.

excluding bias for prediction in NN

I have a continuous dataset, for which I built a neural network. Before running the NN, I have scaled the training data and added bias after each layer. I am pretty happy with training result, however, I am facing a problem, when trying to build a predict method.
As I have said, the training data was scaled. The problem is, if I run un-scaled testing data with training weights, whole model will act differently and testing score will of course produce different result.
My question is, how to exclude bias from prediction model? Do I need to remove every weight out manually or are there and good practises around?

Python Word2Vec: understand the trained model itself in detail

Let me make my question clearer:
I am using python gensim.models.Word2Vec to train a word embedding model. Based on my understanding, the model training is in essence a machine learning issue---to train a neural network via a prediction task. For example, if I select parameters to train a skip-gram model, then the model is trained by predicting context words from target word. Once the model is well-trained, word vectors are just obtained from the model.
If my understanding is correct, so since in fact it is a machine learning process and the training goal is to perform well in the prediction task, there should be a loss function during training and the model is supposed to make the loss as low as possible. So, how to know the model loss value for a given set of parameters? Or is there any other metrics that we can know to understand the model itself?
Hope I have made my question clear. In a word, I don't want to evaluate the model by its outputs as in the Google test set http://word2vec.googlecode.com/svn/trunk/questions-words.txt, but I want to understand the model itself as a simple machine learning problem during its training process. Would this be possible?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.