excluding bias for prediction in NN - python

I have a continuous dataset, for which I built a neural network. Before running the NN, I have scaled the training data and added bias after each layer. I am pretty happy with training result, however, I am facing a problem, when trying to build a predict method.
As I have said, the training data was scaled. The problem is, if I run un-scaled testing data with training weights, whole model will act differently and testing score will of course produce different result.
My question is, how to exclude bias from prediction model? Do I need to remove every weight out manually or are there and good practises around?

Related

In Keras, after you train a stateful LSTM model, do you have to re-train the model as you predict values?

I've been trying to create a stateful LSTM model with keras, and I pretty much figured out the training part, but I don't get the predicting part.
So, let's imagine that we had 10000 time-series datapoints. we use 9000 in front for training, and the other 1000 for testing. So, as we start training, we set the window length to 2, and slide the window forward as we set the input(X) as the first datapoint and set the output(y) as the second datapoint.
And as we train, the model converges because of it's stateful nature. Finally we finish training.
Now, we are left with a model, and some test data. The problem begins here. We test the first datapoint.
It returns a guessed value. Nice.
We test the second datapoint of the test set.
We get an output. But, the problem is that because we were using a stateful model, and we only one value as an input, the only way the model is going to figure out the next value is from memory of the previous time-series.
But since we didn't train the data on the first datapoint of the test set, the time-series is broken, and the model will think that the second datapoint on the test set is the first datapoint on the test set!
So, my question is,
does keras take care of this and automaticaly train the network as it's predicting?
or do I have to train the net as I am predicting
or is there some other reason that enables me to just keep predicting without training the model farther?
For a stateful LSTM, if will retain information in its cells as you predict. If you were to take any random point in the train or test dataset and repeatedly predict on it, your answer will change each time, because it keeps seeing this data and uses it every time it predicts. The only way to get a repeatable answer would be to call reset_states().
You should be calling reset_states() after each training epoch, and when you save the model, those cells should be empty. Then if you want to start predicting on the test set, you can predict on the last n training points (without saving the values anywhere), then start saving values once you get to your first test point.
It is often good practice to seed the model before prediction. If I want to evaluate on test_set[10:20,:], I can let the model predict on test_set[:10,:] first to seed the model then start saving my predicted values once I get to the range I am interested in.
To address the further training question, you do not need to train the model further to predict. Training will only be for tuning the model's weights. Look into this blog for more information on Stateful vs Stateless LSTM.

Why am I getting 100% accuracy using feed-forward neural networks for separate training, validation, and testing datasets in Keras?

Today I was working on a classifier to detect whether or not a mushroom was poisonous given its features. The data was in a .csv file(read to a pandas DataFrame) and the link to the data can be found at the end.
I used sci-kit learn's train_test_split function to split the data into training and testing sets.
I then removed the column that specified whether or not the mushroom was poisonous or not for the training and testing labels and assigned this to a yTrain, and yTest variable.
I then applied a one-hot-encoding (Using pd.get_dummies()) to the data since the parameters were categorical.
After this, I normalized the training and testing input data.
Essentially the training and testing input data was a distinct list of one-hot-encoded parameters and the output data was a list of one's and zeroes representing the output(one meant poisonous, zero meant edible).
I used Keras and a simple-feed forward network for this project. This network is comprised of three layers; A simple Dense(Linear Layer for PyTorch users) layer with 300 neurons, a Dense layer with 100 neurons, and a Dense layer with two neurons, each representing the probability of whether or not the given parameters of the mushroom signified it was poisonous, or edible. Adam was the optimizer that I had used, and Sparse-Categorical-Crossentropy was my loss-function.
I trained my network for 60 epochs. After about 5 epochs the loss was basically zero, and my accuracy was 1. After training, I was worried that my network had overfitted, so I tried it on my distinct testing data. The results were the same as the training and validation data; the accuracy was at 100% and my loss was negligible.
My validation loss at the end of 50 epochs is 2.258996e-07, and my training loss is 1.998715e-07. My testing loss was 4.732502e-09. I am really confused at the state of this, is the loss supposed to be this low? I don't think I am overfitting, and my validation loss is only a bit higher than my training loss, so I don't think that I am underfitting, as well.
Do any of you know the answer to this question? I am sorry if I had messed up in a silly way of some sort.
Link to dataset: https://www.kaggle.com/uciml/mushroom-classification
It seems that that Kaggle dataset is solvable, in the sense that you can create a model which gives the correct answer 100% of the time (if these results are to be believed). If you look at those results, you can see that the author was actually able to find models which give 100% accuracy using several methods, including decisions trees.

Shall I update my training data in real-time?

I tried image classification using trained model and its working well but some images could not find perfectly in that time have to get that image and label from users so my doubt is..Is it possible to add new data into already trained model?
No, during inference time you use the weights of the trained model for predictions. Which basically means that at the time your model is deployed the capabilities of your image classifier are fixed by the weights. If you wish to improve your model, you would have to retrain your model with the new - data. However, there is another paradigm of learning called "Online Learning" where the model is continuously learning and modifying the weights. In this case your weights are not fixed and your model is continuously updating its weights with each training input. However afaik this is not usually recommended for CNNs, because the backward pass of gradients is computationally intensive and your inference will be slow because of this.
No model can predict with 100% accuracy if it does it's an ideal model. And if you want to add more data to your train model you have to retrain the model with the new data. Having more data is always a good idea. It allows the “data to tell for itself,” instead of relying on assumptions and weak correlations. Presence of more data results in better and accurate models. So if you want to get better accuracy you have to train your model with more data. Without retraining, you can't add data to your trained model.

Tensorflow: Combining Loss Functions in LSTM Model for Domain Adaptation

Can any one please help me out?
I am working on my thesis work. Its about Predicting Parkinson disease, Since i want to build an LSTM model to adapt independent of patients. Currently i have implemented it using TensorFlow with my own loss function.
Since i am planning to introduce both labeled train and unlabeled train data in every batch of data to train the model. I want to apply my own loss function on this both labeled and unlabeled train data and also want to apply cross entropy loss only on labeled train data. Can i do this in tensorflow?
So my question is, Can i have combination of loss functions in a single model training on different set of train data?
From an implementation perspective, the short answer would be yes. However, I believe your question could be more specific, maybe what you mean is whether you could do it with tf.estimator?

Convolutional Neural Network using TensorFlow

I'm building a CNN model using Tensorflow, without the use of any frontend APIs such as Keras. I'm creating a VGG-16 model and using the pre-trained weights, and want to fine tune the last layers to serve my purpose.
Following the tutorial here, http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
I re-created the training script and modified as per my requirements. However, my training does not happen and the training accuracy is stuck at 50.00% and validation accuracy is forming a pattern repeating the numbers.
Attached is the screenshot of the same.
I have been stuck on this for days now and can't seem to find the error. Any help is appreciated.
The code is pretty long and hence here is the gist file for the same
Your cross entropy is wrong, you are comparing your logits with the softmax of your logits.
This:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_pred)
Should be:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_true)
Some things to note. I would not train on some data point and then evaluate on the same datapoint. Your training accuracy is probably going to be biased by doing so. Another point to note ist that tf.argmax(tf.softmax(logits)) is the same as tf.argmax(logits).

Categories

Resources