Running Tensorflow Predictions code twice does *not* result same outcome

Running Tensorflow Predictions code twice does *not* result same outcome - python

I am new to tensorflow, so please pardon my ignorance.
I have a tensorflow demo model "from an online tutorial" that should predict stockmarket prices for S&P. When I run the code I get inconsistent results everytime I run it. Training data does not change, I suppressed block shuffling , ...
But, When I run the prediction 2 times in the same run I get consistent results "i.e. use Only one training , run prediction twice".
My questions are:
Why am I getting inconsistent results?
If you are going to release such code to production , would you
just take the last time you ran this model training results? if not, then what would you do?
Does it make sense to force the model to produce consistent predictions? how would
you do that?
Here is my code location github repo

In training a neural network there is more randomness involved than just the batch shuffling. The initial weights of the layers are also randomly initialized.
Typically you would use the best model you have trained so far. To determine which model is the best you usually use some test dataset you did not use during training.
It is probably not a good sign if your performance fluctuates for different training runs. This means your result depends a lot on the random initialization. But I personally don't know about any general techniques to make learning more stable. But there probably are some.

Related

In tensorflow why for a same dropout value of 0.8 when run with adam optimiser with 50epochs give different accuracy each time i run it?

I am building ANN as below:-
model=Sequential()
model.add(Flatten(input_shape=(25,)))
model.add(Dense(25,activation='relu'))
model.add(Dropout(0.8))
model.add(Dense(16,activation='relu'))
model.add(Dropout(0.8))
model.add(Dense(5,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
model.fit(xtraindata,ytraindata,epochs=50)
test_loss,test_acc=model.evaluate(xtestdata,ytestdata)
print(test_acc)
I am adding different features into the model and checking whether the newly added feature decreases or increases the accuracy but the problem is that each time I run this code with the same values I get different accuracy, sometimes it gets as low as 0.50 and so, I have few doubts and kindly answer them:-
Is the model giving different accuracy each time because in dropout reg. there are random dropouts in nodes and each time I run diff. nodes get silenced so thereby giving different accuracies i.e sometimes low and sometimes high?
How can I trust the accuracy of the model if each time it gives different accuracies? How can I know that the feature I have added has resulted in a decrement or increment of the accuracy?
If I get high accuracy and wanted to reproduce these results how do I save the parameters that the model has used?

Great questions. Answers:
I think your theory is right; it's the dropout. That's the only layer with an element of randomness each run, so it's likely the culprit. Try removing that layer, leaving everything else fixed, and run multiple times. Check if the accuracy is the same.
Cross validation. This article explains how it works, but the gist is that it is a statistical technique that trains and checks the accuracy of multiple runs of your model, all with different slices of data. The average accuracy of all runs is used. So highs and lows will be averaged to a true(ish) accuracy. That being said, if your model has inconsistent results by just varying dropout, it's an indicator that when you move the model to production and use real data, it will perform poorly.
Keras api has a method model.save("model_name") to save models. You can use keras.models.load_models("model_name") to get it back. As I said in point 2 though; if your model is so finicky that some trainings drastically affect accuracy, then even if you train and get good accuracy, it probably won't be useful on new data. So when you say "If I get high accuracy and wanted to reproduce these results", really you shouldn't be thinking along these lines. Instead, try to get consistently high training accuracy.

Keras model.predict function not giving similar results as model.evaluate

I have trained an image classification model using Keras. The model after training has 95% accuracy on training data and using model.evaluate on an untouched validation data, I get ~92.8% accuracy.
But when I use model.predict function instead to get the prediction probabilities and get the predicted class with maximum probability, I get ~80% accuracy.
The complete code is available as a colab notebook on the following link - https://colab.research.google.com/drive/1RQ2KnT2sVsdCAWfpsDj_kcMZiqiwJrpc?usp=sharing
You should be able to run everything and see the difference in accuracy. The problem lies in the code blocks as shown below

To make both the accuracies from predict_generator and evaluate_generator same, you have to set the following 3 things in your functions as parameters:
shuffle = False
pickle_safe = True
workers = 1
Your program might be running on different threads and these settings make it run on the main thread.

The solution I could find so far after having posted the issue here and keras official github (without any answer for weeks) is that instead of using Keras, I used tf.keras. Most of the implementation stayed the same. And the "Shuffle" option is definitely messing up the accuracy. The lower accuracy with "Shuffle = False" is a bug in the keras implementation probably. The tf.keras implementation gives the same result in the "evaluate_generator" function. And the predict and evaluate function outputs with respect to accuracy match. I hope if other people encounter this error, they don't waste as much time as I did on the issue.

Tensorflow trained model speed

I'm a Tensorflow newby and I'm trying to train a 1 class model for object detection. In particular I'm trying to recognize an arrow like the following:
I need a very fast recognition so I started wondering if a pre-trained model can contain such kind of shape.
Unfortunately didn't find anything similar and therefor I started with my own training of the arrow using as model the faster_rcnn_inception_v2_coco_2018_01_28.
I'm using his pipeline config, and I'm using his fine_tune_checkpoint as well, is this right considering that I have to train a completely different object?
The result is a training with a very good accuracy but very low speed. I need to increase the framerate and I didn't understand yet if the less is the "training loss" the more is the "object recognition speed", or not.
Any suggestion on how could I speedup the detection?

I'm using his pipeline config, and I'm using his fine_tune_checkpoint
as well, is this right considering that I have to train a completely
different object?
Yes! Every time you want to change the output of a deep NN, you should take a pretrained model. Training a model from scratch can take several weeks and you will never be able to generate enough data on your own. Taking a pretrained model and fine-tuning it is a way to go.
I didn't understand yet if the
less is the "training loss" the more is the "object recognition
speed", or not.
No. Training loss just tells you how good your model performs with respect to the training set.
The issue you are having is a classic speed vs. accuracy trade-off. I encourage you to take a look at this table and find a model which is fast enough for you (i.e. lowest run-time) but have decent accuracy. I would first check SSD here.

The result is a training with a very good accuracy but very low speed.
How much FPS does your algorithm perform?
Since you already have prepared dataset, I would suggest using Tiny-Yolo which performs 244 FPS on COCO dataset https://pjreddie.com/darknet/yolo/
Preparing training dataset for Tiny-Yolo is very easy if you use this repository
And
I didn't understand yet if the less is the "training loss" the more is the "object recognition speed"
Training lost has nothing to do with speed.

Why does more epochs make my model worse?

Most of my code is based on this article and the issue I'm asking about is evident there, but also in my own testing. It is a sequential model with LSTM layers.
Here is a plotted prediction over real data from a model that was trained with around 20 small data sets for one epoch.
Here is another plot but this time with a model trained on more data for 10 epochs.
What causes this and how can I fix it? Also that first link I sent shows the same result at the bottom - 1 epoch does great and 3500 epochs is terrible.
Furthermore, when I run a training session for the higher data count but with only 1 epoch, I get identical results to the second plot.
What could be causing this issue?

A few questions:
Is this graph for training data or validation data?
Do you consider it better because:
The graph seems cool?
You actually have a better "loss" value?
If so, was it training loss?
Or validation loss?
Cool graph
The early graph seems interesting, indeed, but take a close look at it:
I clearly see huge predicted valleys where the expected data should be a peak
Is this really better? It sounds like a random wave that is completely out of phase, meaning that a straight line would indeed represent a better loss than this.
Take a look a the "training loss", this is what can surely tell you if your model is better or not.
If this is the case and your model isn't reaching the desired output, then you should probably make a more capable model (more layers, more units, a different method, etc.). But be aware that many datasets are simply too random to be learned, no matter how good the model.
Overfitting - Training loss gets better, but validation loss gets worse
In case you actually have a better training loss. Ok, so your model is indeed getting better.
Are you plotting training data? - Then this straight line is actually better than a wave out of phase
Are you plotting validation data?
What is happening with the validation loss? Better or worse?
If your "validation" loss is getting worse, your model is overfitting. It's memorizing the training data instead of learning generally. You need a less capable model, or a lot of "dropout".
Often, there is an optimal point where the validation loss stops going down, while the training loss keeps going down. This is the point to stop training if you're overfitting. Read about the EarlyStopping callback in keras documentation.
Bad learning rate - Training loss is going up indefinitely
If your training loss is going up, then you've got a real problem there, either a bug, a badly prepared calculation somewhere if you're using custom layers, or simply a learning rate that is too big.
Reduce the learning rate (divide it by 10, or 100), create and compile a "new" model and restart training.
Another problem?
Then you need to detail your question properly.

How to study the effect of each data on a deep neural network model?

I'm working on a training a neural network model using Python and Keras library.
My model test accuracy is very low (60.0%) and I tried a lot to rise it, but I couldn't. I'm using DEAP dataset (total 32 participants) to train the model. The splitting technique that I'm using is a fixed one. It was as the followings:28 participants for training, 2 for validation and 2 for testing.
For the model I'm using is as follows.
sequential model
Optimizer = Adam
With L2_regularizer, Gaussian noise, dropout, and Batch normalization
Number of hidden layers = 3
Activation = relu
Compile loss = categorical_crossentropy
initializer = he_normal
Now, I'm using train-test technique (fixed one also) to split the data and I got better results. However, I figured out that some of the participants are affecting the training accuracy in a negative way. Thus, I want to know if there is a way to study the effect of the each data (participant) on the accuracy (performance) of a model?
Best Regards,

From my Starting deep learning hands-on: image classification on CIFAR-10 tutorial, in which I insist on keeping track of both:
global metrics (log-loss, accuracy),
examples (correctly and incorrectly classifies cases).
The later may help us telling which kinds of patterns are problematic, and on numerous occasions helped me with changing the network (or supplementing training data, if it was the case).
And example how does it work (here with Neptune, though you can do it manually in Jupyter Notebook, or using TensorBoard image channel):
And then looking at particular examples, along with the predicted probabilities:
Full disclaimer: I collaborate with deepsense.ai, the creators or Neptune - Machine Learning Lab.

This is, perhaps, more broad an answer than you may like, but I hope it'll be useful nevertheless.
Neural networks are great. I like them. But the vast majority of top-performance, hyper-tuned models are ensembles; use a combination of stats-on-crack techniques, neural networks among them. One of the main reasons for this is that some techniques handle some situations better. In your case, you've run into a situation for which I'd recommend exploring alternative techniques.
In the case of outliers, rigorous value analyses are the first line of defense. You might also consider using principle component analysis or linear discriminant analysis. You could also try to chase them out with density estimation or nearest neighbors. There are many other techniques for handling outliers, and hopefully you'll find the tools I've pointed to easily implemented (with help from their docs); sklearn tends to readily accept data prepared for Keras.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.