How does TF know what object you are finetuning for

How does TF know what object you are finetuning for - python

I am trying to improve mobilenet_v2's detection of boats with about 400 images I have annotated myself, but keep on getting an underfitted model when I freeze the graphs, (detections are random does not actually seem to be detecting rather just randomly placing an inference). I performed 20,000 steps and had a loss of 2.3.
I was wondering how TF knows that what I am training it on with my custom label map
ID:1
Name: 'boat'
Is the same as what it regards as a boat ( with an ID of 9) in the mscoco label map.
Or whether, by using an ID of 1, I am training the models' idea of what a person looks like to be a boat?
Thank you in advance for any advice.

The model works with the category labels (numbers) you give it. The string "boat" is only a translation for human convenience in reading the output.
If you have a model that has learned to identify a set of 40 images as class 9, then giving it a very similar image that you insist is class 1 will confuse it. Doing so prompts the model to elevate the importance of differences between the 9 boats and the new 1 boats. If there are no significant differences, then the change in weights will find unintended features that you don't care about.
The result is a model that is much less effective.

so I managed to figure out the issue.
We created the annotation tool from scratch and the issue that was causing underfitting whenever we trained regardless of the number of steps or various fixes I tried to implement was that When creating bounding boxes there was no check to identify whether the xmin and ymin coordinates were less than the xmax and ymax I did not realize this would be such a large issue but after creating a very simple check to ensure the coordinates are correct training ran smoothly.

Related

Custom trained tensorflow model efficientdet is not detecting anything

I followed the following tutorial by Gilbert Tanner
https://gilberttanner.com/blog/tensorflow-object-detection-with-tensorflow-2-creating-a-custom-model/
to make a custom trained Tensorflow object detection model. There were just a few incompatibilities I had to fix, but overall everything worked out. But now, when I am trying to test the trained model with the testing script mentioned in the tutorial (I copied it to my local machine as a script because I don't like working with the Notebook. I just hope this is not causing the problem), it just doesn't work. There are no errors and no warnings, it's just that it doesn't detect anything on any picture.
The model was supposed to identify members of my family on old pictures, it was trained with roughly 300 pictures, they on average contained 5 members of my family. In total, 17 different family members were labeled. The model trained until the learning rate dropped to 0, which took about 60 hours. I had to reduce the batch_size argument in the config file of the model to 1, because it kept overflowing my memory otherwise.
I suppose, the given information is not enough to solve the problem. However, this is the first time for me working with custom trained models, so I don't really know what Information is needed. So feel free to just ask for additional outputs or other information.
Thanks in advance to anyone who can help me.

Methods for increasing accuracying of a CNN for image classification

I'm currently working on a image classification task, involving a large datasets of grayscale images of cartoons and my CNN needs to classify them. Atm my model has a test accuracy of about 88% but I know a higher accuracy is possible.
I've tried:
improving / changing the actual model / architecture
using different meta parameters
different loss functions from the pytorch libraries
a bunch of different transforms
different optimizes from torch.optim
I've also tried a bunch of the standard models included in torchvision.models and am still getting sub 90% accuracy on the test set.
Do I just need to keep trying the above things to squeeze out better accuracy or are there any other avenues I can try? Would really appreciate any suggestions, the only other thing I can think of would be making my own custom loss function specific for the data set but I'm not exactly sure how much that would help?

From what you've described, it sounds like it might be worth spending some time on the data preparation. Here is a good article on how to do that for images. Some ideas you could try are:
Resizing all your images to a fixed size
Subtracting mean pixel values, i.e. normalizing the dataset
I don't really know the context of what you're doing but I would also consider adding additional features that may be relevant and seeing if that helps.

Rebuilding and training new Deep Learning Python model after feature importances and feature selection to reduce feature amount?

i'm learning Deep Learning concepts with python and i've come so far with my project.
This open project's purpose is to detect Liver cancer so patient avoid biopsy and can be healed sooner than usual.
I've a dataset of 427 patients on which genetics markers (2687 columns) methylation rate has been determined from 0 to 1 (0 = not methylated, 1 = fully methylated).
I used xgboost and I got a node-graph with renamed features by xgboost (So my first problem is I don't know what markers really are represented by these xgboost graph's labels (apparently with 3 tests (6 "yes" or "no" decision tree, fig. a), xgboost can determine if a patient have a liver cancer or not)
So, considering i'm not enough experimented and not english-native I'd like to get some of your advices to finalize it with wishes uppon my skill :
2 : Is there a simple way to make these "label" xgboost chosed become the real markers's name so i can test all my model with only these 3 ? unless i didn't understood well what was this graph ?
3 : I got this feature importance graph (fig. b) and again, i'd like to find the way to make model only with "important" markers (features), So instead having 2680+ columns (markers) for each patients, I have waaaaaay less features needed for the same accuracy. (my model is actually 99.5 accurate)
fig.a Nodes decisions tree by xgboost
fig.b (feature importance) link because you need power zoom : https://cdn.discordapp.com/attachments/314114979332882432/579000210760531980/features_importances.png
I have my whole notebook but i don't know how to show you interesting code parts (because you have to import dataset etc..)
even the code that worked one day ago to get the shape of feature importance (that might return a simple 2687) isn't working anymore for me, such as "'Booster' object has no 'feature_importances_' " when I executes cells.. I don't know why ...
for indication
When I do
cv_results = xgb.cv(dtrain=data_dmatrix, params=params, nfold=100, num_boost_round=100, early_stopping_rounds=10, metrics="error", as_pandas=True, seed=123)
cv_results
I have 0.0346 for train error mean, 0.00937 for train error std and 0.135 for test error std
At the moment I don't really have error, I just don't know how to take those xgboost label to translate and get in the concerned feature, xgboost return nodes named such as fl1754 or f93 etc and my features in data set are like "cg000001052" (it's CpG markers (fig c))
fig c . dataset format
how CpG marker names (col) are displayed in dataset
Then I'll do another model with only these (considered)-importants features to see if it's still insanely accurate with thousands less markers
If you really need some parts, I'll be able to provide them to you, at the moment I'm just lost with my searches I don't find the type of answer I want whoever the basic idea I have is simple
As a newbie, I'd say I've noticed the f93 in the nodes graph is the second most important in the feature selection (I displayed by Desc order ! an exploit for me, hardest part of the project to be honest)
Now I feel close the end, the purpose was to reduce marker amount needed and i feel really close with such results :( then i'm lost
Any help is so welcome !

Ok so basically I've tried to reset the dataset with only selected markers
Happen it doesn't work and I miss some technic until I have weird error trying to setting up new model like that.
I end up deciding this is the end of the project,
Solution is : there is no reachable solution
so conclusion : it detects cancer with 99.5% accuracy but need 2683 CpG markers, too bad I can't redim it too 500 as I almost succeed to do. that's too bad it was 99% complete
Thank you anyway for your precious knowledge and help
regards

Where does machine learning algorithme store the result?

I think this is kind of "blasphemy" for someone who comes from the AI world, but since I come from the world where we program and get a result, and there is the concept of storing something un memory, here is my question :
Machine learning works by iterations, the more there are iterations, the best our algorithm becomes, but after those iterations, there is a result stored somewhere ? because if I think as a programmer, if I re-run the program, I must store previous results somewhere, or they will be overwritten ? or I need to use an array for example to store my results.
For example, if I train my image recognition algorithme with a bunch of cats pictures data sets, what are the variables I need to add to my algorithme, so if I use it with an image library, it will always success everytime I find a cat, but I will use what? since there is nothing saved for my next step ?
All videos and tutorials I have seen, they only draw a graph as decision making visualy, and not applying something to use it in future program ?
For example, this example, kNN is used to teach how to detect a written digit, but where is the explicit value to use ?
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/2_BasicModels/nearest_neighbor.py
NB: people clicking on close request or downvoting at least give a reason.

the more there are iterations, the best our algorithm becomes, but after those iterations, there is a result stored somewhere
What you're alluding to here is the optimization part.
However to optimize a model, we first have to represent it.
For example, if I'm creating a very simple linear model to predict house prices using its surface in square meters I might go for this model:
price = a * surface + b
That's the representation.
Now that you have represented the model, you want to optimize it, i.e. find the params a and b that minimize the prediction error.
there is a result stored somewhere ?
In the above, we say that we have learned the params or weights a and b.
That's what you keep, the weights which come from optimization (also called training) and of course the model itself.

I think there is some confusion. Let's clear it up.
Machine Learning models usually have parameters, and these parameters are trainable. This means a training algorithm find the "right" values of these parameters in order to properly work for a given task.
This is the learning part. The actual parameter values are "inferred" from training data.
What you would call the result of the training process is a model. The model is represented by formulas with parameters, and these parameters must be stored. Typically when you use a ML/DL framework (like scikit-learn or Keras), the parameters are stored alongside some information about the type of model, so it can be reconstructed at runtime.

Can a neural network recognize a screen and replicate a finite set of actions?

I learned, that neural networks can replicate any function.
Normally the neural network is fed with a set of descriptors to its input neurons and then gives out a certain score at its output neuron. I want my neural network to recognize certain behaviours from a screen. Objects on the screen are already preprocessed and clearly visible, so recognition should not be a problem.
Is it possible to use the neural network to recognize a pixelated picture of the screen and make decisions on that basis? The amount of training data would be huge of course. Is there way to teach the ANN by online supervised learning?
Edit:
Because a commenter said the programming problem would be too general:
I would like to implement this in python first, to see if it works. If anyone could point me to a resource where i could do this online-learning thing with python, i would be grateful.

I would suggest
http://www.neuroforge.co.uk/index.php/getting-started-with-python-a-opencv
http://docs.opencv.org/doc/tutorials/ml/table_of_content_ml/table_of_content_ml.html
http://blog.damiles.com/2008/11/the-basic-patter-recognition-and-classification-with-opencv/
https://github.com/bytefish/machinelearning-opencv
openCV is basically an image processing library but also has some amazing helper classes that you you can use for almost any task. Its machine learning module is pretty easy to use and you can go through the source to see explanation and background theory about each function.
You could also use a pure python machine learning library like:
http://scikit-learn.org/stable/
But, before you feed in the data from your screen (i'm assuming thats in pixels?) to your ANN or SVM or whatever ML algorithm you choose, you need to perform "Feature Extraction" on your data. (which are the objects on the screen)
Feature Extraction can be thought of like representing the same data on the screen but with fewer numbers so i have less numbers to give to my ANN. You need to experiment with different features before you find a combination that works well for your particular scenario. a sample one could look something like this:
[x1,y1,x2,y2...,col]
This is basically a list of edge points that represent the area your object is in. a sort of ROI (Region of Interest) and perform egde detection, color detection and also extract any other relevant characteristics. The important thing is that now all your objects, their shape/color information is represented by a number of these lists, one for each object detected.
This is the data that can be provided as input to the neural network. but you'll have to define some meaningfull output parameters depending on your specific problem statements before you can train/test your system of course.
Hope this helps.

This is not entirely correct.
A 3-layer feedforward MLP can theoretically replicate any CONTINUOUS function.
If there are discontinuities, then you need a 4th layer.
Since you are dealing with pixelated screens and such, you probably would need to consider a fourth layer.
Finally, if you are looking at circular shapes, etc., than a radial basis function (RBF) network may be more suitable.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.