I followed the following tutorial by Gilbert Tanner
https://gilberttanner.com/blog/tensorflow-object-detection-with-tensorflow-2-creating-a-custom-model/
to make a custom trained Tensorflow object detection model. There were just a few incompatibilities I had to fix, but overall everything worked out. But now, when I am trying to test the trained model with the testing script mentioned in the tutorial (I copied it to my local machine as a script because I don't like working with the Notebook. I just hope this is not causing the problem), it just doesn't work. There are no errors and no warnings, it's just that it doesn't detect anything on any picture.
The model was supposed to identify members of my family on old pictures, it was trained with roughly 300 pictures, they on average contained 5 members of my family. In total, 17 different family members were labeled. The model trained until the learning rate dropped to 0, which took about 60 hours. I had to reduce the batch_size argument in the config file of the model to 1, because it kept overflowing my memory otherwise.
I suppose, the given information is not enough to solve the problem. However, this is the first time for me working with custom trained models, so I don't really know what Information is needed. So feel free to just ask for additional outputs or other information.
Thanks in advance to anyone who can help me.
Related
I've trained an Image Classification model via Google Cloud Platform's Vertex AI framework and liked the results. Due to that I then proceeded to export it in Tensorflow SavedModel format (shows up as 'Container' export) for custom prediction because I don't like neither the slowness of Vertex's batch prediction nor the high cost of using a Vertex endpoint.
In my python code I used
model = tensorflow.saved_model.load(model_path)
infer = model.signatures["serving_default"]
When I tried to inspect what infer requires I saw that its input is two parameters: image_bytes and key. Both are string-type tensors.
This question can be broken off into several sub-questions that then make a whole:
Isn't inference done on multiple data instances? If so, why is it image_bytes and not images_bytes?
Is image_bytes just the output of open("img.jpg", "rb").read()? If so, don't I have to resize it first? To what size? How do I check that?
What is key? I have absolutely no clue or guess regarding this one's meaning.
The documentation for GCP is paid only and so I have decided to ask for help here. I tried to search for an answer on google for multiple days but found no relevant article.
Thank you for reading and your help would be greatly appreciated and maybe even useful to future readers.
I've been looking to train my own ELMo model for the past week and came across these two implementations allenai/bilm-tf & allenai/allennlp. I've been facing a few roadblocks for a few techniques I've tried and would like to clarify my findings, so that I can get a clearer direction.
As my project revolves around healthcare, I would like to train the embeddings from scratch for better results. The dataset I am working on is MIMIC-III and the entire dataset is stored in one .csv, unlike 1 Billion Word Language Model Benchmark (data used in tutorials) where files are stored in separate .txt files.
I was following this "Using ELMo as a PyTorch Module to train a new model" tutorial but I figured out that one of the requirements is a .hdf5 weights_file.
(Question) Does this mean that I will have to train a bilm model first to get .hdf5 weights to input? Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? Is there any other way where I can train a model this way with an empty .hdf5 as I was able to run this successfully with tutorial data.
(Question) What will be the best method for me to train my embeddings? (PS: some methods I've tried are documented below). In my case where I will probably need a custom DatasetReader, rather than converting the csv to txt files, wasting memory.
Here, let me go into the details of other methods I have tried so far. Serves as a backstory to the main question of what may be the best technique. Please let me know if you know of any other methods to train my own ELMo model, or if one of the following methods are preferred over the others.
I've tried training a model using the allennlp train ... command by following this tutorial. However, I was unable to run with tutorial data due to the following error which I am still unable to solve.
allennlp.common.checks.ConfigurationError: Experiment specified GPU device 1 but there are only 1 devices available.
Secondly, this is a technique that I found but have not tried. Similar to the technique above it uses the allennlp train ... command but instead I use allenai/allennlp-template-config-files as a template and modify the Model and DatasetReader.
Lastly, I tried using the TensorFlow implementation allenai/bilm-tf following tutorials like this. However, I would like to avoid this method as TF1 is quite outdated. Besides receiving tons of warnings, I faced an error for CUDA as well.
2021-09-14 17:31:36.222624: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 18.45M (19346432 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
I'm new to this topic, so forgive me my lack of knowledge. There is a very good model called inception resnet v2 that basically works like this, the input is an image and outputs a list of predictions with their positions and bounded rectangles. I find this very useful, and I thought of using the already worked model in order to recognize things that it now can't (for example if a human is wearing a mask or not). Yes, I wanted to add a new recognition class to the model.
import tensorflow as tf
import tensorflow_hub as hub
mod = hub.load("https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1")
mod is an object of type
tensorflow.python.training.tracking.tracking.AutoTrackable, reading the documentation (that was only available on the source code was a bit hard to understand without context)
and I tried to inspect some of it's properties in order to see if I could figure it out by myself.
And well, I didn't. How can I see the network, the layers, the weights? the fit methods, Is it's all abstracted away?. Can I convert it to keras? I want to experiment with it, see if I can modify it, and see if I could export the model to another representation, for example pytorch.
I wanted to do this because I thought it'd be better to modify an already working model instead of creating one from scratch. Also because I'm not good at training models myself.
I've run into this issue too. Tensorflow hub guide says:
This error frequently arises when loading models in TF1 Hub format with the hub.load() API in TF2. Adding the correct signature should fix this problem.
mod = hub.load(handle).signatures['default']
As an example, you can see this notebook.
You can dir the loaded model asset to see what's defined on it
m = hub.load(handle)
dir(model)
As mentioned in the other answer, you can also look at the signatures with print(m.signatures)
Hub models are SavedModel assets and do not have a keras .fit method on them. If you want to train the model from scratch, you'll need to go to the source code.
Some models have more extensive exported interfaces including access to individual layers, but this model does not.
I want to use tensorflow for detecting cars in an embedded system, so I tried ssd_mobilenet_v2 and it actually did pretty well for me, except for some specific car types which are not very common and I think that is why the model does not recognize them. I have a dataset of these cases and I want to improve the model by fine-tuning it. I should also note that I need a .tflite file because I'm using tflite_runtime in python.
I followed these instructions https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 and I could train the model and reached a reasonable loss value. I then used export_tflite_ssd_graph.py in the object detection API to build inference_graph from the trained model. Afterwards I used toco tool to build a .tflite file out of it.
But here is the problem, after I've done all that; not only the model did not improve, but now it does not detect any cars. I got confused and do not know what is the problem, I searched a lot and did not find any tutorial about doing what I need to do. They just added a new object to a model and then exported it, which I tried and I was successful doing that. I also tried to build a .tflite file without training the model and directly from the Tensorflow detection model zoo and it worked fine. So I think the problem has something to do with the training process. Maybe I am missing something there.
Another thing that I did not find in documents is that whether is it possible to "add" a class to the current classes of an object detection model. For example, let's assume the mobilenet ssd v2 detects 90 different object classes, I would like to add another class so that the model detects 91 different classes instead of 90 classes. As far as I understand and tested after doing transfer learning using object detection API, I could only detect the objects that I had in my dataset and the old classes will be gone. So how do I do what I explained?
I found out that there is no way to 'add' a class to the previously trained classes but with providing a little amount of data of that class you can have your model detect it. The reason is that the last layer of the model changes when transfer learning is applied. In my case I labeled around 3k frames containing about 12k objects because my frames would be complicated. But for simpler tasks as I saw in tutorials 200-300 annotated images would be enough.
And for the part that the model did not detect anything it has something to do with the convert command that I used. I should have used tflite_convert instead of toco. I explained more here.
I am trying to improve mobilenet_v2's detection of boats with about 400 images I have annotated myself, but keep on getting an underfitted model when I freeze the graphs, (detections are random does not actually seem to be detecting rather just randomly placing an inference). I performed 20,000 steps and had a loss of 2.3.
I was wondering how TF knows that what I am training it on with my custom label map
ID:1
Name: 'boat'
Is the same as what it regards as a boat ( with an ID of 9) in the mscoco label map.
Or whether, by using an ID of 1, I am training the models' idea of what a person looks like to be a boat?
Thank you in advance for any advice.
The model works with the category labels (numbers) you give it. The string "boat" is only a translation for human convenience in reading the output.
If you have a model that has learned to identify a set of 40 images as class 9, then giving it a very similar image that you insist is class 1 will confuse it. Doing so prompts the model to elevate the importance of differences between the 9 boats and the new 1 boats. If there are no significant differences, then the change in weights will find unintended features that you don't care about.
The result is a model that is much less effective.
so I managed to figure out the issue.
We created the annotation tool from scratch and the issue that was causing underfitting whenever we trained regardless of the number of steps or various fixes I tried to implement was that When creating bounding boxes there was no check to identify whether the xmin and ymin coordinates were less than the xmax and ymax I did not realize this would be such a large issue but after creating a very simple check to ensure the coordinates are correct training ran smoothly.