I have finished training a model, what I want to do next is hand over my model to my colleages who know nothing about deep learning, I just want to give them a function that they can run without installing tensorflow or Python on their machines, may be just python (ideally I would love it to run on Matlab). Is this doable? How can I abstract away all or codify everything from them?
I read about deployment of models but it's all about servers and stuff, this is not what I want.
PS: assume a TF model for now.
Basically, there's onnx that aims at assisting in deploying trained models.
I do not have much experience with it, but I know it's not always a straight forward procedure.
Related
I've been looking to train my own ELMo model for the past week and came across these two implementations allenai/bilm-tf & allenai/allennlp. I've been facing a few roadblocks for a few techniques I've tried and would like to clarify my findings, so that I can get a clearer direction.
As my project revolves around healthcare, I would like to train the embeddings from scratch for better results. The dataset I am working on is MIMIC-III and the entire dataset is stored in one .csv, unlike 1 Billion Word Language Model Benchmark (data used in tutorials) where files are stored in separate .txt files.
I was following this "Using ELMo as a PyTorch Module to train a new model" tutorial but I figured out that one of the requirements is a .hdf5 weights_file.
(Question) Does this mean that I will have to train a bilm model first to get .hdf5 weights to input? Can I train an ELMo model from scratch using allennlp.modules.elmo.Elmo? Is there any other way where I can train a model this way with an empty .hdf5 as I was able to run this successfully with tutorial data.
(Question) What will be the best method for me to train my embeddings? (PS: some methods I've tried are documented below). In my case where I will probably need a custom DatasetReader, rather than converting the csv to txt files, wasting memory.
Here, let me go into the details of other methods I have tried so far. Serves as a backstory to the main question of what may be the best technique. Please let me know if you know of any other methods to train my own ELMo model, or if one of the following methods are preferred over the others.
I've tried training a model using the allennlp train ... command by following this tutorial. However, I was unable to run with tutorial data due to the following error which I am still unable to solve.
allennlp.common.checks.ConfigurationError: Experiment specified GPU device 1 but there are only 1 devices available.
Secondly, this is a technique that I found but have not tried. Similar to the technique above it uses the allennlp train ... command but instead I use allenai/allennlp-template-config-files as a template and modify the Model and DatasetReader.
Lastly, I tried using the TensorFlow implementation allenai/bilm-tf following tutorials like this. However, I would like to avoid this method as TF1 is quite outdated. Besides receiving tons of warnings, I faced an error for CUDA as well.
2021-09-14 17:31:36.222624: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 18.45M (19346432 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
I'm interested in training a YOLOv5 model. Currently, I'm using Roboflow to annotate and export the data into YOLOv5 format. I'm also using Roboflow's Colab Notebook for YOLOv5.
However, I'm not familiar with many of the commands used in the Roboflow Colab Notebook. I found on here that there appears to be a much more "Pythonic" way of using and manipulating the YOLOv5 model, which I would be much more familiar with.
My questions regarding this are as follows:
Is there an online resource that can show me how to train the YOLOv5 and extract results after importing the model from PyTorch with the "Pythonic" version (perhaps a snippet of code right here on StackOverflow would help)? The official documentation that I could find (here) also uses the "non-Pythonic" method for the model.
Is there any important functionality I would lose if I were to switch to this "Pythonic" method of using YOLOv5?
I found nothing in the documentation that suggests otherwise, but would I need to export my data in a different format from Roboflow for the data to be able to train the "Pythonic" model?
Similar to question 1), is there anywhere that can guide me how to use the trained model on test images? Do I simply do prediction=model(my_image.jpg)? What if I want predictions on multiple images at once?
Any guidance would be appreciated. Thanks!
You can use the GitHub repository of ultralytics to do what you want; if you want to understand the process, check out the train.py file to get a better understanding. There isn't a straightforward explanation you just have to learn by yourself.
For the training: if you want to write the code by yourself it will need a lot of ML knowledge; that's why train.py exist, same for test.py and export.py.
I'm new to this topic, so forgive me my lack of knowledge. There is a very good model called inception resnet v2 that basically works like this, the input is an image and outputs a list of predictions with their positions and bounded rectangles. I find this very useful, and I thought of using the already worked model in order to recognize things that it now can't (for example if a human is wearing a mask or not). Yes, I wanted to add a new recognition class to the model.
import tensorflow as tf
import tensorflow_hub as hub
mod = hub.load("https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1")
mod is an object of type
tensorflow.python.training.tracking.tracking.AutoTrackable, reading the documentation (that was only available on the source code was a bit hard to understand without context)
and I tried to inspect some of it's properties in order to see if I could figure it out by myself.
And well, I didn't. How can I see the network, the layers, the weights? the fit methods, Is it's all abstracted away?. Can I convert it to keras? I want to experiment with it, see if I can modify it, and see if I could export the model to another representation, for example pytorch.
I wanted to do this because I thought it'd be better to modify an already working model instead of creating one from scratch. Also because I'm not good at training models myself.
I've run into this issue too. Tensorflow hub guide says:
This error frequently arises when loading models in TF1 Hub format with the hub.load() API in TF2. Adding the correct signature should fix this problem.
mod = hub.load(handle).signatures['default']
As an example, you can see this notebook.
You can dir the loaded model asset to see what's defined on it
m = hub.load(handle)
dir(model)
As mentioned in the other answer, you can also look at the signatures with print(m.signatures)
Hub models are SavedModel assets and do not have a keras .fit method on them. If you want to train the model from scratch, you'll need to go to the source code.
Some models have more extensive exported interfaces including access to individual layers, but this model does not.
I want to use tensorflow for detecting cars in an embedded system, so I tried ssd_mobilenet_v2 and it actually did pretty well for me, except for some specific car types which are not very common and I think that is why the model does not recognize them. I have a dataset of these cases and I want to improve the model by fine-tuning it. I should also note that I need a .tflite file because I'm using tflite_runtime in python.
I followed these instructions https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 and I could train the model and reached a reasonable loss value. I then used export_tflite_ssd_graph.py in the object detection API to build inference_graph from the trained model. Afterwards I used toco tool to build a .tflite file out of it.
But here is the problem, after I've done all that; not only the model did not improve, but now it does not detect any cars. I got confused and do not know what is the problem, I searched a lot and did not find any tutorial about doing what I need to do. They just added a new object to a model and then exported it, which I tried and I was successful doing that. I also tried to build a .tflite file without training the model and directly from the Tensorflow detection model zoo and it worked fine. So I think the problem has something to do with the training process. Maybe I am missing something there.
Another thing that I did not find in documents is that whether is it possible to "add" a class to the current classes of an object detection model. For example, let's assume the mobilenet ssd v2 detects 90 different object classes, I would like to add another class so that the model detects 91 different classes instead of 90 classes. As far as I understand and tested after doing transfer learning using object detection API, I could only detect the objects that I had in my dataset and the old classes will be gone. So how do I do what I explained?
I found out that there is no way to 'add' a class to the previously trained classes but with providing a little amount of data of that class you can have your model detect it. The reason is that the last layer of the model changes when transfer learning is applied. In my case I labeled around 3k frames containing about 12k objects because my frames would be complicated. But for simpler tasks as I saw in tutorials 200-300 annotated images would be enough.
And for the part that the model did not detect anything it has something to do with the convert command that I used. I should have used tflite_convert instead of toco. I explained more here.
I am playing around with tensorflow and today I have noticed that google also open-sourced Python SDK for their dataflow.
Currently when I need to train and evaluate several networks in parallel I usually use either luigi and run one model training after another or I use spark and I am performing each model training within the map step.
Whole this data processing is just a part of the pipeline.
I am wondering if there is or if there is planned something like perform tensorflow model training step inside of the dataflow pipeline?
Is there currently some best practice around this?
Or do I have to run each model setting within the map step?
I went through the documentation and for now it seems to be really vague, so I'm asking here if someone has some experience with this.
There is nothing planned at this time.
If you can run the Tensorflow training on a single machine (it sounds like this is what you were doing with Spark) then it should be possible to do the training within a DoFn of a Dataflow pipeline.