I am working now on my master thesis. And I want to extend the ResNet50 model to add tabular data. Has anyone experience in similar task? I use an iterative DataLoader and it may causes problems. In general I would like to ask if it's a good idea to create a network with mixed data types (image + tabular) and if this is the right approach. Thanks in advance!
This might be an old question but I find this blogpost might answer your question very well:
Markus Rosenfelder's blog
In summary, it explains how to combine a CNN (like your ResNet50) and tabular input to one model that has a combined output (using Pytorch and Pytorch Lightning but I feel the tutorial is so well done that you can easily adapt the technique to whatever you are using). The tutorial includes the whole process so your problems regarding the DataLoader might be addressed as well.
Hope this helps!
Related
I am working on comparing different explaination techniques of black box prediction problems.
I read the paper about this rulebased explanation technique called LORE (https://arxiv.org/abs/1805.10820) and i have found the official repository (https://github.com/riccotti/LORE). They apply it to a classification problem, but i would need to use it on regression problem with a neural network i have created.
Since there's no documentation, and comments in code are really poor, i am having difficulty trying to understand how to change the code to adapt it to my case, so i was wondering if anyone had the same problem and, in case, how they solved it.
I'm interested in training a YOLOv5 model. Currently, I'm using Roboflow to annotate and export the data into YOLOv5 format. I'm also using Roboflow's Colab Notebook for YOLOv5.
However, I'm not familiar with many of the commands used in the Roboflow Colab Notebook. I found on here that there appears to be a much more "Pythonic" way of using and manipulating the YOLOv5 model, which I would be much more familiar with.
My questions regarding this are as follows:
Is there an online resource that can show me how to train the YOLOv5 and extract results after importing the model from PyTorch with the "Pythonic" version (perhaps a snippet of code right here on StackOverflow would help)? The official documentation that I could find (here) also uses the "non-Pythonic" method for the model.
Is there any important functionality I would lose if I were to switch to this "Pythonic" method of using YOLOv5?
I found nothing in the documentation that suggests otherwise, but would I need to export my data in a different format from Roboflow for the data to be able to train the "Pythonic" model?
Similar to question 1), is there anywhere that can guide me how to use the trained model on test images? Do I simply do prediction=model(my_image.jpg)? What if I want predictions on multiple images at once?
Any guidance would be appreciated. Thanks!
You can use the GitHub repository of ultralytics to do what you want; if you want to understand the process, check out the train.py file to get a better understanding. There isn't a straightforward explanation you just have to learn by yourself.
For the training: if you want to write the code by yourself it will need a lot of ML knowledge; that's why train.py exist, same for test.py and export.py.
I have millions of images to infer on. I know how to write my own code to create batches and forward the batches to a trained network using MxNet Module API in order to get the predictions. However, creating the batches leads to a lot of data manipulation that is not especially optimized.
Before doing any optimisation myself, I would like to know if there are some recommended approaches for batch predictions/inferences. More specifically, since this is a common use case, I was wondering if there is an interface/api that can do the usual image pre-processing, batch creation, and inference given a trained model (i.e. symbole file & epoch checkpoint)?
If you are using a standard pretrained model, I would highly recommend to take a look into gluoncv project - a toolkit for Computer Vision based on Apache MXNet.
They have really nice implementations of state of the art models, sometimes even beating the original results that are published in scientific papers. What is cool is that they also provide the data preprocessing code - as far as I understand, this is what you are looking for. (see gluoncv.data.transforms.presets package).
I don't know which inference you want to do, like image classification, segmentation, etc, but take a look to the list of tutorials and most probably you will find one you need.
Other than that, optimization for the fast wall clock time requires you to make sure that your GPU is 100% utilized. You may find useful to watch this video to learn more about tips and tricks on optimizing performance. It discusses training, but the same techniques applies to inference.
Long story short:
How to prepare data for lstm object detection retraining of the tensorflow master github implementation.
Long story:
Hi all,
I recently found implementation a lstm object detection algorithm based on this paper:
http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Mobile_Video_Object_CVPR_2018_paper.pdf
at the tensorflow model master github repository (https://github.com/tensorflow/models/tree/master/research/lstm_object_detection)
I would like to retrain this implementation on my own dataset to evaluate the lstm improvement to other algorithms like SSD. But I keep struggling on how to prepare the data for the training. I've tried the config file of the authors and tried to prepare the data similar to the object-detection-api and also tried to use the same procedure as the inputs/seq_dataset_builder_test.py or inputs/tf_sequence_example_decoder_test.py does. Sadly the github Readme does not provide any information. Someone else created an issue with a similar question on the github repo (https://github.com/tensorflow/models/issues/5869) but the authors did not provide a helpful answer yet. I tried to contact the authors via email a month ago, but didn't got a response. I've also searched the internet but found no solution. Therefore I desperately write to you!
Is anybody out there who can explain how to prepare the data for the retraining and how to actually run the retraining.
Thank you for reading, any help is really appreciated!
I have model saved in graph (.pb file). But now the model is inaccurate and I would like to develop it. I have pictures of additional data to learn, but I don't if it's possible or if it's how to do it? The result must be the modified of new data pb graph.
It's a good question. Actually it would be nice, if someone could explain how to do this. But in addition i can say you, that it would come to "catastrophic forgetting", so it wouldn't work out. You had to train all your data again.
But anyway, i also would like to know that espacially for ssd, just for test reasons.
The mozzila/DeepSpeech community has contributed a way to initialize training from a frozen graph(.pb). It does not restores optimizer parameters, so adjusting the learning rate is necessary.
You could find the code at:
https://github.com/mozilla/DeepSpeech/blob/master/DeepSpeech.py#L1562
Hope this helps!