Long story short:
How to prepare data for lstm object detection retraining of the tensorflow master github implementation.
Long story:
Hi all,
I recently found implementation a lstm object detection algorithm based on this paper:
http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Mobile_Video_Object_CVPR_2018_paper.pdf
at the tensorflow model master github repository (https://github.com/tensorflow/models/tree/master/research/lstm_object_detection)
I would like to retrain this implementation on my own dataset to evaluate the lstm improvement to other algorithms like SSD. But I keep struggling on how to prepare the data for the training. I've tried the config file of the authors and tried to prepare the data similar to the object-detection-api and also tried to use the same procedure as the inputs/seq_dataset_builder_test.py or inputs/tf_sequence_example_decoder_test.py does. Sadly the github Readme does not provide any information. Someone else created an issue with a similar question on the github repo (https://github.com/tensorflow/models/issues/5869) but the authors did not provide a helpful answer yet. I tried to contact the authors via email a month ago, but didn't got a response. I've also searched the internet but found no solution. Therefore I desperately write to you!
Is anybody out there who can explain how to prepare the data for the retraining and how to actually run the retraining.
Thank you for reading, any help is really appreciated!
Related
I've trained an Image Classification model via Google Cloud Platform's Vertex AI framework and liked the results. Due to that I then proceeded to export it in Tensorflow SavedModel format (shows up as 'Container' export) for custom prediction because I don't like neither the slowness of Vertex's batch prediction nor the high cost of using a Vertex endpoint.
In my python code I used
model = tensorflow.saved_model.load(model_path)
infer = model.signatures["serving_default"]
When I tried to inspect what infer requires I saw that its input is two parameters: image_bytes and key. Both are string-type tensors.
This question can be broken off into several sub-questions that then make a whole:
Isn't inference done on multiple data instances? If so, why is it image_bytes and not images_bytes?
Is image_bytes just the output of open("img.jpg", "rb").read()? If so, don't I have to resize it first? To what size? How do I check that?
What is key? I have absolutely no clue or guess regarding this one's meaning.
The documentation for GCP is paid only and so I have decided to ask for help here. I tried to search for an answer on google for multiple days but found no relevant article.
Thank you for reading and your help would be greatly appreciated and maybe even useful to future readers.
I am currently following this github repo: https://github.com/Tianxiaomo/pytorch-YOLOv4 to implement a pytorch YOLOv4 model. However, this repo did not provide test.py/val.py. We know that YOLOv5 does provide val.py which purpose is to let us validate out trained result on validation dataset and testing dataset.
So I want to write a test.py/val.py for this purpose, but I am have really no idea how to write. Anyone have experience on how to write, can you please share some idea to write it?
You can take a look at YOLOv4-Scaled official repo, they use official pycocotools API for evaluation.
I am working now on my master thesis. And I want to extend the ResNet50 model to add tabular data. Has anyone experience in similar task? I use an iterative DataLoader and it may causes problems. In general I would like to ask if it's a good idea to create a network with mixed data types (image + tabular) and if this is the right approach. Thanks in advance!
This might be an old question but I find this blogpost might answer your question very well:
Markus Rosenfelder's blog
In summary, it explains how to combine a CNN (like your ResNet50) and tabular input to one model that has a combined output (using Pytorch and Pytorch Lightning but I feel the tutorial is so well done that you can easily adapt the technique to whatever you are using). The tutorial includes the whole process so your problems regarding the DataLoader might be addressed as well.
Hope this helps!
I'm interested in training a YOLOv5 model. Currently, I'm using Roboflow to annotate and export the data into YOLOv5 format. I'm also using Roboflow's Colab Notebook for YOLOv5.
However, I'm not familiar with many of the commands used in the Roboflow Colab Notebook. I found on here that there appears to be a much more "Pythonic" way of using and manipulating the YOLOv5 model, which I would be much more familiar with.
My questions regarding this are as follows:
Is there an online resource that can show me how to train the YOLOv5 and extract results after importing the model from PyTorch with the "Pythonic" version (perhaps a snippet of code right here on StackOverflow would help)? The official documentation that I could find (here) also uses the "non-Pythonic" method for the model.
Is there any important functionality I would lose if I were to switch to this "Pythonic" method of using YOLOv5?
I found nothing in the documentation that suggests otherwise, but would I need to export my data in a different format from Roboflow for the data to be able to train the "Pythonic" model?
Similar to question 1), is there anywhere that can guide me how to use the trained model on test images? Do I simply do prediction=model(my_image.jpg)? What if I want predictions on multiple images at once?
Any guidance would be appreciated. Thanks!
You can use the GitHub repository of ultralytics to do what you want; if you want to understand the process, check out the train.py file to get a better understanding. There isn't a straightforward explanation you just have to learn by yourself.
For the training: if you want to write the code by yourself it will need a lot of ML knowledge; that's why train.py exist, same for test.py and export.py.
I faced a strange challenge trying to train neural network using code from github, it is huggingface conversational model.
What happens: even i use my own dataset for training result remains the same like with original dataset. My hypothesis that it is a somehow cache problem - old dataset continuously get loaded from cached and replace my.
Them when i launch actual interactive session with neural network it works, but without my data, even if i pass model checkpoint.
Why i think of cache: in this repo author use automatic downloading and caching neural network model in /home/joo/.cache/torch/pytorch_transformers/ if no parameter specified in terminal.
I have created an issue on Github. BUT i am not sure is that a problem specific for this repo only, or it is a common problem with retraining neural networks i faced first time.
https://github.com/huggingface/transfer-learning-conv-ai/issues/36
Some copypaste from issue:
I am still curious, was not able to pass my dataset:
I added to original 200mb json my personality
trained once more with --dataset_path ./my.json
invoke interact.py with new checkpoint and path python ./interact.py --model_checkpoint
./runs/Oct08_18-22-53_joo-tf_openai-gpt/ --dataset_path ./my.json
and it reports Gathered 18878 personalities (but not 18879, with my own).
I changed the code in interact.py to choose my first perosnality this way
was: personality = random.choice(personalities)
become: personality = personalities[0]
and this first personality is not mine.
Solved: it is a specific issue to this repo, just hardcoded dataset path.
But still why it doesn't load first time - no answer