I have 31.216 labeled images for object detection. I used LabelIMG program to label images and it's in Pascal-VOC format. I want to create a tflite model for my Kotlin project. However, I have serious problems.
First, If I use my local environment, I tried to install tflite-model-maker library in PyCharm using pip install tflite-model-maker. It downloaded ~30GB and Python still says unresolved reference. Then I tried to add the library from here but it also didn't work. I couldn't achieve importing the library.
On the second way, I used Google Colab. Following this tutorial from Tensorflow. I mounted my Google Drive in Colab and edited all codes for my dataset path. I ran this line model.export(export_dir='.', tflite_filename='AslModel.tflite') lastly and it create model file in the colab directory. I continued to run next line model.evaluate_tflite('AslModel.tflite', val_data) and it gave 16 hours ETA and after 14 hours Google Colab runtime gave an error and all runtime has been reset. Now, I have a tflite and I tested it. Since there is no evaluation step, it makes bad predictions. I started all over again but Google Colab gave an error again. I guess ~7 hours training + ~16 hours of evaluation is impossible with Google Colab because there is 24h limit. Thus, my question is how can I run the evaluation step only?
model is defining in this line and it takes 7 hours model = object_detector.create(train_data, model_spec=spec, batch_size=4, train_whole_model=True, epochs=20, validation_data=val_data). Instead of this line, I want to initialize my tflite file to a model like model = LoadModel(PATH_OF_MY_TFLITE). I couldn't find any load method so I'm stuck there.
To sum up, objective is training the Pascal-VOC formatted dataset. I couldn't import the libraries for Python and with Google Colab I have raw tflite model but it needs evaluation and I can't run previous steps due to time limit. Lastly, I bought Colab Pro but I spent all my compute unit. I don't even know what is the purpose of compute unit. I'm waiting for suggestions. Thank you.
Related
I'm new to TensorFlow and started training my model in Google Collaboratory. After spending a few hours training my model, I was finally able to download the tflite file, and it's working great! The only issue I have with it is its speed. I've looked into post-training quantization, but it seems as if I still need the Keras model to do that, but all I have left is the actual tflite file itself, as the notebook has since been closed and all data lost. Is there any way I can quantize the file itself?
Thank you in advance for replies.
I tried using the tf.lite.Interpreter to load the model into a keras optimizer, but that didn't work.
I am in full and bad surprise. Same program everything is perfect. I just slept and opened the Google colab today to run the program. This is my first ever deep learning program. It ran perfectly yesterday. But when I run today, it is giving a weird error. Need help. Why it is giving such error? How to solve it?
Google colab screenshot:
Code:
#Step3: test_img_path: Location of the image we want the model to predict
test_img = image.load_img(test_img_path,target_size=(224,224))
#Step4: Deep learning models expect a batch of images represented by array
# At this stage we will have a processed image of size 224x224x3.
# Convert it to a batch of images denoted by nx224x224x3 where n denotes total images
# In this case, n=1
test_img_array = image.img_to_array(test_img)
# Convert the array to a batch
test_img_batch = np.expand_dims(test_img_array,axis=0)
#Step5: At the data level, an original image data is stored in the in terms of the pixels.
# Now, normalizing the image
nor_testimg = preprocess_input(test_img_batch)
#Step6: Import the model and input our test image
# Model here means, it is already trained by someone else and I don't have to do it again
# Moreover, they made their hardwork or trained model freely available to every on on the keras, we just download it
model = tf.keras.applications.resnet50.ResNet50()
#Step7: Lets see how and what the model would predict
predict_testimg = model.predict(nor_testimg)
# Decode the predictions
print(decode_predictions(predict_testimg,top=3)[0])
In the above code, tf.keras.applications.resnet50.ResNet50() is the one causing the problem when I run it today. The same program ran successfully yesterday. Now, if I remove end brackets tf.keras.applications.resnet50.ResNet50, it runs perfectly but raised an error in the next line of the code.
The issue is not with you and it lies in Keras as its trying to decode a string with utf 8. If I can get some more part of it might be able to help then
I am using TensorFlow 2.x object detection API. I have trained a deep learning model from the model zoo on my dataset. I am using Google Colab. After training now I want to evaluate my model. I am using coco detection metrics. I used the following script to evaluate my model,
!python3 model_main_tf2.py \
--model_dir = path/to/model directory \
--pipeline_config_path = path/to/pipeline config file \
--checkpoint_dir = path/to/checkpoint directory
After running the above code I get the mean average precision (mAP) and average recall (AR) for the latest checkpoint on my test set. But for academic purposes, I want to get these metrics on all the checkpoints to get a graph of how my model has improved over time. Is there a possible way to that? or is it possible to train and evaluate at the same time in TensorFlow 2 object detection API? I am a beginner in this field so kindly help me out with this issue. Thank you.
I am facing the same problem. So I had an idea. We can run the model_main_tf2.py you mentioned to eval the model but changing the current checkpoint (first line) to evaluate in the checkpoint file
model_checkpoint_path: "ckpt-1"
then
model_checkpoint_path: "ckpt-2"
then
model_checkpoint_path: "ckpt-3"
.
.
.
For each checkpoint you will get a .tfevent so then you open TensorBoard pointing to the directory that contains all the .tfevent and you can see how the model improves over time.
I just saved the last 3 checkpoints in my computer so I can't see the progress from the beginning (my fault) but if you have all the checkpoints try to do what I suggest.
See my graph evaluating the last 3 checkpoints.
You should have an eval directory including an events.out.tfevents file under your model directory. You can run !tensorboard --logdir=path/to/eval/directory to access the graphs.
You can run training with the same snipped you have except without the checkpoint_dirand can open another terminal to run evaluation like you're currently doing.
I faced a strange challenge trying to train neural network using code from github, it is huggingface conversational model.
What happens: even i use my own dataset for training result remains the same like with original dataset. My hypothesis that it is a somehow cache problem - old dataset continuously get loaded from cached and replace my.
Them when i launch actual interactive session with neural network it works, but without my data, even if i pass model checkpoint.
Why i think of cache: in this repo author use automatic downloading and caching neural network model in /home/joo/.cache/torch/pytorch_transformers/ if no parameter specified in terminal.
I have created an issue on Github. BUT i am not sure is that a problem specific for this repo only, or it is a common problem with retraining neural networks i faced first time.
https://github.com/huggingface/transfer-learning-conv-ai/issues/36
Some copypaste from issue:
I am still curious, was not able to pass my dataset:
I added to original 200mb json my personality
trained once more with --dataset_path ./my.json
invoke interact.py with new checkpoint and path python ./interact.py --model_checkpoint
./runs/Oct08_18-22-53_joo-tf_openai-gpt/ --dataset_path ./my.json
and it reports Gathered 18878 personalities (but not 18879, with my own).
I changed the code in interact.py to choose my first perosnality this way
was: personality = random.choice(personalities)
become: personality = personalities[0]
and this first personality is not mine.
Solved: it is a specific issue to this repo, just hardcoded dataset path.
But still why it doesn't load first time - no answer
I intend to use posenet in python and not in browser, for that I need the model as a frozen graph to do inference on. Is there a way to do that?
I ported Google's tfjs PoseNet to Python over the holidays. The demo apps in the repository automatically download the weights, freeze a graph, and save to a model file. You can grab this model and use in any TF variant.
I wrote a Python version of the multi-person post processing code that uses vectorized scipy/numpy ops to speed a few parts. I have not done exhaustive testing of this part, but with a number of spot checks on various test images against the reference, and using it for some other sources, it seems to be reasonably close to the original, and faster :)
Python + TF at https://github.com/rwightman/posenet-python
I also did a PyTorch conversion at https://github.com/rwightman/posenet-pytorch
And if you happen to be looking for a CoreML port at some point, I started off with the weight conversion code from this project https://github.com/infocom-tpo/PoseNet-CoreML
We currently do not have the frozen graph for inference publicly, however you could download the assets and run them in a Node.js environment.