I have a trained TF-Lite model (model.tflite) for image classification with several labels. The output of the model provides an array of probabilities, but I don't know the order to the labels.
Can I extract the labels from the TF model?
I think this might extract the metadata
pip install tflite_support
import os
from tflite_support import metadata as _metadata
from tflite_support import metadata_schema_py_generated as _metadata_fb
model_file = <model_path>
displayer = _metadata.MetadataDisplayer.with_model_file(model_file)
export_json_file = os.path.join(os.path.splitext(model_file)[0] + ".json")
json_file = displayer.get_metadata_json()
with open(export_json_file, "w") as f:
f.write(json_file)
The simplest thing to do is to dump the labels file from the TF Lite model file. That file is a zip archive, so just do this:
unzip mobilenet_v1_0.75_160_quantized_1_metadata_1.tflite
Archive: mobilenet_v1_0.75_160_quantized_1_metadata_1.tflite
extracting: labels.txt
The "labels.txt" file (or something similarly named) contains the list of labels for the model.
Reference (and more info on how to read TF Lite model metadata): https://www.tensorflow.org/lite/models/convert/metadata#read_the_associated_files_from_models
Note: A TF Lite model is not guaranteed to contain a labels file like this, but most publicly published models, such as ones on tfhub.dev, should have this metadata included.
Related
I am running into this error , i can't unpickle a file on my jupyter notebook:
import os
import pickle
import joblib
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
filename = open("loan_model3.pkl", "rb")
mdl = pickle.load(filename)
mdl.close()
and it always shows the below error message , even tho i'vce upgraded all my libraries
Error Message:
FileNotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ram://89506590-ec42-44a9-b67c-3ee4cc8e884e/variables/variables You may be trying to load on a different device from the computational device. Consider setting the experimental_io_deviceoption intf.saved_model.LoadOptions to the io_device such as '/job:localhost'.
I tried to upgrade my libraries but still didn't work.
I got the same error too when I was trying to store my Sequential model in .pkl file, since Sequential model is a TensorFlow Keras model so we have to store it in .h5 file and Keras saves models in this format as it can easily store the weights and model configuration in a single file.
Code:
from keras.models import load_model
model.save('model.h5')
model_final = load_model('model.h5')
Idk if you are still here but I found the solution. basically you should not save the tensorflow model into a pickle file but instead into h5 file
## save model
save_path = './model.h5'
model.save(save_path)
## load tensorflow model
model = keras.models.load_model(save_path)
This worked for me. Hope this helps you too.
this worked for me:
import tensorflow as tf
path = './model.h5'
model.save(path )
loaded_model= tf.keras.models.load_model(path )
I have faced the same issue, but by saving the model as .h5 file worked for me. Now i'm able to load .h5 model.
I am using transformers.BertForMaskedLM to further pre-train the BERT model on my custom dataset. I first serialize all the text to a .txt file by separating the words by a whitespace. Then, I am using transformers.TextDataset to load the serialized data with a BERT tokenizer given as tokenizer argument. Then, I am using BertForMaskedLM.from_pretrained() to load the pre-trained model (which is what transformers library presents). Then, I am using transformers.Trainer to further pre-train the model on my custom dataset, i.e., domain adaptation, for 3 epochs. I save the model with trainer.save_model(). Then, I want to load the further pre-trained model to get the embeddings of the words in my custom dataset. To load the model, I am using AutoModel.from_pretrained() but this pops up a warning.
Some weights of the model checkpoint at {path to my further pre-trained model} were not used when initializing BertModel
So, I know why this pops up. Because I further pre-trained using transformers.BertForMaskedLM but when I load with transformers.AutoModel, it loads it as transformers.BertModel. What I do not understand is if this is a problem or not. I just want to get the embeddings, e.g., embedding vector with a size of 768.
You saved a BERT model with LM head attached. Now you are going to load the serialized file into a standalone BERT structure without any extra element and the warning is issued. This is pretty normal and there is no Fatal error to do so! You can check the list of unloaded params like below:
from transformers import BertTokenizer, BertModel
from transformers import BertTokenizer, BertLMHeadModel, BertConfig
import torch
lmbert = BertLMHeadModel.from_pretrained('bert-base-cased', config=config)
lmbert.save_pretrained('you_desired_path/BertLMHeadModel')
lmbert_params = []
for name, param in lmbert.named_parameters():
lmbert_params.append(name)
bert = BertModel.from_pretrained('you_desired_path/BertLMHeadModel')
bert_params = []
for name, param in bert.named_parameters():
bert_params.append(name)
params_ralated_to_lm_head = [param_name for param_name in lmbert_params if param_name.replace('bert.', '') not in bert_params]
params_ralated_to_lm_head
output:
['cls.predictions.bias',
'cls.predictions.transform.dense.weight',
'cls.predictions.transform.dense.bias',
'cls.predictions.transform.LayerNorm.weight',
'cls.predictions.transform.LayerNorm.bias']
I have a project that was developed on TensorFlow v1 I think. It works in Python 3.8 like this:
...
saver = tf.train.Saver(var_list=vars)
...
saver.restore(self.sess, tf.train.latest_checkpoint(checkpoint_dir))
...
The checkpoint files reside in the "checkpoint_dir"
I would like to use this with TFjs but I can't figure out how to transform the checkpoint files to something that can be loaded with TFjs.
What should I do?
thanks,
John
Ok, I figured it out. Hope this helps other beginners like me too.
The checkpoint files do not contain the model, they only contain the values (weights, etc) of the model.
The model is actually built in the code. So, here are the steps to convert the Tensorflow v1 checkpoint files to TensorflowJS loadable model:
First I saved the checkpoint again because there was a file that was missing (.meta file) This contains some meta information about the values in the checkpoint. To save the checkpoint with meta I used this code right after the saver.restore(... call like this:
...
saver.save(self.sess,save_path='./newcheckpoint/')
...
Save the model as a frozen model file like this:
import tensorflow.compat.v1 as tf
meta_path = './newcheckpoint/.meta' # Your .meta file
output_node_names = ['name_of_the_output_node'] # Output nodes
with tf.Session() as sess:
# Restore the graph
saver = tf.train.import_meta_graph(meta_path)
# Load weights
saver.restore(sess,tf.train.latest_checkpoint('./newcheckpoint/'))
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('./freeze/output_graph.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
This will save the model to ./freeze/output_graph.pb
Using tensorflowjs_converter convert the frozen model to a web model like this:
tensorflowjs_converter --input_format=tf_frozen_model --output_node_names='final_add' --skip_op_check ./freeze/output_graph.pb ./web_model/
Had to use the --skip_op_check due to some missing op errors/warnings when trying to convert.
As a result of step 3, the ./webmodel/ folder will contain the JSON and binary files required by the TensorflowJS library.
Here's how I load the model using tfjs 2.x:
model=await tf.loadGraphModel('web_model/model.json');
I'm trying to convert these three files of a pre-trained model:
semantic_model.data-00000-of-00001
semantic_model.index
semantic_model.meta
into a Saved Model format, so that I can later convert it into TFLite format for Inference.
Searching StackOverflow, I'd come across this code, which properly generates the Saved_model.pb, however as noted in some comments, doing it in this way doesn't keep the Meta Graph Definitions, which causes an error when I later try to convert it into TFlite format or freeze it.
import os
import tensorflow.compat.v1 as tf
tf.compat.v1.disable_eager_execution()
export_dir = '/tf-end-to-end/export_dir'
#trained_checkpoint_prefix = 'Models/semantic_model' \tf-end-to-end\Models
trained_checkpoint_prefix = 'PATH TO MODEL DIRECTORY'
tf.reset_default_graph()
graph = tf.Graph()
loader = tf.train.import_meta_graph(trained_checkpoint_prefix + ".meta" )
sess = tf.Session()
loader.restore(sess,trained_checkpoint_prefix)
builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
builder.add_meta_graph_and_variables(sess, [tf.saved_model.tag_constants.TRAINING, tf.saved_model.tag_constants.SERVING], strip_default_attrs=True)
builder.save()
This is the error I get when trying to use the saved_model:
RuntTimeError: MetaGraphDef associated with tags {'serve'} could not be found in SavedModel
Running the showsavedmodelcli --all doesn't display anything under signature definitions for the created saved_model.
My question is, how do I maintain the data and convert this to saved_model, for later conversion into TFLite format?
Model Structure and creation details can be seen here, including the checkpoint files mentioned: https://github.com/OMR-Research/tf-end-to-end
Refer to these steps for converting checkpoints to a TFLite model: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md#convert-checkpoints-
I've only seen a few questions that ask this, and none of them have an answer yet, so I thought I might as well try. I've been using gensim's word2vec model to create some vectors. I exported them into text, and tried importing it on tensorflow's live model of the embedding projector. One problem. It didn't work. It told me that the tensors were improperly formatted. So, being a beginner, I thought I would ask some people with more experience about possible solutions.
Equivalent to my code:
import gensim
corpus = [["words","in","sentence","one"],["words","in","sentence","two"]]
model = gensim.models.Word2Vec(iter = 5,size = 64)
model.build_vocab(corpus)
# save memory
vectors = model.wv
del model
vectors.save_word2vec_format("vect.txt",binary = False)
That creates the model, saves the vectors, and then prints the results out nice and pretty in a tab delimited file with values for all of the dimensions. I understand how to do what I'm doing, I just can't figure out what's wrong with the way I put it in tensorflow, as the documentation regarding that is pretty scarce as far as I can tell.
One idea that has been presented to me is implementing the appropriate tensorflow code, but I don’t know how to code that, just import files in the live demo.
Edit: I have a new problem now. The object I have my vectors in is non-iterable because gensim apparently decided to make its own data structures that are non-compatible with what I'm trying to do.
Ok. Done with that too! Thanks for your help!
What you are describing is possible. What you have to keep in mind is that Tensorboard reads from saved tensorflow binaries which represent your variables on disk.
More information on saving and restoring tensorflow graph and variables here
The main task is therefore to get the embeddings as saved tf variables.
Assumptions:
in the following code embeddings is a python dict {word:np.array (np.shape==[embedding_size])}
python version is 3.5+
used libraries are numpy as np, tensorflow as tf
the directory to store the tf variables is model_dir/
Step 1: Stack the embeddings to get a single np.array
embeddings_vectors = np.stack(list(embeddings.values(), axis=0))
# shape [n_words, embedding_size]
Step 2: Save the tf.Variable on disk
# Create some variables.
emb = tf.Variable(embeddings_vectors, name='word_embeddings')
# Add an op to initialize the variable.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables and save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Save the variables to disk.
save_path = saver.save(sess, "model_dir/model.ckpt")
print("Model saved in path: %s" % save_path)
model_dir should contain files checkpoint, model.ckpt-1.data-00000-of-00001, model.ckpt-1.index, model.ckpt-1.meta
Step 3: Generate a metadata.tsv
To have a beautiful labeled cloud of embeddings, you can provide tensorboard with metadata as Tab-Separated Values (tsv) (cf. here).
words = '\n'.join(list(embeddings.keys()))
with open(os.path.join('model_dir', 'metadata.tsv'), 'w') as f:
f.write(words)
# .tsv file written in model_dir/metadata.tsv
Step 4: Visualize
Run $ tensorboard --logdir model_dir -> Projector.
To load metadata, the magic happens here:
As a reminder, some word2vec embedding projections are also available on http://projector.tensorflow.org/
Gensim actually has the official way to do this.
Documentation about it
The above answers didn't work for me. What I found out pretty useful was this script (will be added to gensim in the future) Source
To transform the data to metadata:
model = gensim.models.Word2Vec.load_word2vec_format(model_path, binary=True)
with open( tensorsfp, 'w+') as tensors:
with open( metadatafp, 'w+') as metadata:
for word in model.index2word:
encoded=word.encode('utf-8')
metadata.write(encoded + '\n')
vector_row = '\t'.join(map(str, model[word]))
tensors.write(vector_row + '\n')
Or follow this gist
the gemsim provide convert method word2vec to tf projector file
python -m gensim.scripts.word2vec2tensor -i ~w2v_model_file -o output_folder
add in projector wesite, upload the metadata