I'm making an mobile app that uses TensorFlow Lite to perform text classification on tweets. I've done this successfully with the TensorFlow sample model but this model is trained on IMDB movie reviews and I want to have a custom model on device that is trained on tweets to increase accuracy. I have training and test sets for this domain and am trying to create a custom model following this tutorial https://www.tensorflow.org/lite/tutorials/model_maker_text_classification
I'm running into a pythion KeyError though and cant figure out why. here is a screen shot
You can see on the right a sample of my csv, I have a label and Sentence column, using the TextClassifierDataLoader. I don't understand why this key error is happening, I'm interpreting this as it can't find a column named "Sentence" but cleary it's there.
Any ideas?
Seems to me that is_training is not an available argument in the class, so just take out that line/argument if you want it to compile (what it was supposed to do in the first place, I haven't looked into).
https://github.com/tensorflow/examples/blob/1dc6978e2141e7a5efebcf6971b3afa9cb055679/tensorflow_examples/lite/model_maker/core/data_util/text_dataloader.py#L90
Issue was the column " Sentence" had an empty space at the beginning
Related
I have one CSV with around 10k rows and around 370 columns mostly numerical (int or float) and ID columns which is unique and I know the target column (integer type column) which needs to be used as inference for What-If Tool in Tensorboard. I'm not much experienced in tensorflow, but I could not find the documentation that fit my purposes correctly.
Initially, I built my model using this documentation:
https://www.tensorflow.org/tutorials/load_data/pandas_dataframe
To serve the model I went through this documentation:
https://www.tensorflow.org/tensorboard/what_if_tool
Where it said in the requirements:
The model(s) you wish to explore must be served using TensorFlow Serving using the classify, regress, or predict API.
This leads to this link:
https://github.com/tensorflow/serving
I was able to build the saved_model.pb file and use it for serving using docker successfully, but when I use it in Tensorboard What-If Tool I get an error saying "Expected one input Tensor"
And then I went through these links for doing the changes to the model for serving to add input and outputs:
https://www.tensorflow.org/tfx/tutorials/serving/rest_simple
https://www.tensorflow.org/guide/saved_model
But I still can't understand how or what to give as input and output as I only have a target integer column I know about from my CSV. Neither do I understand how to add signatures properly for all 3 APIs.
I checked the UCI Census Demo model and loaded the model and in signatures, I could see classification, regression, and such and all of them are pruned Concrete Functions which I have no idea about.
My client requires me to load the CSV with model understanding and predict features enabled with both Classification and Regression.
I want to use tensorflow for detecting cars in an embedded system, so I tried ssd_mobilenet_v2 and it actually did pretty well for me, except for some specific car types which are not very common and I think that is why the model does not recognize them. I have a dataset of these cases and I want to improve the model by fine-tuning it. I should also note that I need a .tflite file because I'm using tflite_runtime in python.
I followed these instructions https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 and I could train the model and reached a reasonable loss value. I then used export_tflite_ssd_graph.py in the object detection API to build inference_graph from the trained model. Afterwards I used toco tool to build a .tflite file out of it.
But here is the problem, after I've done all that; not only the model did not improve, but now it does not detect any cars. I got confused and do not know what is the problem, I searched a lot and did not find any tutorial about doing what I need to do. They just added a new object to a model and then exported it, which I tried and I was successful doing that. I also tried to build a .tflite file without training the model and directly from the Tensorflow detection model zoo and it worked fine. So I think the problem has something to do with the training process. Maybe I am missing something there.
Another thing that I did not find in documents is that whether is it possible to "add" a class to the current classes of an object detection model. For example, let's assume the mobilenet ssd v2 detects 90 different object classes, I would like to add another class so that the model detects 91 different classes instead of 90 classes. As far as I understand and tested after doing transfer learning using object detection API, I could only detect the objects that I had in my dataset and the old classes will be gone. So how do I do what I explained?
I found out that there is no way to 'add' a class to the previously trained classes but with providing a little amount of data of that class you can have your model detect it. The reason is that the last layer of the model changes when transfer learning is applied. In my case I labeled around 3k frames containing about 12k objects because my frames would be complicated. But for simpler tasks as I saw in tutorials 200-300 annotated images would be enough.
And for the part that the model did not detect anything it has something to do with the convert command that I used. I should have used tflite_convert instead of toco. I explained more here.
I have trained a spacy textcat model but then I realized that there were some incorrect training data: data from one category happened to be labeled with another category. My question is: is it possible to remove these training examples from the model without retraining it? Something like nlp.update() but in reverse? Would appreciate any help!
You mean to revert specific cases? As far as I know, that's not currently possible in spaCy.
I would suggest to either retrain from scratch with the corrected annotations, or continue training with the updated annotations. If you continue training, make sure that you keep feeding a representative set to your model, so that it doesn't "forget" cases it was already predicting correctly before.
Main question: How do I create a neural network that can classify text data along with numerical features?
It sounds simple, but I must not be understanding something correctly.
Background
I'm trying to build a text classifier (for the first time) using TensorFlow 2/Keras to look through app store reviews and classify them into the following categories: happy, pricingIssue, techIssue, productIssue, miscIssue
I have a data set that contains: star_rating, review_text and the associated labels.
Problem
My understanding from this tutorial from TensorFlow is that I need to use the tensorflow hub layer to embed the sentences as as a fixed shape output.
embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[], dtype=tf.string, trainable=True)
And then I would build the model using that as my input layer.
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
So my question is, where do I insert the numerical rating into the data into the model?
Potential Solutions?
Use two input layers and merge them somehow? I would think that I would want to use the hub layer to embed the data, another input layer for numerical data, and then pipe them both into the next layer?
Do I embed the string first and then append the rating to that? I could also see creating a function that preprocesses the data into the array, and appends the rating onto the end of the embedded string, and just use the whole thing as the input object.
I'm stumped and any guidance is helpful!!
After consulting with an expert, both of the above solutions can work, but have different trade offs:
Using two input layers: You can do this, but not using a sequential model, since this is no longer in sequence. It's a more traditional graph
Append the string first: Because the embedded layer is pre-trained, it doesn't need to happen inside the model, and the text can be embedded and then added into a tensor along with the numerical rating.
Since I'm the most familiar with Tensorflow 2 and Keras, I opted for the 2nd choice, so I can continue to use a sequential model.
There’s another option for adding in non-text data to text models: make the data textual. The exact way you do this depends on the tokenizer you are using, and how your model handles words it hasn’t seen before (OOV words). But, similar to how you might see special tokens like __EOS__ to tell the model that one sentence ended and the next is beginning, you could prepend a text version of the rating to the review string: review_string = “_5_stars_ “ + review_string.
This sounds like such a hack it can’t possibly work, but I’ve talked to someone at AWS using it in production to pass metadata to a text model.
I'm new to keras seq2seq LSTM models. I have a working machine translation model and English-to-Arabic training data. I just trained the model using google colab tool and made some predictions. As you can see in the image, when I test the model on a text from the training data, it predicts well, but when I change ONE word, the prediction goes completely wrong!
I want my model to UNDERSTAND the full meaning of the text even when adding/deleting one word. How can I solve this problem?
LSTM wrong predictions when adding/deleting one word
In the image, the first test of each section is the text from the training data, which predicts well. The second test is the same but with adding/deleting one word.
UPDATE: Whenever I add validation split, the val_loss is always increasing and the model isn't learning too much! What's going worng?
This is the classical overtraining problem. Your model only learn to translate your training data by remembering each sample instead of understanding the concept behind it.
For this reason always split your training data in training data and validation data. The validation data must not be in the training data set! This way you can check if your model is actually learning something.
There are two main solution for this:
Like m33n said more training data (there is no data like more data)
Implement more regularization techniques like Dropout
Also the problem seems very ambigous. Translating sentences is not an easy task at all and copanies like google or deepl created very complex models trained with lots and lots of data occupied over years. Are you sure you have the necessary resources to accomplish this?