Train a network on python wrraper caffe? - python

I would like to train a caffe network with the python interface.
The main reason behind this is I use multi dimentional input of a few Tbs of data and I dont want to convert all this to LMDB and train it.
I have found a this one answer on stack overflow.
But his loads this complete data at once and has initialized weights.
I would like to load data to a numpy and then pass it to the caffe.
And save the weights of the caffemodel to a .caffemodel file once every 1000 iterations.
the print_network() get_accuracy() & load_data() are very useful. And gives me a good inside.

Beside using PythonLayer , one thing you can do is use MemoryData layer and feed in each batch of data at a time by using solver.net.set_input_arrays(your_data) after however many iteration is needed to go through one batch of data.
Remember, you can always restore the training state by using .solverstate file from your snapshots.

Related

TFLite model maker custom object detector training using tfrecord

I am trying to train a custom object detector using tflite model maker (https://www.tensorflow.org/lite/tutorials/model_maker_object_detection). I want to deploy trained tflite model to coral edgeTPU. I want to use tensorflow tfrecord (multiple) as input for training a model like object detection API. I tried with
tflite_model_maker.object_detector.DataLoader(
tfrecord_file_patten, size, label_map, annotations_json_file=None
) but I am not able to work around it. I have following questions.
Is it possible to tfrecord for training like mentioned above?
Is it also possible to pass multiple CSV files for training?
For multiple CSV files, you could probably just append one file to the other. Then you'd just have to pass one csv file.
As for passing a tfrecord instead, this should be possible. I'm also attempting to do this, so if I get it working I'll update my post. Looking at the source, it seems from_cache is the function internally used. Following that structure, should be able to create a DataLoader object similarly:
train_data = DataLoader(tfrecord_file_patten, meta_data['size'],
meta_data['label_map'], ann_json_file)
In this case, tfrecord_file_patten should be a tfrecord of your training data. You can construct the validation and test data the same way. This will work provided you're constructing your TFRecords correctly. There appears to be some inconsistency to how it's done in different places, so make sure you follow the same structure in creating the TFRecords as found in the ModelMaker source. This worked for me. One specific thing to watch out for is to use an integer for the 'image/source_id' feature in your TFExamples. If you use a string it'll throw an error.

Resource Exhausted in Tensorflow with any architecture

I tried to train a image classifier using tensorflow. I used data api to load the dataset and i used dataset caching to speed up training process. while trying to training the model i struck with a error called Resource Exhausted. I tried to change the batch size even after trying different batch size like 32,64,128 i could not over come this problem
I have tried to remove some layers but i could not fix this error.
Check your batch_size. Decrease it. It seems it is overwhelming.

Saving custom variables in Keras .h5 file

I'm developing a RNN for a project and I need to train it on a computer and be able to predict on another. The solution I found is to save the model into a .h5 file using the code below:
... # Train the data etc....
model.save("model.h5")
My problem is that I need to store some meta-data from my training dataset and pre-process and be able to load it together with the model. (e.g. name of dataset file, size of the dataset file, number of characters, etc...)
I don't want to store this information in a second file (e.g. a .txt file) because I would have to use two files. I don't want to use any additional library or framework for this task.
I was thinking (brainstorming) a code like this:
model.save("model.h5", metaData={'myVariableName': myVariable})
And to load would be:
myVariable = model.load("model.h5").getMetaData('myVariableName')
I know this is not possible in the current version and I already read Keras doc, but I couldn't find any efficient method to do that. Notice that what I'm asking is different from custom_object because want to save and load my own variables.
Is there a smarter approach to solve this problem?

tensorflow save and restore autoencoder

I used tf.layers.dense to build a fully connected autoencoder. and I want to save it and restore only the encoder to get the embedding output.
How to use tf.train.saver to restore only the encoder? Because I want to set different batch size of the restored model, to input only one data into it.
I saw many tutorials but there is no tutorials about this.
Is there any standard solution about this
Thank you very much
If you don't care about memory space the easiest way is by saving the whole graph (encoder and decoder) and when using it for prediction, you can pass the last layer of the encoder as the fetch argument. Tensorflow will only calculate to this point and you don't have any computational difference compared to only saving the encoder.
Otherwise you can create two graphs (one for the encoder, one for the decoder) an train them at the same time and train them together. But this is I bit more complex.

Tensorflow: run time test metrics and data queues

I want to compute and display accuracy on the test set while the network is training.
In the MNIST tutorial that uses feeds, one can see that it can be done easily by feeding test data rather than train data. Simple solution to a simple problem.
However I am not able to find such an easy example when using queues for batching. AFAICS, the documentation proposes two solutions:
Offline testing with saved states. I don't want offline.
Making a second 'test' network that share weights with the network being trained. That doesn't sound simple and I have not seen an example of that.
Is there a third, easy way to compute test metrics at run time? Or is there an example somewhere of the second, test network with shared weights that proves me wrong by being super simple to implement?
If I understand your question correctly, you want to validate your model while training with queue inputs not feed_dict?
see my program that does this.
Here is a short explanation:
First you need to convert you data into train and validation files like 'train.tfreords' and 'valid.tfreocrds'
Second in your training program start two queues that parse this two files,
and use sharing variables to get the two logits for train and valid
In my program this is done by
with tf.variable_scope("inference") as scope:
logits = mnist.inference(images)
scope.reuse_variables()
validation_logits = mnist.inference(validation_images)
then use logits to do get train loss and minimize it and use validation_logits to get valid accuracy

Categories

Resources