I was following this Tensorflow tutorial on creating a Convolutional Neural Network.
I'm at the step where the training and test data is read:
def main(unused_argv):
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
Up to here, everything is fine.
But then suddenly an estimator is created:
mnist_classifier = learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")
My questions are:
What is an Estimator?
The previous code doesn't save anything under "/tmp/mnist_convnet_model". How come there is a model saved under that directory?
How did it get there?
EDIT:
When I run the code, I get:
Couldn't find trained model at ../tmp/mnist_convnet_model.
This is because the model isn't found under that directory structure.
How can I put the model there? Also, why do I have to put it there, instead of storing it in memory for the execution of the script.
The first question is answered right there in the tutorial. Estimator is "a TensorFlow class for performing high-level model training, evaluation, and inference".
The answer to the second question is that no, nothing is saved to that directory yet. The estimator object will use this directory to save training checkpoints, logs etc. When you run this code the first time, it will not load anything. But once you train the model, it will load the saved state from there.
Related
I have trained and saved a model using my video dataset for a seq2seq sign language recognition(SLR) using a code from git-hub (https://github.com/0aqz0/SLR)
I created a pyTorch Model to classify images. I saved it once via state_dict and the entire model like that:
torch.save(model.state_dict(), "model1_statedict")
torch.save(model, "model1_complete")
And I loaded my model:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
and it works just fine but I don't know how get predictions out of it with my video input.
I don't know much about torch, so I don't know how to get output from my trained model as the authors did not it in their publicly available code.
so now I need help ,on how to get my output from my model.
PS:My model takes a video input and outputs a sentence-level text("Hello, how are you?")
Define your model class first (in your case it is already defined). And load back what you saved.
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
See very detailed explanations here.
I have a model trained in Sagemaker as a file and can load and ultimately score it locally like so:
local_model_path = "model.tar.gz"
with tarfile.open(local_model_path) as tar:
tar.extractall()
model = xgb.XGBRegressor()
model.load_model("xgboost-model")
I wonder, how I can establish the hyperparameters used to fit the saved model. I do not think that these lines of code work (i.e. they do not show the hyperparameters the model was trained with):
booster = model.get_booster()
print(booster.save_config())
print(model.get_xgb_params())
How can I establish/check the actually used hyper parameters? Any help would be very much appreciated. Thanks.
Ok, forget the other answer which I deleted.
This one works for me, I don't know why get_xgb_params() should not work.
model = xgb.XGBRegressor()
model.load_model("xgboost-model")
model.get_xgb_params()
I'm using tf to create a sentiment analysis model. Since I'm a noob of machine learning I followed a guide on the official documentation of Tensorflow to train and test a model with the IMDB_reviews dataset. It works pretty well but I wish I could train it with another dataset.
So I've downloaded this dataset: "movie_review.csv". It contains various columns and I want to access text and tag (where the tag is a positive or negative value and text is the text of the review).
What I want to do is to prepare the CSV as a dataset, access text and tag, vectorize them, and feed them to the network. There is no division between test and train, so I have to divide the file too.
So, I want to know how to:
0- Access the file I've downloaded and transform it into a dataset.
1- Access text and tag in the file, maybe without using pandas. If pandas is recommended and there is a simple way to access the file and passing to a network using TensorFlow I'll be okay with the answer.
2- Splitting the file in the test set and train set (I've already found a pandas solution for this actually).
3- Vectorize my text and tag to feed my network.
If you have an entire guide on how to do this, it'll be fine, it just has to use TensorFlow.
Questions 0 to 3 have been answered
Ok so, I have used the file posted to load a dataset to train the model on short sentences, but I'm having trouble with the training.
When I followed the guide to build the model for text classification I came out with this code:
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']
encoder = info.features['text'].encoder
BUFFER_SIZE = 10000
BATCH_SIZE = 64
padded_shapes = ([None], ())
train_dataset = train_dataset.shuffle(BUFFER_SIZE).padded_batch(BATCH_SIZE, padded_shapes = padded_shapes)
test_dataset = test_dataset.padded_batch(BATCH_SIZE, padded_shapes = padded_shapes)
model = tf.keras.Sequential([tf.keras.layers.Embedding(encoder.vocab_size, 64),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')])
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
history = model.fit(train_dataset, epochs = 1, validation_data = test_dataset, validation_steps=30, callbacks=[cp_callback])
So, I trained my model this way (Some parts are missing, I have included all the fundamental ones). After this, I wanted to train the model with another dataset, and thanks to Andrew I have accessed a dataset created by me this way:
csv_dataset = tf.data.experimental.CsvDataset(filepath, default_values, header=header)
def reshape_dataset(txt, tag):
txt = tf.reshape(txt, shape=(1,))
tag = tf.reshape(tag, shape=(1,))
return txt, tag
csv_dataset = csv_dataset.map(reshape_dataset)
training = csv_dataset.take(10)
testing = csv_dataset.skip(10)
And my problem is to adapt the dataset to the model I already have. I have tried various solution, but I get errors on the shapes.
Can somebody be so gentle to explain me how to do this? Obviously the solution for step 3 has already been posted by Andrew in his file, but I'd like to use my model with the weights I have saved during training.
This sounds like a great place to use Tensorflow's Dataset API. Here's a notebook/tutorial that covers how to do some basic data input and preprocessing stuff, right from Tensorflow's website!
I have also made a notebook with a quick example, answering each of your questions with implementations. You can find that here.
I´m trying to restore the trained Generator of a Generative Adversarial Network using a Tensorflow Model (the metagraph and the checkpoint)
I´m new to tensorflow and python, so I´m not sure if what I´m doing is making sense. have already tried importing the metagraph from the meta file and restoring the variables from checkpoint, but i´m sure what to do next. My goal is to restore the trained Generator from the last checkpoint and then use it to generate new data from noise input.
Here´s a link to a drive containing the model files:
https://drive.google.com/drive/folders/1MaELMC4aOroSQlMJ32J3_ff3wxiBT_Fq?usp=sharing
So far I have tried the following and it seems to be loading the graph:
# import the graph from the file
imported_graph = tf.train.import_meta_graph("../../models/model-9.meta")
# list all the tensors in the graph
for tensor in tf.get_default_graph().get_operations():
print (tensor.name)
# run the session
with tf.Session() as sess:
# restore the saved vairable
imported_graph.restore(sess, "../../models/model-9")
However, I´m not sure what to do next. Is it possible to run only the trained generator using this files? How can I acces it?
In the Tensorflow 2 doc, they save both the generator and the discriminator. However, they do not explain how to only restore the generator.
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(generator_optimizer=generator_optimizer,
discriminator_optimizer=discriminator_optimizer,
generator=generator,
discriminator=discriminator)
And then restore with
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))
From https://www.tensorflow.org/tutorials/generative/dcgan#save_checkpoints
I am looking for some suggestions about how to do continue training in theano. For example, I have the following:
classifier = my_classifier()
cost = ()
updates = []
train_model = theano.function(...)
eval_model = theano.function(...)
best_accuracy = 0
while (epoch < n_epochs):
train_model()
current_accuracy = eval_model()
if current_accuracy > best_accuracy:
save classifier or save theano functions?
best_accuracy = current_accuracy
else:
load saved classifier or save theano functions?
if we saved classifier previously, do we need to redefine train_model and eval_model functions?
epoch+=1
#training is finished
save classifier
I want to save the current trained model if it has higher accuracy than previously trained models, and load the saved model later if the current trained model accuracy is lower than the best accuracy.
My questions are:
When saving, should I save the classifier, or theano functions?
If the classifier needs to be saved, do I need to redefine theano functions when loading it, since classifier is changed.
Thanks,
When pickling models, it is always better to save the parameters and when loading re-create the shared variable and rebuild the graph out of this. This allow to swap the device between CPU and GPU.
But you can pickle Theano functions. If you do that, pickle all associated function at the same time. Otherwise, they will have each of them a different copy of the shared variable. Each call to load() will create new shared variable if they where pickled. This is a limitation of pickle.