I am trying to train a resnet model for CIFAR10 using the following repo in tensorflow: https://github.com/stanford-futuredata/dawn-bench-models/tree/master/tensorflow/CIFAR10/resnet. Even though the readme mentions tensorflow 1.2, I got a Could not find a version that satisfies the requirement tensorflow==1.2 when trying to install, so I am instead using tensorflow 1.15. I am also using Python 3.7.6 and running on a Mac. When I try to run the training script resnet_main.py:
python3 resnet/resnet_main.py --train_data_path=cifar10/data_batch* \
--log_root=/tmp/resnet_model \
--train_dir=/tmp/resnet_model/train \
--dataset='cifar10'
I get the following command line error: zsh: no matches found: --train_data_path=cifar10/data_batch*. I imagine it has to do with the *, though I'm not sure, and I'm not sure what the work around is. Thanks!
The answer is as simple as adding single quotes, such as --train_data_path='cifar10/data_batch*', for all the filepaths.
Related
I tried running this command but i get erros that i dont have tenserflow 2.2 or higher. But I checked and I have the correct version of tenserflow. I also did pip3 install keras command
I know for a fact that all of the code is correct because it worked for my teacher the other day and nothing has changed. I just need to run his commands but i keep running into problems
I am doing this course following everything he does in a recorded video so there must be no issue there but for some reason it just doesn't work
just install tensorflow as requested in the last line of the error message: pip install tensorflow. It is needed as backend for Keras.
Also, since keras is part of tensorflow now, I recommend to write imports as from tensorflow.keras.[submodule name] import instead of from keras.[submodule name] import
Let me start by saying I am a beginner on Deep Learning and trying to find my way by following the Tensorflow tutorial, which is mainly applying the inception V3 method to the flowers data set.
https://www.tensorflow.org/tutorials/image_retraining
which includes the following :
cd ~
curl -O (flower data link) -- runs fine
tar xzf flower_photos.tgz --runs fine
bazel build tensorflow/examples/image_retraining:retrain --error: no bazel command found
In order to be able to follow this tutorial, I have also completed the Tensorflow installation tutorial and modified (replaced 35 to 36) it for Python 3.6 compatible whl as follows: pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.2.1-cp36-cp36m-win_amd64.whl
Now back to the main question: After installing the flower data set and installing the bazel package, cygwin64. I went into the Bazel folder and ran the configure file as suggested in the forums as well as touch WORKSPACE and bagel build. When I run the command "bazel build tensorflow/examples/image_retraining:retrain" I still get the error: "Bazel command not found"
I followed similar questions on stackoverflow before openning up my own question, such as: questions- 41791171/bazel-build-for-tensorflow-inception-model and git clone'd the entire Tensorflow folder as instructed but resulted an eror of :bagel: command not found
To summarize, how can I run the Tensorflow Flowerset tutorial and overcome the errors of :bagel: command not found and :bazel: command not found?
It's not mandatory to use Bazel for the TensorFlow Image Retraining tutorial.
You can also run the retrain.py located in the \tensorflow\examples\image_retraining\ folder cloned from the TensorFlow GitHub repo to retrain the Inception v3 model or Mobilenet model.
https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/image_retraining/retrain.py
Put the Flowers datasets folder (flower_photos) under the image_retraining and run the retrain.py as below:
python retrain.py --image_dir flower_photos
You should see the script will download the Inception v3 model.
The image retraining in progress.
After the retraining is completed, you should see the below:
Copy both output_graph.pb and output_labels.txt in the C:\tmp folder, which are the retrain outputs to the image_retraining folder.
To verify the retrained model, you can run the label_image.py as below.
It should show the top 5 predictions.
python label_image.py --image=flower_photos\daisy\21652746_cc379e0eea_m.jpg --graph=output_graph.pb --labels=output_labels.txt
The expected output should be as below:
I have retrained Tensorflow's Inception V3's last layer on a flower dataset. This was done using:
bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir ~/flower_photos
The training was successful, and then I ran:
bazel build tensorflow/examples/label_image:label_image
This too ran fine and next I ran:
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \
--image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg
When I run this, I receive an error that says
E tensorflow/examples/label_image/main.cc:285] Not found: Failed to load compute graph at '/tmp/output_graph.pb'
Any help is greatly appreciated, thank you.
I have run this using following command instead of bazel and I found it easier.
python /path_to_file/label_image.py /path_to_image/image.jpeg
First make sure that graph is created after you run retrain.py and it is at the correct location. (default is inside /tmp/). If you want you can get the script here and you can change the file locations as per your need.
Note: If graph file is not created you may want to check this regarding running retrain.py
When following the Readme to fine-tune Google's Inception-v3 image classification model, I get the error:
File "/Path/to/Model/bazel-bin/inception/flowers_train.runfiles/inception/inception/slim/ops.py", line 88, in batch_norm
initializer=tf.zeros_initializer(),
TypeError: zeros_initializer() takes at least 1 argument (0 given)
This occurs after running the final command:
bazel-bin/inception/flowers_train \
--train_dir="${TRAIN_DIR}" \
--data_dir="${FLOWERS_DATA_DIR}" \
--pretrained_model_checkpoint_path="${MODEL_PATH}" \
--fine_tune=True \
--initial_learning_rate=0.001 \
--input_queue_memory_factor=1
I have 0 idea whats going on here as this error gets thrown from a python file written by the TF team. Additionally, being a TF newbie, I do not know my way around enough to attempt a deep debugging session. Just by looking at the path from the error, there might be an issue with the script running TF slim code?
Anyhow, I am running macOS Sierra with Python 3.6 and the TensorFlow Python API r0.12.
So turns out this error was thrown if the current installation of tensorflow did not have the most recent tensorflow-slim code. Install directions here.
When I follow the tutorials of "How to Retrain Inception's Final Layer for New Categories", I need to build the retainer like this
bazel build tensorflow/examples/image_retraining:retrain
However, my tensorflow on windows does not have such directory. I am wondering why and how can I solve the problem?
Thank you in advance
In my case tensorflow version is 1.2 and corresponding retrain.py is here.
Download and extract flowers images from here.
Now run the the retrain.py file as
python retrain.py --image_dir=path\to\dir\where\flowers\images\where\extracted --output_lables=retrained_labels.txt --output_graph=retrained_graph.pb
note: the last two arguments in the above command are optional.
Now to test the retrained model:
go the master branch and download the label_image.py code as shown below
Then run python label_image.py --image=image/path/to/test/classfication --graph=retrained_graph.pb --labels=retrained_labels.txt
The result will be like
From the screenshot, it appears that you have installed the TensorFlow PIP package, whereas the instructions in the image retraining tutorial assume that you have cloned the Git repository (and can use bazel to build TensorFlow).
However, fortunately the script (retrain.py) for image retraining is a simple Python script, which you can download and run without building anything. Simply download the copy of retrain.py from the branch of the TensorFlow repository that matches your installed package (e.g. if you've installed TensorFlow 0.12, you can download this version), and you should be able to run it by typing python retrain.py at the Command Prompt.
I had the same problem on windows. My windows could not find script.retrain. I downloaded retrain.py file from tensoflow website at here. Then, copied the file in the tensorflow folder and run the retrain script using Python command.