I cloned tensorflow object detection model on githug:
github link
And I want to train this model with my own data (331 samoyed dog's images) following by this blog tutorial click here
My steps:
Created PASCAL VOC format dataset;
download retrained model(ssd_mobilenet_v1_coco_11_06_2017.tar.gz)
change the config file(ssd_mobilenet_v1_pets.config)
initial the training process by this codes:
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=./samoyed_test_and_train/training/ssd_mobilenet_v1_pets.config \
--train_dir=./samoyed_test_and_train/data/train.record
but I receive errors, my os is MacOS,and I tried on AWS,same problem occurs, can you figured out my mistakes ?errors:
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
WARNING:tensorflow:From /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/meta_architectures/ssd_meta_arch.py:579: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-08-01 10:34:42.992224: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:34:42.992254: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:35:00.359032: I tensorflow/core/common_runtime/simple_placer.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/samoyed_test_and_train/training/model.ckpt
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
2017-08-01 10:35:29.556458: E tensorflow/core/util/events_writer.cc:62] Could not open events file: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local: Failed precondition: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local
2017-08-01 10:35:29.556480: E tensorflow/core/util/events_writer.cc:95] Write failed because file could not be opened.
Traceback (most recent call last):
File "object_detection/train.py", line 198, in <module>
tf.app.run()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/trainer.py", line 290, in train
saver=saver)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 732, in train
master, start_standard_services=False, config=session_config) as sess:
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session
start_standard_services=start_standard_services)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 709, in prepare_or_wait_for_session
self._write_graph()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 612, in _write_graph
self._logdir, "graph.pbtxt")
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/graph_io.py", line 67, in write_graph
file_io.atomic_write_string_to_file(path, str(graph_def))
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 418, in atomic_write_string_to_file
write_string_to_file(temp_pathname, contents)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 305, in write_string_to_file
f.write(file_content)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
self._prewrite_check()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status)
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
the train_dir flag is meant to point at some (typically empty) directory where your training logs and checkpoints will be written during training. For example it could be something like train_dir=/tmp/training_directory. It looks like you are trying to point it at your dataset --- which the config file should already be pointing at.
Related
I need some clear instructions on how to execute some code.
Context:
This is a python machine learning peptide binding script, but you don't need to know biology to help me.
I am trying to recreate this scientific paper to test its validity and if I can use it. I work in the biotech industry and am only somewhat familiar with C# and python.
The paper is linked to a GitHub page. And the GitHub page has some instructions on how to execute the code. But every time I try to execute this code as instructed, it gives me an error. I already installed its requirements of the most updated pytorch, numpy, scikit-learn; I also switched between GPU and CPU, but no method worked. I don't know what to do at this point.
Paper Title:
"Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs" by Ido Springer, Hanan Besser. etc.
Paper's Github8 (found in the paper's abstract):
https://github.com/louzounlab/ERGO
These are the example codes I input in the terminal. The example code was found in a comment at the end of ERGO.py
GPU ver:
python ERGO.py train lstm mcpas specific cuda:0 --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
GPU code results:
Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 163, in train_model
model.to(device) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 927, in to
return self._apply(convert) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
param_applied = fn(param) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py", line 211, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
CPU code ver (only replaced specific cuda:0 with specific cpu):
python ERGO.py train lstm mcpas specific cpu --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
CPU code results:
epoch: 1 C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py:1960: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 173, in train_model
loss = train_epoch(batches, model, loss_function, optimizer, device) File "D:\D Download\ERGO-master\lstm_utils.py", line 137, in train_epoch
loss = loss_function(probs, batch_signs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\loss.py", line 613, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py", line 3074, in binary_cross_entropy
raise ValueError( ValueError: Using a target size (torch.Size([50])) that is different to the input size (torch.Size([50, 1])) is deprecated. Please ensure they have the same size.
Looking at the ValueError, it seems that what you're trying to do is deprecated in pytorch, so you have a more recent version of the package than the one it was developed in. I suggest you try
pip install pytorch 1.4.0
in command line.
I'm not familiar with pytorch but menaging tensor shapes in tensorflow is the biggest pain in the a** for me. What it actually looks like to be the problem is that the input has an extra dimension than it should, so you would have to manually reshape it.
I am a univ student. I am working on my graduation work using the EfficientDet model.
First of all, I am using Efficientdet model for fine tuning my custom dataset.
python = 3.7
tensorflow = 2.5
tensorflow-gpu = 1.15
cuda = 11.0
these are versions of the packages I installed,
error message is
=====> Starting training, epoch: 1.
WARNING:tensorflow:From /home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0527 01:25:16.531095 140153861850944 deprecation.py:323] From /home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
Traceback (most recent call last):
File "main.py", line 407, in <module>
app.run(main)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "main.py", line 400, in main
run_train_and_eval(e)
File "main.py", line 384, in run_train_and_eval
max_steps=e * FLAGS.num_examples_per_epoch // FLAGS.train_batch_size)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default
input_fn, ModeKeys.TRAIN))
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1116, in _call_input_fn
return input_fn(**kwargs)
File "/home/ais-public/ms_ys/Liberty/automl/efficientdet/dataloader.py", line 431, in __call__
_prefetch_dataset, num_parallel_calls=tf.data.AUTOTUNE)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/util/module_wrapper.py", line 193, in __getattr__
attr = getattr(self._tfmw_wrapped_module, name)
AttributeError: module 'tensorflow._api.v1.data' has no attribute 'AUTOTUNE'
I can't find 'tensorflow/_api/v1/data' folder and I don't know why this error is occurs.
please tell me how to slove this error. Thanks.
Try to find this file and line:
File "/home/ais-public/ms_ys/Liberty/automl/efficientdet/dataloader.py", line >431, in call
_prefetch_dataset, num_parallel_calls=tf.data.AUTOTUNE)
and change tf.data.AUTOTUNE to tf.data.experimental.AUTOTUNE.
You have an older tensorflow-gpu version. AUTOTUNE is no longer experimental, as far as I know.
Or try to use just tensorflow 2.5, to avoid version conflicts.
I am new to StackOverflow.I am trying to export a model by using export_inference_graph.py.
I trained my model locally using faster_rcnn_inception_v2.I am following this tutorial.
When in command prompt I type
python export_inference_graph.py --input_type image_tensor --pipeline_config_path CAPTCHA_training/faster_rcnn_inception_v2_coco.config --trained_checkpoint_prefix "CAPTCHA_training_dir/model.ckpt-51272" --output_directory CAPTCHA_inference_graph
All with correct paths I get following error.
File "export_inference_graph.py", line 206, in <module>
tf.app.run()
File "C:\Users\Jatin\anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Users\Jatin\anaconda3\lib\site-packages\absl\app.py", line 303, in run
_run_main(main, args)
File "C:\Users\Jatin\anaconda3\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "export_inference_graph.py", line 194, in main
exporter.export_inference_graph(
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\exporter.py", line 604, in export_inference_graph
detection_model = model_builder.build(pipeline_config.model,
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 1116, in build
return build_func(getattr(model_config, meta_architecture), is_training,
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 583, in _build_faster_rcnn_model
_check_feature_extractor_exists(frcnn_config.feature_extractor.type)
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 249, in _check_feature_extractor_exists
raise ValueError('{} is not supported. See `model_builder.py` for features '
ValueError: faster_rcnn_inception_v2 is not supported. See `model_builder.py` for features extractors compatible with different versions of Tensorflow
I am using Python 3.8.5 and tensorflow version 2.4.1
Thanks in advance
Looks like a Tensorflow version problem according to error. Looking into source faster_rcnn_inception_v2 exists under if tf_version.is_tf1():. Try using to TF 1.
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/builders/model_builder.py#L70
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py#L33
faster_rcnn_inception_v2_coco exists here in Tensorflow 1 Detection model zoo.
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/g3doc/tf1_detection_zoo.md#coco-trained-models
I am trying to quantize my model (specifically pretrained faster_rcnn_inception_v2 on coco, that was downloaded from the model zoo), in hopes to speedup inference time.
I use the following code from here:
import tensorflow as tf
converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir)
converter.post_training_quantize = True
tflite_quantized_model = converter.convert()
open("quantized_model.tflite", "wb").write(tflite_quantized_model)
Models directory didnt have saved_model.pb file. So i renamed frozen_inference_graph.pb to saved_model.pb.
Running the code above produce the following runtime error:
Traceback (most recent call last):
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/hdd/motorola/motorola_heads/tensorflow_face_detection/quantize.py", line 5, in <module>
converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 318, in new_func
return func(*args, **kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/lite.py", line 587, in from_saved_model
tag_set, signature_key)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/lite.py", line 376, in from_saved_model
output_arrays, tag_set, signature_key)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/convert_saved_model.py", line 254, in freeze_saved_model
meta_graph = get_meta_graph_def(saved_model_dir, tag_set)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/convert_saved_model.py", line 61, in get_meta_graph_def
return loader.load(sess, tag_set, saved_model_dir)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 318, in new_func
return func(*args, **kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 269, in load
return loader.load(sess, tags, import_scope, **saver_kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 420, in load
**saver_kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 347, in load_graph
meta_graph_def = self.get_meta_graph_def_from_tags(tags)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 323, in get_meta_graph_def_from_tags
" could not be found in SavedModel. To inspect available tag-sets in"
RuntimeError: MetaGraphDef associated with tags set(['serve']) could not be found in SavedModel. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: `saved_model_cli`
What does it mean and what should i do?
Please refer to this issue. They seem to have the same issue as you.
This may be fixed in a more recent version of Tensorflow (perhaps the tag has switched from 'serve' to 'serving' in the meantime).
You should use tf.saved_model.simple_save to save the pb model.
When I follow the tutorials of "How to Retrain Inception's Final Layer for New Categories", I am running python retrain.py on Windows. I have not made any changes to the file retrain.py. I get the following error after nearly 7300 bottleneck files are created
Creating bottleneck at /tmp/bottleneck\daisy\9204730092_a7f2182347.jpg.txt
Creating bottleneck at /tmp/bottleneck\daisy\99306615_739eb94b9e_m.jpg.txt
7300 bottleneck files created.
Traceback (most recent call last):
File "retrain.py", line 930, in <module>
tf.app.run()
File "C:\Users\student\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "retrain.py", line 846, in main
bottleneck_tensor)
File "retrain.py", line 755, in add_final_training_ops
variable_summaries(layer_weights, layer_name + '/weights')
File "retrain.py", line 711, in variable_summaries
tf.scalar_summary('mean/' + name, mean)
AttributeError: module 'tensorflow' has no attribute 'scalar_summary'
You have to use tf.summary.scalar() instead of tf.scalar_summary.
Find list of all such updated summary functions here.