tensorflow object detection with my own data, can you help me? - python

I cloned tensorflow object detection model on githug:
github link
And I want to train this model with my own data (331 samoyed dog's images) following by this blog tutorial click here
My steps:
Created PASCAL VOC format dataset;
download retrained model(ssd_mobilenet_v1_coco_11_06_2017.tar.gz)
change the config file(ssd_mobilenet_v1_pets.config)
initial the training process by this codes:
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=./samoyed_test_and_train/training/ssd_mobilenet_v1_pets.config \
--train_dir=./samoyed_test_and_train/data/train.record
but I receive errors, my os is MacOS,and I tried on AWS,same problem occurs, can you figured out my mistakes ?errors:
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
WARNING:tensorflow:From /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/meta_architectures/ssd_meta_arch.py:579: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-08-01 10:34:42.992224: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:34:42.992254: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:35:00.359032: I tensorflow/core/common_runtime/simple_placer.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/samoyed_test_and_train/training/model.ckpt
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
2017-08-01 10:35:29.556458: E tensorflow/core/util/events_writer.cc:62] Could not open events file: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local: Failed precondition: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local
2017-08-01 10:35:29.556480: E tensorflow/core/util/events_writer.cc:95] Write failed because file could not be opened.
Traceback (most recent call last):
File "object_detection/train.py", line 198, in <module>
tf.app.run()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/trainer.py", line 290, in train
saver=saver)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 732, in train
master, start_standard_services=False, config=session_config) as sess:
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session
start_standard_services=start_standard_services)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 709, in prepare_or_wait_for_session
self._write_graph()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 612, in _write_graph
self._logdir, "graph.pbtxt")
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/graph_io.py", line 67, in write_graph
file_io.atomic_write_string_to_file(path, str(graph_def))
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 418, in atomic_write_string_to_file
write_string_to_file(temp_pathname, contents)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 305, in write_string_to_file
f.write(file_content)
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
self._prewrite_check()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status)
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f

the train_dir flag is meant to point at some (typically empty) directory where your training logs and checkpoints will be written during training. For example it could be something like train_dir=/tmp/training_directory. It looks like you are trying to point it at your dataset --- which the config file should already be pointing at.

Related

Problem with executing python machine learning code I found on GitHub

I need some clear instructions on how to execute some code.
Context:
This is a python machine learning peptide binding script, but you don't need to know biology to help me.
I am trying to recreate this scientific paper to test its validity and if I can use it. I work in the biotech industry and am only somewhat familiar with C# and python.
The paper is linked to a GitHub page. And the GitHub page has some instructions on how to execute the code. But every time I try to execute this code as instructed, it gives me an error. I already installed its requirements of the most updated pytorch, numpy, scikit-learn; I also switched between GPU and CPU, but no method worked. I don't know what to do at this point.
Paper Title:
"Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs" by Ido Springer, Hanan Besser. etc.
Paper's Github8 (found in the paper's abstract):
https://github.com/louzounlab/ERGO
These are the example codes I input in the terminal. The example code was found in a comment at the end of ERGO.py
GPU ver:
python ERGO.py train lstm mcpas specific cuda:0 --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
GPU code results:
Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 163, in train_model
model.to(device) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 927, in to
return self._apply(convert) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
param_applied = fn(param) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py", line 211, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
CPU code ver (only replaced specific cuda:0 with specific cpu):
python ERGO.py train lstm mcpas specific cpu --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
CPU code results:
epoch: 1 C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py:1960: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 173, in train_model
loss = train_epoch(batches, model, loss_function, optimizer, device) File "D:\D Download\ERGO-master\lstm_utils.py", line 137, in train_epoch
loss = loss_function(probs, batch_signs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\loss.py", line 613, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py", line 3074, in binary_cross_entropy
raise ValueError( ValueError: Using a target size (torch.Size([50])) that is different to the input size (torch.Size([50, 1])) is deprecated. Please ensure they have the same size.
Looking at the ValueError, it seems that what you're trying to do is deprecated in pytorch, so you have a more recent version of the package than the one it was developed in. I suggest you try
pip install pytorch 1.4.0
in command line.
I'm not familiar with pytorch but menaging tensor shapes in tensorflow is the biggest pain in the a** for me. What it actually looks like to be the problem is that the input has an extra dimension than it should, so you would have to manually reshape it.

AttributeError: module 'tensorflow._api.v1.data' has no attribute 'AUTOTUNE'

I am a univ student. I am working on my graduation work using the EfficientDet model.
First of all, I am using Efficientdet model for fine tuning my custom dataset.
python = 3.7
tensorflow = 2.5
tensorflow-gpu = 1.15
cuda = 11.0
these are versions of the packages I installed,
error message is
=====> Starting training, epoch: 1.
WARNING:tensorflow:From /home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0527 01:25:16.531095 140153861850944 deprecation.py:323] From /home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
Traceback (most recent call last):
File "main.py", line 407, in <module>
app.run(main)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "main.py", line 400, in main
run_train_and_eval(e)
File "main.py", line 384, in run_train_and_eval
max_steps=e * FLAGS.num_examples_per_epoch // FLAGS.train_batch_size)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default
input_fn, ModeKeys.TRAIN))
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1116, in _call_input_fn
return input_fn(**kwargs)
File "/home/ais-public/ms_ys/Liberty/automl/efficientdet/dataloader.py", line 431, in __call__
_prefetch_dataset, num_parallel_calls=tf.data.AUTOTUNE)
File "/home/ais-public/anaconda3/envs/peace/lib/python3.7/site-packages/tensorflow_core/python/util/module_wrapper.py", line 193, in __getattr__
attr = getattr(self._tfmw_wrapped_module, name)
AttributeError: module 'tensorflow._api.v1.data' has no attribute 'AUTOTUNE'
I can't find 'tensorflow/_api/v1/data' folder and I don't know why this error is occurs.
please tell me how to slove this error. Thanks.
Try to find this file and line:
File "/home/ais-public/ms_ys/Liberty/automl/efficientdet/dataloader.py", line >431, in call
_prefetch_dataset, num_parallel_calls=tf.data.AUTOTUNE)
and change tf.data.AUTOTUNE to tf.data.experimental.AUTOTUNE.
You have an older tensorflow-gpu version. AUTOTUNE is no longer experimental, as far as I know.
Or try to use just tensorflow 2.5, to avoid version conflicts.

Running export_inference_graph.py throws value error

I am new to StackOverflow.I am trying to export a model by using export_inference_graph.py.
I trained my model locally using faster_rcnn_inception_v2.I am following this tutorial.
When in command prompt I type
python export_inference_graph.py --input_type image_tensor --pipeline_config_path CAPTCHA_training/faster_rcnn_inception_v2_coco.config --trained_checkpoint_prefix "CAPTCHA_training_dir/model.ckpt-51272" --output_directory CAPTCHA_inference_graph
All with correct paths I get following error.
File "export_inference_graph.py", line 206, in <module>
tf.app.run()
File "C:\Users\Jatin\anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Users\Jatin\anaconda3\lib\site-packages\absl\app.py", line 303, in run
_run_main(main, args)
File "C:\Users\Jatin\anaconda3\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "export_inference_graph.py", line 194, in main
exporter.export_inference_graph(
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\exporter.py", line 604, in export_inference_graph
detection_model = model_builder.build(pipeline_config.model,
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 1116, in build
return build_func(getattr(model_config, meta_architecture), is_training,
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 583, in _build_faster_rcnn_model
_check_feature_extractor_exists(frcnn_config.feature_extractor.type)
File "C:\Users\Jatin\anaconda3\lib\site-packages\object_detection\builders\model_builder.py", line 249, in _check_feature_extractor_exists
raise ValueError('{} is not supported. See `model_builder.py` for features '
ValueError: faster_rcnn_inception_v2 is not supported. See `model_builder.py` for features extractors compatible with different versions of Tensorflow
I am using Python 3.8.5 and tensorflow version 2.4.1
Thanks in advance
Looks like a Tensorflow version problem according to error. Looking into source faster_rcnn_inception_v2 exists under if tf_version.is_tf1():. Try using to TF 1.
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/builders/model_builder.py#L70
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py#L33
faster_rcnn_inception_v2_coco exists here in Tensorflow 1 Detection model zoo.
https://github.com/tensorflow/models/blob/5a89897396aa8ecc7b3ef8919f987e96fc8d74db/research/object_detection/g3doc/tf1_detection_zoo.md#coco-trained-models

Post-training quantization with tflite cause runtime error

I am trying to quantize my model (specifically pretrained faster_rcnn_inception_v2 on coco, that was downloaded from the model zoo), in hopes to speedup inference time.
I use the following code from here:
import tensorflow as tf
converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir)
converter.post_training_quantize = True
tflite_quantized_model = converter.convert()
open("quantized_model.tflite", "wb").write(tflite_quantized_model)
Models directory didnt have saved_model.pb file. So i renamed frozen_inference_graph.pb to saved_model.pb.
Running the code above produce the following runtime error:
Traceback (most recent call last):
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/juggernaut/pycharm-community-2018.2.4/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/hdd/motorola/motorola_heads/tensorflow_face_detection/quantize.py", line 5, in <module>
converter = tf.lite.TocoConverter.from_saved_model(saved_model_dir)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 318, in new_func
return func(*args, **kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/lite.py", line 587, in from_saved_model
tag_set, signature_key)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/lite.py", line 376, in from_saved_model
output_arrays, tag_set, signature_key)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/convert_saved_model.py", line 254, in freeze_saved_model
meta_graph = get_meta_graph_def(saved_model_dir, tag_set)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/lite/python/convert_saved_model.py", line 61, in get_meta_graph_def
return loader.load(sess, tag_set, saved_model_dir)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 318, in new_func
return func(*args, **kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 269, in load
return loader.load(sess, tags, import_scope, **saver_kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 420, in load
**saver_kwargs)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 347, in load_graph
meta_graph_def = self.get_meta_graph_def_from_tags(tags)
File "/hdd/motorola/venv_py27_tf1.10/local/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 323, in get_meta_graph_def_from_tags
" could not be found in SavedModel. To inspect available tag-sets in"
RuntimeError: MetaGraphDef associated with tags set(['serve']) could not be found in SavedModel. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: `saved_model_cli`
What does it mean and what should i do?
Please refer to this issue. They seem to have the same issue as you.
This may be fixed in a more recent version of Tensorflow (perhaps the tag has switched from 'serve' to 'serving' in the meantime).
You should use tf.saved_model.simple_save to save the pb model.

Tensorflow for Poets (Retraining Inception) on Windows

When I follow the tutorials of "How to Retrain Inception's Final Layer for New Categories", I am running python retrain.py on Windows. I have not made any changes to the file retrain.py. I get the following error after nearly 7300 bottleneck files are created
Creating bottleneck at /tmp/bottleneck\daisy\9204730092_a7f2182347.jpg.txt
Creating bottleneck at /tmp/bottleneck\daisy\99306615_739eb94b9e_m.jpg.txt
7300 bottleneck files created.
Traceback (most recent call last):
File "retrain.py", line 930, in <module>
tf.app.run()
File "C:\Users\student\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "retrain.py", line 846, in main
bottleneck_tensor)
File "retrain.py", line 755, in add_final_training_ops
variable_summaries(layer_weights, layer_name + '/weights')
File "retrain.py", line 711, in variable_summaries
tf.scalar_summary('mean/' + name, mean)
AttributeError: module 'tensorflow' has no attribute 'scalar_summary'
You have to use tf.summary.scalar() instead of tf.scalar_summary.
Find list of all such updated summary functions here.

Categories

Resources