I would like to use YOLOv7 for segmentation on my custom dataset and custom classes.
I am already able to run the 'normal' YOLO version with my data and using the yolov7.pt weights.
But when I am using the yolov7-mask.pt weights, I end up having an error:
Traceback (most recent call last):
File "train.py", line 616, in <module>
train(hyp, opt, device, tb_writer)
File "train.py", line 71, in train
run_id = torch.load(weights, map_location=device).get('wandb_id') if weights.endswith('.pt') and os.path.isfile(weights) else None
File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 1124, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'Merge' on <module 'models.common' from '/content/yolov7/models/common.py'>
I also saw that this error is not specific to me, but not a solution.
Also, this tutorial does not use pre-trained weights and does not mention why it does so.
When I do not use pretrained weights the code compiles, but I did not check yet how good it is (I assume will take much longer to train).
Any advice will be appreciated.
Related
I need some clear instructions on how to execute some code.
Context:
This is a python machine learning peptide binding script, but you don't need to know biology to help me.
I am trying to recreate this scientific paper to test its validity and if I can use it. I work in the biotech industry and am only somewhat familiar with C# and python.
The paper is linked to a GitHub page. And the GitHub page has some instructions on how to execute the code. But every time I try to execute this code as instructed, it gives me an error. I already installed its requirements of the most updated pytorch, numpy, scikit-learn; I also switched between GPU and CPU, but no method worked. I don't know what to do at this point.
Paper Title:
"Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs" by Ido Springer, Hanan Besser. etc.
Paper's Github8 (found in the paper's abstract):
https://github.com/louzounlab/ERGO
These are the example codes I input in the terminal. The example code was found in a comment at the end of ERGO.py
GPU ver:
python ERGO.py train lstm mcpas specific cuda:0 --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
GPU code results:
Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 163, in train_model
model.to(device) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 927, in to
return self._apply(convert) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply
module._apply(fn) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply
param_applied = fn(param) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py", line 211, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
CPU code ver (only replaced specific cuda:0 with specific cpu):
python ERGO.py train lstm mcpas specific cpu --model_file=model.pt --train_data_file=train_data --test_data_file=test_data
CPU code results:
epoch: 1 C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py:1960: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last): File "D:\D Download\ERGO-master\ERGO.py", line 437, in <module>
main(args) File "D:\D Download\ERGO-master\ERGO.py", line 141, in main
model, best_auc, best_roc = lstm.train_model(train_batches, test_batches, args.device, arg, params) File "D:\D Download\ERGO-master\lstm_utils.py", line 173, in train_model
loss = train_epoch(batches, model, loss_function, optimizer, device) File "D:\D Download\ERGO-master\lstm_utils.py", line 137, in train_epoch
loss = loss_function(probs, batch_signs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\loss.py", line 613, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\functional.py", line 3074, in binary_cross_entropy
raise ValueError( ValueError: Using a target size (torch.Size([50])) that is different to the input size (torch.Size([50, 1])) is deprecated. Please ensure they have the same size.
Looking at the ValueError, it seems that what you're trying to do is deprecated in pytorch, so you have a more recent version of the package than the one it was developed in. I suggest you try
pip install pytorch 1.4.0
in command line.
I'm not familiar with pytorch but menaging tensor shapes in tensorflow is the biggest pain in the a** for me. What it actually looks like to be the problem is that the input has an extra dimension than it should, so you would have to manually reshape it.
My friend saved a decision tree model using joblib.dump(). But when I tried to predict using some data from the saved model, I got the following error. Can anyone advise as to why this is happening? My friend and I have the same versions of all required libraries
Traceback (most recent call last):
File "testing.py", line 12, in <module>
classifier = joblib.load('saved_model.pkl')
File "C:\Users\naiks\AppData\Local\Programs\Python\Python38-32\lib\site-packages\joblib\numpy_pickle.py", line 585, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "C:\Users\naiks\AppData\Local\Programs\Python\Python38-32\lib\site-packages\joblib\numpy_pickle.py", line 504, in _unpickle
obj = unpickler.load()
File "C:\Users\naiks\AppData\Local\Programs\Python\Python38-32\lib\pickle.py", line 1210, in load
dispatch[key[0]](self)
File "C:\Users\naiks\AppData\Local\Programs\Python\Python38-32\lib\pickle.py", line 1587, in load_reduce
stack[-1] = func(*args)
File "sklearn\tree\_tree.pyx", line 607, in sklearn.tree._tree.Tree.__cinit__
ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'
I was using a 32 bit version of python, whereas the machine on which the model was trained was using 64 bit python. Hence the buffer size was different, resulting in the error
So, I trained an object detection model and now I want to export .ckpt files.
When I try to export the .ckpt files:
python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/faster_rcnn_inception_v2_pets.config --trained_checkpoint_prefix training3/model.ckpt-47816 --output_directory inference_graph
I get this:
Traceback (most recent call last):
File "export_inference_graph.py", line 147, in <module>
tf.app.run()
File "/home/ubuntu/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "export_inference_graph.py", line 143, in main
FLAGS.output_directory, input_shape)
File "/home/ubuntu/tensorflow1/models/research/object_detection/exporter.py", line 454, in export_inference_graph
is_training=False)
File "/home/ubuntu/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/object_detection-0.1-py3.6.egg/object_detection/builders/model_builder.py", line 101, in build
add_summaries)
File "/home/ubuntu/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/object_detection-0.1-py3.6.egg/object_detection/builders/model_builder.py", line 274, in _build_faster_rcnn_model
image_resizer_fn = image_resizer_builder.build(frcnn_config.image_resizer)
File "/home/ubuntu/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/object_detection-0.1-py3.6.egg/object_detection/builders/image_resizer_builder.py", line 83, in build
if keep_aspect_ratio_config.per_channel_pad_value:
AttributeError: 'KeepAspectRatioResizer' object has no attribute 'per_channel_pad_value'
It seems that everybody has this working fine and have no problems with this.
Could anyone please tell me what is going on here?
I know this is a few months later, but I just encountered this issue too!
It seems the image_resizer.proto is missing the per_channel_pad_value attribute.
Update the proto file to include the attribute, from here:
https://github.com/tensorflow/models/blob/master/research/object_detection/protos/image_resizer.proto
recompile it and then try again.
Should work this time.
I am trying to run the code from here which is an implementatino of Generative Adversarial Networks using keras python. I followed the instructions and install all the requirements. Then i tried to run the code for DCGAN. However, it seems that there is some issue with the compatibility of the libraries. I am receiving the following message when i am running the code:
AttributeError: 'module' object has no attribute 'leaky_relu'
File "main.py", line 176, in <module>
dcgan = DCGAN()
File "main.py", line 25, in __init__
self.discriminator = self.build_discriminator()
File "main.py", line 84, in build_discriminator
model.add(LeakyReLU(alpha=0.2))
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/models.py", line 492, in add
output_tensor = layer(self.outputs[0])
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 617, in __call__
output = self.call(inputs, **kwargs)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/layers/advanced_activations.py", line 46, in call
return K.relu(inputs, alpha=self.alpha)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2918, in relu
x = tf.nn.leaky_relu(x, alpha)
I am using kerasVersion: 2.1.3 while tensorflowVersion: 1.2.1
and TheanoVersion: 1.0.1+40.g757b4d5
Any idea why am I receiving that issue?
EDIT:
The error is located in the line 84 in the build_discriminator:
function:`model.add(LeakyReLU(alpha=0.2))`
According to this answer, leaky_relu was added to tensorflow on version 1.4. So you might wanna check if your tensorflow installation is at least on version 1.4.
I'm trying to go through the tutorial on convolutional neural nets using cifar10. The cnn is being built (cifar10.py) but when I try to run cifar10_train.py I'm getting the following error:
Traceback (most recent call last):
File "cifar10_train.py", line 115, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "cifar10_train.py", line 111, in main
train()
File "cifar10_train.py", line 58, in train
images, labels = cifar10.distorted_inputs()
File "/home/brennus/workspace/python/cifar/cifar10.py", line 141, in distorted_inputs
batch_size=FLAGS.batch_size)
File "/home/brennus/workspace/python/cifar/cifar10_input.py", line 177, in distorted_inputs
float_image = tf.image.per_image_standardization(distorted_image)
AttributeError: 'module' object has no attribute 'per_image_standardization'
According to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/image.md, there is indeed a per_image_standardization attribute but it looks like my tensorflow doesn't have it. I'm not sure what version I have and not sure where to find it, but I built it from source from the repository so I imagine it's the current one.
I can't find anyone else who is having this problem so I'm stymied. Maybe I have to write my own?
I reinstalled tensorflow and solved the problem. Thanks, all!