I'm having an issue with v2.0.12 that I've traced into thinc. pip list shows me:
msgpack (0.5.6)
msgpack-numpy (0.4.3.1)
murmurhash (0.28.0)
regex (2017.4.5)
scikit-learn (0.19.2)
scipy (1.1.0)
spacy (2.0.12)
thinc (6.10.3)
I have code that works fine on my Mac, but fails in production. The stack trace goes into spacy and then into thinc -- and then django literally crashes. This all worked when I used an earlier version of spacy -- this has only come about since I'm attempting to upgrade to v2.0.12.
My requirements.txt file has these lines:
regex==2017.4.5
spacy==2.0.12
scikit-learn==0.19.2
scipy==1.1.0
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz
The last line pulls the en_core_web_sm down during deployment. I'm doing this so I can get those models loaded on Heroku during deployment.
I then load the parser like this:
import en_core_web_sm
en_core_web_sm.load()
Then the stack trace shows the problem here in thinc:
File "spacy/language.py", line 352, in __call__
doc = proc(doc)
File "pipeline.pyx", line 426, in spacy.pipeline.Tagger.__call__
File "pipeline.pyx", line 438, in spacy.pipeline.Tagger.predict
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/api.py", line 55, in predict
X = layer(X)
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/api.py", line 293, in predict
X = layer(layer.ops.flatten(seqs_in, pad=pad))
File "thinc/neural/_classes/model.py", line 161, in __call__
eturn self.predict(x)
File "thinc/api.py", line 55, in predict
X = layer(X)
File "thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "thinc/neural/_classes/model.py", line 125, in predict
y, _ = self.begin_update(X)
File "thinc/api.py", line 374, in uniqued_fwd
Y_uniq, bp_Y_uniq = layer.begin_update(X_uniq, drop=drop)
File "thinc/api.py", line 61, in begin_update
X, inc_layer_grad = layer.begin_update(X, drop=drop)
File "thinc/neural/_classes/layernorm.py", line 51, in begin_update
X, backprop_child = self.child.begin_update(X, drop=0.)
File "thinc/neural/_classes/maxout.py", line 69, in begin_update
output__boc = self.ops.batch_dot(X__bi, W)
File "gunicorn/workers/base.py", line 192, in handle_abort
sys.exit(1)
Again -- this all works on my laptop.
Is there something wrong with how I'm loading? Or is my version of thinc out of date? If so, what should my requirements.txt file look like?
I resolved this issue, but am leaving the answer in case someone else needs it.
The problem was that my thread was taking too long to respond because of how and when I was building and training my sklearn models. As a result, Heroku aborted the thread -- which is why the stack trace shows abort.
The fix was to change how and when I was loading the ML models so this particular operation didn't timeout.
Related
I've been doing the pytorch tutorial (https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html) and have been getting this error that I don't know how to fix. The full error is below:
Traceback (most recent call last):
File "main.py", line 146, in <module>
main()
File "main.py", line 138, in main
train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)
File "/engine.py", line 26, in train_one_epoch
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
File "/utils.py", line 180, in log_every
for obj in iterable:
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataset.py", line 311, in __getitem__
return self.dataset[self.indices[idx]]
File "main.py", line 64, in __getitem__
img, target = self.transforms(img, target)
File "/transforms.py", line 26, in __call__
image, target = t(image, target)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/transforms.py", line 50, in forward
image = F.to_tensor(image)
File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py", line 129, in to_tensor
np.array(pic, mode_to_nptype.get(pic.mode, np.uint8), copy=True)
TypeError: __array__() takes 1 positional argument but 2 were given
I believe it means somewhere I'm using an array with 2 arguments which isn't allowed, but I don't really know where abouts that is happening - perhaps in one of their pre written libraries?
I can share the code in full if desired, but thought its a bit unwieldy. Does anyone know what might be causing this error?
PyTorch has already considered this issue. It does not seem to be a PyTorch problem.
As xwang233 mentioned in the issue, we can fix it by downgrading pillow:
pip install pillow==8.2.0
This issue could be fixed as well by upgrading Pillow from version 8.3.0 to 8.3.1. I had the same issue with
torch==1.9.0+cu111
torchvision==0.10.0+cu111
Pillow==8.3.0
After Pillow was upgraded to 8.3.1 (with no change to torch and torchvision) as below, the issue is gone:
pip install --upgrade pillow
Thanks to DRTorresRuiz for providing the clue about Pillow.
I had the same error when using:
torch==1.9.0
torchvision==0.10.0
In my requirements.txt file I downgraded the torch library, which forced me to downgrade torchvision, and that fixed the error for me. The library versions I ended up using that did not raise the error were:
torch==1.8.1
torchvision==0.9.1
change your code:
np.array(pic ,np.float32)
to:
np.array(pic).astype('float32')
Scanning for .csv files in folder `Dataset`...
===> File to be labeled: SampleDataset.csv
-------> Generating reports based on the trained models...
---------------> Generating report based on classifier `Davidson` trained on dataset `Davidson`.
Traceback (most recent call last):
File "Classifiers/Davidson/DavidsonClassifier.py", line 334, in <module>
test(args)
File "Classifiers/Davidson/DavidsonClassifier.py", line 316, in test
y_preds = loaded_model.predict(X)
File "/home/root1/Desktop/btp/OnlineHarms-Metatool/newenv/lib/python3.8/site-packages/sklearn/utils/metaestimators.py", line 120, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "/home/root1/Desktop/btp/OnlineHarms-Metatool/newenv/lib/python3.8/site-packages/sklearn/pipeline.py", line 418, in predict
Xt = transform.transform(Xt)
File "/home/root1/Desktop/btp/OnlineHarms-Metatool/newenv/lib/python3.8/site-packages/sklearn/feature_selection/_base.py", line 88, in transform
mask = self.get_support()
File "/home/root1/Desktop/btp/OnlineHarms-Metatool/newenv/lib/python3.8/site-packages/sklearn/feature_selection/_base.py", line 52, in get_support
mask = self._get_support_mask()
File "/home/root1/Desktop/btp/OnlineHarms-Metatool/newenv/lib/python3.8/site-packages/sklearn/feature_selection/_from_model.py", line 189, in _get_support_mask
estimator=estimator, getter=self.importance_getter,
AttributeError: 'SelectFromModel' object has no attribute 'importance_getter'
---------------> Generating report based on classifier `Davidson` trained on dataset `Founta`.
Is there some package that is missing or is it a version error?
Here is a github repo link of ml model:-https://github.com/yashnatani28/OnlineHarms-Metatool
I had this error as well, and in your case as well it's most likely due to a version mismatch. try reinstalling scikit-learn with pip install scikit-learn --upgrade (or fix the version you want, but greater than 0.24 – importance_getter is an attribute introduced in version 0.24).
I am running a pytorch solution for wireframe detection. I am receiving a "RuntimeError: error in LoadLibraryA" when the solution executes "forward return torch.cat(outputs, 1)"
I am not able to provide a minimal re-producable example. Therefore the quesion: Is it possible to produce just type of error in a microsoft library by python programming errors, or is this most likely a version (of python, pytorch, CUDA,...) problem or a bug in my installation?
I am using windows 10, python 3.8.1 and pytorch 1.4.0.
File "main.py", line 144, in <module>
main()
File "main.py", line 137, in main
trainer.train(train_loader, val_loader=None)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\trainer\balance_junction_trainer.py", line 75, in train
self.step(epoch, train_loader)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\trainer\balance_junction_trainer.py", line 176, in step
) = self.model(input_var, junc_conf, junc_res, bin_conf, bin_res)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\inception.py", line 41, in forward
base_feat = self.base_net(im_data)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\networks\inception_v2.py", line 63, in forward
x = self.Mixed_3b(x)
File "D:\Dev\Python\Environment\Environments\pytorch\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "D:\Dev\Python\Projects\wireframe\wireframe\junc\model\networks\inception_v2.py", line 97, in forward
return torch.cat(outputs, 1)
RuntimeError: error in LoadLibraryA
Try this workground: run the following code after import torch (should be fixed in 1.5):
import ctypes
ctypes.cdll.LoadLibrary('caffe2_nvrtc.dll')
It was possible to avoid this error by downgrading to python 3.7.6
Remark: Unfortunately, the first step of the overall processing (run time 3 days on my GPU) creates intermediate results with pickel format 5, which is new in Python 3.8. Therefore, I either have to re-run the first step for 3 days or find another solution. The files with the intermediate results cannot be used with python 3.7.6
I'm trying to convert the kitti dataset into the tensorflow .record. After I typed the command:
python object_detection/dataset_tools/create_kitti_tf_record.py
--lable_map_path=object_detection/data/kitti_label_map.pbtxt --data_dir=/Users/zhenglyu/Graduate/research/DataSet/kitti/data_object_image_2/testing/image_2
--output_path=/Users/zhenglyu/Graduate/research/DataSet/kitti2tf/train.record
validation_set_size=1000
I got this error:
Traceback (most recent call last): File
"object_detection/dataset_tools/create_kitti_tf_record.py", line 310,
in
tf.app.run() File "/Users/zhenglyu/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py",
line 126, in run
_sys.exit(main(argv)) File "object_detection/dataset_tools/create_kitti_tf_record.py", line 307,
in main
validation_set_size=FLAGS.validation_set_size) File "object_detection/dataset_tools/create_kitti_tf_record.py", line 94,
in convert_kitti_to_tfrecords
label_map_dict = label_map_util.get_label_map_dict(label_map_path) File
"/Users/zhenglyu/Graduate/research/TensorFlow/model/research/object_detection/utils/label_map_util.py",
line 152, in get_label_map_dict
label_map = load_labelmap(label_map_path) File "/Users/zhenglyu/Graduate/research/TensorFlow/model/research/object_detection/utils/label_map_util.py",
line 132, in load_labelmap
label_map_string = fid.read() File "/Users/zhenglyu/tensorflow/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py",
line 120, in read
self._preread_check() File "/Users/zhenglyu/tensorflow/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py",
line 80, in _preread_check
compat.as_bytes(self.name), 1024 * 512, status) File "/Users/zhenglyu/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py",
line 519, in __exit
c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError:
data/kitti_label_map.pbtxt; No such file or directory
The file exists for sure. And I don't know why as I set the label_map_path to another one (object_detection/data/kitti_label_map.pbtxt), the path still remains the default setting (data/kitti_label_map.pbtxt).
I know there's a lot of related problem but none of the solutions that I found works for me. I used Virtualenv to install the tensorflow and using python 3.6. Could these be the problem? Thanks!
I don't have a definitive solution to this but here is what resolved it.
First, I copied the kitti_label_map.pbtxt into the data_dir. Then I also copied create_kitti_tf_record.py into the data_dir. And now I copied(this is what made it run in the end) the name and absolute path of the kitti_label_map.pbtxt and pasted it as label_map_path
I have no idea why but it worked.
I am trying to run the code from here which is an implementatino of Generative Adversarial Networks using keras python. I followed the instructions and install all the requirements. Then i tried to run the code for DCGAN. However, it seems that there is some issue with the compatibility of the libraries. I am receiving the following message when i am running the code:
AttributeError: 'module' object has no attribute 'leaky_relu'
File "main.py", line 176, in <module>
dcgan = DCGAN()
File "main.py", line 25, in __init__
self.discriminator = self.build_discriminator()
File "main.py", line 84, in build_discriminator
model.add(LeakyReLU(alpha=0.2))
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/models.py", line 492, in add
output_tensor = layer(self.outputs[0])
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 617, in __call__
output = self.call(inputs, **kwargs)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/layers/advanced_activations.py", line 46, in call
return K.relu(inputs, alpha=self.alpha)
File "/opt/libraries/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2918, in relu
x = tf.nn.leaky_relu(x, alpha)
I am using kerasVersion: 2.1.3 while tensorflowVersion: 1.2.1
and TheanoVersion: 1.0.1+40.g757b4d5
Any idea why am I receiving that issue?
EDIT:
The error is located in the line 84 in the build_discriminator:
function:`model.add(LeakyReLU(alpha=0.2))`
According to this answer, leaky_relu was added to tensorflow on version 1.4. So you might wanna check if your tensorflow installation is at least on version 1.4.