I am using Logistic regression for Mnist digit classification and am using statsmodel.api library to fit the parameters but the Logit.fit() still throws an overflow warning.Below is the error I am getting on Windows10,python 2.7 using library downloaded from http://www.lfd.uci.edu/~gohlke/pythonlibs/.
C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py:1213: RuntimeWarning: overflow encountered in exp return 1/(1+np.exp(-X)) C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py:1263: RuntimeWarning: divide by zero encountered in log return np.sum(np.log(self.cdf(q*np.dot(X,params)))) Warning: Maximum number of iterations has been exceeded.
Current function value: inf
Iterations: 35 Traceback (most recent call last): File "code.py", line 44, in <module>
result1 = logit1.fit() File "C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py", line 1376, in fit
disp=disp, callback=callback, **kwargs) File "C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py", line 203, in fit
disp=disp, callback=callback, **kwargs) File "C:\Python27\lib\site-packages\statsmodels\base\model.py", line 434, in fit
Hinv = np.linalg.inv(-retvals['Hessian']) / nobs File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 526, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj) File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 90, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix") numpy.linalg.linalg.LinAlgError: Singular matrix
Related
I ran detect.py in YOLOv5-5.0 and it shown me this. How can I fix it?
Here is the problem:
Traceback (most recent call last):
File "F:\python_yolov5\yolov5-5.0\detect.py", line 178, in <module>
detect()
File "F:\python_yolov5\yolov5-5.0\detect.py", line 61, in detect
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once
File "F:\anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "F:\python_yolov5\yolov5-5.0\models\yolo.py", line 123, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "F:\python_yolov5\yolov5-5.0\models\yolo.py", line 139, in forward_once
x = m(x) # run
File "F:\anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "F:\python_yolov5\yolov5-5.0\models\yolo.py", line 55, in forward
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
RuntimeError: The size of tensor a (80) must match the size of tensor b (56) at non-singleton dimension 3
Download the file at this linkļ¼
https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5s.pt
replace default downloaded file(v6.1)---yolov5s.pt
then run detect.py. and it will be OK!
I am trying to debug my tensorflow code that suddenly produces a NaN loss after about 30 epochs. You may find my specific problem and things I tried in this SO question.
I monitored the weights of all layers for each mini-batch during training and found that the weights suddenly jump to NaN although all weight values were less than 1 during the previous iteration (I have set kernel_constraint max_norm to 1). This makes it very hard to figure out which operation is the culprit.
Pytorch has a cool debugging method torch.autograd.detect_anomaly that produces an error at any backward computation that produces NaN value and shows the traceback. This makes it easy to debug the code.
Is there something similar in TensorFlow? If not can you suggest a method to debug this?
There is indeed a similar debugging tool in tensorflow. See tf.debugging.check_numerics.
This can be used to track the tensors that produce inf or nan values during training. As soon as such value is found, tensorflow produces an InvalidArgumentError.
tf.debugging.check_numerics(LayerN, "LayerN is producing nans!")
If the tensor LayerN has nans, you would get an error like that:
Traceback (most recent call last):
File "trainer.py", line 506, in <module>
worker.train_model()
File "trainer.py", line 211, in train_model
l, tmae = train_step(*batch)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 855, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 2943, in __call__
filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 560, in call
ctx=ctx)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: LayerN is producing nans! : Tensor had NaN values
I'm trying to train a convolutional neural network using Keras/Tensorflow. My model compiles correctly, but as soon as training begins the following error is returned:
Using TensorFlow backend.
Epoch 1/3
Traceback (most recent call last):
File "./main.py", line 17, in <module>
history = CNN.fit(TrainImages, TrainMasks, epochs = 3)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1239, in fit
validation_freq=validation_freq)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 196, in fit_loop
outs = fit_function(ins_batch)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3727, in __call__
outputs = self._graph_fn(*converted_inputs)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1551, in __call__
return self._call_impl(args, kwargs)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1591, in _call_impl
return self._call_flat(args, self.captured_inputs, cancellation_manager)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/home/tomhalmos/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.**InvalidArgumentError: BiasGrad requires tensor size <= int32 max**
[[node gradients/conv2d_22/BiasAdd_grad/BiasAddGrad (defined at /home/tomhalmos/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_5496]
Function call stack:
keras_scratch_graph
Happy to provide any further details if the above is not sufficient.
The bounds checking is on the number of elements in a tensor. The size is limited to 2.147 billion values (int32).
Take your image size (h x v) times the sample batch size. Multiply that by the number of channels in your operation (such as Conv2D). The place where you get a count larger than 2.1e9 is the guilty operation. There is no solution that I can see other than reducing one of those numbers.
I change my job on GPU and it's work well.
I have modified existing graph
https://github.com/TropComplique/FaceBoxes-tensorflow/blob/master/src/detector.py#L70
Adding tf.identity ops to get readable names, to find out them in graph after.
with tf.name_scope('postprocessing'):
boxes = batch_decode(self.box_encodings, self.anchors)
# if the images were padded we need to rescale predicted boxes:
boxes = boxes / self.box_scaler
boxes = tf.clip_by_value(boxes, 0.0, 1.0)
# it has shape [batch_size, num_anchors, 4]
scores = tf.nn.softmax(self.class_predictions_with_background, axis=2)[:, :, 1]
# it has shape [batch_size, num_anchors]
boxes = tf.identity(input=boxes, name="my_boxes")
scores = tf.identity(input=scores, name="my_scores")
Then having existing checkpoint I converted checkpoint to .pb using this modified graph like decribed here:
https://github.com/TropComplique/FaceBoxes-tensorflow/issues/6
But then when I try to convert .pb to tensorboard via https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/import_pb_to_tensorboard.py and then try to view it via tensorboard
python import_pb_to_tensorboard.py --model_dir model_v3.pb --log_dir model_v3_log_dir
2019-03-14 16:28:43.572017: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'nms/map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3' expects to be colocated with unknown node 'postprocessing/my_boxes'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "import_pb_to_tensorboard.py", line 86, in <module>
app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "import_pb_to_tensorboard.py", line 68, in main
import_to_tensorboard(FLAGS.model_dir, FLAGS.log_dir)
File "import_pb_to_tensorboard.py", line 59, in import_to_tensorboard
importer.import_graph_def(graph_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
raise ValueError(str(e))
ValueError: Node 'nms/map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3' expects to be colocated with unknown node 'postprocessing/my_boxes'
So is it possible to set readable names to some ops of existing graph?
I am working on sentiment analysis of around 30,000 tweets. python version is 2.7 on linux. In the training phase I am using nltk as a wrapper for sklearn library to apply different Classifiers such as Naive Bayes, LinearSVC, Logistic regression , etc.
It works fine when the number of tweets are like 10,000 but now I received error for 30,000 tweets on classifying Bigrams with Multinomial naive bayes in sklearn. Here is part of the implementation code after pre-processing and dividing to train and test sets :
import nltk
from nltk.classify.scikitlearn import SklearnClassifier
from sklearn.naive_bayes import MultinomialNB,
training_set = nltk.classify.util.apply_features(extractFeatures, trainTweets)
testing_set = nltk.classify.util.apply_features(extractFeatures, testTweets)
MNB_classifier = SklearnClassifier(MultinomialNB())
MNB_classifier.train(training_set)
MNBAccuracy = nltk.classify.accuracy(MNB_classifier, testing_set)*100
print "-------- MultinomialNB --------"
print "RESULT : Matches " + str(int((testSize*MNBAccuracy)/100)) + ":"+ str(testSize)
print "MNB accuracy percentage:" + str(MNBAccuracy)
print ""
here the Error:
Traceback (most recent call last):
File "/home/sb402747/Desktop/Sentiment/sentiment140API/analysing/Classifier.py", line 83, in <module>
MNB_classifier.train(training_set)
File "/home/sb402747/.local/lib/python2.7/site-packages/nltk/classify/scikitlearn.py", line 115, in train
X = self._vectorizer.fit_transform(X)
File "/home/sb402747/.local/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 226, in fit_transform
return self._transform(X, fitting=True)
File "/home/sb402747/.local/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 176, in _transform
indptr.append(len(indices))
OverflowError: signed integer is greater than maximum
I guess the reason is because the number of indices in array is more that the maximum allowed for it on dict_vectore.py. I even tried to change the type of indices in dict_vectorizer.py from i to l but it didn't solve my problem and received this error:
Traceback (most recent call last):
File "/home/sb402747/Desktop/Sentiment/ServerBackup26-02-2016/analysing/Classifier.py", line 84, in <module>
MNB_classifier.train(training_set)
File "/home/sb402747/.local/lib/python2.7/site-packages/nltk/classify/scikitlearn.py", line 115, in train
X = self._vectorizer.fit_transform(X)
File "/home/sb402747/.local/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 226, in fit_transform
return self._transform(X, fitting=True)
File "/home/sb402747/.local/lib/python2.7/site-packages/sklearn/feature_extraction/dict_vectorizer.py", line 186, in _transform
shape=shape, dtype=dtype)
File "/rwthfs/rz/SW/UTIL.common/Python/2.7.9/x86_64/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 88, in __init__
self.check_format(full_check=False)
File "/rwthfs/rz/SW/UTIL.common/Python/2.7.9/x86_64/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 167, in check_format
raise ValueError("indices and data should have the same size")
ValueError: indices and data should have the same size
then discarded it and changed it back to i again. How can I solve this problem?
Hmm, looks like here:
File "/home/sb402747/.local/lib/python2.7/site-packages/nltk/classify/scikitlearn.py", line 115, in train
X = self._vectorizer.fit_transform(X)
nltk demands too big matrix as a result.
Maybe you can change it somehow, for example minimize number of features (words) in your text, or request for this result in two passes?
Also, are you trying to do this on latest numpy/scipy/scikit-learn stable releases?
Read this too: https://sourceforge.net/p/scikit-learn/mailman/message/31340515/