ValueError: Found input variables with inconsistent numbers of samples: [75, 1] - python

Can somebody tell me what kind of error is this and how to solve this (as picture given)
and here is the error message
Traceback (most recent call last):
File "ep5pipeline.py", line 57, in <module>
print (accuracy_score(y_test,predictions, normalize = False))
File "//anaconda/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 172, in accuracy_score
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "//anaconda/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 72, in _check_targets
check_consistent_length(y_true, y_pred)
File "//anaconda/lib/python3.6/site-packages/sklearn/utils/validation.py", line 181, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [75, 1]

In function predict(), return predictions should out of the "for" iteration.
Besides, like Neil said, better to post sources code.

Related

Unknown is not supported - f1 score

I want to do f1 score with 32 predicted masks images and 32 true masks images. My data has this features:
predicted.shape [32,512,512]
true.shape [32,512,512]
type_of_target(predicted) Unknown
type_of_target(true) Unknown
type_of_target(predicted[0]) Continuous-multioutput
type_of_target(true[0]) Continuous-multioutput
When I run this line f1_score(true, predicted, average='macro')
I get this error:
f1_score(true, predicted, average='macro')
Traceback (most recent call last):
File "<ipython-input-75-7198c91642b6>", line 1, in <module>
f1_score(true, predicted, average='macro')
File "C:\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 1099, in f1_score
zero_division=zero_division)
File "C:\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 1226, in fbeta_score
zero_division=zero_division)
File "C:\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 1484, in precision_recall_fscore_support
pos_label)
File "C:\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 1301, in _check_set_wise_labels
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 97, in _check_targets
raise ValueError("{0} is not supported".format(y_type))
ValueError: unknown is not supported
F1-Score is the Harmonic Mean of Precision and Recall. Precision and recall are calculated when the predicted values are categorical and not continuous outputs. You need to convert the predictions to categorical (by rounding up or rounding down) then flatten the array since the f1_score function only takes 1D-arrays as the input parameters.
I think F1 input should be 1d array (label).
Make sure of that.

I am trying to recreate a simple helloworld machine learning example from a google youtube video, but I am getting errors that I do not understand

I am trying to follow this machine learning hello world video: https://www.youtube.com/watch?v=cKxRvEZd3Mw
but I am getting errors that I do not understand. The errors seem to be within the library functions themselves and not with what I wrote directly. I know that the video is from 2016 so something must have changed in that time. I am just beginning with machine learning and need help!
I checked that the syntax is the same, don't know what else to do
from sklearn import tree
features = [[140, 1], [130, 1], [150, 0], [[170, 0]]]
labels = [0, 0, 1, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)
print(clf.predict([[150, 0]]))
the output in the video is "[1]", but I am getting these errors instead:
C:\Users\offic\PycharmProjects\helloWorld\venv\Scripts\python.exe "C:/Users/offic/PycharmProjects/helloWorld/Hello World.py"
Traceback (most recent call last):
File "C:/Users/offic/PycharmProjects/helloWorld/Hello World.py", line 5, in <module>
clf = clf.fit(features, labels)
File "C:\Users\offic\PycharmProjects\helloWorld\venv\lib\site-packages\sklearn\tree\tree.py", line 801, in fit
X_idx_sorted=X_idx_sorted)
File "C:\Users\offic\PycharmProjects\helloWorld\venv\lib\site-packages\sklearn\tree\tree.py", line 116, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "C:\Users\offic\PycharmProjects\helloWorld\venv\lib\site-packages\sklearn\utils\validation.py", line 527, in check_array
array = np.asarray(array, dtype=dtype, order=order)
File "C:\Users\offic\PycharmProjects\helloWorld\venv\lib\site-packages\numpy\core\numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
Process finished with exit code 1

Multiple labels with tensorflow regression

I'm trying to get a multilabel model going in tensorflow. I saw a related question here: Multiple labels with tensorflow, but couldn't get the solution working.
The code is from a tensorflow tutorial. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/input_fn/boston.py
FEATURES = ["crim", "zn", "indus", "nox", "rm",
"dis", "tax", "ptratio"]
LABELS = ["medv", "age"]
def get_input_fn(data_set, num_epochs=None, shuffle=True):
return tf.estimator.inputs.pandas_input_fn(
x=pd.DataFrame({k: data_set[k].values for k in FEATURES}),
# y=pd.Series(data_set[LABEL].values),
y=list(map(lambda label: data_set[label].values, LABELS)),
num_epochs=num_epochs,
shuffle=shuffle)
In my regression I set the label dimension to 2.
regressor = tf.estimator.DNNRegressor(feature_columns=feature_cols,
label_dimension=2,
hidden_units=[10, 10],
model_dir="/tmp/boston_model")
With my try I get:
Traceback (most recent call last):
File "./boston.py", line 85, in <module>
tf.app.run()
File "/home/jillian/.eb/software/machine-learning/1.00/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./boston.py", line 67, in main
regressor.train(input_fn=get_input_fn(training_set), steps=5000)
File "./boston.py", line 43, in get_input_fn
shuffle=shuffle)
File "/home/jillian/.eb/software/machine-learning/1.00/lib/python3.6/site-packages/tensorflow/python/estimator/inputs/pandas_io.py", line 87, in pand
as_input_fn
'Index for y: %s\n' % (x.index, y.index))
ValueError: Index for x and y are mismatched.
Index for x: RangeIndex(start=0, stop=400, step=1)
Index for y: <built-in method index of list object at 0x7f6f64a5bb48>
I also tried setting y to a numpy array instead of a list.

issues with machine learning scikit learn in python

I'm trying to reproduce a tutorial seen
here.
Everything work perfectly until I add the .fit methods with my training set.
Here is a sample of my code :
# TRAINING PART
train_dir = 'pdf/learning_set'
dictionary = make_dic(train_dir)
train_labels = np.zeros(20)
train_labels[17:20] = 1
train_matrix = extract_features(train_dir)
model1 = MultinomialNB()
model1.fit(train_matrix, train_labels)
# TESTING PART
test_dir = 'pdf/testing_set'
test_matrix = extract_features(test_dir)
test_labels = np.zeros(8)
test_labels[4:7] = 1
result1 = model1.predict(test_matrix)
print(confusion_matrix(test_labels, result1))
Here is my Traceback:
Traceback (most recent call last):
File "ML.py", line 65, in <module>
model1.fit(train_matrix, train_labels)
File "/usr/local/lib/python3.6/site-packages/sklearn/naive_bayes.py",
line 579, in fit
X, y = check_X_y(X, y, 'csr')
File "/usr/local/lib/python3.6/site-
packages/sklearn/utils/validation.py", line 552, in check_X_y
check_consistent_length(X, y)
File "/usr/local/lib/python3.6/site-
packages/sklearn/utils/validation.py", line 173, in
check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of
samples: [23, 20]
I would like to know how can I solve this issue ?
I'm working on Ubuntu 16.04, with python 3.6.
ValueError: Found input variables with inconsistent numbers of
samples: [23, 20]
That means you have 23 training Vectors (train_matrix has 23 rows)
but only 20 training labels (train_labels is an array of 20 values)
change train_labels = np.zeros(20)
to train_labels = np.zeros(23)
and it should work.

statsmodel.api fit() throws overflow error

I am using Logistic regression for Mnist digit classification and am using statsmodel.api library to fit the parameters but the Logit.fit() still throws an overflow warning.Below is the error I am getting on Windows10,python 2.7 using library downloaded from http://www.lfd.uci.edu/~gohlke/pythonlibs/.
C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py:1213: RuntimeWarning: overflow encountered in exp return 1/(1+np.exp(-X)) C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py:1263: RuntimeWarning: divide by zero encountered in log return np.sum(np.log(self.cdf(q*np.dot(X,params)))) Warning: Maximum number of iterations has been exceeded.
Current function value: inf
Iterations: 35 Traceback (most recent call last): File "code.py", line 44, in <module>
result1 = logit1.fit() File "C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py", line 1376, in fit
disp=disp, callback=callback, **kwargs) File "C:\Python27\lib\site-packages\statsmodels\discrete\discrete_model.py", line 203, in fit
disp=disp, callback=callback, **kwargs) File "C:\Python27\lib\site-packages\statsmodels\base\model.py", line 434, in fit
Hinv = np.linalg.inv(-retvals['Hessian']) / nobs File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 526, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj) File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 90, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix") numpy.linalg.linalg.LinAlgError: Singular matrix

Categories

Resources