how to solve ValueError: Expected 2D array, got 1D array instead

how to solve ValueError: Expected 2D array, got 1D array instead - python

i have using metrics on sklearn to compare metrics accuracy with this model :
# Model Accuracy, how often is the classifier correct?
a = y_test
b = df_train['Interest_Rate']
k = a.reshape(1, -1)
y = b.reshape(1, -1)
print("Accuracy:",metrics.accuracy_score(k, y))
but when i run it, why it become error
raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[1 3 3 ... 1 3 2].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Related

Fix error ''Expected 2D array, got 1D array instead'' on a scaler

I am testing an .ipynb file with code to perform data augmentation using autoencoders. You can find the file in the following link: Autoencoder Data Augmentation Example
Dataset used: wine.csv
The first error I encountered was a variable that did not exist and was renamed:
# D_in = data_set.x.shape[1]
D_in = traindata_set.x.shape[1]
H = 50
H2 = 12
model = Autoencoder(D_in, H, H2).to(device)
optimizer = optim.Adam(model.parameters(), lr=1e-3)
Later, I get another error derived from the following code:
scaler = trainloader.dataset.standardizer
recon_row = scaler.inverse_transform(recon_batch[0].cpu().numpy())
real_row = scaler.inverse_transform(testloader.dataset.x[0].cpu().numpy())
I get the following error that I can't solve:
ValueError: Expected 2D array, got 1D array instead:
array=[-1.1050762 0.59396696 -0.40257156 0.5084665 -0.3387986 0.5908352
0.6442218 0.7660801 -0.36749032 0.2818777 -0.06692128 0.49236417
0.7825899 0.8493577 ].
Reshape your data either using array.reshape(-1, 1)
if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
I hope you can help me to detect it.

Seems like inverse_transform is expecting a matrix. In your code you are passing a single sample (row), so that doesn't work.
If you follow the error message tip and reshape your input the cell runs.
recon_row = scaler.inverse_transform(recon_batch[0].cpu().numpy().reshape(1, -1))
real_row = scaler.inverse_transform(testloader.dataset.x[0].cpu().numpy().reshape(1, -1))
Now that you have 2 matrix of shape (1,14) you also need to change the next cell in order to use only the first (and only) sample in the matrix :
df = pd.DataFrame(np.stack((recon_row[0], real_row[0])), columns = cols)

Can't transform list of arrays into 3D array

I have 3 classifiers that run over 288 samples. All of them are sklearn.neural_network.MLPClassifier structures. Here is the code i am using:
list_of_clfs = [MLPClassifier(...), MLPClassifier(...), MLPClassifier(...)]
probas_list = []
for clf in list_of_clfs:
probas_list.append(clf.predict_proba(X_test))
Each predict_proba(X_test) will return a 2D array with shape (n_samples, n_classes). Then, i am creating a 3D array that will contain all predict_proba() in one single place:
proba = np.array(probas_list) #this should return a (3, n_samples, n_classes) array
This should work fine, but i get an error:
ValueError: could not broadcast input array from shape (288,4) into shape (288)
I don't know why, but this works with dummy examples but not with my dataset.
update: it seems like one of the predict_proba() calls is returning an array of shape (288, 2) but my problem has 4 classes. All classifiers are being tested on the same dataset, so i don't know what this comes from.
Thanks in advance

How to find r2score of my PyTorch model for regression

I have a UNet model. I'm trying for a regression model since, in my output, I have different floating values for each pixel. In order to check the r2score, I tried to put the below code in the model class, training_step, validation_step, and test_step.
from pytorch_lightning.metrics.functional import r2score
r2 = r2score(pred, y)
self.log('r2:',r2)
But it's giving the following error
ValueError: Expected both prediction and target to be 1D or 2D tensors, but recevied tensors with dimension torch.Size([50, 1, 32, 32])
How can I check my model fit?

The issue is that the function accepts 1D or 2D tensors, but your tensor is 4D (B x C x H x W). So to use the function you should reshape it:
r2 = r2score(pred.view(pred.shape[1], -1), y.view(y.shape[1], -1))

ValueError: could not broadcast input array from shape (26000,1) into shape (26000) for sklearn preprocessing StandardScaler

I am working on a numpy array X that has multiple features (or columns).
I am trying to standardize the first feature (column) of the dataset:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1))
and I get this error:
ValueError: could not broadcast input array from shape (26000,1) into shape (26000)
if I remove reshape(-1,1) then I get this error:
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
How can I tackle this problem?
Thanks in advance!

I will split this answer to two parts. First understanding the input to the StandardScaler's fit_tansform function, and second understanding the output.
If you follow the StandardScaler documentation on the fit_transform function, it says:
Parameters: X array-like of shape (n_samples, n_features)Input samples.
Returns X_newndarray array of shape (n_samples, n_features_new)
Understanding the Input:
Here, when you do X[:,0], you are getting your entire column, but in one row. Here's an example on a random 3x2 array:
import numpy as np
X = np.random.random_sample((3,2))*10
print(X)
print(X[:,0])
print(X[:,0].shape)
gives
[[3.5782437 6.12481959]
[9.2333248 8.49628361]
[8.56447626 5.24588392]]
[3.5782437 9.2333248 8.56447626]
(3,)
Here lies our first issue. The StandardScaler expects a 2D array where the rows are each sample, and the columns are the features, in this case 1 feature. Therefore, we need a (3x1) 2D array, but we have a (3,) 1D array. This causes the error you get without reshaping. To convert our 1D array to the shape the function expects, we use reshape. The function expects a shape parameter. We want a (3x1) shape, therefore, use reshape(-1,1) (the -1 indicates that numpy will infer we want all elements).
To confirm this:
print(X[:,0].reshape(-1,1))
gives
[[9.31648164]
[6.74048286]
[7.57667118]]
Now, we are ready to input this to StandardScaler's fit_transform.
Second, the output of fit_transform:
Let's look at the shape of its output:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
print(sc.fit_transform(X[:,0].reshape(-1,1)))
print(sc.fit_transform(X[:,0].reshape(-1,1)).shape)
gives
[[ 1.34073239]
[-1.06001668]
[-0.28071571]]
(3, 1)
We are getting a 3x1 2D array when X[:,0] is actually a (3,) 1D array. We want to flatten this array back to a 1D array. There are multiple ways to do this. We could use reshape again by giving one value to indicate a 1D array, and a -1 means all values:
temp = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1)
print(temp)
print(temp.shape)
gives
[ 1.34073239 -1.06001668 -0.28071571]
(3,)
This means that X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1) will work.
Using ravel or flatten work as well.

sc.fit_transform(X[:,0].reshape(-1,1)) is a 2D array with shape (26000,1) and you cannot assign it to a 1D placeholder X[:,0] with shape (26000).
Try to flatten the fit when assign back:
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).ravel()

When subsetting the columns on a pandas DataFrame the number of columns is being lost

I am working with the pima indians dataset. After reading in the file via read_csv the shape looks good at (768,9). But subsetting on the first 8 columns results in the columns dimension of the shape getting lost:
pima = pd.read_csv('/git/uni/data/pima-indians-diabetes.csv', header=0)
X = pima.iloc[:,-1]
Y = pima.iloc[:,:-1]
gnb = GaussianNB()
y_pred = gnb.fit(X,Y) # .predict(pima)
The surprise here is the X.shape: Why would the shape be lost?
`pima.shape=(768, 9) X.shape=(768,) Y.shape=(768, 8)
So then we get bitten on the fit:
ValueError: Expected 2D array, got 1D array instead:
array=[1 0 1 ..., 0 1 0].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Update I had also tried X = pima.iloc[:,-1].values but it gives the same result/behavior

You defined your X and Y backwards, X are the features columns :-1, while Y, the target, is your last column -1. And, standard conventions are to use X (capital) and y (lowercase).
Therefore, your code should look like this:
X=pima.iloc[:,:-1]
y=pima.iloc[:,-1]
gnb = GaussianNB()
y_pred = gnb.fit(X,y)

you are confusing the names , your x should be y and vice versa
Y= pima.iloc[:,-1]
X = pima.iloc[:,:-1]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to solve ValueError: Expected 2D array, got 1D array instead - python

Related

Fix error ''Expected 2D array, got 1D array instead'' on a scaler

Can't transform list of arrays into 3D array

How to find r2score of my PyTorch model for regression

ValueError: could not broadcast input array from shape (26000,1) into shape (26000) for sklearn preprocessing StandardScaler

When subsetting the columns on a pandas DataFrame the number of columns is being lost

Categories

Resources