Can't transform list of arrays into 3D array - python

I have 3 classifiers that run over 288 samples. All of them are sklearn.neural_network.MLPClassifier structures. Here is the code i am using:
list_of_clfs = [MLPClassifier(...), MLPClassifier(...), MLPClassifier(...)]
probas_list = []
for clf in list_of_clfs:
probas_list.append(clf.predict_proba(X_test))
Each predict_proba(X_test) will return a 2D array with shape (n_samples, n_classes). Then, i am creating a 3D array that will contain all predict_proba() in one single place:
proba = np.array(probas_list) #this should return a (3, n_samples, n_classes) array
This should work fine, but i get an error:
ValueError: could not broadcast input array from shape (288,4) into shape (288)
I don't know why, but this works with dummy examples but not with my dataset.
update: it seems like one of the predict_proba() calls is returning an array of shape (288, 2) but my problem has 4 classes. All classifiers are being tested on the same dataset, so i don't know what this comes from.
Thanks in advance

Related

Confusion matrix for 4D array of multiclass image segmentation

The results from my multi-class image segmentation gives a 4d array like (25, 512, 512, 4), what would be the best way to create a confusion matrix with the actual class labels (that has the same array dimensions)?
I thought about flattening each class axis and using argmax to return the index (aka class label), to get a 1D array of prediction class labels. But this seems really inefficient so hoping someone has a better idea.

ValueError: could not broadcast input array from shape (26000,1) into shape (26000) for sklearn preprocessing StandardScaler

I am working on a numpy array X that has multiple features (or columns).
I am trying to standardize the first feature (column) of the dataset:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1))
and I get this error:
ValueError: could not broadcast input array from shape (26000,1) into shape (26000)
if I remove reshape(-1,1) then I get this error:
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
How can I tackle this problem?
Thanks in advance!
I will split this answer to two parts. First understanding the input to the StandardScaler's fit_tansform function, and second understanding the output.
If you follow the StandardScaler documentation on the fit_transform function, it says:
Parameters: X array-like of shape (n_samples, n_features)Input samples.
Returns X_newndarray array of shape (n_samples, n_features_new)
Understanding the Input:
Here, when you do X[:,0], you are getting your entire column, but in one row. Here's an example on a random 3x2 array:
import numpy as np
X = np.random.random_sample((3,2))*10
print(X)
print(X[:,0])
print(X[:,0].shape)
gives
[[3.5782437 6.12481959]
[9.2333248 8.49628361]
[8.56447626 5.24588392]]
[3.5782437 9.2333248 8.56447626]
(3,)
Here lies our first issue. The StandardScaler expects a 2D array where the rows are each sample, and the columns are the features, in this case 1 feature. Therefore, we need a (3x1) 2D array, but we have a (3,) 1D array. This causes the error you get without reshaping. To convert our 1D array to the shape the function expects, we use reshape. The function expects a shape parameter. We want a (3x1) shape, therefore, use reshape(-1,1) (the -1 indicates that numpy will infer we want all elements).
To confirm this:
print(X[:,0].reshape(-1,1))
gives
[[9.31648164]
[6.74048286]
[7.57667118]]
Now, we are ready to input this to StandardScaler's fit_transform.
Second, the output of fit_transform:
Let's look at the shape of its output:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
print(sc.fit_transform(X[:,0].reshape(-1,1)))
print(sc.fit_transform(X[:,0].reshape(-1,1)).shape)
gives
[[ 1.34073239]
[-1.06001668]
[-0.28071571]]
(3, 1)
We are getting a 3x1 2D array when X[:,0] is actually a (3,) 1D array. We want to flatten this array back to a 1D array. There are multiple ways to do this. We could use reshape again by giving one value to indicate a 1D array, and a -1 means all values:
temp = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1)
print(temp)
print(temp.shape)
gives
[ 1.34073239 -1.06001668 -0.28071571]
(3,)
This means that X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1) will work.
Using ravel or flatten work as well.
sc.fit_transform(X[:,0].reshape(-1,1)) is a 2D array with shape (26000,1) and you cannot assign it to a 1D placeholder X[:,0] with shape (26000).
Try to flatten the fit when assign back:
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).ravel()

Different Numpy reshaping to 3D array syntax's

I'm looking at LSTM neural networks. I saw code like this below:
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
This code is meant to change a 2d array into a 3d array but the syntax looks off to me or at least I don't understand it. For example I would assume this code below as a 3d syntax
np.reshape(rows , columns, dimensions)
Could someone elaborate what the syntax is and what it is trying to do.
Function numpy.reshape gives a new shape to an array without changing its data. It is a numpy package function. First of all, it needs to know what to reshape, which is the first argument of this function (in your case, you want to reshape X_train).
Then it needs to know what is the shape of your new matrix. This argument needs to be a tuple. For 2D reshape you can pass (W,H), for three dimensional you can pass (W,H,D), for four dimensional you can pass (W,H,D,T) and so on.
However, you can also call reshape a Numpy matrix by X_train.reshape((W,H,D)). In this case, since reshape function is a method of X_train object, then you do not have to pass it and only pass the new shape.
It is also worth mentioning that the total number of element in a matrix with the new shape, should match your original matrix. For example, your 2D X_train has X_train.shape[0] x X_train.shape[1] elements. This value should be equal to W x H x D.

Keras predict getting incorrect shape?

I'm new to Keras and am trying to test out a model I've just trained.
I'm using Tensorflow backend and Python 3.
However, the shape my input has and the shape Keras says it has in an error are completely different. Here's my code:
testnote = np.zeros((3,))
testnote[0] = 70
testnote[1] = 70
print(testnote.shape)
pred = model.predict(testnote)
print(pred)
My consistent output is "(3,)" for the shape of testnote and then an error for my predict line: "ValueError: Error when checking input: expected dense_1_input to have shape (3,) but got array with shape (1,)"
How is it that Keras reads testnote as having shape (1,) when I've just confirmed that the shape is (3,)? Is it using some sort of different standard for what "shape" means? I've tried reshaping and adding brackets and a bunch of other things, but I don't really know what the problem is.
For additional context, the model takes in an array with 3 scalar input (representing pitch, velocity, and instrument class) and outputs an array with 1025 scalar outputs. I am carefully not using the word "dimension" since I think this is where I'm getting confused, and technically both are only 1 dimension. I'm sure there are many problems with my model which I will have to fix following this. However, I'd like to just get this prediction function working so I can understand what my output looks like.
Thanks in advance for any help.
A Keras Model implicitly expects that your data (passed as a np array) has a dimension for the batch size. Currently, your model is interpreting testnote as being 3 examples of shape 1. Try adding the batch dimension to 'testnote' as follows:
testnote = testnote.reshape(1,-1)
This will reshape testnote to shape (1, 3), so that you explicitly define the batch size to be 1.

Call model.fit in Keras for inputs of different shapes?

I created a CNN whith Python and Keras which compresses 2D input of various length into a single output. All images have a height of 80 pixels, but different lenght, e.g. shape (80, lenght_of_image_i, 2), where 2 is the number of color channels.
I have 5000 images, the shape of the training data array X in numpy is (5000, 1) and the array has dtype object. This is because storing content with different shape is not possible in a single numpy array. Each object in the list has shape (80, lenght_of_image_i, 2).
With this said, when I call the model.fit(X,y) function of the sequential model, I get the following error:
ValueError: Error when checking input: expected conv2d_1_input to have 4
dimensions, but got array with shape (5000, 1)
Converting the numpy array to Python list of numpy arrays also doesn't work:
AttributeError: 'list' object has no attribute 'ndim'
Zero padding or transformations of my data to get all of my images to the same shape is not an option.
My Question now is: How can I call the model.fit(X,y) function when my data has not a fixed shape?
Thank you in advance!
Edit: Note that I do not have a problem with the architecture of my network (since I am not using dense layers). My problem is that I cannot call the fit function, due to problems with the shape of the numpy array.
My model is a replicate of this network: http://machine-listening.eecs.qmul.ac.uk/wp-content/uploads/sites/26/2017/01/sparrow.pdf
You need to pass "numpy arrays" to fit, of type "float". That is the only possibility.
So, you will probably have to group batches of images with the same length, or train each sample individually:
for image, output in zip(images,outputs):
model.train_on_batch(image.reshape((1,80,-1,2), outputs.reshape((1,)+outputs.shape, ....)

Categories

Resources