I'm looking at LSTM neural networks. I saw code like this below:
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
This code is meant to change a 2d array into a 3d array but the syntax looks off to me or at least I don't understand it. For example I would assume this code below as a 3d syntax
np.reshape(rows , columns, dimensions)
Could someone elaborate what the syntax is and what it is trying to do.
Function numpy.reshape gives a new shape to an array without changing its data. It is a numpy package function. First of all, it needs to know what to reshape, which is the first argument of this function (in your case, you want to reshape X_train).
Then it needs to know what is the shape of your new matrix. This argument needs to be a tuple. For 2D reshape you can pass (W,H), for three dimensional you can pass (W,H,D), for four dimensional you can pass (W,H,D,T) and so on.
However, you can also call reshape a Numpy matrix by X_train.reshape((W,H,D)). In this case, since reshape function is a method of X_train object, then you do not have to pass it and only pass the new shape.
It is also worth mentioning that the total number of element in a matrix with the new shape, should match your original matrix. For example, your 2D X_train has X_train.shape[0] x X_train.shape[1] elements. This value should be equal to W x H x D.
Related
I have 3 classifiers that run over 288 samples. All of them are sklearn.neural_network.MLPClassifier structures. Here is the code i am using:
list_of_clfs = [MLPClassifier(...), MLPClassifier(...), MLPClassifier(...)]
probas_list = []
for clf in list_of_clfs:
probas_list.append(clf.predict_proba(X_test))
Each predict_proba(X_test) will return a 2D array with shape (n_samples, n_classes). Then, i am creating a 3D array that will contain all predict_proba() in one single place:
proba = np.array(probas_list) #this should return a (3, n_samples, n_classes) array
This should work fine, but i get an error:
ValueError: could not broadcast input array from shape (288,4) into shape (288)
I don't know why, but this works with dummy examples but not with my dataset.
update: it seems like one of the predict_proba() calls is returning an array of shape (288, 2) but my problem has 4 classes. All classifiers are being tested on the same dataset, so i don't know what this comes from.
Thanks in advance
I am working on a numpy array X that has multiple features (or columns).
I am trying to standardize the first feature (column) of the dataset:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1))
and I get this error:
ValueError: could not broadcast input array from shape (26000,1) into shape (26000)
if I remove reshape(-1,1) then I get this error:
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
How can I tackle this problem?
Thanks in advance!
I will split this answer to two parts. First understanding the input to the StandardScaler's fit_tansform function, and second understanding the output.
If you follow the StandardScaler documentation on the fit_transform function, it says:
Parameters: X array-like of shape (n_samples, n_features)Input samples.
Returns X_newndarray array of shape (n_samples, n_features_new)
Understanding the Input:
Here, when you do X[:,0], you are getting your entire column, but in one row. Here's an example on a random 3x2 array:
import numpy as np
X = np.random.random_sample((3,2))*10
print(X)
print(X[:,0])
print(X[:,0].shape)
gives
[[3.5782437 6.12481959]
[9.2333248 8.49628361]
[8.56447626 5.24588392]]
[3.5782437 9.2333248 8.56447626]
(3,)
Here lies our first issue. The StandardScaler expects a 2D array where the rows are each sample, and the columns are the features, in this case 1 feature. Therefore, we need a (3x1) 2D array, but we have a (3,) 1D array. This causes the error you get without reshaping. To convert our 1D array to the shape the function expects, we use reshape. The function expects a shape parameter. We want a (3x1) shape, therefore, use reshape(-1,1) (the -1 indicates that numpy will infer we want all elements).
To confirm this:
print(X[:,0].reshape(-1,1))
gives
[[9.31648164]
[6.74048286]
[7.57667118]]
Now, we are ready to input this to StandardScaler's fit_transform.
Second, the output of fit_transform:
Let's look at the shape of its output:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
print(sc.fit_transform(X[:,0].reshape(-1,1)))
print(sc.fit_transform(X[:,0].reshape(-1,1)).shape)
gives
[[ 1.34073239]
[-1.06001668]
[-0.28071571]]
(3, 1)
We are getting a 3x1 2D array when X[:,0] is actually a (3,) 1D array. We want to flatten this array back to a 1D array. There are multiple ways to do this. We could use reshape again by giving one value to indicate a 1D array, and a -1 means all values:
temp = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1)
print(temp)
print(temp.shape)
gives
[ 1.34073239 -1.06001668 -0.28071571]
(3,)
This means that X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).reshape(-1) will work.
Using ravel or flatten work as well.
sc.fit_transform(X[:,0].reshape(-1,1)) is a 2D array with shape (26000,1) and you cannot assign it to a 1D placeholder X[:,0] with shape (26000).
Try to flatten the fit when assign back:
X[:,0] = sc.fit_transform(X[:,0].reshape(-1,1)).ravel()
I am getting this code while analyzing through lstm model. Any help will be much appreciated :)
The tuple index out of range happens when you try to take the second dimension of the X_test.shape (and I guess X_test is 1-dimensional).
As you know, the shape will be a tuple (m, n). But if the numpy array is 1-dimensional the shape will be (m,). So there isnt a second dimension. Therefore X_test.shape[1] will fail.
I am learning TensorFlow, and my goal is to implement MultiPerceptron for my needs. I checked the MNIST tutorial with MultiPerceptron implementation and everything was clear to me except this:
_, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
y: batch_y})
I guess, x is an image itself(28*28 pixels, so the input is 784 neurons) and y is a label which is an 1x10 array:
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
They feed whole batches (which are packs of data points and labels)! How does tensorflow interpret this "batch" input? And how does it update the weights: simultaneously after each element in a batch, or after running through the whole batch?
And, if I need to input one number (input_shape = [1,1]) and output four numbers (output_shape = [1,4]), how should I change the tf.placeholders and in which form should I feed them into session?
When I ask, how does tensorflow interpret it, I want to know how tensorflow splits the batch into single elements. For example, batch is a 2-D array, right? In which direction does it split an array? Or it uses matrix operations and doesn't split anything?
When I ask, how should I feed my data, I want to know, should it be a 2-D array with samples at its rows and features at its columns, or, maybe, could it be a 2-D list.
When I feed my float numpy array X_train to x, which is :
x = tf.placeholder("float", [1, n_input])
I receive an error:
ValueError: Cannot feed value of shape (1, 18) for Tensor 'Placeholder_10:0', which has shape '(1, 1)'
It appears that I have to create my data as a Tensor too?
When I tried [18x1]:
Cannot feed value of shape (18, 1) for Tensor 'Placeholder_12:0', which has shape '(1, 1)'
They feed whole bathces(which are packs of data points and labels)!
Yes, this is how neural networks are usually trained (due to some nice mathematical properties of having best of two worlds - better gradient approximation than in SGD on one hand and much faster convergence than full GD).
How does tensorflow interpret this "batch" input?
It "interprets" it according to operations in your graph. You probably have reduce mean somewhere in your graph, which calculates average over your batch, thus causing this to be the "interpretation".
And how does it update the weights: 1.simultaniusly after each element in a batch? 2. After running threw the whole batch?.
As in the previous answer - there is nothing "magical" about batch, it is just another dimension, and each internal operation of neural net is well defined for the batch of data, thus there is still a single update in the end. Since you use reduce mean operation (or maybe reduce sum?) you are updating according to mean of the "small" gradients (or sum if there is reduce sum instead). Again - you could control it (up to the agglomerative behaviour, you cannot force it to do per-sample update unless you introduce while loop into the graph).
And, if i need to imput one number(input_shape = [1,1]) and ouput four nubmers (output_shape = [1,4]), how should i change the tf.placeholders and in which form should i feed them into session? THANKS!!
just set the variables, n_input=1 and n_classes=4, and you push your data as before, as [batch, n_input] and [batch, n_classes] arrays (in your case batch=1, if by "1x1" you mean "one sample of dimension 1", since your edit start to suggest that you actually do have a batch, and by 1x1 you meant a 1d input).
EDIT: 1.when i ask, how does tensorflow interpret it, i want to know, how tensorflow split the batch into single elements. For example, batch is a 2-D array, right? In which direction it splits an array. Or it uses matrix operations and doesnt split anything? 2. When i ask, how should i feed my data, i want to know, should it be a 2-D array with samples at its rows and features at its colums, or, maybe, could it be a 2-D list.
It does not split anything. It is just a matrix, and each operation is perfectly well defined for matrices as well. Usually you put examples in rows, thus in first dimension, and this is exactly what [batch, n_inputs] says - that you have batch rows each with n_inputs columns. But again - there is nothing special about it, and you could also create a graph which accepts column-wise batches if you would really need to.
I have data that is a ndarray containing features and targets with different dimensions respectively. This seems to give tensorflow problems.
If I have a function:
def cost(self,data):
return self.sess.run(self.cost_function, feed_dict={self.input:data[:,0], self.targets:data[:,1]})
This results in ValueError: setting an array element with a sequence.
It seems to be because feed_dict is not recognizing my inputs as numpy arrays. I think this is because features and targets have different dimensions and ndarray has problems with that; if I have data with 100 pairs then its shape is (100,2). If I then slice it: data[:,0].shape=(100,) and data[:,1].shape=(100,), so the length of the feature/target vectors is not recognized even after slicing.
I've got around the problem by splitting data up beforehand into feats and targs, who's shapes are returned correctly.
My question is, is this normal - is this supposed to be like this? Or am I just doing something wrong? It would be nicer to just work with data instead of passing two variables around all the time.
edit:
self.input = tf.placeholder("float",shape=[None,39])
self.targets = tf.placeholder("float",shape=[None,949])
The dimensions of data should be self-explanatory.