I am struggling once again with Python, NumPy and arrays to compute some calculations between matrices.
The code part that is likely not working properly is as follows:
train, test, cv = np.array_split(data, 3, axis = 0)
train_inputs = train[:,: -1]
test_inputs = test[:,: -1]
cv_inputs = cv[:,: -1]
train_outputs = train[:, -1]
test_outputs = test[:, -1]
cv_outputs = cv[:, -1]
When printing those matrices informations (np.ndim, np.shape and dtype respectively), this is what you get:
2
1
2
1
2
1
(94936, 30)
(94936,)
(94936, 30)
(94936,)
(94935, 30)
(94935,)
float64
float64
float64
float64
float64
float64
I believe it is missing 1 dimension in all *_output arrays.
The other matrix I need is created by this command:
newMatrix = neuronLayer(30, 94936)
In which neuronLayer is a class defined as:
class neuronLayer():
def __init__(self, neurons, neuron_inputs):
self.weights = 2 * np.random.random((neuron_inputs, neurons)) - 1
Here's the final output:
outputLayer1 = self.__sigmoid(np.dot(inputs, self.layer1.weights))
ValueError: shapes (94936,30) and (94936,30) not aligned: 30 (dim 1) != 94936 (dim 0)
Python is clearly telling me the matrices are not adding up but I am not understanding where is the problem.
Any tips?
PS: The full code is pasted ħere.
layer1 = neuronLayer(30, 94936) # 29 neurons with 227908 inputs
layer2 = neuronLayer(1, 30) # 1 Neuron with the previous 29 inputs
where `nueronLayer creates
self.weights = 2 * np.random.random((neuron_inputs, neurons)) - 1
the 2 weights are (94936,30) and (30,1) in size.
This line does not make any sense. I surprised it doesn't give an error
layer1error = layer2delta.dot(self.layer2.weights.np.transpose)
I suspect you want np.transpose(self.layer2.weights) or self.layer2.weights.T.
But maybe it doesn't get there. train first calls think with a (94936,30) inputs
outputLayer1 = self.__sigmoid(np.dot(inputs, self.layer1.weights))
outputLayer2 = self.__sigmoid(np.dot(outputLayer1, self.layer2.weights))
So it tries to do a np.dot with 2 (94936,30), (94936,30) arrays. They aren't compatible for a dot. You could transpose one or the other, producing either (94936,94936) array or (30,30). One looks too big. The (30,30) is compatible with the weights for the 2nd layer.
np.dot(inputs.T, self.layer1.weights)
has a chance of working right.
np.dot(outputLayer1, self.layer2.weights)
(30,30) with (30,1) => (30,1)
But then you do
train_outputs - outputLayer2
That will have problems regardless of whether train_outputs is (94936,) or (94936,1)
You need to make sure that arrays shapes flow correctly through the calculation. Don't just check them at the start. Check then internally. And make you sure you understand what shapes they should have at each step.
It would be a whole lot easier to develop and test this code with much smaller inputs and layers, something like 10 samples and 3 features. That way you can look at the values as well as the shapes.
np.dot uses matrix multiplication when its arguments are matrices. It looks like your code is trying to multiply two non-square matrices together with the same dimensions which doesn't work. Perhaps you meant to transpose one of the matrices? Numpy matrices have a T property that returns the transpose, you could try:
self.__sigmoid(np.dot(inputs.T, self.layer1.weights))
Related
mat = np.squeeze( np.mean(vec[i0-5:i0+5, :], axis=0))
This is yielding me dimensions of (145,) and I need it to be (145,1). I know this is a simple fix but cant figure out where to define it.
Your question is a bit vague on how your variables look. I'm making the following assumptions:
vec = np.random.uniform(size=(145,145))
i0 = 10
To make the array 2 dimensional, reshape the array.
new_mat = np.reshape(mat, (-1, 1))
Here -1 infers the dimension from your mat array.
In the neural network code that I've written, I could not get an answer since the problem of alignment.
I wrote a neural network code (based on some other). I tried to build input and output in the right way. While I defined the class and operations correctly, I could not get an answer since the problem of alignment. Error : shapes (127,3) and (1,4) not aligned: 3 (dim 1) != 1 (dim 0)
Datafile = pd.read_excel(r"C:\\Users\Hasan\Desktop\ANN\x.xlsx") is 127x3
Target = pd.read_excel(r"C:\\Users\Hasan\Desktop\ANN\y.xlsx") is 127x1
class Neural_Network(object):
def __init__(self):
self.inputlayer = 1
self.w1 = np.random.randn(self.inputlayer, self.hiddenlayer)
self.z = np.dot(Datafile, self.w1)
I think it's because of the dimension of two matrices but even, when I changed the dimensions it did not work.
All help will be appreciated
For matrix multiplication (dot product), number of columns of first matrix should be equal to number of rows of second matrix.
In your case, Datafile has 3 columns while w1 has 1 row, that's why it's giving you error because of incorrect dimensions.
For giving you example, I am assuming random matrices,
Datafile = np.random.rand(127, 3)
w1 = np.random.rand(3, 127)
z = np.dot(Datafile, w1)
print(z.shape)
Output: (127, 127)
In this example, Datafile has 3 columns and w1 has 3 rows so in this case, dot-product will be successful.
Hello I'm new with TensorFlow and I'd like to concatenate a 2D tensor to a 3D one. I don't know how to do it by exploiting TensorFlow functions.
tensor_3d = [[[1,2], [3,4]], [[5,6], [7,8]]] # shape (2, 2, 2)
tensor_2d = [[10,11], [12,13]] # shape (2, 2)
out: [[[1,2,10,11], [3,4,10,11]], [[5,6,12,13], [7,8,12,13]]] # shape (2, 2, 4)
I would make it work by using loops and new numpy arrays, but in that way I wouldn't use TensorFlow transformations. Any suggestions on how to make this possible? I don't see how transformations like: tf.expand_dims or tf.reshape may help here...
Thanks for sharing your knowledge.
This should do the trick:
import tensorflow as tf
a = tf.constant([[[1,2], [3,4]], [[5,6], [7,8]]])
b = tf.constant([[10,11], [12,13]])
c = tf.expand_dims(b, axis=1) # Add dimension
d = tf.tile(c, multiples=[1,2,1]) # Duplicate in this dimension
e = tf.concat([a,d], axis=-1) # Concatenate on innermost dimension
with tf.Session() as sess:
print(e.eval())
Gives:
[[[ 1 2 10 11]
[ 3 4 10 11]]
[[ 5 6 12 13]
[ 7 8 12 13]]]
There is actually a different trick, that is used from time to time in code bases such as OpenAI's baselines.
Suppose you have two tensors for your gaussian policy. mu and std. The standard deviation has the same shape as mu for batch size 1, but because you use the same parameterized standard deviation for all actions, when the batch size is larger than 1 the two would differ:
mu : Size<batch_size, feat_n>
std: Size<1, feat_n>
In this case a simple thing to do (as what the OpenAI baseline does) is to do:
params = tf.concat([mu, mu * 0 + std])
The zero multiplication casts the std into the same rank as mu.
Enjoy, and good luck training!
ps: numpy and tensorflow's concat operator does not automagically apply broadcasting because according to the maintainers, when the shape of two tensors doesn't match, it is usually the result of a programming error. This is not a big deal in numpy because the computation are evaluated eagerly. But with tensorflow this means that you have to explicitly broadcast the lower rank (or the one that has shape [1, *_]) by hand using the tf.shape operator.
when I am trying to run this simple snippet of code
a= 2
G = np.random.rand(25,1)
H = np.zeros((25,a))
for i in range(a):
H[:,i] = .5 * G
I receive the
ValueError: could not broadcast input array from shape (25,1) into shape (25).
I wonder if anyone can point at a solution to this problem?
I know it happens quite a bit in image processing, but this one, I don't knwo how to circumvent.
Cheers.
To fix this you use the first column of G:
for i in range(a):
H[:,i] = .5 * G[:, 0]
Numpy broadcasting basically attempts to match dimensions of arrays (when broadcasting) by starting with the last dimension and moving to the first. In this case the second dimension of G (1) gets broadcast to 25 (the first and only dimension of H[:, i]. The first dimension of G does not match with anything. You can read more about numpy broadcasting rules here.
Note: you really don't need that for loop. H is just G column repeated twice. You can accomplish that in various ways (e.g. np.tile, np.hstack, etc.)
H = np.tile(G / 2, 2)
I have a tensor probs with probs.shape = (max_time, num_batches, num_labels).
And I have a tensor targets with targets.shape = (max_seq_len, num_batches) where the values are label indices, i.e. for the third dimension in probs.
Now I want to get a tensor probs_y with probs.shape = (max_time, num_batches, max_seq_len) where the third dimension is the index in targets. Basically
probs_y[:,i,:] = probs[:,i,targets[:,i]]
for all 0 <= i < num_batches.
How can I achieve this?
A similar problem with solution was posted here.
The solution there, if I understand correctly, would be:
probs_y = probs[:,T.arange(targets.shape[1])[None,:],targets]
But that doesn't seem to work. I get:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices.
Also, isn't the creation of the temporal T.arange a bit costly? Esp when I try to workaround by really making it a full dense integer array. There should be a better way.
Maybe theano.map? But as far as I understand, that doesn't parallelize the code, so this is also not a solution.
This works for me:
import theano
import theano.tensor as T
max_time, num_batches, num_labels = 3, 4, 6
max_seq_len = 5
probs_ = np.arange(max_time * num_batches * num_labels).reshape(
max_time, num_batches, num_labels)
targets_ = np.arange(num_batches * max_seq_len).reshape(max_seq_len,
num_batches) % (num_batches - 1) # mix stuff up
probs, targets = map(theano.shared, (probs_, targets_))
print probs_
print targets_
probs_y = probs[:, T.arange(targets.shape[1])[:, np.newaxis], targets.T]
print probs_y.eval()
Above used a transposed version of your indices. Your exact proposition also works
probs_y2 = probs[:, T.arange(targets.shape[1])[np.newaxis, :], targets]
print probs_y2.eval()
print (probs_y2.dimshuffle(0, 2, 1) - probs_y).eval()
So maybe your problem is somewhere else.
As for speed, I am at a loss as to what could be faster than this. map, which is a specialization of scan almost certainly is not. I do not know to what extent the arange is actually built rather than simply iterated over.