This line of code is giving me the wrong dimensions - python

mat = np.squeeze( np.mean(vec[i0-5:i0+5, :], axis=0))
This is yielding me dimensions of (145,) and I need it to be (145,1). I know this is a simple fix but cant figure out where to define it.

Your question is a bit vague on how your variables look. I'm making the following assumptions:
vec = np.random.uniform(size=(145,145))
i0 = 10
To make the array 2 dimensional, reshape the array.
new_mat = np.reshape(mat, (-1, 1))
Here -1 infers the dimension from your mat array.

Related

Python error, "NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions"

I'm kinda new to python, currently working on a project and getting this error with this lines of code.
g1_coll[obstacle==0]=tau*(g1+g2-g3+g4)
g2_coll[obstacle==0]=tau*(g1+g2+g3-g4)
g3_coll[obstacle==0]=tau*(-g1+g2+g3+g4)
g4_coll[obstacle==0]=tau*(g1-g2+g3+g4)
can anyone help me understand this?
I Assume the error you are getting is because all of your arrays are 2-dimensional. I suggest you try using numpy.putmask(matrix, mask, new_matrix_values)
For instance
mask = (obstacle == 0)
numpy.putmask(g1_coll, mask, tau*(g1+g2-g3+g4))
numpy.putmask(g2_coll, mask, tau*(g1+g2+g3-g4))
numpy.putmask(g3_coll, mask, tau*(-g1+g2+g3+g4))
numpy.putmask(g4_coll, mask, tau*(g1-g2+g3+g4))
The following should probably work as well
mask = (obstacle == 0)
new_array = tau*(g1+g2-g3+g4)
g1_coll[mask]= new_array[mask]
Notice the last [mask]
I think the other answers are mentioning to the source of the problem (e.g., melqkiades answer). I try to reproduce the problem in another way inspiring from this post.
So, such a mistake will be happened if we use np.matrix (which is 2D - it is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future) instead of np.array, that I tried to explain in the following code:
# Note that using np.array instead of np.matrix will get a 1D array with shape --> (4,) ==1D==> 1D res is expected
g_coll = np.matrix([0.94140464, 0.96913727, 0.43559733, 0.45494222]) # shape --> (1, 4) ==2D==> 2D res may be expected wrongly
# [[0.94140464 0.96913727 0.43559733 0.45494222]]
my_boolean_array = g_coll < 0.5 # shape --> (1, 4)
# [[False False True True]]
# g_coll[my_boolean_array] # shape --> (1, 2)
# [[0.43559733 0.45494222]]
# The point is here, where we are expecting res to be 2D, wrongly, because g_coll[my_boolean_array] is in 2D, but that must be 1D
res = np.array([[0, 0]]) # shape --> (1, 2)
g_coll[my_boolean_array] = res # --> res must be 1D: np.array([0, 0])
# The true answer will be as:
# res = np.array([0, 0]) # 1D --> shape: (2,)
# g_coll[my_boolean_array] = res
# # [[0.94140464 0.96913727 0. 0. ]]
The problem is what you are assigning using the mask. Not knowing what's inside g1, g2, g3 and g4 it's quite difficult to understand what you are doing, but probably
tau*(g1+g2-g3+g4)
is a vector of two dimension. Instead you need to assign a single value. For example, if you change your assignment in this way, it will probably work:
g1_coll[obstacle==0]=(tau*(g1+g2-g3+g4))[0]
g2_coll[obstacle==0]=(tau*(g1+g2+g3-g4))[0]
g3_coll[obstacle==0]=(tau*(-g1+g2+g3+g4))[0]
g4_coll[obstacle==0]=(tau*(g1-g2+g3+g4))[0]
or, if it is not working:
g1_coll[obstacle==0]=(tau*(g1+g2-g3+g4))[0][0]
g2_coll[obstacle==0]=(tau*(g1+g2+g3-g4))[0][0]
g3_coll[obstacle==0]=(tau*(-g1+g2+g3+g4))[0][0]
g4_coll[obstacle==0]=(tau*(g1-g2+g3+g4))[0][0]
But before doing anything you should understand what's inside your input (tau*(g1+g2-g3+g4)).
My guess is that probably g1, g2, g3, and g4 are vectors of two dimensions
With this example I can reproduce your error:
import numpy as np
import random
my_matrix = np.random.rand(4)
print(my_matrix)
my_boolean_array = my_matrix < 0.5
print(my_boolean_array)
my_matrix[my_boolean_array] = [[0, 0]] # two dimensions array! not a single value. This will not work
print(my_matrix)
Try to print the value inside
print(tau*(g1+g2-g3+g4))

bootstrap numpy 2D array

I am trying to sample with replacement a base 2D numpy array with shape of (4,2) by rows, say 10 times. The final output should be a 3D numpy array.
Have tried the code below, it works. But is there a way to do it without the for loop?
base=np.array([[20,30],[50,60],[70,80],[10,30]])
print(np.shape(base))
nsample=10
tmp=np.zeros((np.shape(base)[0],np.shape(base)[1],10))
for i in range(nsample):
id_pick = np.random.choice(np.shape(base)[0], size=(np.shape(base)[0]))
print(id_pick)
boot1=base[id_pick,:]
tmp[:,:,i]=boot1
print(tmp)
Here's one vectorized approach -
m,n = base.shape
idx = np.random.randint(0,m,(m,nsample))
out = base[idx].swapaxes(1,2)
Basic idea is that we generate all the possible indices with np.random.randint as idx. That would an array of shape (m,nsample). We use this array to index into the input array along the first axis. Thus, it selects random rows off base. To get the final output with a shape (m,n,nsample), we need to swap last two axes.
You can use the stack function from numpy. Your code would then look like:
base=np.array([[20,30],[50,60],[70,80],[10,30]])
print(np.shape(base))
nsample=10
tmp = []
for i in range(nsample):
id_pick = np.random.choice(np.shape(base)[0], size=(np.shape(base)[0]))
print(id_pick)
boot1=base[id_pick,:]
tmp.append(boot1)
tmp = np.stack(tmp, axis=-1)
print(tmp)
Based on #Divakar 's answer, if you already know the shape of this 2D-array, you can treat it as an (8,) 1D array while bootstrapping, and then reshape it:
m, n = base.shape
flatbase = np.reshape(base, (m*n,))
idxs = np.random.choice(range(8), (numReps, m*n))
bootflats = flatbase[idx]
boots = np.reshape(flatbase, (numReps, m, n))

ValueError: could not broadcast input array from shape (25,1) into shape (25)

when I am trying to run this simple snippet of code
a= 2
G = np.random.rand(25,1)
H = np.zeros((25,a))
for i in range(a):
H[:,i] = .5 * G
I receive the
ValueError: could not broadcast input array from shape (25,1) into shape (25).
I wonder if anyone can point at a solution to this problem?
I know it happens quite a bit in image processing, but this one, I don't knwo how to circumvent.
Cheers.
To fix this you use the first column of G:
for i in range(a):
H[:,i] = .5 * G[:, 0]
Numpy broadcasting basically attempts to match dimensions of arrays (when broadcasting) by starting with the last dimension and moving to the first. In this case the second dimension of G (1) gets broadcast to 25 (the first and only dimension of H[:, i]. The first dimension of G does not match with anything. You can read more about numpy broadcasting rules here.
Note: you really don't need that for loop. H is just G column repeated twice. You can accomplish that in various ways (e.g. np.tile, np.hstack, etc.)
H = np.tile(G / 2, 2)

Python arrays dimension issues

I am struggling once again with Python, NumPy and arrays to compute some calculations between matrices.
The code part that is likely not working properly is as follows:
train, test, cv = np.array_split(data, 3, axis = 0)
train_inputs = train[:,: -1]
test_inputs = test[:,: -1]
cv_inputs = cv[:,: -1]
train_outputs = train[:, -1]
test_outputs = test[:, -1]
cv_outputs = cv[:, -1]
When printing those matrices informations (np.ndim, np.shape and dtype respectively), this is what you get:
2
1
2
1
2
1
(94936, 30)
(94936,)
(94936, 30)
(94936,)
(94935, 30)
(94935,)
float64
float64
float64
float64
float64
float64
I believe it is missing 1 dimension in all *_output arrays.
The other matrix I need is created by this command:
newMatrix = neuronLayer(30, 94936)
In which neuronLayer is a class defined as:
class neuronLayer():
def __init__(self, neurons, neuron_inputs):
self.weights = 2 * np.random.random((neuron_inputs, neurons)) - 1
Here's the final output:
outputLayer1 = self.__sigmoid(np.dot(inputs, self.layer1.weights))
ValueError: shapes (94936,30) and (94936,30) not aligned: 30 (dim 1) != 94936 (dim 0)
Python is clearly telling me the matrices are not adding up but I am not understanding where is the problem.
Any tips?
PS: The full code is pasted ħere.
layer1 = neuronLayer(30, 94936) # 29 neurons with 227908 inputs
layer2 = neuronLayer(1, 30) # 1 Neuron with the previous 29 inputs
where `nueronLayer creates
self.weights = 2 * np.random.random((neuron_inputs, neurons)) - 1
the 2 weights are (94936,30) and (30,1) in size.
This line does not make any sense. I surprised it doesn't give an error
layer1error = layer2delta.dot(self.layer2.weights.np.transpose)
I suspect you want np.transpose(self.layer2.weights) or self.layer2.weights.T.
But maybe it doesn't get there. train first calls think with a (94936,30) inputs
outputLayer1 = self.__sigmoid(np.dot(inputs, self.layer1.weights))
outputLayer2 = self.__sigmoid(np.dot(outputLayer1, self.layer2.weights))
So it tries to do a np.dot with 2 (94936,30), (94936,30) arrays. They aren't compatible for a dot. You could transpose one or the other, producing either (94936,94936) array or (30,30). One looks too big. The (30,30) is compatible with the weights for the 2nd layer.
np.dot(inputs.T, self.layer1.weights)
has a chance of working right.
np.dot(outputLayer1, self.layer2.weights)
(30,30) with (30,1) => (30,1)
But then you do
train_outputs - outputLayer2
That will have problems regardless of whether train_outputs is (94936,) or (94936,1)
You need to make sure that arrays shapes flow correctly through the calculation. Don't just check them at the start. Check then internally. And make you sure you understand what shapes they should have at each step.
It would be a whole lot easier to develop and test this code with much smaller inputs and layers, something like 10 samples and 3 features. That way you can look at the values as well as the shapes.
np.dot uses matrix multiplication when its arguments are matrices. It looks like your code is trying to multiply two non-square matrices together with the same dimensions which doesn't work. Perhaps you meant to transpose one of the matrices? Numpy matrices have a T property that returns the transpose, you could try:
self.__sigmoid(np.dot(inputs.T, self.layer1.weights))

numpy broadcast from first dimension

In NumPy, is there an easy way to broadcast two arrays of dimensions e.g. (x,y) and (x,y,z)? NumPy broadcasting typically matches dimensions from the last dimension, so usual broadcasting will not work (it would require the first array to have dimension (y,z)).
Background: I'm working with images, some of which are RGB (shape (h,w,3)) and some of which are grayscale (shape (h,w)). I generate alpha masks of shape (h,w), and I want to apply the mask to the image via mask * im. This doesn't work because of the above-mentioned problem, so I end up having to do e.g.
mask = mask.reshape(mask.shape + (1,) * (len(im.shape) - len(mask.shape)))
which is ugly. Other parts of the code do operations with vectors and matrices, which also run into the same issue: it fails trying to execute m + v where m has shape (x,y) and v has shape (x,). It's possible to use e.g. atleast_3d, but then I have to remember how many dimensions I actually wanted.
how about use transpose:
(a.T + c.T).T
numpy functions often have blocks of code that check dimensions, reshape arrays into compatible shapes, all before getting down to the core business of adding or multiplying. They may reshape the output to match the inputs. So there is nothing wrong with rolling your own that do similar manipulations.
Don't offhand dismiss the idea of rotating the variable 3 dimension to the start of the dimensions. Doing so takes advantage of the fact that numpy automatically adds dimensions at the start.
For element by element multiplication, einsum is quite powerful.
np.einsum('ij...,ij...->ij...',im,mask)
will handle cases where im and mask are any mix of 2 or 3 dimensions (assuming the 1st 2 are always compatible. Unfortunately this does not generalize to addition or other operations.
A while back I simulated einsum with a pure Python version. For that I used np.lib.stride_tricks.as_strided and np.nditer. Look into those functions if you want more power in mixing and matching dimensions.
as another angle: if you encounter this pattern frequently, it may be useful to create a utility function to enforce right-broadcasting:
def right_broadcasting(arr, target):
return arr.reshape(arr.shape + (1,) * (target.ndim - arr.ndim))
Although if there are only two types of input (already having 3 dims or having only 2), id say the single if statement is preferable.
Indexing with np.newaxis creates a new axis in that place. Ie
xyz = #some 3d array
xy = #some 2d array
xyz_sum = xyz + xy[:,:,np.newaxis]
or
xyz_sum = xyz + xy[:,:,None]
Indexing in this way creates an axis with shape 1 and stride 0 in this location.
Why not just decorate-process-undecorate:
def flipflop(func):
def wrapper(a, mask):
if len(a.shape) == 3:
mask = mask[..., None]
b = func(a, mask)
return np.squeeze(b)
return wrapper
#flipflop
def f(x, mask):
return x * mask
Then
>>> N = 12
>>> gs = np.random.random((N, N))
>>> rgb = np.random.random((N, N, 3))
>>>
>>> mask = np.ones((N, N))
>>>
>>> f(gs, mask).shape
(12, 12)
>>> f(rgb, mask).shape
(12, 12, 3)
Easy, you just add a singleton dimension at the end of the smaller array. For example, if xyz_array has shape (x,y,z) and xy_array has shape (x,y), you can do
xyz_array + np.expand_dims(xy_array, xy_array.ndim)

Categories

Resources