My numpy array always ends in zero? - python

I think I missed something somewhere. I filled a numpy array using two for loops (x and y) and a function based on the x,y position. The only problem is that the value of the array always ends in zero irregardless of the size of the array.
thetamap = numpy.zeros(36, dtype=float)
thetamap.shape = (6, 6)
for y in range(0,5):
for x in range(0,5):
thetamap[x][y] = x+y
print thetamap

range(0, 5) produces 0, 1, 2, 3, 4. The endpoint is always omitted. You want simply range(6).
Better yet, use the awesome power of NumPy to make the array in one line:
thetamap = np.arange(6) + np.arange(6)[:,None]
This makes a row vector and a column vector, then adds them together using NumPy broadcasting to make a matrix.

Related

Alternative to loop for for boolean / nonzero indexing of numpy array

I need to select only the non-zero 3d portions of a 3d binary array (or alternatively the true values of a boolean array). Currently I am able to do so with a series of 'for' loops that use np.any, but this does work but seems awkward and slow, so currently investigating a more direct way to accomplish the task.
I am rather new to numpy, so the approaches that I have tried include a) using
np.nonzero, which returns indices that I am at a loss to understand what to do with for my purposes, b) boolean array indexing, and c) boolean masks. I can generally understand each of those approaches for simple 2d arrays, but am struggling to understand the differences between the approaches, and cannot get them to return the right values for a 3d array.
Here is my current function that returns a 3D array with nonzero values:
def real_size(arr3):
true_0 = []
true_1 = []
true_2 = []
print(f'The input array shape is: {arr3.shape}')
for zero_ in range (0, arr3.shape[0]):
if arr3[zero_].any()==True:
true_0.append(zero_)
for one_ in range (0, arr3.shape[1]):
if arr3[:,one_,:].any()==True:
true_1.append(one_)
for two_ in range (0, arr3.shape[2]):
if arr3[:,:,two_].any()==True:
true_2.append(two_)
arr4 = arr3[min(true_0):max(true_0) + 1, min(true_1):max(true_1) + 1, min(true_2):max(true_2) + 1]
print(f'The nonzero area is: {arr4.shape}')
return arr4
# Then use it on a small test array:
test_array = np.zeros([2, 3, 4], dtype = int)
test_array[0:2, 0:2, 0:2] = 1
#The function call works and prints out as expected:
non_zero = real_size(test_array)
>> The input array shape is: (2, 3, 4)
>> The nonzero area is: (2, 2, 2)
# So, the array is correct, but likely not the best way to get there:
non_zero
>> array([[[1, 1],
[1, 1]],
[[1, 1],
[1, 1]]])
The code works appropriately, but I am using this on much larger and more complex arrays, and don't think this is an appropriate approach. Any thoughts on a more direct method to make this work would be greatly appreciated. I am also concerned about errors and the results if the input array has for example two separate non-zero 3d areas within the original array.
To clarify the problem, I need to return one or more 3D portions as one or more 3d arrays beginning with an original larger array. The returned arrays should not include extraneous zeros (or false values) in any given exterior plane in three dimensional space. Just getting the indices of the nonzero values (or vice versa) doesn't by itself solve the problem.
Assuming you want to eliminate all rows, columns, etc. that contain only zeros, you could do the following:
nz = (test_array != 0)
non_zero = test_array[nz.any(axis=(1, 2))][:, nz.any(axis=(0, 2))][:, :, nz.any(axis=(0, 1))]
An alternative solution using np.nonzero:
i = [np.unique(_) for _ in np.nonzero(test_array)]
non_zero = test_array[i[0]][:, i[1]][:, :, i[2]]
This can also be generalized to arbitrary dimensions, but requires a bit more work (only showing the first approach here):
def real_size(arr):
nz = (arr != 0)
result = arr
axes = np.arange(arr.ndim)
for axis in range(arr.ndim):
zeros = nz.any(axis=tuple(np.delete(axes, axis)))
result = result[(slice(None),)*axis + (zeros,)]
return result
non_zero = real_size(test_array)

Append numpy one dimensional arrays does not lead to a matrix

I am trying to get a 2d array, by randomly generating its rows and appending
import numpy as np
my_nums = np.array([])
for i in range(100):
x = np.random.rand(2, 1)
my_nums = np.append(my_nums, np.array(x))
But I do not get what I want but instead get a 1d array.
What is wrong?
Transposing x did not help either.
You could do this by using np.append(axis=0) or np.vstack. This however requires the rows appended to have the same length as the rows already in the array.
You cannot use the same code to append a row with two values to an empty array, and to append a row to an already existing 2D array: numpy will throw a
ValueError: all the input arrays must have same number of dimensions.
You could initialize my_nums to work around this:
my_nums = np.random.rand(1, 2)
for i in range(99):
x = np.random.rand(1, 2)
my_nums = np.append(my_nums, x, axis=0)
Note the decrease in the range by one due to the initialization row. Also note that I changed the dimensions to (1, 2) to get actual row vectors.
Much easier than appending row-wise will of course be to create the array in the wanted final shape:
my_nums = np.random.rand(100, 2)

Reshaping array of matrices in Python

I have a Numpy array X of n 2x2 matrices, arranged so that X.shape = (2,2,n), that is, to get the first matrix I call X[:,:,0]. I would like to reshape X into an array Y such that I can get the first matrix by calling Y[0] etc., but performing X.reshape(n,2,2) messes up the matrices. How can I get it to preserve the matrices while reshaping the array?
I am essentially trying to do this:
import numpy as np
Y = np.zeros([n,2,2])
for i in range(n):
Y[i] = X[:,:,i]
but without using the for loop. How can I do this with reshape or a similar function?
(To get an example array X, try X = np.concatenate([np.identity(2)[:,:,None]] * n, axis=2) for some n.)
numpy.moveaxis can be used to take a view of an array with one axis moved to a different position in the shape:
numpy.moveaxis(X, 2, 0)
numpy.moveaxis(a, source, destination) takes a view of array a where the axis originally at position source ends up at position destination, so numpy.moveaxis(X, 2, 0) makes the original axis 2 the new axis 0 in the view.
There's also numpy.transpose, which can be used to perform arbitrary rearrangements of an array's axes in one go if you pass it the optional second argument, and numpy.rollaxis, an older version of moveaxis with a more confusing calling convention.
Use swapaxis:
Y = X.swapaxes(0,2)

numpy meshgrid of dynamic shape

I am trying to use numpy meshgrid to generate some arrays. So, I have a nd array. Let us call it data and it can have an arbitrary shape and I am trying to generate some indices array as follows:
shape = data.shape
x = np.meshgrid[1,x-1 for x in shape]
I know the syntax looks crazy but sometimes I try things like these in python and it works! Anyway, is there a way to do this dynamic meshgrid in python? This comes back with invalid syntax error:
x = np.meshgrid[1,x-1 for x in shape]
^
SyntaxError: invalid syntax
EDIT:
I would like basically to create an array of indices. For example, I can do the following when the index always begins with 0
import numpy as np
array = np.random.rand(5, 5, 5)
shape = array.shape
indices = np.indices(x-1 for x in shape)
This creates an ndarray with indices starting from 0 to (n-1) along each of the axes of my input array. Now, I wanted to have the indexing begin from 1 and could not find a good way to do this.
EDIT:
For example, a call for an array with shape (4, 5, 6) could be something like:
x = np.meshgrid(np.arange(1,4), np.arange(1,5), np.arange(1, 6))
Going off your last example, you can do something like this:
x = np.meshgrid(*[np.arange(1, x) for x in shape])
You need to explicitly create a list of the values you want to pass to meshgrid. If you want each one to start at 1, you need to put the 1 in each call to arange. You can't do something like [1, arange(x)] and have it "distribute" the 1 through all the calls.
Then the * there expands the list into separate arguments. (See here for info.)

Index the middle of a numpy array?

To index the middle points of a numpy array, you can do this:
x = np.arange(10)
middle = x[len(x)/4:len(x)*3/4]
Is there a shorthand for indexing the middle of the array? e.g., the n or 2n elements closes to len(x)/2? Is there a nice n-dimensional version of this?
as cge said, the simplest way is by turning it into a lambda function, like so:
x = np.arange(10)
middle = lambda x: x[len(x)/4:len(x)*3/4]
or the n-dimensional way is:
middle = lambda x: x[[slice(np.floor(d/4.),np.ceil(3*d/4.)) for d in x.shape]]
Late, but for everyone else running into this issue:
A much smoother way is to use numpy's take or put.
To address the middle of an array you can use put to index an n-dimensional array with a single index. Same for getting values from an array with take
Assuming your array has an odd number of elements, the middle of the array will be at half of it's size. By using an integer division (// instead of /) you won't get any problems here.
import numpy as np
arr = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
# put a value to the center
np.put(arr, arr.size // 2, 999)
print(arr)
# take a value from the center
center = np.take(arr, arr.size // 2)
print(center)

Categories

Resources