Say I generate a sequence of values, tile them by the range provided and then increment each value in each row by that current row ID, then mask some values outside of a desired range like below:
>>> range = 5
>>> matrix = np.arange(-5, 10, 1)
>>> matrix = np.tile(matrix, (range, 1))
>>> matrix = np.add(matrix, np.arange(0, range)[:, None])
>>> matrix = ma.masked_outside(matrix, 0, 10)
[[-- -- -- -- -- 0 1 2 3 4 5 6 7 8 9]
[-- -- -- -- 0 1 2 3 4 5 6 7 8 9 10]
[-- -- -- 0 1 2 3 4 5 6 7 8 9 10 --]
[-- -- 0 1 2 3 4 5 6 7 8 9 10 -- --]
[-- 0 1 2 3 4 5 6 7 8 9 10 -- -- --]]
How would you best convert the above output to a matrix of the format [non-masked value, row-id], i.e.:
[0,0], [1, 0], [2,0] ... [10, 4]
Also, is the original code too wasteful to achieve the final desired step?
Playing around with your matrix I produced this:
In [50]: np.stack((matrix.compressed(), np.where(~matrix.mask)[0]),1)
Out[50]:
array([[ 0, 0],
[ 1, 0],
[ 2, 0],
[ 3, 0],
[ 4, 0],
[ 5, 0],
[ 6, 0],
[ 7, 0],
[ 8, 0],
[ 9, 0],
....
We could probably skip the masked array step, creating the mask directly. The compressed for example is produced by matrix.data[~matrix.mask].
In [52]: mask = ~matrix.mask
In [53]: data = matrix.data
In [54]: np.stack((data[mask], np.where(mask)[0]), 1)
Related
I have a function f(a) that takes one entry from a testarray and returns an array with 5 values:
f(testarray[0])
#Output: array([[0, 1, 5, 3, 2]])
Since f(testarray[0]) is the result of an experiment, I want to run this function f for each entry of the testarray and store each result in a new NumPy array. I always thought this would be quite simple by just taking an empty NumPy array with the length of the testarray and save the results the following way:
N = 1000 #Number of entries of the testarray
test_result = np.zeros([N, 5], dtype=int)
for i in testarray:
test_result[i] = f(i)
When I run this, I don't receive any error message but nonsense results (half of the test_result is empty while the rest is filled with implausible values). Since f() works perfectly for a single entry of the testarray I suppose that something of the way of how I save the results in the test_result is wrong. What am I missing here?
(I know that I could save the results as list and then append an empty list, but this method is too slow for the large number of times I want to run the function).
Since you don't seem to understand indexing, stick with this approach
alist = [f(i) for i in testarray]
arr = np.array(alist)
I could show how to use row indices and testarray values together, but that requires more explanation.
Your problem may could be reproduced by the following small example:
testarray = np.array([5, 6, 7, 3, 1])
def f(x):
return np.array([x * i for i in np.arange(1, 6)])
f(testarray[0])
# [ 5 10 15 20 25]
test_result = np.zeros([len(testarray), 5], dtype=int) # len(testarray) or testarray.shape[0]
So, as hpaulj mentioned in the comments, you must be careful how to use indexing:
for i in range(len(testarray)):
test_result[i] = f(testarray[i])
# [[ 5 10 15 20 25]
# [ 6 12 18 24 30]
# [ 7 14 21 28 35]
# [ 3 6 9 12 15]
# [ 1 2 3 4 5]]
There will be another condition where the testarray is a specified index array that contains shuffle integers from 0 to N to full fill the zero array i.e. test_result. For this condition we can create a reproducible example as:
testarray = np.array([4, 3, 0, 1, 2])
def f(x):
return np.array([x * i for i in np.arange(1, 6)])
f(testarray[0])
# [ 4 8 12 16 20]
test_result = np.zeros([len(testarray), 5], dtype=int)
So, using your loop will get the following result:
for i in testarray:
test_result[i] = f(i)
# [[ 0 0 0 0 0]
# [ 1 2 3 4 5]
# [ 2 4 6 8 10]
# [ 3 6 9 12 15]
# [ 4 8 12 16 20]]
As it can be understand from this loop, if the index array be not from 0 to N, some rows in the zero array will left zero (unchanged):
testarray = np.array([4, 2, 4, 1, 2])
for i in testarray:
test_result[i] = f(i)
# [[ 0 0 0 0 0] # <--
# [ 1 2 3 4 5]
# [ 2 4 6 8 10]
# [ 0 0 0 0 0] # <--
# [ 4 8 12 16 20]]
I have 3d array and I need to set to zero its right part. For each 2d slice (n, :, :) of the array the index of the column should be taken from vector b. This index defines separating point - the left and right parts, as shown in the figure below.
a_before = [[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
[[17 18 19 20]
[21 22 23 24]
[25 26 27 28]
[29 30 31 32]]
[[33 34 35 36]
[37 38 39 40]
[41 42 43 44]
[45 46 47 48]]]
a_before.shape = (3, 4, 4)
b = (2, 3, 1)
a_after_1 = [[[ 1 2 0 0]
[ 5 6 0 0]
[ 9 10 0 0]
[13 14 0 0]]
[[17 18 19 0]
[21 22 23 0]
[25 26 27 0]
[29 30 31 0]]
[[33 0 0 0]
[37 0 0 0]
[41 0 0 0]
[45 0 0 0]]]
After this, for each 2d slice (n, :, :) I have to take index of the column from c vector and multiply by the corresponding value taken from the vector d.
c = (1, 2, 0)
d = (50, 100, 150)
a_after_2 = [[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[13 700 0 0]]
[[17 18 1900 0]
[21 22 2300 0]
[25 26 2700 0]
[29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]
I did it but my version looks ugly. Maybe someone can help me.
P.S. I would like to avoid for loops and use only numpy methods.
Thank You.
Here's a version without loops.
In [232]: A = np.arange(1,49).reshape(3,4,4)
In [233]: b = np.array([2,3,1])
In [234]: d = np.array([50,100,150])
In [235]: I,J = np.nonzero(b[:,None]<=np.arange(4))
In [236]: A[I,:,J]=0
In [237]: A[np.arange(3),:,b-1] *= d[:,None]
In [238]: A
Out[238]:
array([[[ 1, 100, 0, 0],
[ 5, 300, 0, 0],
[ 9, 500, 0, 0],
[ 13, 700, 0, 0]],
[[ 17, 18, 1900, 0],
[ 21, 22, 2300, 0],
[ 25, 26, 2700, 0],
[ 29, 30, 3100, 0]],
[[4950, 0, 0, 0],
[5550, 0, 0, 0],
[6150, 0, 0, 0],
[6750, 0, 0, 0]]])
Before I developed this, I wrote an iterative version. It helped me visualize the problem.
In [240]: Ac = np.arange(1,49).reshape(3,4,4)
In [241]:
In [241]: for i,v in enumerate(b):
...: Ac[i,:,v:]=0
...:
In [242]: for i,(bi,di) in enumerate(zip(b,d)):
...: Ac[i,:,bi-1]*=di
It may be easier to understand, and in that sense, less ugly!
The fact that your A has middle dimension that is "just-going-along" for the ride, complicates "vectorizing" the problem.
With a (3,4) 2d array, the solution is just:
In [251]: Ab = Ac[:,0,:]
In [252]: Ab[b[:,None]<=np.arange(4)]=0
In [253]: Ab[np.arange(3),b-1]*=d
Here it is:
import numpy as np
a = np.arange(1,49).reshape(3,4,4)
b = np.array([2,3,1])
c = np.array([1,2,0])
d = np.array([50,100,150])
for i in range(len(b)):
a[i,:,b[i]:] = 0
for i,j in enumerate(c):
a[i,:,j] = a[i,:,j]* d[i]
print(a)
#
[[[ 1 100 0 0]
[ 5 300 0 0]
[ 9 500 0 0]
[ 13 700 0 0]]
[[ 17 18 1900 0]
[ 21 22 2300 0]
[ 25 26 2700 0]
[ 29 30 3100 0]]
[[4950 0 0 0]
[5550 0 0 0]
[6150 0 0 0]
[6750 0 0 0]]]
I'm using scikit.morphology to do an erosion on a two-dimensional array. I need to also ascertain the distance of each cell to the minimum value identified in the erosion.
Example:
np.reshape(np.arange(1,126,step=5),[5,5])
array([[ 1, 6, 11, 16, 21],
[ 26, 31, 36, 41, 46],
[ 51, 56, 61, 66, 71],
[ 76, 81, 86, 91, 96],
[101, 106, 111, 116, 121]])
erosion(np.reshape(np.arange(1,126,step=5),[5,5]),selem=disk(3))
array([[ 1, 1, 1, 1, 6],
[ 1, 1, 1, 6, 11],
[ 1, 1, 1, 6, 11],
[ 1, 6, 11, 16, 21],
[26, 31, 36, 41, 46]])
Now what I want to do is also return an array that gives me the distance to the minimum like this:
array([[ 0, 1, 2, 3, 3],
[ 1, 1, 2, 3, 3],
[ 2, 2, 3, 3, 3],
[ 3, 3, 3, 3, 3],
[ 3, 3, 3, 3, 3]])
Is there a scikit tool that can do this? If not, any tips on how to efficiently achieve this result?
You can find the distances from the centre of your footprint using scipy.ndimage.distance_transform_cdt, then use SciPy's ndimage.generic_filter to return those values:
import numpy as np
from skimage.morphology import erosion, disk
from scipy import ndimage as ndi
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
def distance_from_min(values, distance_values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# set values outside footprint to 0 for pretty-printing
distance_footprint[~footprint.astype(bool)] = 0
# then, extract it into values matching the values in generic_filter
distance_values = distance_footprint[footprint.astype(bool)]
output = ndi.generic_filter(
input_arr.astype(float),
distance_from_min,
footprint=footprint,
mode='constant',
cval=np.inf,
extra_arguments=(distance_values,),
)
print('input:\n', input_arr)
print('footprint:\n', footprint)
print('distance_footprint:\n', distance_footprint)
print('output:\n', output)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
footprint:
[[0 0 0 1 0 0 0]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[1 1 1 1 1 1 1]
[0 1 1 1 1 1 0]
[0 1 1 1 1 1 0]
[0 0 0 1 0 0 0]]
distance_footprint:
[[0 0 0 3 0 0 0]
[0 4 3 2 3 4 0]
[0 3 2 1 2 3 0]
[3 2 1 0 1 2 3]
[0 3 2 1 2 3 0]
[0 4 3 2 3 4 0]
[0 0 0 3 0 0 0]]
output:
[[0. 1. 2. 3. 3.]
[1. 2. 3. 3. 3.]
[2. 3. 4. 4. 4.]
[3. 3. 3. 3. 3.]
[3. 3. 3. 3. 3.]]
This function will be very slow, however. If you want to make it faster, you will need (a) a solution like Numba or Cython for the filter function, in conjunction with SciPy LowLevelCallables and (b) to hardcode the distance array into the distance function, because for LowLevelCallables it is more difficult to pass in extra arguments. Here is a full example with llc-tools, which you can install with pip install numba llc-tools.
import numpy as np
from scipy import ndimage as ndi
from skimage.morphology import erosion, disk
import llc
def filter_func_from_footprint(footprint):
# first, create a footprint where the values are the distance from the
# center
full_footprint = np.ones_like(footprint, dtype=float)
full_footprint[tuple(i//2 for i in footprint.shape)] = 0
# use `ndi.distance_transform_edt` instead for the euclidean distance
distance_footprint = ndi.distance_transform_cdt(
full_footprint, metric='taxicab'
)
# then, extract it into values matching the values in generic_filter
distance_footprint[~footprint.astype(bool)] = 0
distance_values = distance_footprint[footprint.astype(bool)]
# finally, create a filter function with the values hardcoded
#llc.jit_filter_function
def distance_from_min(values):
d = np.inf
min_val = np.inf
for i in range(len(values)):
if values[i] <= min_val:
min_val = values[i]
d = distance_values[i]
return d
return distance_from_min
if __name__ == '__main__':
input_arr = np.reshape(np.arange(1,126,step=5),[5,5])
footprint = disk(3)
eroded = erosion(input_arr, selem=footprint)
filter_func = filter_func_from_footprint(footprint)
result = ndi.generic_filter(
# use input_arr.astype(float) when using euclidean dist
input_arr,
filter_func,
footprint=disk(3),
mode='constant',
cval=np.inf,
)
print('input:\n', input_arr)
print('output:\n', result)
Which gives:
input:
[[ 1 6 11 16 21]
[ 26 31 36 41 46]
[ 51 56 61 66 71]
[ 76 81 86 91 96]
[101 106 111 116 121]]
output:
[[0 1 2 3 3]
[1 2 3 3 3]
[2 3 4 4 4]
[3 3 3 3 3]
[3 3 3 3 3]]
For more reading on low-level callables and llc-tools, in addition to the LowLevelCallable documentation on the SciPy site (linked above, plus links therein), you can read these two blog posts I wrote a few years ago:
SciPy's new LowLevelCallable is a game-changer
Prettier LowLevelCallables with Numba JIT and decorators
Is there such a function or easy method?
The only functions I have found so far are mesh.vertexCoords and mesh.faceVertexIDs but I couldn't figure quit out if they might help me.
As the comments suggest, the vertex to cell data shouldn't usually be required in a finite volume scheme. However, the following is a solution for finding the vertex to cell IDs given the cell to vertex IDs. The cell to vertex data is available in FiPy with the mesh._cellVertexIDs array.
The following uses sparse matrices to represent the cell to vertex link and then a transpose to find the vertex to cell links.
from fipy import Grid2D
import numpy as np
from scipy.sparse import coo_matrix
import itertools
def lists_to_numpy(x):
"""List of lists of different length to Numpy array. See
https://stackoverflow.com/questions/38619143/convert-python-sequence-to-numpy-array-filling-missing-values
>>> print(lists_to_numpy([[0], [0, 1], [0, 1, 2]]))
array([[ 0, -1, -1],
[ 0, 1, -1],
[ 0, 1, 2]])
"""
return np.array(list(itertools.zip_longest(*x, fillvalue=-1))).T
def invert_sparse_bool(x, mshape):
"""Invert a sparse bool matrix represented by a 2D array and return as
inverted 2D array.
>>> a = numpy.array([[0, 2], [1, 3], [0, 3], [3, 4]])
>>> print(invert_sparse_bool(a, (4, 5)))
[[ 0 2 -1]
[ 1 -1 -1]
[ 0 -1 -1]
[ 1 2 3]
[ 3 -1 -1]]
"""
arr1 = np.indices(x.shape)[0]
arr2 = np.stack((arr1, x), axis=-1)
arr3 = np.reshape(arr2, (-1, 2))
lists = coo_matrix(
(np.ones(len(arr3), dtype=int),
(arr3[:, 0], arr3[:, 1])),
shape=mshape
).tolil().T.rows
return lists_to_numpy(lists)
m = Grid2D(nx=3, ny=3)
cellVertexIDs = m._cellVertexIDs.swapaxes(0, 1)
vertexCellIDs = invert_sparse_bool(
cellVertexIDs,
(m.numberOfCells, m.numberOfVertices)
)
print('cellVertexIDs:', m._cellVertexIDs)
print('vertexCellIDs:', vertexCellIDs)
Note that the m._cellVertexIDs are of shape (maxNumberOfVerticesPerCell, numberOfCells), but it's a little easier to implement when they are reshaped. The new vertexCellIDs array are shaped as (numberOfVertices, maxNumberOfCellsPerVertex). The vertexCellIDs do need a fill value as each vertex won't be connected to the same number of cells.
The output from this is
cellVertexIDs: [[ 1 5 4 0]
[ 2 6 5 1]
[ 3 7 6 2]
[ 5 9 8 4]
[ 6 10 9 5]
[ 7 11 10 6]
[ 9 13 12 8]
[10 14 13 9]
[11 15 14 10]]
vertexCellIDs: [[ 0 -1 -1 -1]
[ 0 1 -1 -1]
[ 1 2 -1 -1]
[ 2 -1 -1 -1]
[ 0 3 -1 -1]
[ 0 1 3 4]
[ 1 2 4 5]
[ 2 5 -1 -1]
[ 3 6 -1 -1]
[ 3 4 6 7]
[ 4 5 7 8]
[ 5 8 -1 -1]
[ 6 -1 -1 -1]
[ 6 7 -1 -1]
[ 7 8 -1 -1]
[ 8 -1 -1 -1]]
which makes sense to me for a 3x3 mesh with 9 cells and 16 vertices and an ordered numbering system for both cells and vertices (left to right, bottom to top).
Couldn't transpose np.array
import numpy as np
arr = np.arange(16).reshape((2, 2, 4))
print(arr)
arr.transpose(1, 0, 2)
print('------------')
print(arr)
output:
[[[ 0 1 2 3]
[ 4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]]
------------
[[[ 0 1 2 3]
[ 4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]]
I think that's weird. Here the same example but it works. numpy==1.17.2 What could be wrong?
Try typing 'arr = arr.transpose(1, 0, 2)' in place of 'arr.transpose(1, 0, 2)'. You can also try typing 'print(arr.transpose(1, 0, 2)' in place of 'print(arr)'.