How to divide a tensor elements to specific indixes - python

I have a tensor like this:
out = [[ 3, 6, 5, 4],
[ 6, 5, 10, 13],
[ 5, 10, 6, 22],
[ 4, 13, 22, 9]]
And this is a symmetrical matrix. What I want to do is to divide each element by the values in the same indexes of diagonal. So the values of diagonal in this matrix are:
index0 = 3
index1 = 5
index2 = 6
index3 = 9
The result will look like this:
[[3 , 6/(3*5) , 5/(3*6) , 4/(3*9) ]
[6/(3*5), 5 , 10/(5*6), 13/(5*9)]
[5/(3*6), 10/(5*6), 6 , 22/(6*9)]
[4/(3*9), 13/(5*9), 22/(6*9), 9 ]]
Let me walk through the first row:
3 is the value in the diagonal so we will skip it
6/3*5, 6 is the value at index 0 and 1 so that I will divide 6 by the diagonal values in index0 and 1.
5/3*6, 5 is the value at index 0 and 2 so that I will divide it by diagonal values at index 0 and 2
4/3*9, 4 is the value at index 0 and 3 so that I will divide it by diagonal values at index 0 and 3

It could be done as following in tensorflow (or numpy).
We take the original matrix and zero out diagonal.
We divide the resulted matrix by diagonal vector.
We transpose result from 2 and divide it again by diagonal vector.
We add diagonal that we zeroed out in step 1.
import tensorflow as tf
out = [[ 3, 6, 5, 4],
[ 6, 5, 10, 13],
[ 5, 10, 6, 22],
[ 4, 13, 22, 9]]
tensor = tf.constant(out, dtype=tf.float32)
diag_indices = tf.tile(tf.range(tf.shape(tensor)[0])[..., None], [1, 2])
diag = tf.gather_nd(tensor, diag_indices) # [3. 5. 6. 9.]
diag_matrix = tf.linalg.tensor_diag(diag)
zero_diag_matrix = tensor - diag_matrix
res = tf.transpose(zero_diag_matrix / diag) / diag + diag_matrix
with tf.Session() as sess:
print(res.eval())
# [[3. 0.4 0.27777776 0.14814815]
# [0.4 5. 0.33333334 0.28888887]
# [0.27777776 0.3333333 6. 0.4074074 ]
# [0.14814815 0.28888887 0.4074074 9. ]]

Using numpy, you could do as follows:
import numpy as np
out = out.astype(float)
# diagonal elements in out
d = np.copy(np.diagonal(out))
# Indices of lower triangular matriX
tril_ix = np.tril_indices_from(out, k=-1)
# cumulative sum of the diagonal values
# over the first axis on a square matrix
dx = np.cumsum(np.diag(d), 1)
# replicate ove lower triangular
dx[tril_ix] += np.rot90(dx, k=1)[::-1][tril_ix]
# same but accumulating the diagonal elements
# upwards on the y axis
dy = np.cumsum(np.diag(d)[::-1],0)[::-1]
# replicate ove rlower triangular
dy[tril_ix] += np.rot90(dy, k=1)[::-1][tril_ix]
# mask where to apply the product
m = dy!=0
# perform div and mult
out[m] = out[m]/(dx[m]*dy[m])
np.fill_diagonal(out, d)
print(out)
array([[3. , 0.4 , 0.27777778, 0.14814815],
[0.4 , 5. , 0.33333333, 0.28888889],
[0.27777778, 0.33333333, 6. , 0.40740741],
[0.14814815, 0.28888889, 0.40740741, 9. ]])

Here's a tensorflow version.
import tensorflow as tf
import numpy as np
out = tf.Variable([[ 3, 6, 5, 4],
[ 6, 5, 10, 13],
[ 5, 10, 6, 22],
[ 4, 13, 22, 9]], dtype=tf.float32)
# this solution only works for square matrices
assert out.shape[-2] == out.shape[-1]
out_diag = tf.linalg.diag_part(out)
res = tf.Variable(tf.zeros(out.shape, dtype=tf.float32))
for i in tf.range(out.shape[0]):
_ = res[..., (i+1):, i].assign(out[..., (i+1):, i] / out_diag[..., (i+1):] / out_diag[..., i])
_ = res[..., i, (i+1):].assign(out[..., i, (i+1):] / out_diag[..., (i+1):] / out_diag[..., i])
print(res)

Related

Sorting 2 single dimensional arrays into a 1 dimensional array

I am trying to write a code that chooses one by one from a and b. I want to make a 2 dimensional array where the first index is either 0 or 1. 0 representing a and 1 representing b and the second index would just be the values in array a or b so it will be something like this [[0 7][1 13]]. I want the function to also have it in order so it will be The function starts off with a then it will be like a,b,a,b,a... if its the other way around b,a,b,a,b.... Comparing which index function comes before the other so since the first index of b is 0 and the first index of a is 7, since 0 < 7 the code will start off with b [[1 0]] and then it will go for the next index on 'a' which is 7 so the [[1 0],[0, 7]]. It will keep on doing this until it reaches the end of the array a and b. How can I get the expected output below?
import numpy as np
a = np.array([ 7, 9, 12, 15, 17, 22])
b = np.array([ 0, 13, 17, 18])
Expected Output:
[[ 1 0]
[ 0 7]
[ 1 13]
[ 0 15]
[ 1 17]
[ 0 17]
[ 1 18]
[ 0 22]]
You can combine the two arrays and sort the values while preserving the origin of each value (using 2N and 2N+1 offsetting).
Then filter out the consecutive odd/even values to only retain values with alternating origin indicator (1 or 0)
Finally, build the resulting array of [origin,value] pairs by reversing the 2N and 2N+1 tagging.
import numpy as np
a = np.array([ 7, 9, 12, 15, 17, 22])
b = np.array([ 0, 13, 17, 18])
p = 1 if a[0] > b[0] else 0 # determine first entry
c = np.sort(np.concatenate((a*2+p,b*2+1-p))) # combine/sort tagged values
c = np.concatenate((c[:1],c[1:][c[:-1]%2 != c[1:]%2])) # filter out same-array repeats
c = np.concatenate(((c[:,None]+p)%2,c[:,None]//2),axis=1) # build result
print(c)
[[ 1 0]
[ 0 7]
[ 1 13]
[ 0 15]
[ 1 17]
[ 0 17]
[ 1 18]
[ 0 22]]
This isn't a Numpy solution, but may work if you are okay processing these as lists. You can make iterators out of the lists, then alternate between them using itertools.dropwhile to proceed through the elements until you get the next in line. It might look something like:
from itertools import dropwhile
def pairs(a, b):
index = 0 if a[0] <= b[0] else 1
iters = [iter(a), iter(b)]
while True:
try:
current = next(iters[index])
yield [index,current]
index = int(not index)
except StopIteration:
break
iters[index] = dropwhile(lambda n: n < current, iters[index])
list(pairs(a, b))
Which results in:
[[1, 0], [0, 7], [1, 13], [0, 15], [1, 17], [0, 17], [1, 18], [0, 22]]
you can use conditions of which array element is from -> with sorted values -> including condition of ->group wise split and ->flip
c = np.hstack([np.vstack([a, np.zeros(len(a))]), np.vstack([b, np.ones(len(b))])]).T
c = c[c[:, 0].argsort()]
# Group wise split and flip array - 2nd possiblity
d = np.vstack(np.apply_along_axis(np.flip, 0, np.split(c, np.unique(c[:,0], return_index = True)[1])))[::-1]
res1 = np.vstack([d[0], d[1:][d[:,1][:-1]!=d[:,1][1:]]])
res2 = np.vstack([c[0], c[1:][c[:,1][:-1]!=c[:,1][1:]]])
if res1.shape[0]>res2.shape[0]:
print(res1)
else:
print(res2)
Out:
[[ 0. 1.]
[ 7. 0.]
[13. 1.]
[15. 0.]
[17. 1.]
[17. 0.]
[18. 1.]
[22. 0.]]

How to break a Numpy ndarray into blocks [duplicate]

Is there a way to slice a 2d array in numpy into smaller 2d arrays?
Example
[[1,2,3,4], -> [[1,2] [3,4]
[5,6,7,8]] [5,6] [7,8]]
So I basically want to cut down a 2x4 array into 2 2x2 arrays. Looking for a generic solution to be used on images.
There was another question a couple of months ago which clued me in to the idea of using reshape and swapaxes. The h//nrows makes sense since this keeps the first block's rows together. It also makes sense that you'll need nrows and ncols to be part of the shape. -1 tells reshape to fill in whatever number is necessary to make the reshape valid. Armed with the form of the solution, I just tried things until I found the formula that works.
You should be able to break your array into "blocks" using some combination of reshape and swapaxes:
def blockshaped(arr, nrows, ncols):
"""
Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
"""
h, w = arr.shape
assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
return (arr.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1,2)
.reshape(-1, nrows, ncols))
turns c
np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)
[out]:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
into
print(blockshaped(c, 2, 3))
[out]:
[[[ 0 1 2]
[ 6 7 8]]
[[ 3 4 5]
[ 9 10 11]]
[[12 13 14]
[18 19 20]]
[[15 16 17]
[21 22 23]]]
I've posted an inverse function, unblockshaped, here, and an N-dimensional generalization here. The generalization gives a little more insight into the reasoning behind this algorithm.
Note that there is also superbatfish's
blockwise_view. It arranges the
blocks in a different format (using more axes) but it has the advantage of (1)
always returning a view and (2) being capable of handling arrays of any
dimension.
It seems to me that this is a task for numpy.split or some variant.
e.g.
a = np.arange(30).reshape([5,6]) #a.shape = (5,6)
a1 = np.split(a,3,axis=1)
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)
If you have a NxN image you can create, e.g., a list of 2 NxN/2 subimages, and then divide them along the other axis.
numpy.hsplit and numpy.vsplit are also available.
There are some other answers that seem well-suited for your specific case already, but your question piqued my interest in the possibility of a memory-efficient solution usable up to the maximum number of dimensions that numpy supports, and I ended up spending most of the afternoon coming up with possible method. (The method itself is relatively simple, it's just that I still haven't used most of the really fancy features that numpy supports so most of the time was spent researching to see what numpy had available and how much it could do so that I didn't have to do it.)
def blockgen(array, bpa):
"""Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray
# parameter checking
if array.ndim != bpa.size: # bpa doesn't match array dimensionality
raise ValueError("Size of bpa must be equal to the array dimensionality.")
if (bpa.dtype != np.int # bpa must be all integers
or (bpa < 1).any() # all values in bpa must be >= 1
or (array.shape % bpa).any()): # % != 0 means not evenly divisible
raise ValueError("bpa ({0}) must consist of nonzero positive integers "
"that evenly divide the corresponding array axis "
"size".format(bpa))
# generate block edge indices
rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
for i, blk_n in enumerate(bpa))
# build slice sequences for each axis (unfortunately broadcasting
# can't be used to make the items easy to operate over
c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]
# Now to get the blocks; this is slightly less efficient than it could be
# because numpy doesn't like jagged arrays and I didn't feel like writing
# a ufunc for it.
for idxs in np.ndindex(*bpa):
blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))
yield array[blockbounds]
You question practically the same as this one. You can use the one-liner with np.ndindex() and reshape():
def cutter(a, r, c):
lenr = a.shape[0]/r
lenc = a.shape[1]/c
np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)
To create the result you want:
a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
cutter( a, 1, 2 )
#array([[[[1, 2]],
# [[3, 4]]],
# [[[5, 6]],
# [[7, 8]]]])
Some minor enhancement to TheMeaningfulEngineer's answer that handles the case when the big 2d array cannot be perfectly sliced into equally sized subarrays
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
Examples:
a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)
a->
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
out[0] ->
array([[0., 1., 2.],
[5., 6., 7.]])
out[1]->
array([[3., 4.],
[8., 9.]])
out[-1]->
array([[23., 24.]])
For now it just works when the big 2d array can be perfectly sliced into equally sized subarrays.
The code bellow slices
a ->array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
into this
block_array->
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]]])
p ang q determine the block size
Code
a = arange(24)
a = a.reshape((4,6))
m = a.shape[0] #image row size
n = a.shape[1] #image column size
p = 2 #block row size
q = 3 #block column size
block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
previous_row = row_block * p
previous_column = 0
for column_block in range(blocks_per_column):
previous_column = column_block * q
block = a[previous_row:previous_row+p,previous_column:previous_column+q]
block_array.append(block)
block_array = array(block_array)
If you want a solution that also handles the cases when the matrix is
not equally divided, you can use this:
from operator import add
half_split = np.array_split(input, 2)
res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)
Here is a solution based on unutbu's answer that handle case where matrix cannot be equally divided. In this case, it will resize the matrix before using some interpolation. You need OpenCV for this. Note that I had to swap ncols and nrows to make it works, didn't figured why.
import numpy as np
import cv2
import math
def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
"""
arr a 2D array, typically an image
r_nbrs numbers of rows
r_cols numbers of cols
"""
arr_h, arr_w = arr.shape
size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )
if size_w != arr_w or size_h != arr_h:
arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)
nrows = int(size_w // r_nbrs)
ncols = int(size_h // c_nbrs)
return (arr.reshape(r_nbrs, ncols, -1, nrows)
.swapaxes(1,2)
.reshape(-1, ncols, nrows))
a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)
yields
[[7 6 2 4 4 2 5 2 3]
[2 3 7 6 8 8 2 6 2]
[4 1 3 1 3 8 1 3 7]
[6 1 1 5 7 2 1 5 8]
[8 8 7 6 6 1 8 8 4]
[6 1 8 2 1 4 5 1 8]
[7 3 4 2 5 6 1 2 7]
[4 6 7 5 8 2 8 2 8]
[6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
[2, 3, 7],
[4, 1, 3]]), array([[4, 4, 2],
[6, 8, 8],
[1, 3, 8]]), array([[5, 2, 3],
[2, 6, 2],
[1, 3, 7]])], [array([[6, 1, 1],
[8, 8, 7],
[6, 1, 8]]), array([[5, 7, 2],
[6, 6, 1],
[2, 1, 4]]), array([[1, 5, 8],
[8, 8, 4],
[5, 1, 8]])], [array([[7, 3, 4],
[4, 6, 7],
[6, 6, 5]]), array([[2, 5, 6],
[5, 8, 2],
[5, 6, 1]]), array([[1, 2, 7],
[8, 2, 8],
[2, 6, 4]])]]
I publish my solution. Notice that this code doesn't' actually create copies of original array, so it works well with big data. Moreover, it doesn't crash if array cannot be divided evenly (but you can easly add condition for that by deleting ceil and checking if v_slices and h_slices are divided without rest).
import numpy as np
from math import ceil
a = np.arange(9).reshape(3, 3)
p, q = 2, 2
width, height = a.shape
v_slices = ceil(width / p)
h_slices = ceil(height / q)
for h in range(h_slices):
for v in range(v_slices):
block = a[h * p : h * p + p, v * q : v * q + q]
# do something with a block
This code changes (or, more precisely, gives you direct access to part of an array) this:
[[0 1 2]
[3 4 5]
[6 7 8]]
Into this:
[[0 1]
[3 4]]
[[2]
[5]]
[[6 7]]
[[8]]
If you need actual copies, Aenaon code is what you are looking for.
If you are sure that big array can be divided evenly, you can use numpy splitting tools.
to add to #Aenaon answer and his blockfy function, if you are working with COLOR IMAGES/ 3D ARRAY here is my pipeline to create crops of 224 x 224 for 3 channel input
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
then extended above to
for file in os.listdir(path_to_crop): ### list files in your folder
img = io.imread(path_to_crop + file, as_gray=False) ### open image
r = blockfy(img[:,:,0],224,224) ### crop blocks of 224 x 224 for red channel
g = blockfy(img[:,:,1],224,224) ### crop blocks of 224 x 224 for green channel
b = blockfy(img[:,:,2],224,224) ### crop blocks of 224 x 224 for blue channel
for x in range(0,len(r)):
img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch
img = img.astype(np.uint8) ### cast back to proper integers
img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
img_swap_2 = img_swap.swapaxes(0, 1) ### do it again
Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
format = 'jpeg',
subsampling=0,
quality=100) ### save patch with new name etc

split numpy multidimensional array into equal pieces [duplicate]

Is there a way to slice a 2d array in numpy into smaller 2d arrays?
Example
[[1,2,3,4], -> [[1,2] [3,4]
[5,6,7,8]] [5,6] [7,8]]
So I basically want to cut down a 2x4 array into 2 2x2 arrays. Looking for a generic solution to be used on images.
There was another question a couple of months ago which clued me in to the idea of using reshape and swapaxes. The h//nrows makes sense since this keeps the first block's rows together. It also makes sense that you'll need nrows and ncols to be part of the shape. -1 tells reshape to fill in whatever number is necessary to make the reshape valid. Armed with the form of the solution, I just tried things until I found the formula that works.
You should be able to break your array into "blocks" using some combination of reshape and swapaxes:
def blockshaped(arr, nrows, ncols):
"""
Return an array of shape (n, nrows, ncols) where
n * nrows * ncols = arr.size
If arr is a 2D array, the returned array should look like n subblocks with
each subblock preserving the "physical" layout of arr.
"""
h, w = arr.shape
assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
return (arr.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1,2)
.reshape(-1, nrows, ncols))
turns c
np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)
[out]:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
into
print(blockshaped(c, 2, 3))
[out]:
[[[ 0 1 2]
[ 6 7 8]]
[[ 3 4 5]
[ 9 10 11]]
[[12 13 14]
[18 19 20]]
[[15 16 17]
[21 22 23]]]
I've posted an inverse function, unblockshaped, here, and an N-dimensional generalization here. The generalization gives a little more insight into the reasoning behind this algorithm.
Note that there is also superbatfish's
blockwise_view. It arranges the
blocks in a different format (using more axes) but it has the advantage of (1)
always returning a view and (2) being capable of handling arrays of any
dimension.
It seems to me that this is a task for numpy.split or some variant.
e.g.
a = np.arange(30).reshape([5,6]) #a.shape = (5,6)
a1 = np.split(a,3,axis=1)
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)
If you have a NxN image you can create, e.g., a list of 2 NxN/2 subimages, and then divide them along the other axis.
numpy.hsplit and numpy.vsplit are also available.
There are some other answers that seem well-suited for your specific case already, but your question piqued my interest in the possibility of a memory-efficient solution usable up to the maximum number of dimensions that numpy supports, and I ended up spending most of the afternoon coming up with possible method. (The method itself is relatively simple, it's just that I still haven't used most of the really fancy features that numpy supports so most of the time was spent researching to see what numpy had available and how much it could do so that I didn't have to do it.)
def blockgen(array, bpa):
"""Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray
# parameter checking
if array.ndim != bpa.size: # bpa doesn't match array dimensionality
raise ValueError("Size of bpa must be equal to the array dimensionality.")
if (bpa.dtype != np.int # bpa must be all integers
or (bpa < 1).any() # all values in bpa must be >= 1
or (array.shape % bpa).any()): # % != 0 means not evenly divisible
raise ValueError("bpa ({0}) must consist of nonzero positive integers "
"that evenly divide the corresponding array axis "
"size".format(bpa))
# generate block edge indices
rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
for i, blk_n in enumerate(bpa))
# build slice sequences for each axis (unfortunately broadcasting
# can't be used to make the items easy to operate over
c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]
# Now to get the blocks; this is slightly less efficient than it could be
# because numpy doesn't like jagged arrays and I didn't feel like writing
# a ufunc for it.
for idxs in np.ndindex(*bpa):
blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))
yield array[blockbounds]
You question practically the same as this one. You can use the one-liner with np.ndindex() and reshape():
def cutter(a, r, c):
lenr = a.shape[0]/r
lenc = a.shape[1]/c
np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)
To create the result you want:
a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
# [5, 6, 7, 8]])
cutter( a, 1, 2 )
#array([[[[1, 2]],
# [[3, 4]]],
# [[[5, 6]],
# [[7, 8]]]])
Some minor enhancement to TheMeaningfulEngineer's answer that handles the case when the big 2d array cannot be perfectly sliced into equally sized subarrays
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
Examples:
a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)
a->
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
out[0] ->
array([[0., 1., 2.],
[5., 6., 7.]])
out[1]->
array([[3., 4.],
[8., 9.]])
out[-1]->
array([[23., 24.]])
For now it just works when the big 2d array can be perfectly sliced into equally sized subarrays.
The code bellow slices
a ->array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
into this
block_array->
array([[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]]])
p ang q determine the block size
Code
a = arange(24)
a = a.reshape((4,6))
m = a.shape[0] #image row size
n = a.shape[1] #image column size
p = 2 #block row size
q = 3 #block column size
block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
previous_row = row_block * p
previous_column = 0
for column_block in range(blocks_per_column):
previous_column = column_block * q
block = a[previous_row:previous_row+p,previous_column:previous_column+q]
block_array.append(block)
block_array = array(block_array)
If you want a solution that also handles the cases when the matrix is
not equally divided, you can use this:
from operator import add
half_split = np.array_split(input, 2)
res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)
Here is a solution based on unutbu's answer that handle case where matrix cannot be equally divided. In this case, it will resize the matrix before using some interpolation. You need OpenCV for this. Note that I had to swap ncols and nrows to make it works, didn't figured why.
import numpy as np
import cv2
import math
def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
"""
arr a 2D array, typically an image
r_nbrs numbers of rows
r_cols numbers of cols
"""
arr_h, arr_w = arr.shape
size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )
if size_w != arr_w or size_h != arr_h:
arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)
nrows = int(size_w // r_nbrs)
ncols = int(size_h // c_nbrs)
return (arr.reshape(r_nbrs, ncols, -1, nrows)
.swapaxes(1,2)
.reshape(-1, ncols, nrows))
a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)
yields
[[7 6 2 4 4 2 5 2 3]
[2 3 7 6 8 8 2 6 2]
[4 1 3 1 3 8 1 3 7]
[6 1 1 5 7 2 1 5 8]
[8 8 7 6 6 1 8 8 4]
[6 1 8 2 1 4 5 1 8]
[7 3 4 2 5 6 1 2 7]
[4 6 7 5 8 2 8 2 8]
[6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
[2, 3, 7],
[4, 1, 3]]), array([[4, 4, 2],
[6, 8, 8],
[1, 3, 8]]), array([[5, 2, 3],
[2, 6, 2],
[1, 3, 7]])], [array([[6, 1, 1],
[8, 8, 7],
[6, 1, 8]]), array([[5, 7, 2],
[6, 6, 1],
[2, 1, 4]]), array([[1, 5, 8],
[8, 8, 4],
[5, 1, 8]])], [array([[7, 3, 4],
[4, 6, 7],
[6, 6, 5]]), array([[2, 5, 6],
[5, 8, 2],
[5, 6, 1]]), array([[1, 2, 7],
[8, 2, 8],
[2, 6, 4]])]]
I publish my solution. Notice that this code doesn't' actually create copies of original array, so it works well with big data. Moreover, it doesn't crash if array cannot be divided evenly (but you can easly add condition for that by deleting ceil and checking if v_slices and h_slices are divided without rest).
import numpy as np
from math import ceil
a = np.arange(9).reshape(3, 3)
p, q = 2, 2
width, height = a.shape
v_slices = ceil(width / p)
h_slices = ceil(height / q)
for h in range(h_slices):
for v in range(v_slices):
block = a[h * p : h * p + p, v * q : v * q + q]
# do something with a block
This code changes (or, more precisely, gives you direct access to part of an array) this:
[[0 1 2]
[3 4 5]
[6 7 8]]
Into this:
[[0 1]
[3 4]]
[[2]
[5]]
[[6 7]]
[[8]]
If you need actual copies, Aenaon code is what you are looking for.
If you are sure that big array can be divided evenly, you can use numpy splitting tools.
to add to #Aenaon answer and his blockfy function, if you are working with COLOR IMAGES/ 3D ARRAY here is my pipeline to create crops of 224 x 224 for 3 channel input
def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0] #image row size
n = a.shape[1] #image column size
# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc
A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a
block_list = []
previous_row = 0
for row_block in range(bpc):
previous_row = row_block * p
previous_column = 0
for column_block in range(bpr):
previous_column = column_block * q
block = A[previous_row:previous_row+p, previous_column:previous_column+q]
# remove nan columns and nan rows
nan_cols = np.all(np.isnan(block), axis=0)
block = block[:, ~nan_cols]
nan_rows = np.all(np.isnan(block), axis=1)
block = block[~nan_rows, :]
## append
if block.size:
block_list.append(block)
return block_list
then extended above to
for file in os.listdir(path_to_crop): ### list files in your folder
img = io.imread(path_to_crop + file, as_gray=False) ### open image
r = blockfy(img[:,:,0],224,224) ### crop blocks of 224 x 224 for red channel
g = blockfy(img[:,:,1],224,224) ### crop blocks of 224 x 224 for green channel
b = blockfy(img[:,:,2],224,224) ### crop blocks of 224 x 224 for blue channel
for x in range(0,len(r)):
img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch
img = img.astype(np.uint8) ### cast back to proper integers
img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
img_swap_2 = img_swap.swapaxes(0, 1) ### do it again
Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
format = 'jpeg',
subsampling=0,
quality=100) ### save patch with new name etc

Fast way to take average of every N rows in a .npy array

I have a very large masked NumPy array (originalArray) with many rows and two columns. I want take the average of every two rows in originalArray and build a newArray in which each row is the average of two rows in originalArray (so newArray has half as many rows as originalArray). This should be a simple thing to do, but the script below is EXTREMELY slow. Any advice from the community would be greatly appreciated.
newList = []
for i in range(0, originalArray.shape[0], 2):
r = originalArray[i:i+2,:].mean(axis=0)
newList.append(r)
newArray = np.asarray(newList)
There must be a more elegant way of doing this. Many thanks!
The mean of two values a and b is 0.5*(a+b)
Therefore you can do it like this:
newArray = 0.5*(originalArray[0::2] + originalArray[1::2])
It will sum up all two consecutive rows and in the end multiply every element by 0.5.
Since in the title you are asking for avg over N rows, here is a more general solution:
def groupedAvg(myArray, N=2):
result = np.cumsum(myArray, 0)[N-1::N]/float(N)
result[1:] = result[1:] - result[:-1]
return result
The general form of the average over n elements is sum([x1,x2,...,xn])/n.
The sum of elements m to m+n in vector v is the same as subtracting the m-1th element from the m+nth element of cumsum(v). Unless m is 0, in that case you don't subtract anything (result[0]).
That is what we take advantage of here. Also since everything is linear, it is not important where we divide by N, so we do it right at the beginning, but that is just a matter of taste.
If the last group has less than N elements, it will be ignored completely.
If you don't want to ignore it, you have to treat the last group specially:
def avg(myArray, N=2):
cum = np.cumsum(myArray,0)
result = cum[N-1::N]/float(N)
result[1:] = result[1:] - result[:-1]
remainder = myArray.shape[0] % N
if remainder != 0:
if remainder < myArray.shape[0]:
lastAvg = (cum[-1]-cum[-1-remainder])/float(remainder)
else:
lastAvg = cum[-1]/float(remainder)
result = np.vstack([result, lastAvg])
return result
Your problem (average of every two rows with two columns):
>>> a = np.reshape(np.arange(12),(6,2))
>>> a
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11]])
>>> a.transpose().reshape(-1,2).mean(1).reshape(2,-1).transpose()
array([[ 1., 2.],
[ 5., 6.],
[ 9., 10.]])
Other dimensions (average of every four rows with three columns):
>>> a = np.reshape(np.arange(24),(8,3))
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]])
>>> a.transpose().reshape(-1,4).mean(1).reshape(3,-1).transpose()
array([[ 4.5, 5.5, 6.5],
[ 16.5, 17.5, 18.5]])
General formula for taking the average of r rows for a 2D array a with c columns:
a.transpose().reshape(-1,r).mean(1).reshape(c,-1).transpose()
import numpy as np
def av(array):
return 1. * np.sum(array.reshape(1. * array.shape[0] / 2,2, array.shape[1]),axis = 1) / array.shape[1]
a = np.array([[1,1],[2,2],[3,3],[4,4]])
print av(a)
>> [[ 1.5 1.5] [ 3.5 3.5]]

How to make numpy.cumsum start after the first value

I have:
import numpy as np
position = np.array([4, 4.34, 4.69, 5.02, 5.3, 5.7, ..., 4])
x = (B/position**2)*dt
A = np.cumsum(x)
assert A[0] == 0 # I want this to be true.
Where B and dt are scalar constants. This is for a numerical integration problem with initial condition of A[0] = 0. Is there a way to set A[0] = 0 and then do a cumsum for everything else?
I don't understand what exactly your problem is, but here are some things you can do to have A[0] = 0.
You can create A to be longer by one index to have the zero as the first entry:
# initialize example data
import numpy as np
B = 1
dt = 1
position = np.array([4, 4.34, 4.69, 5.02, 5.3, 5.7])
# do calculation
A = np.zeros(len(position) + 1)
A[1:] = np.cumsum((B/position**2)*dt)
Result:
A = [ 0. 0.0625 0.11559096 0.16105356 0.20073547 0.23633533 0.26711403]
len(A) == len(position) + 1
Alternatively, you can manipulate the calculation to substract the first entry of the result:
# initialize example data
import numpy as np
B = 1
dt = 1
position = np.array([4, 4.34, 4.69, 5.02, 5.3, 5.7])
# do calculation
A = np.cumsum((B/position**2)*dt)
A = A - A[0]
Result:
[ 0. 0.05309096 0.09855356 0.13823547 0.17383533 0.20461403]
len(A) == len(position)
As you see, the results have different lengths. Is one of them what you expect?
1D cumsum
A wrapper around np.cumsum that sets first element to 0:
def cumsum(pmf):
cdf = np.empty(len(pmf) + 1, dtype=pmf.dtype)
cdf[0] = 0
np.cumsum(pmf, out=cdf[1:])
return cdf
Example usage:
>>> np.arange(1, 11)
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> cumsum(np.arange(1, 11))
array([ 0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55])
N-D cumsum
A wrapper around np.cumsum that sets first element to 0, and works with N-D arrays:
def cumsum(pmf, axis=None, dtype=None):
if axis is None:
pmf = pmf.reshape(-1)
axis = 0
if dtype is None:
dtype = pmf.dtype
idx = [slice(None)] * pmf.ndim
# Create array with extra element along cumsummed axis.
shape = list(pmf.shape)
shape[axis] += 1
cdf = np.empty(shape, dtype)
# Set first element to 0.
idx[axis] = 0
cdf[tuple(idx)] = 0
# Perform cumsum on remaining elements.
idx[axis] = slice(1, None)
np.cumsum(pmf, axis=axis, dtype=dtype, out=cdf[tuple(idx)])
return cdf
Example usage:
>>> np.arange(1, 11).reshape(2, 5)
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10]])
>>> cumsum(np.arange(1, 11).reshape(2, 5), axis=-1)
array([[ 0, 1, 3, 6, 10, 15],
[ 0, 6, 13, 21, 30, 40]])
I totally understand your pain, I wonder why Numpy doesn't allow this with np.cumsum. Anyway, though I'm really late and there's already another good answer, I prefer this one a bit more:
np.cumsum(np.pad(array, (1, 0), "constant"))
where array in your case is (B/position**2)*dt. You can change the order of np.pad and np.cumsum as well. I'm just adding a zero to the start of the array and calling np.cumsum.
You can use roll (shift right by 1) and then set the first entry to zero.

Categories

Resources