How to get the blocks back from a scipy sparse block matrix? - python

After some vectorized calculations, I get a sparse block matrix with all my results stacked in blocks of same size.
>>> A = [[1, 1],
... [1, 1]]
>>> B = [[2, 2],
... [2, 2]]
>>> C = [[3, 3],
... [3, 3]]
>>> results = scipy.sparse.block_diag(A, B, C)
>>> print(results.toarray())
[[1 1 0 0 0 0]
[1 1 0 0 0 0]
[0 0 2 2 0 0]
[0 0 2 2 0 0]
[0 0 0 0 3 3]
[0 0 0 0 3 3]]
How can I get back these arrays A,B,C in an efficient way, if necessery by providing their shape (2,2)?

In [177]: >>> A = [[1, 1],
...: ... [1, 1]]
...: >>> B = [[2, 2],
...: ... [2, 2]]
...: >>> C = [[3, 3],
...: ... [3, 3]]
...: >>> results = sparse.block_diag([A, B, C])
...:
In [178]: results
Out[178]:
<6x6 sparse matrix of type '<class 'numpy.int64'>'
with 12 stored elements in COOrdinate format>
block_diag does not preserve the inputs; rather it creates coo format matrix, representing the whole matrix, not the pieces.
In [194]: results.data
Out[194]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3], dtype=int64)
In [195]: results.row
Out[195]: array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], dtype=int32)
In [196]: results.col
Out[196]: array([0, 1, 0, 1, 2, 3, 2, 3, 4, 5, 4, 5], dtype=int32)
In [179]: results.A
Out[179]:
array([[1, 1, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[0, 0, 2, 2, 0, 0],
[0, 0, 2, 2, 0, 0],
[0, 0, 0, 0, 3, 3],
[0, 0, 0, 0, 3, 3]], dtype=int64)
block_diag pass the arrays to sparse.bmat. That in turn makes a coo matrix from each, and then merges the coo attributes into 3 arrays, which are inputs to the global sparse matrix.
There is another sparse format bsr that may preserve the blocks (until conversion to csr for calculation), but I'll have to experiment to see that's the case.
Let's make a bsr from that results coo:
In [186]: bresults = sparse.bsr_matrix(results)
In [187]: bresults
Out[187]:
<6x6 sparse matrix of type '<class 'numpy.int64'>'
with 12 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [188]: bresults.blocksize
Out[188]: (2, 2)
In [189]: bresults.data
Out[189]:
array([[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]]], dtype=int64)
So it deduces that there are blocks, just as you desired.
In [191]: bresults.indices
Out[191]: array([0, 1, 2], dtype=int32)
In [192]: bresults.indptr
Out[192]: array([0, 1, 2, 3], dtype=int32)
So it's a csr like storage, but with the data arranged in blocks.
It may be possible to construct this from your A,B,C without the block_diag intermediary, but I'd have to look at the docs more.

that's a funny little problem.
I don't think there is a function that solves this in one line, but there's a way to do it programmatically.
Check out what res.data prints out, I use it here.
This works when shapes are all the same.
from scipy.sparse import block_diag
a = [[1, 2, 4],
[3, 4, 4]]
b = [[2, 2, 1],
[2, 2, 1]]
c = [[3, 3, 6],
[3, 3, 6]]
res = block_diag((a, b, c))
def goBack(res, shape):
s = shape[0]*shape[1]
num = int(len(res.data)/s)
for i in range (num):
mat = res.data[i*s:(i+1)*s].reshape(shape)
print(mat)
goBack(res, [2,3])
Output:
[[1 2 4]
[3 4 4]]
[[2 2 1]
[2 2 1]]
[[3 3 6]
[3 3 6]]
Edit:
Okay, this does not work when any of the elements of the provided matrices is zero, as then it would not be counted in res.data.
Also, forget it, the link provided by cleb should help you.

Related

How to generate values from a diagonal to fill matrix

I have the following diagonal matrix
a = array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
And the desired out come is the following
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Each element is the sum of the corresponding diagonals.
Thanks a lot
Try:
>>> np.diag(a) + np.diag(a)[:, None] - a
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Addendum
What if a is a DataFrame?
Then: np.diag(a) + np.diag(a)[:, None] - a is also a DataFrame (with same index and columns as a).
What if a is a numpy array, but I want a DataFrame result?
Then use: pd.DataFrame(...) instead.
You can use:
# get diagonal
diag = np.diag(a)
# outer sum
out = diag+diag[:,None]
# or
# out = np.outer(diag, diag)
# reset diagonal
np.fill_diagonal(out, diag)
print(out)
output:
[[1 3 4 5]
[3 2 5 6]
[4 5 3 7]
[5 6 7 4]]

multiplying large 2D numpy arrays

I would like to multyply the following matrices (using numpy) in the most efficient way.
This is the code for the matrixes:
a = np.array([[1, 5], [2, 6], [3, 7], [4, 8]])
m = np.array([[1, 0, 0, 1], [1, 0, 1, 0], [0, 1, 0, 1], [0, 1, 1, 1]])
These are the matrixes visualized better:
a:
[[1 5]
[2 6]
[3 7]
[4 8]]
m:
[[1 0 0 1]
[1 0 1 0]
[0 1 0 1]
[0 1 1 1]]
I want to multiply a by (the first column of matrix m), like this
a m[:,0] x0
[[1 5] [[1] [[1 5]
[2 6] * [1] = [2 6]
[3 7] [0] [0 0]
[4 8]] [0]] [0 0]
And then I want to multiply a by (the second column of matrix m), like this
a * m[:,1] = x1
And then 3rd and 4th column
a * m[:,2] = x2
a * m[:,3] = x3
And finally, I want to put the resulting matrices x0,x1,x2,x3 all in one matrix.
X = [x0, x1, x2, x3]
The size X in this example is 4 x 8.
The final result in this example is:
X =
[[[1 5 0 0 0 0 1 5]
[2 6 0 0 2 6 0 0]
[0 0 3 7 0 0 3 7]
[0 0 4 8 4 8 4 8]]
I would like to know how to do this with build-in functions of numpy, and using generators, instead of using 2 for loops, if it is possible.
This is just an example. In reality the matrixes have large dimensions and it is important that the multiplications are done as fast as possible.
Thank you
You may achieve it with broadcast and reshape
arr_out = (a[:,None] * m[...,None]).reshape(4,8)
Out[176]:
array([[1, 5, 0, 0, 0, 0, 1, 5],
[2, 6, 0, 0, 2, 6, 0, 0],
[0, 0, 3, 7, 0, 0, 3, 7],
[0, 0, 4, 8, 4, 8, 4, 8]])
You could transpose and expand the dimensions of m, to get the wanted result:
m.T[...,None] * a
array([[[1, 5],
[2, 6],
[0, 0],
[0, 0]],
[[0, 0],
[0, 0],
[3, 7],
[4, 8]],
...
If you want to stack the arrays horizontally to end up with a 2D array, use np.hstack:
np.hstack(m.T[...,None] * a)
array([[1, 5, 0, 0, 0, 0, 1, 5],
[2, 6, 0, 0, 2, 6, 0, 0],
[0, 0, 3, 7, 0, 0, 3, 7],
[0, 0, 4, 8, 4, 8, 4, 8]])
Or reshaping as:
(a[:,None] * m[...,None]).reshape(m.shape[0], -1)
This is the answer that I was looking for. Thank you Yatu and hpaulj.
X = m.T[...,None] * a
for i in range(4):
reshaped = np.hstack(X[i,:,:])
reshaped_simpler = np.hstack(X)
print (reshaped_simpler)
I got the rest of the answer from the following link:
reshape numpy 3D array to 2D
I rearranged the for loop because I got a Warning regarding the generators going to be depricated in future versions of Numpy.

How to re-index 3d numpy array

Lets say we have a 3D array like:
array = np.arange(8).reshape(2,2, 2)
new_array = np.zeros((2, 2, 2))
and lets assume we have some new random x,y,z indices for our array
x,y,z = np.meshgrid(array, array, array)
What is the fastest way to re-index our array?
A simple solution given here:
for x in range(0, 3):
for y in range(0, 3):
for z in range(0, 3):
new_x = x_coord[x,y,z]
new_y = y_coord[x,y,z]
new_z = z_coord[x,y,z]
new_array[x,y,z] = array[new_x, new_y, new_z]
Is there a one-liner for this that I am not aware of?
EDIT
Yes, there is... very easy:
vol = np.arange(8).reshape(2,2, 2)
arr = np.arange(2)
x,y,z = np.meshgrid(arr, arr, arr)
print(vol)
print(vol[y, x, z]) ### ---> You have to swap the axes here tho. Does anyone know why?
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
Also, it is very slow. Any ideas how to improve the performance?
Setup:
In [54]: arr = np.arange(9).reshape(3,3)
In [55]: x = np.random.randint(0,3,(3,3))
In [56]: y = np.random.randint(0,3,(3,3))
In [57]: x
Out[57]:
array([[2, 0, 1],
[0, 2, 1],
[0, 0, 1]])
In [58]: y
Out[58]:
array([[0, 0, 0],
[0, 1, 1],
[0, 1, 0]])
The simplest application of these indexing arrays:
In [59]: arr[x,y]
Out[59]:
array([[6, 0, 3],
[0, 7, 4],
[0, 1, 3]])
The iterative equivalent:
In [60]: out = np.empty_like(arr)
In [61]: for i in range(3):
...: for j in range(3):
...: out[i,j] = arr[x[i,j], y[i,j]]
...:
In [62]: out
Out[62]:
array([[6, 0, 3],
[0, 7, 4],
[0, 1, 3]])
Your code isn't the same, because it is modifying the source array as it iterates:
In [63]: arr1 = arr.copy()
In [64]: for i in range(3):
...: for j in range(3):
...: arr1[i,j] = arr1[x[i,j], y[i,j]]
...:
In [65]: arr1
Out[65]:
array([[6, 6, 3],
[6, 7, 7],
[6, 6, 6]])
There isn't a simple equivalent.
You can index with arr[x_coord,y_coord,z_coord] as long a indexing arrays broadcast together. Where they all have the same shape that is trivial.
In [68]: x1 = np.random.randint(0,3,(2,4))
In [69]: x1
Out[69]:
array([[2, 0, 2, 0],
[0, 0, 0, 2]])
In [70]: arr[x1,x1]
Out[70]:
array([[8, 0, 8, 0],
[0, 0, 0, 8]])
A simpler way of picking random values from an array is to create random row and column selectors, and use ix_ to create arrays that broadcast together:
In [71]: x1 = np.random.randint(0,3,(3))
In [72]: y1 = np.random.randint(0,3,(3))
In [75]: np.ix_(x1,y1)
Out[75]:
(array([[2],
[1],
[1]]), array([[2, 2, 1]]))
In [76]: arr[np.ix_(x1,y1)]
Out[76]:
array([[8, 8, 7],
[5, 5, 4],
[5, 5, 4]])
Almost sounds like you just want to shuffle the values of the array, like:
In [95]: arr
Out[95]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [96]: np.random.shuffle(arr.ravel())
In [97]: arr
Out[97]:
array([[0, 1, 2],
[7, 4, 3],
[6, 5, 8]])

element-wise count along axis of values in numpy array

How can I get an element-wise count of each element's number of occurrences in a numpy array, along a given axis? By "element-wise," I mean each value of the array should be converted to the number of times it appears.
Simple 2D input:
[[1, 1, 1],
[2, 2, 2],
[3, 4, 5]]
Should output:
[[3, 3, 3],
[3, 3, 3],
[1, 1, 1]]
The solution also needs to work relative to a given axis. For example, if my input array a has shape (4, 2, 3, 3), which I think of as "a 4x2 matrix of 3x3 matrices," running solution(a) should spit out a (4, 2, 3, 3) solution of the form above, where each 3x3 "submatrix" contains counts of the corresponding elements relative to that submatrix alone, rather than the entire numpy array at large.
More complex example: suppose I take the example input above a and call skimage.util.shape.view_as_windows(a, (2, 2)). This gives me array b of shape (2, 2, 2, 2):
[[[[1 1]
[2 2]]
[[1 1]
[2 2]]]
[[[2 2]
[3 4]]
[[2 2]
[4 5]]]]
Then solution(b) should output:
[[[[2 2]
[2 2]]
[[2 2]
[2 2]]]
[[[2 2]
[1 1]]
[[2 2]
[1 1]]]]
So even though the value 1 occurs 3 times in a and 4 times in b, it only occurs twice in each 2x2 window.
Starting off approach
We can use np.unique to get the counts of occurrences and also tag each element from 0 onwards, letting us index into those counts with the tags for the desired output, like so -
In [43]: a
Out[43]:
array([[1, 1, 1],
[2, 2, 2],
[3, 4, 5]])
In [44]: _,ids,c = np.unique(a, return_counts=True, return_inverse=True)
In [45]: c[ids].reshape(a.shape)
Out[45]:
array([[3, 3, 3],
[3, 3, 3],
[1, 1, 1]])
For positive integers numbers in input array, we can also use np.bincount -
In [73]: c = np.bincount(a.ravel())
In [74]: c[a]
Out[74]:
array([[3, 3, 3],
[3, 3, 3],
[1, 1, 1]])
For negative integers numbers, simply offset by the minimum in it.
Extending to generic n-dims
Let's use bincount for this -
In [107]: ar
Out[107]:
array([[[1, 1, 1],
[2, 2, 2],
[3, 4, 5]],
[[2, 3, 5],
[4, 3, 4],
[3, 1, 2]]])
In [104]: ar2D = ar.reshape(-1,ar.shape[-2]*ar.shape[-1])
# bincount2D_vectorized from https://stackoverflow.com/a/46256361/ #Divakar
In [105]: c = bincount2D_vectorized(ar2D)
In [106]: c[np.arange(ar2D.shape[0])[:,None], ar2D].reshape(ar.shape)
Out[106]:
array([[[3, 3, 3],
[3, 3, 3],
[1, 1, 1]],
[[2, 3, 1],
[2, 3, 2],
[3, 1, 2]]])

Is there a way to loop through the return value of np.where?

Is there a way to loop-through this tuple(?) where the left array are positions in an array and the right array is the value I would like to insert into the given positions:
(array([ 0, 4, 6, ..., 9992, 9996, 9997]), array([3, 3, 3, ..., 3, 3, 3]))
The output above is generated from the following piece of code:
np.where(h2 == h2[i,:].max())[1]
I would like the result to be like this:
array[0] = 3
array[4] = 3
...
array[9997] = 3
Just use a simple indexing:
indices, values = my_tuple
array[indices] = values
If you don't have the final array yet you can create it using a desire function like np.zeros, np.ones, etc. with a size as the size of maximum index.
I think you want the transpose of the where tuple:
In [204]: x=np.arange(1,13).reshape(3,4)
In [205]: x
Out[205]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
In [206]: idx=np.where(x)
In [207]: idx
Out[207]:
(array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], dtype=int32),
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int32))
In [208]: ij=np.transpose(idx)
In [209]: ij
Out[209]:
array([[0, 0],
[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 1],
[1, 2],
[1, 3],
[2, 0],
[2, 1],
[2, 2],
[2, 3]], dtype=int32)
In fact there's a function that does just that:
np.argwhere(x)
Iterating on ij, I can print:
In [213]: for i,j in ij:
...: print('array[{}]={}'.format(i,j))
...:
array[0]=0
array[0]=1
array[0]=2
zip(*) is a list version of transpose:
for i,j in zip(*idx):
print(i,j)

Categories

Resources