Can someone tell me how to join two unequal numpy arrays(one sparse and one dense). I tried using hstack/vstack but keep getting the dimensionality error.
from scipy import sparse
from scipy.sparse import coo_matrix
m=(coo_matrix(X_new)) # is a(7395,50000) sparse array
a=(other) # is a (7395,20) dense array
new_tr=scipy.sparse.hstack((m,a))
Please post more context to make your problem reproducible. As posted, your code works for me:
X_new = np.zeros((10,5), int)
X_new[(np.random.randint(0,10,5),np.random.randint(0,5,5))] = np.random.randint(0,10,5)
X_new
#array([[ 0, 0, 7, 0, 0],
# [ 0, 0, 0, 0, 0],
# [ 0, 0, 8, 3, 0],
# [ 0, 0, 0, 0, 0],
# [ 0, 0, 0, 0, 0],
# [ 0, 0, 0, 0, 0],
# [ 0, 0, 8, 0, 0],
# [ 1, 0, 0, 0, 0],
# [ 0, 0, 0, 0, 0],
# [ 0, 0, 0, 0, 0]])
m = coo_matrix(X_new)
m
#<10x5 sparse matrix of type '<type 'numpy.int64'>'
# with 5 stored elements in COOrdinate format>
a = np.matrix(np.random.randint(0,10,(10,2)))
a
#matrix([[2, 1],
# [5, 2],
# [4, 1],
# [1, 4],
# [5, 2],
# [7, 2],
# [6, 3],
# [8, 4],
# [5, 5],
# [7, 4]])
new_tr = sparse.hstack([m,a])
new_tr
#<10x7 sparse matrix of type '<type 'numpy.int64'>'
# with 25 stored elements in COOrdinate format>
new_tr.todense()
#matrix([[ 0, 0, 7, 0, 0, 2, 1],
# [ 0, 0, 0, 0, 0, 5, 2],
# [ 0, 0, 8, 3, 0, 4, 1],
# [ 0, 0, 0, 0, 0, 1, 4],
# [ 0, 0, 0, 0, 0, 5, 2],
# [ 0, 0, 0, 0, 0, 7, 2],
# [ 0, 0, 8, 0, 0, 6, 3],
# [ 1, 0, 0, 0, 0, 8, 4],
# [ 0, 0, 0, 0, 0, 5, 5],
# [ 0, 0, 0, 0, 0, 7, 4]])
Related
When I append List in a for loop it changes it value correctly
and when I print it outside for loop it's value gets changed
arr=[]
b=[1,2,3,4,5,6,7]
for i in range(0,len(b)):
b[i]=0
arr.append(b)
print(arr[i])
Here output is
[0, 2, 3, 4, 5, 6, 7]
[0, 0, 3, 4, 5, 6, 7]
[0, 0, 0, 4, 5, 6, 7]
[0, 0, 0, 0, 5, 6, 7]
[0, 0, 0, 0, 0, 6, 7]
[0, 0, 0, 0, 0, 0, 7]
[0, 0, 0, 0, 0, 0, 0]
And here
arr=[]
b=[1,2,3,4,5,6,7]
for i in range(0,len(b)):
b[i]=0
arr.append(b)
print(arr)
Output is
[[0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0]]
On each iteration, you are adding a reference to the same list b to your arr, which means that when you later set new values to zero, you are modifying all of the lists inside arr simultaneously. To avoid this, you can append a copy of b to arr instead by using list(b), i.e.:
arr = []
b = [1, 2, 3, 4, 5, 6, 7]
for i in range(len(b)):
b[i] = 0
arr.append(list(b))
print(arr)
This outputs:
[[0, 2, 3, 4, 5, 6, 7],
[0, 0, 3, 4, 5, 6, 7],
[0, 0, 0, 4, 5, 6, 7],
[0, 0, 0, 0, 5, 6, 7],
[0, 0, 0, 0, 0, 6, 7],
[0, 0, 0, 0, 0, 0, 7],
[0, 0, 0, 0, 0, 0, 0]]
I have several matrices, let's say [M1,M2,M3,M4]. Each matrix has a different shape. How do I compose these matrices into one big matrix diagonally like:
[[M1, 0, 0, 0]
[0, M2, 0, 0]
[0, 0, M2, 0]
[0, 0, 0, M2]]
Example:
M1 = [[1,2],[2,1]]
M2 = [[1,2]]
M3 = [[3]]
M4 = [[3,4,5],[4,5,6]]
To compose this big matrix:
[[1, 2, 0, 0, 0, 0, 0]
[2, 1, 0, 0, 0, 0, 0]
[0, 0, 1, 2, 0, 0, 0]
[0, 0, 0, 0, 3, 4, 5]
[0, 0, 0, 0, 4, 5, 6]]
Here is how to do it using SciPy:
from scipy.sparse import block_diag
block_diag((M1, M2, M3, M4))
Use PyTorch's torch.block_diag():
>>> torch.block_diag(M1,M2,M3,M4)
tensor([[1, 2, 0, 0, 0, 0, 0, 0],
[2, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 0, 0, 0, 0],
[0, 0, 0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 3, 4, 5],
[0, 0, 0, 0, 0, 4, 5, 6]])
How can I sample some of the rows of a scipy sparse matrix and form a new scipy sparse matrix from these sampled rows?
For eg. if I have a scipy sparse matrix A with 10 rows and I want to make a new scipy sparse matrix B with rows 1,3,4 from A, how to do that?
Left-multiply with an appropriate indicator matrix. The indicator matrix can be built using scipy.sparse.block_diag or directly, using csr format, as shown below.
>>> import numpy as np
>>> from scipy import sparse
>>>
# create example
>>> m, n = 10, 8
>>> subset = [1,3,4]
>>> A = sparse.csr_matrix(np.random.randint(-10, 5, (m, n)).clip(0, None))
>>> A.A
array([[3, 2, 4, 0, 0, 0, 2, 0],
[0, 0, 2, 0, 0, 0, 0, 0],
[4, 0, 0, 0, 0, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 4, 0],
[3, 0, 0, 0, 1, 4, 0, 0],
[0, 0, 0, 0, 0, 0, 2, 0],
[0, 0, 0, 4, 0, 4, 4, 0],
[0, 2, 0, 0, 0, 3, 0, 0],
[4, 0, 3, 3, 0, 0, 0, 2],
[4, 0, 0, 0, 0, 2, 0, 1]], dtype=int64)
>>>
# build indicator matrix
# either using block_diag ...
>>> split_points = np.arange(len(subset)+1).repeat(np.diff(np.concatenate([[0], subset, [m-1]])))
>>> indicator = sparse.block_diag(np.split(np.ones(len(subset), int), split_points)).T
>>> indicator.A
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]], dtype=int64)
>>>
# ... or manually---this also works for non sorted non unique subset,
# and is therefore to be preferred over block_diag
>>> indicator = sparse.csr_matrix((np.ones(len(subset), int), subset, np.arange(len(subset)+1)), (len(subset), m))
>>> indicator.A
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]])
>>>
# apply
>>> result = indicator#A
>>> result.A
array([[0, 0, 2, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 4, 0],
[3, 0, 0, 0, 1, 4, 0, 0]], dtype=int64)
What is the fastest way of creating a new matrix that is a result of a "look-up" of some numpy matrix X (using an array of indices to be looked up in matrix X)? Example of what I want to achieve:
indices = np.array([[[1,1],[1,1],[3,3]],[[1,1],[5,8],[6,9]]]) #[i,j]
new_matrix = lookup(X, use=indices)
Output will be something like:
new_matrix = np.array([[3,3,7],[3,4,9]])
where for example X[1,1] was 3. I'm using python 2.
Use sliced columns for indexing into X -
X[indices[...,0], indices[...,1]]
Or with tuple -
X[tuple(indices.T)].T # or X[tuple(indices.transpose(2,0,1))]
Sample run -
In [142]: X
Out[142]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 3, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 4, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 9]])
In [143]: indices
Out[143]:
array([[[1, 1],
[1, 1],
[3, 3]],
[[1, 1],
[5, 8],
[6, 9]]])
In [144]: X[indices[...,0], indices[...,1]]
Out[144]:
array([[3, 3, 7],
[3, 4, 9]])
I am trying to make a diagonal numpy array from:
[1,2,3,4,5,6,7,8,9]
Expected result:
[[ 0, 0, 1, 0, 0],
[ 0, 0, 0, 2, 0],
[ 0, 0, 0, 0, 3],
[ 4, 0, 0, 0, 0],
[ 0, 5, 0, 0, 0],
[ 0, 0, 6, 0, 0],
[ 0, 0, 0, 7, 0],
[ 0, 0, 0, 0, 8],
[ 9, 0, 0, 0, 0]]
What would be an efficient way of doing this?
You can use integer array indexing to set the specified elements of the output:
>>> import numpy as np
>>> a = [1,2,3,4,5,6,7,8,9]
>>> arr = np.zeros((9, 5), dtype=int) # create empty array
>>> arr[np.arange(9), np.arange(2,11) % 5] = a # insert a
>>> arr
array([[0, 0, 1, 0, 0],
[0, 0, 0, 2, 0],
[0, 0, 0, 0, 3],
[4, 0, 0, 0, 0],
[0, 5, 0, 0, 0],
[0, 0, 6, 0, 0],
[0, 0, 0, 7, 0],
[0, 0, 0, 0, 8],
[9, 0, 0, 0, 0]])
Inspired by np.fill_diagonal which can wrap, but not offset:
In [308]: arr=np.zeros((9,5),int)
In [309]: arr.flat[2:45:6]=np.arange(1,10)
In [310]: arr
Out[310]:
array([[0, 0, 1, 0, 0],
[0, 0, 0, 2, 0],
[0, 0, 0, 0, 3],
[0, 0, 0, 0, 0],
[4, 0, 0, 0, 0],
[0, 5, 0, 0, 0],
[0, 0, 6, 0, 0],
[0, 0, 0, 7, 0],
[0, 0, 0, 0, 8]])
(though for some reason this has the 4th all zero row).
def fill_diagonal(a, val, wrap=False):
...
step = a.shape[1] + 1
# Write the value out into the diagonal.
a.flat[:end:step] = val