Adding values to non zero elements in a Sparse Matrix - python

I have a sparse matrix in which I want to increment all the values of non-zero elements by one. However, I cannot figure it out. Is there a way to do it using standard packages in python? Any help will be appreciated.

I cannot comment on it's performance but you can do (Scipy 1.1.0);
>>> from scipy.sparse import csr_matrix
>>> a = csr_matrix([[0, 2, 0], [1, 0, 0]])
>>> print(a)
(0, 1) 2
(1, 0) 1
>>> a[a.nonzero()] = a[a.nonzero()] + 1
>>> print(a)
(0, 1) 3
(1, 0) 2

If your matrix have 2 dimensions, you can do the following:
sparse_matrix = [[element if element==0 else element+1 for element in row ]for row in sparse_matrix]
It will iterate over every element of your matrix and return the element without any change if it is equals to zero, else it add 1 to the element and return it.
More about conditionals in list comprehension in the answer for this question.

You can use the package numpy which has efficient functions for dealing with n-dimensional arrays. What you need is:
array[array>0] += 1
where array is the numpy array of your matrix. Example here:
`
import numpy as np
my_matrix = [[2,0,0,0,7],[0,0,0,4,0]]
array = np.array(my_matrix);
print("Matrix before incrementing values: \n", array)
array[array>0] += 1
print("Matrix after incrementing values: \n", array)`
Outputs:
Matrix before incrementing values:
[[2 0 0 0 7]
[0 0 0 4 0]]
Matrix after incrementing values:
[[3 0 0 0 8]
[0 0 0 5 0]]
Hope this helps!

When you have a scipy sparse matrix (scipy.sparse) is:
import scipy.sparse as sp
my_matrix = [[2,0,0,0,7],[0,0,0,4,0]]
my_matrix = sp.csc_matrix(my_matrix)
my_matrix.data += 1
my_matrix.todense()
Returns:
[[3, 0, 0, 0, 8], [0, 0, 0, 5, 0]]

Related

replacing values with zeros

I have a numpy array, I want to replace whole values to zeros except some range of index.
1
2
3
4
5
I tried
Import numpy as np
data=np.loadtxt('data.txt')
print(data)
expected output
0
0
3
0
0
You can traverse the array with a for loop and check if the traversed element is in a list of desired selected values:
import numpy as np
a = np.array([1, 2, 3, 4, 5])
nums = [3]
for i in range(len(a)):
if a[i] in nums:
pass
else:
a[i] = 0
print(a)
Output:
[0 0 3 0 0]
As you're working with a numpy array, use vectorial methods.
Here isin to form a boolean mask for replacement:
data = np.array([1, 2, 3, 4, 5])
l = [3]
data[~np.isin(data, l)] = 0
data
# array([0, 0, 3, 0, 0])

Efficiently permute array in row wise using Numpy

Given a 2D array, I would like to permute this array row-wise.
Currently, I will create a for loop to permute the 2D array row by row as below:
for i in range(npart):
pr=np.random.permutation(range(m))
# arr_rand3 is the same as arr, but with each row permuted
arr_rand3[i,:]=arr[i,pr]
But, I wonder whether there is some setting within Numpy that can perform this in single line (without the for-loop).
The full code is
import numpy as np
arr=np.array([[0,0,0,0,0],[0,4,1,1,1],[0,1,1,2,2],[0,3,2,2,2]])
npart=len(arr[:,0])
m=len(arr[0,:])
# Permuted version of arr
arr_rand3=np.zeros(shape=np.shape(arr),dtype=np.int)
# Nodal association matrix for C
X=np.zeros(shape=(m,m),dtype=np.double)
# Random nodal association matrix for C_rand3
X_rand3=np.zeros(shape=(m,m),dtype=np.double)
for i in range(npart):
pr=np.random.permutation(range(m))
# arr_rand3 is the same as arr, but with each row permuted
arr_rand3[i,:]=arr[i,pr]
In Numpy 1.19+ you should be able to do:
import numpy as np
arr = np.array([[0, 0, 0, 0, 0], [0, 4, 1, 1, 1], [0, 1, 1, 2, 2], [0, 3, 2, 2, 2]])
rng = np.random.default_rng()
arr_rand3 = rng.permutation(arr, axis=1)
print(arr_rand3)
Output
[[0 0 0 0 0]
[4 0 1 1 1]
[1 0 1 2 2]
[3 0 2 2 2]]
According to the documentation, the method random.Generator.permutation receives a new parameter axis:
axis int, optional
The axis which x is shuffled along. Default is 0.

Making a matrix with numpy.array

I tried to create a matrix using numpy.array with the following code
def matrix_input(3):
matrix = []
for i in range(N):
a = nd.array(input().split(),int)
matrix.append(a)
print(matrix)
But I'm getting the following output:
[array([1, 1, 1]), array([1, 1]), array([1, 1, 1])]
For the input:
1 1 1
1 1
1 1 1
I don't want the matrix to have the word array in them... How do I remove it?
Make it a list on the 4th line of your code. Also, correct your function as mentioned in the code below. Function call and function creation are two different things, so does the arguments you pass into it.
import numpy as np
def matrix_input(N): # Argument to function while creation is wrong, use N instead of 3.
matrix = []
for i in range(N):
a = list(np.array(input().split(),int)) # Make it a list here
matrix.append(a)
print(matrix)
output:
matrix_input(3)
1 1 1
1 1
1 1 1
[[1, 1, 1], [1, 1], [1, 1, 1]]
Alternative method for creating a Proper matrix :
import numpy as np
matrix_1 = np.matrix([[1,1,1],[1,1,0],[1,1,1]])
print(matrix_1)
Output:
[[1 1 1]
[1 1 0]
[1 1 1]]

how to make efficiently large sparse matrix in python?

1.
i try to make a numpy array with shape:(6962341, 268148), type: np.uint8
2.
i have the data consist of [x1,x2,x3,x4], [x2,x1], [x4,x5,x3]...
3.
i want to assign array[x1,x2] += 1, array[x1,x3] += 1, array[x1,x4] += 1, array[x2,x3] += 1, ...
4.
so i have tried a function of the following structure.
import numpy as np
from itertools import combinations
base_array = np.zeros((row_size, col_size), dtype=np.uint8))
for each_list in data:
for (x,y) in list(combinations(each_list,2)):
if x>y:
base_array[y,x] += 1
else:
base_array[x,y] += 1
it basically compute the upper triangle of a matrix and i will use the upper triangle value. also you can think this is similar to make the base matrix A for co-occurrence matrix. but this function is too slow and i think it is possible to make faster.
What should i do?
Assuming your data is integers (since they represent rows and columns) or you can hash your data x1, x2, ... into 1, 2, ... integers, here is a fast solution:
#list of pairwise combinations in your data
comb_list = []
for each_list in data:
comb_list += list(combinations(each_list,2))
#convert combination int to index (numpy is 0 based indexing)
comb_list = np.array(comb_list) - 1
#make array with flat indices
flat = np.ravel_multi_index((comb_list[:,0],comb_list[:,1]),(row_size,col_size))
#count number of duplicates for each index using np.bincount
base_array = np.bincount(flat,None,row_size*col_size).reshape((row_size,col_size)).astype(np.uint8)
sample data:
[[1, 2, 3, 4], [2, 1], [4, 5, 3, 4]]
Corresponding output:
[[0 1 1 1 0]
[1 0 1 1 0]
[0 0 0 2 0]
[0 0 1 1 1]
[0 0 1 1 0]]
EDIT: corresponding to explanation in comments:
data=[[1, 2, 3, 4], [2, 1], [4, 5, 3, 4]]
base_array = np.zeros((len(data), np.max(np.amax(data))), dtype=np.uint8)
for i, each_list in enumerate(data):
for j in each_list:
base_array[i, j-1] = 1
Output:
[[1 1 1 1 0]
[1 1 0 0 0]
[0 0 1 1 1]]

numpy/scipy build adjacency matrix from weighted edgelist

I'm reading a weighted egdelist / numpy array like:
0 1 1
0 2 1
1 2 1
1 0 1
2 1 4
where the columns are 'User1','User2','Weight'. I'd like to perform a DFS algorithm with scipy.sparse.csgraph.depth_first_tree, which requires a N x N matrix as input. How can I convert the previous list into a square matrix as:
0 1 1
1 0 1
0 4 0
within numpy or scipy?
Thanks for your help.
EDIT:
I've been working with a huge (150 million nodes) network, so I'm looking for a memory efficient way to do that.
You could use a memory-efficient scipy.sparse matrix:
import numpy as np
import scipy.sparse as sparse
arr = np.array([[0, 1, 1],
[0, 2, 1],
[1, 2, 1],
[1, 0, 1],
[2, 1, 4]])
shape = tuple(arr.max(axis=0)[:2]+1)
coo = sparse.coo_matrix((arr[:, 2], (arr[:, 0], arr[:, 1])), shape=shape,
dtype=arr.dtype)
print(repr(coo))
# <3x3 sparse matrix of type '<type 'numpy.int64'>'
# with 5 stored elements in COOrdinate format>
To convert the sparse matrix to a dense numpy array, you could use todense:
print(coo.todense())
# [[0 1 1]
# [1 0 1]
# [0 4 0]]
Try something like the following:
import numpy as np
import scipy.sparse as sps
A = np.array([[0, 1, 1],[0, 2, 1],[1, 2, 1],[1, 0, 1],[2, 1, 4]])
i, j, weight = A[:,0], A[:,1], A[:,2]
# find the dimension of the square matrix
dim = max(len(set(i)), len(set(j)))
B = sps.lil_matrix((dim, dim))
for i,j,w in zip(i,j,weight):
B[i,j] = w
print B.todense()
>>>
[[ 0. 1. 1.]
[ 1. 0. 1.]
[ 0. 4. 0.]]

Categories

Resources