replacing values with zeros - python

I have a numpy array, I want to replace whole values to zeros except some range of index.
1
2
3
4
5
I tried
Import numpy as np
data=np.loadtxt('data.txt')
print(data)
expected output
0
0
3
0
0

You can traverse the array with a for loop and check if the traversed element is in a list of desired selected values:
import numpy as np
a = np.array([1, 2, 3, 4, 5])
nums = [3]
for i in range(len(a)):
if a[i] in nums:
pass
else:
a[i] = 0
print(a)
Output:
[0 0 3 0 0]

As you're working with a numpy array, use vectorial methods.
Here isin to form a boolean mask for replacement:
data = np.array([1, 2, 3, 4, 5])
l = [3]
data[~np.isin(data, l)] = 0
data
# array([0, 0, 3, 0, 0])

Related

Making a matrix with numpy.array

I tried to create a matrix using numpy.array with the following code
def matrix_input(3):
matrix = []
for i in range(N):
a = nd.array(input().split(),int)
matrix.append(a)
print(matrix)
But I'm getting the following output:
[array([1, 1, 1]), array([1, 1]), array([1, 1, 1])]
For the input:
1 1 1
1 1
1 1 1
I don't want the matrix to have the word array in them... How do I remove it?
Make it a list on the 4th line of your code. Also, correct your function as mentioned in the code below. Function call and function creation are two different things, so does the arguments you pass into it.
import numpy as np
def matrix_input(N): # Argument to function while creation is wrong, use N instead of 3.
matrix = []
for i in range(N):
a = list(np.array(input().split(),int)) # Make it a list here
matrix.append(a)
print(matrix)
output:
matrix_input(3)
1 1 1
1 1
1 1 1
[[1, 1, 1], [1, 1], [1, 1, 1]]
Alternative method for creating a Proper matrix :
import numpy as np
matrix_1 = np.matrix([[1,1,1],[1,1,0],[1,1,1]])
print(matrix_1)
Output:
[[1 1 1]
[1 1 0]
[1 1 1]]

how to make efficiently large sparse matrix in python?

1.
i try to make a numpy array with shape:(6962341, 268148), type: np.uint8
2.
i have the data consist of [x1,x2,x3,x4], [x2,x1], [x4,x5,x3]...
3.
i want to assign array[x1,x2] += 1, array[x1,x3] += 1, array[x1,x4] += 1, array[x2,x3] += 1, ...
4.
so i have tried a function of the following structure.
import numpy as np
from itertools import combinations
base_array = np.zeros((row_size, col_size), dtype=np.uint8))
for each_list in data:
for (x,y) in list(combinations(each_list,2)):
if x>y:
base_array[y,x] += 1
else:
base_array[x,y] += 1
it basically compute the upper triangle of a matrix and i will use the upper triangle value. also you can think this is similar to make the base matrix A for co-occurrence matrix. but this function is too slow and i think it is possible to make faster.
What should i do?
Assuming your data is integers (since they represent rows and columns) or you can hash your data x1, x2, ... into 1, 2, ... integers, here is a fast solution:
#list of pairwise combinations in your data
comb_list = []
for each_list in data:
comb_list += list(combinations(each_list,2))
#convert combination int to index (numpy is 0 based indexing)
comb_list = np.array(comb_list) - 1
#make array with flat indices
flat = np.ravel_multi_index((comb_list[:,0],comb_list[:,1]),(row_size,col_size))
#count number of duplicates for each index using np.bincount
base_array = np.bincount(flat,None,row_size*col_size).reshape((row_size,col_size)).astype(np.uint8)
sample data:
[[1, 2, 3, 4], [2, 1], [4, 5, 3, 4]]
Corresponding output:
[[0 1 1 1 0]
[1 0 1 1 0]
[0 0 0 2 0]
[0 0 1 1 1]
[0 0 1 1 0]]
EDIT: corresponding to explanation in comments:
data=[[1, 2, 3, 4], [2, 1], [4, 5, 3, 4]]
base_array = np.zeros((len(data), np.max(np.amax(data))), dtype=np.uint8)
for i, each_list in enumerate(data):
for j in each_list:
base_array[i, j-1] = 1
Output:
[[1 1 1 1 0]
[1 1 0 0 0]
[0 0 1 1 1]]

Adding values to non zero elements in a Sparse Matrix

I have a sparse matrix in which I want to increment all the values of non-zero elements by one. However, I cannot figure it out. Is there a way to do it using standard packages in python? Any help will be appreciated.
I cannot comment on it's performance but you can do (Scipy 1.1.0);
>>> from scipy.sparse import csr_matrix
>>> a = csr_matrix([[0, 2, 0], [1, 0, 0]])
>>> print(a)
(0, 1) 2
(1, 0) 1
>>> a[a.nonzero()] = a[a.nonzero()] + 1
>>> print(a)
(0, 1) 3
(1, 0) 2
If your matrix have 2 dimensions, you can do the following:
sparse_matrix = [[element if element==0 else element+1 for element in row ]for row in sparse_matrix]
It will iterate over every element of your matrix and return the element without any change if it is equals to zero, else it add 1 to the element and return it.
More about conditionals in list comprehension in the answer for this question.
You can use the package numpy which has efficient functions for dealing with n-dimensional arrays. What you need is:
array[array>0] += 1
where array is the numpy array of your matrix. Example here:
`
import numpy as np
my_matrix = [[2,0,0,0,7],[0,0,0,4,0]]
array = np.array(my_matrix);
print("Matrix before incrementing values: \n", array)
array[array>0] += 1
print("Matrix after incrementing values: \n", array)`
Outputs:
Matrix before incrementing values:
[[2 0 0 0 7]
[0 0 0 4 0]]
Matrix after incrementing values:
[[3 0 0 0 8]
[0 0 0 5 0]]
Hope this helps!
When you have a scipy sparse matrix (scipy.sparse) is:
import scipy.sparse as sp
my_matrix = [[2,0,0,0,7],[0,0,0,4,0]]
my_matrix = sp.csc_matrix(my_matrix)
my_matrix.data += 1
my_matrix.todense()
Returns:
[[3, 0, 0, 0, 8], [0, 0, 0, 5, 0]]

How many times a number appears in a numpy array

I need to find a way to count how many times each number from 0 to 9 appears in a random matrix created using np.random.randint()
import numpy as np
p = int(input("Length of matrix: "))
m = np.random.randint(0,9,(p,p))
print(m)
For example if length of matrix = 4
[[3 4 6 5] [3 4 4 3] [4 2 4 8] [6 8 2 7]]
How many times does the number 4 appear? It should return 5.
You should be able to get this pretty simply:
list(m.flatten()).count(x)
Another option which is probably faster, is to use the numpy builtin count_nonzero():
np.count_nonzero(m == x)
Hooray builtin functions.
You can use sum function:
In [52]: m = np.random.randint(0,9,(4,4))
In [53]: m
Out[53]:
array([[8, 8, 2, 1],
[2, 7, 1, 2],
[8, 6, 8, 7],
[5, 2, 5, 2]])
In [56]: np.sum(m == 8)
Out[56]: 4
m == 8 will return a boolean array contains True for each 8 then since python evaluates the True as 1 you can sum up the array items in order to get the number of intended items.
If you want to get the frequency from all matrix elements, here's a simple solution using numpy.ndarray.flatten and collections.Counter:
import numpy as np
import collections
p = int(input("Length of matrix: "))
m = np.random.randint(0, 9, (p, p))
print(m)
print(collections.Counter(m.flatten()))
For example, when p=3 you'd get something like this:
[[8 4 8]
[5 1 1]
[1 1 1]]
Counter({1: 5, 8: 2, 4: 1, 5: 1})
You can flatten the matrix and then use the list count() method:
from collections import Counter
import numpy as np
p = int(input("Length of matrix: "))
m = np.random.randint(0,9,(p,p))
print(m)
flat = [item for sublist in m for item in sublist]
flat.count(4)
I would try numpy unique function with argument return_counts=True (see: https://numpy.org/doc/stable/reference/generated/numpy.unique.html).
import numpy as np
p = int(input("Length of matrix: "))
m = np.random.randint(0,9,(p,p))
# print(m)
un, nm = np.unique(m, return_counts = True)
# if number that you are looking for is 1 then:
print(nm[un==1])

Torch7 Tensor Non-Contiguos Index (Similar to Numpy)

I am new to torch7, and I can't find a way to get the some non contiguous indices of a tensor based on another tensor. In numpy, what I do is the following:
array = np.zeros(5) # array = [0 0 0 0 0]
indices = np.array([0, 2, 4])
array[indices] = np.array([1, 2, 3]) # array = [1 0 2 0 3]
Is there a way to do something similar in torch7? Something like:
array = torch.zeros(5) -- array = [0 0 0 0 0]
indices = torch.Tensor({1, 3, 5})
array[indices] = torch.Tensor({1, 2, 3}) -- array = [1 0 2 0 3]
Thanks!
Ok, looking arround, I couldn't find an exact solution, but I found an approximation of what I wanted to do, I share it in case someone else finds it useful:
array = torch.zeros(5) -- array = [0 0 0 0 0]
indices = torch.LongTensor({1, 3, 5}) -- Is important that this is a LongTensor
array:indexAdd(1, indices, torch.Tensor({1, 2, 3})) -- array = [1 0 2 0 3]
torch.IndexCopy does exactly what you need:
array:indexCopy(1, indices, torch.Tensor({1, 2, 3}))
If you are a python user, maybe you could also try https://github.com/imodpasteur/lutorpy . For example, you can process your array in python and then convert it to torch tensor. If you want to convert back to numpy array, the conversion is instant because it only pass the pointer, and two object in python and lua are sharing the same memory.
array = np.zeros(5) # array = [0 0 0 0 0]
indices = np.array([0, 2, 4])
array[indices] = np.array([1, 2, 3]) # array = [1 0 2 0 3]
require("torch")
# convert numpy array to torch tensor
tensor = torch.fromNumpyArray(array)
# put your torch code here
#convert back to numpy array
array = tensor.asNumpyArray()
# now array is sharing memory with tensor

Categories

Resources