numpy: find symmetric values in 2d arrays - python

I have to analyze a quadratic 2D numpy array LL for values which are symmetric (LL[i,j] == LL[j,i]) and not zero.
Is there a faster and more "array like" way without loops to do this?
Is there a easy way to store the indices of the values for later use without creating a array and append the tuple of the indices in every loop?
Here my classical looping approach to store the indices:
IdxArray = np.array() # Array to store the indices
for i in range(len(LL)):
for j in range(i+1,len(LL)):
if LL[i,j] != 0.0:
if LL[i,j] == LL[j,i]:
IdxArray = np.vstack((IdxArray,[i,j]))
later use the indices:
for idx in IdxArray:
P = LL[idx]*(TT[idx[0]]-TT[idx[1]])
...

>>> a = numpy.matrix('5 2; 5 4')
>>> b = numpy.matrix('1 2; 3 4')
>>> a.T == b.T
matrix([[False, False],
[ True, True]], dtype=bool)
>>> a == a.T
matrix([[ True, False],
[False, True]], dtype=bool)
>>> numpy.nonzero(a == a.T)
(matrix([[0, 1]]), matrix([[0, 1]]))

How about this:
a = np.array([[1,0,3,4],[0,5,4,6],[7,4,4,5],[3,4,5,6]])
np.fill_diagonal(a, 0) # changes original array, must be careful
overlap = (a == a.T) * a
indices = np.argwhere(overlap != 0)
Result:
>>> a
array([[0, 0, 3, 4],
[0, 0, 4, 6],
[7, 4, 0, 5],
[3, 4, 5, 0]])
>>> overlap
array([[0, 0, 0, 0],
[0, 0, 4, 0],
[0, 4, 0, 5],
[0, 0, 5, 0]])
>>> indices
array([[1, 2],
[2, 1],
[2, 3],
[3, 2]])

Related

How to delete an element from a 2D Numpy array without knowing its position

I have a 2D array:
[[0,0], [0,1], [1,0], [1,1]]
I want to delete the [0,1] element without knowing its position within the array (as the elements may be shuffled).
Result should be:
[[0,0], [1,0], [1,1]]
I've tried using numpy.delete but keep getting back a flattened array:
>>> arr = np.array([[0,0], [0,1], [1,0], [1,1]])
>>> arr
array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
>>> np.delete(arr, [0,1])
array([0, 1, 1, 0, 1, 1])
Specifying the axis removes the 0, 1 elements rather than searching for the element (which makes sense):
>>> np.delete(arr, [0,1], axis=0)
array([[1, 0],
[1, 1]])
And trying to find the location (as has been suggested) seems equally problematic:
>>> np.where(arr==[0,1])
(array([0, 1, 1, 3]), array([0, 0, 1, 1]))
(Where did that 3 come from?!?)
Here we find all of the rows that match the candidate [0, 1]
>>> (arr == [0, 1]).all(axis=1)
array([False, True, False, False])
Or alternatively, the rows that do not match the candidate
>>> ~(arr == [0, 1]).all(axis=1)
array([ True, False, True, True])
So, to select all those rows that do not match [0, 1]
>>> arr[~(arr == [0, 1]).all(axis=1)]
array([[0, 0],
[1, 0],
[1, 1]])
Note that this will create a new array.
mask = (arr==np.array([0,1])).all(axis=1)
arr1 = arr[~mask,:]
Look at mask.. It should be [False, True,...].
From the documentation:
numpy.delete(arr, obj, axis=None)
axis : int, optional
The axis along which to delete the subarray defined by obj. If axis
is None, obj is applied to the flattened array
If you don't specify the axis(i.e. None), it will automatically flatten your array; you just need to specify the axis parameter, in your case np.delete(arr, [0,1],axis=0)
However, just like in the example above, [0,1] is a list of indices; you must provide the indices/location(you can do that with np.where(condition,array) for example)
Here you have a working example:
my_array = np.array([[0, 1],
[1, 0],
[1, 1],
[0, 0]])
row_index, = np.where(np.all(my_array == [0, 1], axis=1))
my_array = np.delete(my_array, row_index,axis=0)
print(my_array)
#Output is below
[[1 0]
[1 1]
[0 0]]

Locate asymmetries in a matrix

I have generated matrix of pairwise distances between list items, but something went wrong and it is not symmetric.
In the case the matrix looks like this:
array = np.array([
[0, 3, 4],
[3, 0, 2],
[1, 2, 0]
])
How can I locate the actual asymmetries? In this case, the indices of 4 and 1.
I have confirmed the asymmetry by trying to condense the matrix by scipy squareform function, and then using
def check_symmetric(a, rtol=1e-05, atol=1e-08):
return np.allclose(a, a.T, rtol=rtol, atol=atol)
quite late but here would be a alternative the numpy way...
import numpy as np
m = np.array([[0, 3, 4 ],
[ 3, 0, 2 ],
[ 1, 2, 0 ]])
def check_symmetric(a):
diff = a - a.T
boolmatrix = np.isclose(a, a.T) # play around with your tolerances here...
output = np.argwhere(boolmatrix == False)
return output
output:
check_symmetric(m)
>>> array([[0, 2],
[2, 0]])
You can simply use the negation of np.isclose():
mask = ~np.isclose(array, array.T)
mask
# array([[False, False, True],
# [False, False, False],
# [ True, False, False]])
Use that value as an index to get the values:
array[mask]
# array([4, 1])
And use np.where() if you want the indices instead:
np.where(mask)
# (array([0, 2]), array([2, 0]))
The following is quick and slow but if the object is to debug will probably do.
a # nearly symmetric array.
Out:
array([[8, 1, 6, 5, 3],
[1, 9, 4, 4, 4],
[6, 4, 3, 7, 1],
[5, 4, 7, 5, 2],
[3, 4, 1, 3, 7]])
Define function to find and print the differences.
ERROR_LIMIT = 0.00001
def find_asymmetries( a ):
""" Prints the row and column indices with the difference
where abs(a[r,c] - a[c,r]) > ERROR_LIMIT """
res = a-a.T
for r, row in enumerate(res):
for c, cell in enumerate(row):
if abs(cell) > ERROR_LIMIT : print( r, c, cell )
find_asymmetries( a )
3 4 -1
4 3 1
This version halves the volume of results.
def find_asymmetries( a ):
res = a-a.T
for r, row in enumerate(res):
for c, cell in enumerate(row):
if c == r: break # Stop column search once c == r
if abs(cell) > ERROR_LIMIT : print( r, c, cell )
find_asymmetries( a )
4 3 1 # Row number always greater than column number

Masking two square matrices

I'm new to python and there is something that I am not sure how to do it. I have the following Matrices:
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
I would like to use B to transform A into the following Matrix:
A=[[0,1,0],[1,0,1],[0,1,0]]
I'm sure it is quite simple but, as said, I'm new to python so if you could tell me how to do that I'd appreciate it.
Many thanks
Your best bet for this is to use numpy:
import numpy as np
data = np.array([[1, 2, 3,],
[4, 5, 6,],
[7, 8, 9,],])
mask = np.array([[False, True, False,],
[True, False, True,],
[False, True, False,],])
filtered_data = data * mask
which results in filtered_data of:
array([[0, 2, 0],
[4, 0, 6],
[0, 8, 0]])
Without numpy you can do it with a nested list comprehension, but I'm sure you'll agree the numpy solution is much clearer if it's an option:
data = [[1, 2, 3,],
[4, 5, 6,],
[7, 8, 9,],]
mask = [[False, True, False,],
[True, False, True,],
[False, True, False,],]
filtered_data = [[data_elem if mask_elem else 0
for data_elem, mask_elem in zip(data_row, mask_row)]
for data_row, mask_row in zip(data, mask)]
which gives you filtered_data equal to
[[0, 2, 0], [4, 0, 6], [0, 8, 0]]
Using enumerate
Ex:
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
for ind, val in enumerate(B):
for sub_ind, sub_val in enumerate(val):
A[ind][sub_ind] = int(sub_val)
print(A)
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
You could just do
[ [int(y) for y in x] for x in B ]
Doing int() on a Boolean.
int(False) --> 0
int(True) --> 1
With numpy.multiply you'll get what you want:
import numpy as np
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
np.multiply(A, B)
#array([[0, 1, 0],
# [1, 0, 1],
# [0, 1, 0]])
Since, you have asked A to modified. Here's a solution, that doesn't create a new list, but modifies A. It uses zip and enumerate
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
for x,y in zip(A,B):
for x1,y1 in zip(enumerate(x),y):
x[x1[0]] = int(y1)
print A
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
If you want to modify A using flags in B, you can do it like that:
A = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
B = [[False, True, False], [True, False, True], [False, True, False]]
C = [[int(A_el == B_el) for A_el, B_el in zip(A_ar, B_ar)] for A_ar, B_ar in zip(A, B)]
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
Also you can iterate using indexes:
C = [[int(A[i][j] == B[i][j]) for j in range(len(A[0]))] for i in range(len(A))
try this
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
X = [[x and y for x,y in zip(a,b)] for a,b in zip(A,B)]
C = [ [int(x) for x in c] for c in X ]
print(C)
output
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
Basic doble for loop:
for i in range(len(A)):
for j in range(len(A[0])):
A[i][j]= int(B[i][j])*A[i][j]
print (A)
output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
example:
A=[[1,1,1],[1,1,1],[0,0,0]]
B=[[False,True,False],[True,False,True],[False,True,False]]
output:
for i in range(len(A)):
for j in range(len(A[0])):
A[i][j]= int(B[i][j])*A[i][j]
print (A)

Python numpy array integer indexed flat slice assignment

Was experimenting with numpy and found this strange behavior.
This code works ok:
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> a[:, 1].flat[:] = np.array([-1, -1])
>>> a
array([[ 1, -1, 3],
[ 4, -1, 6]])
But why this code doesn't change to -1 elements of 0 and 2 column?
>>> a[:, [0, 2]].flat[:] = np.array([-1, -1])
>>> a
array([[ 1, -1, 3],
[ 4, -1, 6]])
And how to write the code so that would change to -1 elements of 0 and 2 columns like this?
UPD: use of flat or smt similar is necessarily in my example
UPD2: I made example in question basing on this code:
img = imread(img_name)
xor_mask = np.zeros_like(img, dtype=np.bool)
# msg_bits looks like array([ True, False, False, ..., False, False, True], dtype=bool)
xor_mask[:, :, channel].flat[:len(msg_bits)] = np.ones_like(msg_bits, dtype=np.bool)
And after assignment to xor mask with channel == 0 or 1 or 2 code works ok, but if channel == [1,2] or smt like this, assignment does not happen
In first example by flattening the slice you don't change the shape and actually the python Numpy doesn't create a new object. so assigning to flattened slice is like assigning to actual slice. But by flattening a 2d array you're changing the shape and hence numpy makes a copy of it.
also you don't need to flatten your slice to add to it:
In [5]: a[:, [0, 2]] += 100
In [6]: a
Out[6]:
array([[101, 2, 103],
[104, 5, 106]])
As others has pointed out .flat may create a copy of the original vector, so any updates to it would be lost. But flattening a 1D slice is fine, so you can use a for loop to update multiple indexes.
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6]])
a[:, 1].flat = np.array([-1, -1])
print a
# Use for loop to avoid copies
for idx in [0, 2]:
a[:, idx].flat = np.array([-1, -1])
print a
Note that you don't need to use flat[:]: just flat is enough (and probably more efficient).
You could just remove the flat[:] from a[:, [0, 2]].flat[:] += 100:
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> a[:, 1].flat[:] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 105, 6]])
>>> a[:, [0, 2]] += 100
>>> a
array([[101, 102, 103],
[104, 105, 106]])
But you say it is necessary... Can't you just reshape whatever you are trying to add to the initial array instead of using flat?
The second index call makes a copy of the array while the first returns a reference to it:
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> b = a[:,1].flat
>>> b[0] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 5, 6]])
>>> b =a[:,[0,2]].flat
>>> b[0]
1
>>> b[0] += 100
>>> a
array([[ 1, 102, 3],
[ 4, 5, 6]])
>>> b[:]
array([101, 3, 4, 6])
It appears that when the elements you wish to iterate upon in a flat maner are not adjacent numpy makes an iterator over a copy of the array.

Python Scipy How to traverse upper/lower trianglar portion non-zeros from csr_matrix

I have a very sparse matrix(similarity matrix) with dimensions 300k * 300k. In order to find out the relatively greater similarities between users, I only need upper/lower triangular portion of the matrix. So, how to get the coordinates of users with value larger than a threshold in an efficient way?
Thanks.
How about
sparse.triu(M)
If M is
In [819]: M.A
Out[819]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]], dtype=int32)
In [820]: sparse.triu(M).A
Out[820]:
array([[0, 1, 2],
[0, 4, 5],
[0, 0, 8]], dtype=int32)
You may need to construct a new sparse matrix, with just nonzeros above the threshold.
In [826]: sparse.triu(M>2).A
Out[826]:
array([[False, False, False],
[False, True, True],
[False, False, True]], dtype=bool)
In [827]: sparse.triu(M>2).nonzero()
Out[827]: (array([1, 1, 2], dtype=int32), array([1, 2, 2], dtype=int32))
Here's the code for triu:
def triu(A, k=0, format=None):
A = coo_matrix(A, copy=False)
mask = A.row + k <= A.col
row = A.row[mask]
col = A.col[mask]
data = A.data[mask]
return coo_matrix((data,(row,col)), shape=A.shape).asformat(format)

Categories

Resources