Represent a 2-dim matrix for column-wise operation - python

I created a list to represent a 2-dim matrix:
mylist = []
while (some condition):
x1 = ...
x2 = ...
mylist.append([x1,x2])
I would like to test if each entry in the second column of the matrix is bigger than 0.45, but I meet some difficulty:
>>> mylist
[[1, 2], [1, -3], [-1, -2], [-1, 2], [0, 0], [0, 1], [0, -1]]
>>> mylist[][1] > 0.4
File "<stdin>", line 1
mylist[][1] > 0.4
^
SyntaxError: invalid syntax
>>> mylist[:,1] > 0.4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not tuple
Given that mylist is a list of sublists, how can I specify all the second components of all its sublists?
Is it good to choose list to represent the 2-dim matrix? I chose it, only because the size of the matrix is dynamically determined. What would you recommend?
Thanks!

Use all() like this:
>>> lst = [[1, 2], [1, -3], [-1, -2], [-1, 2], [0, 0], [0, 1], [0, -1]]
>>> all(x > 0.45 for _, x in lst)
False
If you need a list of booleans then use a list comprehension:
>>> [x > 0.45 for _, x in lst]
[True, False, False, True, False, True, False]
mylist[][1] is an invalid syntax, but if you can use NumPy then you can do something like:
In [1]: arr = np.array([[1, 2], [1, -3], [-1, -2], [-1, 2], [0, 0], [0, 1], [0, -1]])
In [2]: all(arr[:,1] > 0.45)
Out[2]: False
In [4]: arr[:,1] > .45
Out[4]: array([ True, False, False, True, False, True, False], dtype=bool)

#Aशwini चhaudhary's solution is fantastic if you continue to use lists.
I would suggest you use numpy though as it can provide significant speed increases through vectorised functions, especially when working with larger datasets.
import numpy as np
mylist = [[1, 2], [1, -3], [-1, -2], [-1, 2], [0, 0], [0, 1], [0, -1]]
myarray = np.array(mylist)
# Look at all "rows" (chosen by :) and the 2nd "column" (given by 1).
print(myarray[:,1]>0.45)
# [ True False False True False True False]

Related

How can I calculate distance between points in each row of an array

I have an array like this and I have to find the distance between each points. How could I do so in python with numpy?
array([[ 8139, 112607],
[ 8139, 115665],
[ 8132, 126563],
[ 8193, 113938],
[ 8193, 123714],
[ 8156, 120291],
[ 8373, 125253],
[ 8400, 131442],
[ 8400, 136354],
[ 8401, 129352],
[ 8439, 129909],
[ 8430, 135706],
[ 8430, 146359],
[ 8429, 139089],
[ 8429, 133243]])
Let's minimize this problem down to 4 points:
points = np.array([[8139, 115665], [8132, 126563], [8193, 113938], [8193, 123714]])
In general, you need to do 2 steps:
Make an indices of pairs of points you want to take
Apply np.hypot for these pairs.
TL;DR
Making an indices of points
There are many ways of how you would like to create pairs of indices for each pair of points you'd like to take. But where do they come from? In every case it's a good idea to start building them from adjancency matrix.
Case 1
In the most common way you can start from building it like so:
adjacency = np.ones(shape=(len(points), len(points)), dtype=bool)
>>> adjacency
[[ True True True True]
[ True True True True]
[ True True True True]
[ True True True True]]
It corresponds to indices you need to take like so:
adjacency_idx_view = np.transpose(np.nonzero(adjacency))
for n in adjacency_idx_view.reshape(len(points), len(points), 2):
>>> print(n.tolist())
[[0, 0], [1, 0], [2, 0], [3, 0]]
[[0, 1], [1, 1], [2, 1], [3, 1]]
[[0, 2], [1, 2], [2, 2], [3, 2]]
[[0, 3], [1, 3], [2, 3], [3, 3]]
And this is how you collect them:
x, y = np.nonzero(adjacency)
>>> np.transpose([x, y])
array([[0, 0],
[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 1],
[1, 2],
[1, 3],
[2, 0],
[2, 1],
[2, 2],
[2, 3],
[3, 0],
[3, 1],
[3, 2],
[3, 3]], dtype=int64)
It could be done also manually like in #
Corralien's answer:
x = np.repeat(np.arange(len(points)), len(points))
y = np.tile(np.arange(len(points)), len(points))
Case 2
In previous case every pair of point is duplicated. There are also pairs with points duplicating. A better option is to omit this excessive data and take only pairs with index of first point being less than index of the second one:
adjacency = np.less.outer(np.arange(len(points)), np.arange(len(points)))
>>> print(adjacency)
[[False True True True]
[False False True True]
[False False False True]
[False False False False]]
x, y = np.nonzero(adjacency)
This is not used widely. Although this lays beyond the hood of np.triu_indices. Hence, as an alternative, we could use:
x, y = np.triu_indices(len(points), 1)
And this results in:
>>> np.transpose([x, y])
array([[0, 1],
[0, 2],
[0, 3],
[0, 4],
[1, 2],
[1, 3],
[1, 4],
[2, 3],
[2, 4],
[3, 4]])
Case 3
You could also try omit only pairs of duplicated points and leave pairs with points being swapped. As in Case 1 it costs 2x memory and consumption time so I'll leave it for demonstration purposes only:
adjacency = ~np.identity(len(points), dtype=bool)
>>> adjacency
array([[False, True, True, True],
[ True, False, True, True],
[ True, True, False, True],
[ True, True, True, False]])
x, y = np.nonzero(adjacency)
>>> np.transpose([x, y])
array([[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 2],
[1, 3],
[2, 0],
[2, 1],
[2, 3],
[3, 0],
[3, 1],
[3, 2]], dtype=int64)
I'll leave making x and y manually (without masking) as an exercise for the others.
Apply np.hypot
Instead of np.sqrt(np.sum((a - b) ** 2, axis=1)) you could do np.hypot(np.transpose(a - b)). I'll take my Case 2 as my index generator:
def distance(points):
x, y = np.triu_indices(len(points), 1)
x_coord, y_coord = np.transpose(points[x] - points[y])
return np.hypot(x_coord, y_coord)
>>> distance(points)
array([10898.00224812, 1727.84403231, 8049.18113848, 12625.14736548,
2849.65296133, 9776. ])
You can use np.repeat and np.tile to create all combinations then compute the euclidean distance:
xy = np.array([[8139, 115665], [8132, 126563], [8193, 113938], [8193, 123714],
[8156, 120291], [8373, 125253], [8400, 131442], [8400, 136354],
[8401, 129352], [8439, 129909], [8430, 135706], [8430, 146359],
[8429, 139089], [8429, 133243]])
a = np.repeat(xy, len(xy), axis=0)
b = np.tile(xy, [len(xy), 1])
d = np.sqrt(np.sum((a - b) ** 2, axis=1))
The output of d is (196,) which is 14 x 14.
Update
but I have to do it in a function.
def distance(xy):
a = np.repeat(xy, len(xy), axis=0)
b = np.tile(xy, [len(xy), 1])
return np.sqrt(np.sum((a - b) ** 2, axis=1))
d = distance(xy)

How to delete an element from a 2D Numpy array without knowing its position

I have a 2D array:
[[0,0], [0,1], [1,0], [1,1]]
I want to delete the [0,1] element without knowing its position within the array (as the elements may be shuffled).
Result should be:
[[0,0], [1,0], [1,1]]
I've tried using numpy.delete but keep getting back a flattened array:
>>> arr = np.array([[0,0], [0,1], [1,0], [1,1]])
>>> arr
array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
>>> np.delete(arr, [0,1])
array([0, 1, 1, 0, 1, 1])
Specifying the axis removes the 0, 1 elements rather than searching for the element (which makes sense):
>>> np.delete(arr, [0,1], axis=0)
array([[1, 0],
[1, 1]])
And trying to find the location (as has been suggested) seems equally problematic:
>>> np.where(arr==[0,1])
(array([0, 1, 1, 3]), array([0, 0, 1, 1]))
(Where did that 3 come from?!?)
Here we find all of the rows that match the candidate [0, 1]
>>> (arr == [0, 1]).all(axis=1)
array([False, True, False, False])
Or alternatively, the rows that do not match the candidate
>>> ~(arr == [0, 1]).all(axis=1)
array([ True, False, True, True])
So, to select all those rows that do not match [0, 1]
>>> arr[~(arr == [0, 1]).all(axis=1)]
array([[0, 0],
[1, 0],
[1, 1]])
Note that this will create a new array.
mask = (arr==np.array([0,1])).all(axis=1)
arr1 = arr[~mask,:]
Look at mask.. It should be [False, True,...].
From the documentation:
numpy.delete(arr, obj, axis=None)
axis : int, optional
The axis along which to delete the subarray defined by obj. If axis
is None, obj is applied to the flattened array
If you don't specify the axis(i.e. None), it will automatically flatten your array; you just need to specify the axis parameter, in your case np.delete(arr, [0,1],axis=0)
However, just like in the example above, [0,1] is a list of indices; you must provide the indices/location(you can do that with np.where(condition,array) for example)
Here you have a working example:
my_array = np.array([[0, 1],
[1, 0],
[1, 1],
[0, 0]])
row_index, = np.where(np.all(my_array == [0, 1], axis=1))
my_array = np.delete(my_array, row_index,axis=0)
print(my_array)
#Output is below
[[1 0]
[1 1]
[0 0]]

Masking two square matrices

I'm new to python and there is something that I am not sure how to do it. I have the following Matrices:
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
I would like to use B to transform A into the following Matrix:
A=[[0,1,0],[1,0,1],[0,1,0]]
I'm sure it is quite simple but, as said, I'm new to python so if you could tell me how to do that I'd appreciate it.
Many thanks
Your best bet for this is to use numpy:
import numpy as np
data = np.array([[1, 2, 3,],
[4, 5, 6,],
[7, 8, 9,],])
mask = np.array([[False, True, False,],
[True, False, True,],
[False, True, False,],])
filtered_data = data * mask
which results in filtered_data of:
array([[0, 2, 0],
[4, 0, 6],
[0, 8, 0]])
Without numpy you can do it with a nested list comprehension, but I'm sure you'll agree the numpy solution is much clearer if it's an option:
data = [[1, 2, 3,],
[4, 5, 6,],
[7, 8, 9,],]
mask = [[False, True, False,],
[True, False, True,],
[False, True, False,],]
filtered_data = [[data_elem if mask_elem else 0
for data_elem, mask_elem in zip(data_row, mask_row)]
for data_row, mask_row in zip(data, mask)]
which gives you filtered_data equal to
[[0, 2, 0], [4, 0, 6], [0, 8, 0]]
Using enumerate
Ex:
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
for ind, val in enumerate(B):
for sub_ind, sub_val in enumerate(val):
A[ind][sub_ind] = int(sub_val)
print(A)
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
You could just do
[ [int(y) for y in x] for x in B ]
Doing int() on a Boolean.
int(False) --> 0
int(True) --> 1
With numpy.multiply you'll get what you want:
import numpy as np
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
np.multiply(A, B)
#array([[0, 1, 0],
# [1, 0, 1],
# [0, 1, 0]])
Since, you have asked A to modified. Here's a solution, that doesn't create a new list, but modifies A. It uses zip and enumerate
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
for x,y in zip(A,B):
for x1,y1 in zip(enumerate(x),y):
x[x1[0]] = int(y1)
print A
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
If you want to modify A using flags in B, you can do it like that:
A = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
B = [[False, True, False], [True, False, True], [False, True, False]]
C = [[int(A_el == B_el) for A_el, B_el in zip(A_ar, B_ar)] for A_ar, B_ar in zip(A, B)]
Output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
Also you can iterate using indexes:
C = [[int(A[i][j] == B[i][j]) for j in range(len(A[0]))] for i in range(len(A))
try this
A=[[1,1,1],[1,1,1],[1,1,1]]
B=[[False,True,False],[True,False,True],[False,True,False]]
X = [[x and y for x,y in zip(a,b)] for a,b in zip(A,B)]
C = [ [int(x) for x in c] for c in X ]
print(C)
output
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
Basic doble for loop:
for i in range(len(A)):
for j in range(len(A[0])):
A[i][j]= int(B[i][j])*A[i][j]
print (A)
output:
[[0, 1, 0], [1, 0, 1], [0, 1, 0]]
example:
A=[[1,1,1],[1,1,1],[0,0,0]]
B=[[False,True,False],[True,False,True],[False,True,False]]
output:
for i in range(len(A)):
for j in range(len(A[0])):
A[i][j]= int(B[i][j])*A[i][j]
print (A)

numpy: find symmetric values in 2d arrays

I have to analyze a quadratic 2D numpy array LL for values which are symmetric (LL[i,j] == LL[j,i]) and not zero.
Is there a faster and more "array like" way without loops to do this?
Is there a easy way to store the indices of the values for later use without creating a array and append the tuple of the indices in every loop?
Here my classical looping approach to store the indices:
IdxArray = np.array() # Array to store the indices
for i in range(len(LL)):
for j in range(i+1,len(LL)):
if LL[i,j] != 0.0:
if LL[i,j] == LL[j,i]:
IdxArray = np.vstack((IdxArray,[i,j]))
later use the indices:
for idx in IdxArray:
P = LL[idx]*(TT[idx[0]]-TT[idx[1]])
...
>>> a = numpy.matrix('5 2; 5 4')
>>> b = numpy.matrix('1 2; 3 4')
>>> a.T == b.T
matrix([[False, False],
[ True, True]], dtype=bool)
>>> a == a.T
matrix([[ True, False],
[False, True]], dtype=bool)
>>> numpy.nonzero(a == a.T)
(matrix([[0, 1]]), matrix([[0, 1]]))
How about this:
a = np.array([[1,0,3,4],[0,5,4,6],[7,4,4,5],[3,4,5,6]])
np.fill_diagonal(a, 0) # changes original array, must be careful
overlap = (a == a.T) * a
indices = np.argwhere(overlap != 0)
Result:
>>> a
array([[0, 0, 3, 4],
[0, 0, 4, 6],
[7, 4, 0, 5],
[3, 4, 5, 0]])
>>> overlap
array([[0, 0, 0, 0],
[0, 0, 4, 0],
[0, 4, 0, 5],
[0, 0, 5, 0]])
>>> indices
array([[1, 2],
[2, 1],
[2, 3],
[3, 2]])

Python/Numpy index of array in array

I am just playing around with a particle simulator, I want to use matplotlib with python and numpy to make as realistic a simulator as possible as efficiently as possible (this is purely an exercise in fun with python) and I have a problem trying to calculate the inverse of distances.
I have an array containing positions of particles (x,y) like so:
x = random.randint(0,3,10).reshape(5,2)
>>> x
array([[1, 1],
[2, 1],
[2, 2],
[1, 2],
[0, 1]])
This is 5 particles with positions (x,y) in [0,3]. Now if I want to calculate the distance between one particle (say particle with position (0,1)) and the rest I would do something like
>>>x - [0,1]
array([[1, 0],
[2, 0],
[2, 1],
[1, 1],
[0, 0]])
The problem is I do NOT want to take the distance of the particle to itself: (0,0). This has length 0 and the inverse is infinite and is not defined for say gravity or the coloumb force.
So I tried:
where(x==[0,1])
>>>where(x==[0,1])
(array([0, 1, 4, 4]), array([1, 1, 0, 1]))
Which is not the position of the (0,1) particle in the x array. So how do I pick out the position of [0,1] from an array like x? The where() above checks where x is equal to 0 OR 1, not where x is equal to [0,1]. How do I do this "numpylike" without looping?
Ps: How the frack do you copy-paste code into stackoverflow? I mean bad forums have a [code]..[/code] option while here I spend 15 minutes properly indenting code (since tab in chromium on ubuntu simply hops out of the window instead of indenting with 4 whitespaces....) This is VERY annoying.
Edit: Seeing the first answer I tried:
x
array([[0, 2],
[2, 2],
[1, 0],
[2, 2],
[1, 1]])
>>> all(x==[1,1],axis=1)
array([False, False, False, False, True], dtype=bool)
>>> all(x!=[1,1], axis=1)
array([ True, True, False, True, False], dtype=bool)
Which is not what I was hoping for, the != should return the array WITHOUT [1,1]. But alas, it misses one (1,0):
>>>x[all(x!=[1,1], axis=1)]
array([[0, 2],
[2, 2],
[2, 2]])
Edit2: any did the trick, it makes more logical sense than all I suppose, thank you!
>>> import numpy as np
>>> x=np.array([[1, 1],
... [2, 1],
... [2, 2],
... [1, 2],
... [0, 1]])
>>> np.all(x==[0,1], axis=1)
array([False, False, False, False, True], dtype=bool)
>>> np.where(np.all(x==[0,1], axis=1))
(array([4]),)
>>> np.where(np.any(x!=[0,1], axis=1))
(array([0, 1, 2, 3]),)

Categories

Resources