Python, neighbors on a regular grid - python

Let's suppose I have a set of 2D coordinates that represent the centers of cells of a 2D regular mesh. I would like to find, for each cell in the grid, the two closest neighbors in each direction.
The problem is quite straightforward if one assigns to each cell and index defined as follows:
idx_cell = idx+N*idy
where N is the total number of cells in the grid, idx=x/dx and idy=y/dx, with x and y being the x-coordinate and the y-coordinate of a cell and dx its size.
For example, the neighboring cells for a cell with idx_cell=5 are the cells with idx_cell equal to 4,6 (for the x-axis) and 5+N,5-N (for the y-axis).
The problem that I have is that my implementation of the algorithm is quite slow for large (N>1e6) data sets.
For instance, to get the neighbors of the x-axis I do
[x[(idx_cell==idx_cell[i]-1)|(idx_cell==idx_cell[i]+1)] for i in cells]
Do you think there's a fastest way to implement this algorithm?

You are basically reinventing the indexing scheme of a multidimensional array. It is relatively easy to code, but you can use the two functions unravel_index and ravel_multi_index to your advantage here.
If your grid is of M rows and N columns, to get the idx and idy of a single item you could do:
>>> M, N = 12, 10
>>> np.unravel_index(4, dims=(M, N))
(0, 4)
This also works if, instead of a single index, you provide an array of indices:
>>> np.unravel_index([15, 28, 32, 97], dims=(M, N))
(array([1, 2, 3, 9], dtype=int64), array([5, 8, 2, 7], dtype=int64))
So if cells has the indices of several cells you want to find neighbors to:
>>> cells = np.array([15, 28, 32, 44, 87])
You can get their neighbors as:
>>> idy, idx = np.unravel_index(cells, dims=(M, N))
>>> neigh_idx = np.vstack((idx-1, idx+1, idx, idx))
>>> neigh_idy = np.vstack((idy, idy, idy-1, idy+1))
>>> np.ravel_multi_index((neigh_idy, neigh_idx), dims=(M,N))
array([[14, 27, 31, 43, 86],
[16, 29, 33, 45, 88],
[ 5, 18, 22, 34, 77],
[25, 38, 42, 54, 97]], dtype=int64)
Or, if you prefer it like that:
>>> np.ravel_multi_index((neigh_idy, neigh_idx), dims=(M,N)).T
array([[14, 16, 5, 25],
[27, 29, 18, 38],
[31, 33, 22, 42],
[43, 45, 34, 54],
[86, 88, 77, 97]], dtype=int64)
The nicest thing about going this way is that ravel_multi_index has a mode keyword argument you can use to handle items on the edges of your lattice, see the docs.

Related

reducing colors with numpy

I'm writing a script to reduce the number of colors in a list by finding clusters. The problem I seem to run into is that the clusters will have different dimensions. Here is my jumping off point after the original list of 6 colors got already seperated into 3 clusters:
import numpy
a = numpy.array([
[12, 44, 52],
[27, 0, 71],
[81, 99, 92]
])
b = numpy.array([
[ 12, 13, 93],
[128, 128, 128]
])
c = numpy.array([
[ 57, 14, 255]
])
clusters = numpy.array([a,b,c])
print(numpy.min(clusters, axis=1))
However now the function numpy.min() starts to throw an error - I suspect it's because of the differently sized arrays.
The cluster arrays will always have the shape (x, 3) (x number of colors, 3 components). I want to get an array with the minimums of all components of the colors in one cluster (n, 3) (n is number of clusters) - so array([12, 0, 52], [12, 13, 93], [57, 14, 255]) in this case.
Is there a way to do this? As I mentioned it works as long as all clusters have multiple values.
Since your arrays a, b and c don't have an equal shape, you can't put them in the same array (at least if you don't pad with some value). You could calculate the minimum first and then generate an array from these minima:
numpy.array([arr.min(axis=0) for arr in (a, b, c)])
Which gives you:
array([[ 12, 0, 52],
[ 12, 13, 93],
[ 57, 14, 255]])

Calculate sum of all directly surrounding elements to some element in matrix

I am to calculate sum of all the directly surrounding elements to some element in a matrix.
[ [1, 2, 3],
[4, 5, 6],
[7, 8, 9] ]
so that sum_neighbours(matrix[0][0]) == 11 and sum_neighbours(matrix[1][1]) == 40.
The problem is just that I'm a beginner and I don't know how to make sum_neighbours calculate how many neighbours a certain number has.
I figured that I could write write if-elif-else-statement and then give the specific amount of neighbours that each value in the matrix has, but surely there must be a more efficient way to do this?
Otherwise it'll only be able to calculate the sum of the neighbours for matrices that are 3 x 3.
A nice approach is to use numpy and a convolution:
import numpy as np
from scipy.signal import convolve2d
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
convolve2d(a, [[1,1,1],[1,0,1],[1,1,1]], mode='same')
# top center bottom
output:
array([[11, 19, 13],
[23, 40, 27],
[17, 31, 19]])
Alternatively:
convolve2d(a, np.ones((3,3)), mode='same')-a
# this sums the neighbours + the center
# so we need to subtract the initial array
example on a larger array and ignoring the top left neighbor
this is just to show yo how easy it is to perform similar operations when using convolutions
a = np.arange(5*6).reshape((5,6))
# array([[ 0, 1, 2, 3, 4, 5],
# [ 6, 7, 8, 9, 10, 11],
# [12, 13, 14, 15, 16, 17],
# [18, 19, 20, 21, 22, 23],
# [24, 25, 26, 27, 28, 29]])
convolve2d(a, [[0,1,1],[1,0,1],[1,1,1]], mode='same')
array([[ 7, 15, 19, 23, 27, 25],
[ 20, 42, 49, 56, 63, 52],
[ 44, 84, 91, 98, 105, 82],
[ 68, 126, 133, 140, 147, 112],
[ 62, 107, 112, 117, 122, 73]])
If you would like to achieve this without any imports (the underlying assumption is that you have already checked that you have a well formed list of lists/matrix i.e. all the rows have the same length):
# you pass the matrix and the (i,j) coordinates of the element of interest
# This select the "matrix" around i,j (flooring to 0 and capping to
# the number of elements in the list - this is for the elements on the edge
# of the matrix)
def select(m, i, j):
def s(x, y): return x[max(0,y-1):min(len(x),y+1) + 1]
return [s(x, j) for x in s(m, i)]
def sum_around(m, i, j, excluded = True):
# this sums all the elements within each list and compute the
# grand total. It then subtracts the element in (i,j) if
# excluded = True (which is the default behaviour and what you want here)
return sum([sum(x) for x in select(m, i, j)]) - (m[i][j] if excluded else 0)
m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print(sum_around(m, 0, 0)) # prints 11
print(sum_around(m, 1, 1)) # prints 40
I guess you can add an extra row and column on boundary with values 0.
Then you can easily add the neighbouring elements, without any boundary conditions.

what is the meaning of X[i,] in python [duplicate]

I was wondering what the use of the comma was when slicing Python arrays - I have an example that appears to work, but the line that looks weird to me is
p = 20*numpy.log10(numpy.abs(numpy.fft.rfft(data[:2048, 0])))
Now, I know that when slicing an array, the first number is start, the next is end, and the last is step, but what does the comma after the end number designate? Thanks.
It is being used to extract a specific column from a 2D array.
So your example would extract column 0 (the first column) from the first 2048 rows (0 to 2047). Note however that this syntax will only work for numpy arrays and not general python lists.
Empirically - create an array using numpy
m = np.fromfunction(lambda i, j: (i +1)* 10 + j + 1, (9, 4), dtype=int)
which assigns an array like below to m
array(
[[11, 12, 13, 14],
[21, 22, 23, 24],
[31, 32, 33, 34],
[41, 42, 43, 44],
[51, 52, 53, 54],
[61, 62, 63, 64],
[71, 72, 73, 74],
[81, 82, 83, 84],
[91, 92, 93, 94]])
Now for the slice
m[:,0]
giving us
array([11, 21, 31, 41, 51, 61, 71, 81, 91])
I may have misinterpreted Khan Academy (so take with grain of salt):
In linear algebra terms, m[:,n] is taking the nth column vector of
the matrix m
See Abhranil's note how this specific interpretation only applies to numpy
It slices with a tuple. What exactly the tuple means depends on the object being sliced. In NumPy arrays, it performs a m-dimensional slice on a n-dimensional array.
>>> class C(object):
... def __getitem__(self, val):
... print val
...
>>> c = C()
>>> c[1:2,3:4]
(slice(1, 2, None), slice(3, 4, None))
>>> c[5:6,7]
(slice(5, 6, None), 7)

Search a number in a sorted 2D array

I'm trying to find the number that I'm looking from in a 2D array list. However, it has to be sorted first before searching.
Everything seems to be working fine when I'm trying to find a number in the 2D array. It is just the fact of sorting the 2D array in a way that will still be working. Let's assume I want to sort a 3x3 2D array. The way that it should display is:
[[8, 27, 6],
[1, 0, 11],
[10, 9, 3]]
Then, I will be looking for a number by using the binary search method through the sorted 2D array. My mid value will be in the middle of the array from the search.
This is just an example, but what I want to accomplish when I put randomized numbers and then sort row and columns. Using this idea, I'm using the random.randint() library from Python to randomized my numbers. Then, I'm trying to sort afterward in my 2d array, but it isn't really sorting before continuing.
n = 5
m = 5
def findnum_arr(array, num):
low = 0
high = n * m - 1
while (high >= low):
mid = (low + high) // 2
i = mid // m
j = mid % m
if (num == array[i][j]):
return True
if (num < array[i][j]):
high = mid - 1
else:
low = mid + 1
return False
if __name__ == '__main__':
multi_array = [[random.randint(0, 20) for x in range(n)] for y in range(m)]
sorted(multi_array)
Sorted:
[[0, 1, 3],
[6, 8, 9],
[10, 11, 27]]
Should be the sorted 2D array. Is it possible that both the row and column are sorted respectively with the sorted function?
Calling sorted on a nested list that is just going to sort based on the first index in the list.
Example:
arr = [[8, 27, 6],[1, 0, 11],[10, 15, 3], [16, 12, 14], [4, 9, 13]]
is going to return
[[1, 0, 11], [4, 9, 13], [8, 27, 6], [10, 15, 3], [16, 12, 14]]
To do this way that you want, you are going to have to flatten and then reshape.
To do this, I would try introducing numpy.
import numpy as np
a = np.array(sorted(sum(arr, [])))
#sorted(sum(arr, [])) flattens the list
b = np.reshape(a, (-1,3)).tolist()
EDITED FOR CLARITY: You can use your m and n as parameters in np.reshape. The first parameter (m) would return the number of arrays, while (n) would return the number of arrays.
The use of -1 in either parameter means that the reshaped array will be fit to return the requirements of the other parameter.
b would return
[[0, 1, 3], [4, 6, 8], [9, 10, 11], [12, 13, 14], [15, 16, 27]]
Finally found out a proper solution without using numpy and avoiding sum() module.
if __name__ == '__main__':
x = 7
multi_array = [[random.randint(0, 200) for x in range(n)] for y in range(m)]
# one_array = sorted(list(itertools.chain.from_iterable(multi_array))) Another way if you are using itertools
one_array = sorted([x for row in multi_array for x in row])
sorted_2d = [one_array[i:i+m] for i in range(0, len(one_array), n)]
print("multi_array list is: \n{0}\n".format(multi_array))
print("sorted 2D array: \n{0}\n".format(sorted_2d))
if not findnum_arr(sorted_2d, x):
print("Not Found")
else:
print("Found")
output:
multi_array list is:
[[40, 107, 23, 27, 42], [150, 84, 108, 191, 172], [154, 22, 161, 26, 31], [18, 150, 197, 77, 191], [96, 124, 81, 1
25, 186]]
sorted 2D array:
[[18, 22, 23, 26, 27], [31, 40, 42, 77, 81], [84, 96, 107, 108, 124], [125, 150, 150, 154, 161], [172, 186, 191, 1
91, 197]]
Not Found
I wanted to find a standard library module where I could flat the 2D array into 1D and sort it. Then, I would make a list comprehension of my 1D array and build it into a 2D array to. This sounds a lot of works but seems to work fine. Let me know if there is a better way to do it without numpy and faster :)

New array of smaller size excluding one value from each column

In Python 2.7 using numpy or by any means if I had an array of any size and wanted to excluded certain values and output the new array how would I do that? Here is What I would like
[(1,2,3),
(4,5,6), then exclude [4,2,9] to make the array[(1,5,3),
(7,8,9)] (7,8,6)]
I would always be excluding data the same length as the row length and always only one entry per column. [(1,5,3)] would be another example of data I would want to excluded. So every time I loop the function it reduces the array row size by one. I would imagine I have to use a masked array or convert my mask to a masked array and subtract the two then maybe condense the output but I have no idea how. Thanks for your time.
You can do it very efficiently if you transform your 2-D array in an unraveled 1-D array. Then you repeat the array with the elements to be excluded, called e in order to do an element-wise comparison:
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
e = [1, 5, 3]
ar = a.T.ravel()
er = np.repeat(e, a.shape[0])
ans = ar[er != ar].reshape(a.shape[1], a.shape[0]-1).T
But it will work if each element in e only matches one row of a.
EDIT:
as suggested by #Jaime, you can avoid the ravel() and get the same result doing directly:
ans = a.T[(a != e).T].reshape(a.shape[1], a.shape[0]-1).T
To exclude vector e from matrix a:
import numpy as np
a = np.array([(1,2,3), (4,5,6), (7,8,9)])
e = [4,2,9]
print np.array([ [ i for i in a.transpose()[j] if i != e[j] ]
for j in range(len(e)) ]).transpose()
This would take some work to generalize, but here's something that can handle 2-d cases of the kind you describe. If passed unexpected input, this won't notice and will generate strange results, but it's at least a starting point:
def columnwise_compress(a, values):
a_shape = a.shape
a_trans_flat = a.transpose().reshape(-1)
compressed = a_trans_flat[~numpy.in1d(a_trans_flat, values)]
return compressed.reshape(a_shape[:-1] + ((a_shape[0] - 1),)).transpose()
Tested:
>>> columnwise_compress(numpy.arange(9).reshape(3, 3) + 1, [4, 2, 9])
array([[1, 5, 3],
[7, 8, 6]])
>>> columnwise_compress(numpy.arange(9).reshape(3, 3) + 1, [1, 5, 3])
array([[4, 2, 6],
[7, 8, 9]])
The difficulty is that you're asking for "compression" of a kind that numpy.compress doesn't do (removing different values for each column or row) and you're asking for compression along columns instead of rows. Compressing along rows is easier because it moves along the natural order of the values in memory; you might consider working with transposed arrays for that reason. If you want to do that, things become a bit simpler:
>>> a = numpy. array([[1, 4, 7],
... [2, 5, 8],
... [3, 6, 9]])
>>> a[~numpy.in1d(a, [4, 2, 9]).reshape(3, 3)].reshape(3, 2)
array([[1, 7],
[5, 8],
[3, 6]])
You'll still need to handle shape parameters intelligently if you do it this way, but it will still be simpler. Also, this assumes there are no duplicates in the original array; if there are, this could generate wrong results. Saullo's excellent answer partially avoids the problem, but any value-based approach isn't guaranteed to work unless you're certain that there aren't duplicate values in the columns.
In the spirit of #SaulloCastro's answer, but handling multiple occurrences of items, you can remove the first occurrence on each column doing the following:
def delete_skew_row(a, b) :
rows, cols = a.shape
row_to_remove = np.argmax(a == b, axis=0)
items_to_remove = np.ravel_multi_index((row_to_remove,
np.arange(cols)),
a.shape, order='F')
ret = np.delete(a.T, items_to_remove)
return np.ascontiguousarray(ret.reshape(cols,rows-1).T)
rows, cols = 5, 10
a = np.random.randint(100, size=(rows, cols))
b = np.random.randint(rows, size=(cols,))
b = a[b, np.arange(cols)]
>>> a
array([[50, 46, 85, 82, 27, 41, 45, 27, 17, 26],
[92, 35, 14, 34, 48, 27, 63, 58, 14, 18],
[90, 91, 39, 19, 90, 29, 67, 52, 68, 69],
[10, 99, 33, 58, 46, 71, 43, 23, 58, 49],
[92, 81, 64, 77, 61, 99, 40, 49, 49, 87]])
>>> b
array([92, 81, 14, 82, 46, 29, 67, 58, 14, 69])
>>> delete_skew_row(a, b)
array([[50, 46, 85, 34, 27, 41, 45, 27, 17, 26],
[90, 35, 39, 19, 48, 27, 63, 52, 68, 18],
[10, 91, 33, 58, 90, 71, 43, 23, 58, 49],
[92, 99, 64, 77, 61, 99, 40, 49, 49, 87]])

Categories

Resources