I have two labelled 2D numpy arrays a and b with identical shapes. I would like to re-label the array b by something similar to a GIS geometric union of the two arrays, such that cells with unique combination of values in array a and b are assigned new unique IDs:
I'm not concerned with the specific numbering of the regions in the output, so long as the values are all unique. I have attached sample arrays and desired outputs below: my real datasets are much larger, with both arrays having integer labels which range from "1" to "200000". So far I've experimented with concatenating the array IDs to form unique combinations of values, but ideally I would like to output a simple set of new IDs in the form of 1, 2, 3..., etc.
import numpy as np
import matplotlib.pyplot as plt
# Example labelled arrays a and b
input_a = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 0],
[0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
[0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
input_b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot inputs
plt.imshow(input_a, cmap="spectral", interpolation='nearest')
plt.imshow(input_b, cmap="spectral", interpolation='nearest')
# Desired output, union of a and b
output = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 4, 7, 7, 7, 7, 0, 0],
[0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
[0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
[0, 0, 5, 5, 5, 6, 7, 7, 7, 7, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
# Plot desired output
plt.imshow(output, cmap="spectral", interpolation='nearest')
If I understood the circumstances correctly, you are looking to have unique pairings from a and b. So, 1 from a and 1 from b would have one unique tag in the output; 1 from a and 3 from b would have another unique tag in the output. Also looking at the desired output in the question, it seems that there is an additional conditional situation here that if b is zero, the output is to be zero as well irrespective of the unique pairings.
The following implementation tries to solve all of that -
c = a*(b.max()+1) + b
c[b==0] = 0
_,idx = np.unique(c,return_inverse= True)
out = idx.reshape(b.shape)
Sample run -
In [21]: a
Out[21]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0],
[0, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 0],
[0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
[0, 0, 3, 3, 3, 3, 2, 2, 2, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [22]: b
Out[22]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [23]: out
Out[23]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 3, 5, 5, 5, 5, 0, 0],
[0, 0, 1, 1, 1, 3, 5, 5, 5, 5, 0, 0],
[0, 0, 1, 1, 1, 2, 4, 4, 4, 4, 0, 0],
[0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
[0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
[0, 0, 6, 6, 6, 7, 4, 4, 4, 4, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Sample plot -
# Plot inputs
plt.figure()
plt.imshow(a, cmap="spectral", interpolation='nearest')
plt.figure()
plt.imshow(b, cmap="spectral", interpolation='nearest')
# Plot output
plt.figure()
plt.imshow(out, cmap="spectral", interpolation='nearest')
Here is a way to do it conceptually in terms of set union, but not to GIS geometric union, since that was mentioned after I answered.
Make a list of all possible unique 2-tuples of values with one from a and the other from b in that order. Map each tuple in that list to its index in it. Create the union array using that map.
For example say a and b are arrays each containing values in range(4) and assume for simplicity they have the same shape. Then:
v = range(4)
from itertools import permutations
p = list(permutations(v,2))
m = {}
for i,x in enumerate(p):
m[x] = i
union = np.empty_like(a)
for i,x in np.ndenumerate(a):
union[i] = m[(x,b[i])]
For demonstration, generating a and b with
np.random.randint(4, size=(3, 3))
produced:
a = array([[3, 0, 3],
[1, 3, 2],
[0, 0, 3]])
b = array([[1, 3, 1],
[0, 0, 1],
[2, 3, 0]])
m = {(0, 1): 0,
(0, 2): 1,
(0, 3): 2,
(1, 0): 3,
(1, 2): 4,
(1, 3): 5,
(2, 0): 6,
(2, 1): 7,
(2, 3): 8,
(3, 0): 9,
(3, 1): 10,
(3, 2): 11}
union = array([[10, 2, 10],
[ 3, 9, 7],
[ 1, 2, 9]])
In this case the property that a union should be bigger or equal to its composits is reflected in increased numerical values rather than increase in number of elements.
An issue with using itertools permutations is that the number of permutations could be much larger than needed. It would be much larger if the number of overlaps per area is much smaller than the number of areas.
The question uses Union but the picture shows an Intersection. Divakar's answer replicates the pictured Intersection, and is more elegant than my solution below, which produces the Union.
One could make a dictionary of only the actual overlaps, and then work from that. Flattening the input arrays first makes this easier for me to see, I'm not sure if that is feasible for you:
shp = numpy.shape(input_a)
a = input_a.flatten()
b = input_b.flatten()
s = set(((i,j) for i,j in zip(a,b))) # unique pairings
d = {p:i for i,p in enumerate(sorted(list(s))} # dict{pair:index}
output_c = numpy.array([d[i,j] for i,j in zip(a,b)]).reshape(shp)
array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 1, 1, 1, 1, 5, 5, 5, 5, 5, 0],
[ 0, 1, 1, 1, 1, 1, 5, 5, 5, 5, 5, 0],
[ 0, 1, 2, 2, 2, 4, 7, 7, 7, 7, 5, 0],
[ 0, 1, 2, 2, 2, 4, 7, 7, 7, 7, 5, 0],
[ 0, 1, 2, 2, 2, 3, 6, 6, 6, 6, 5, 0],
[ 0, 8, 9, 9, 9, 10, 6, 6, 6, 6, 5, 0],
[ 0, 0, 9, 9, 9, 10, 6, 6, 6, 6, 0, 0],
[ 0, 0, 9, 9, 9, 10, 6, 6, 6, 6, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Related
I have a few lists as below-
Clusters=['Cluster1', 'Cluster2', 'Cluster3', 'Cluster4', 'Cluster5', 'Cluster6', 'Cluster7']
clusterpoints= [[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0],
[1, 4, 0, 1, 0, 0, 0, 1, 2, 1],
[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0]]
xaxispoints=[ V1, V2, V3 ,V4 ,V5 , V6, V7, V8 , V9 ,V10 ]
How can I plot a graph as shown in the example below ?
The code below produces an image similar to the one you're asking:
import matplotlib.pyplot as plt
import numpy as np
cluster_names = ['Cluster1', 'Cluster2', 'Cluster3', 'Cluster4', 'Cluster5', 'Cluster6', 'Cluster7']
cluster_points = [[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0],
[1, 4, 0, 1, 0, 0, 0, 1, 2, 1],
[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0]]
xaxispoints = ["V1", "V2", "V3" ,"V4" ,"V5" , "V6", "V7", "V8" , "V9" ,"V10" ]
width = 0.45
fig, ax = plt.subplots()
cluster_point_sum = np.zeros(len(xaxispoints))
for (cluster_point, cluster_name) in zip(cluster_points, cluster_names):
ax.bar(xaxispoints, cluster_point, width, label=cluster_name, bottom=cluster_point_sum)
cluster_point_sum += cluster_point
ax.set_ylabel('Number of points')
ax.legend()
plt.show()
Here is an approach using pandas:
import pandas as pd
import numpy as np
Clusters = ['Cluster1', 'Cluster2', 'Cluster3', 'Cluster4', 'Cluster5', 'Cluster6', 'Cluster7']
clusterpoints = [[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0],
[1, 4, 0, 1, 0, 0, 0, 1, 2, 1],
[0, 2, 0, 5, 1, 0, 0, 0, 6, 0],
[0, 0, 5, 0, 0, 5, 1, 0, 1, 0],
[3, 0, 0, 1, 0, 6, 2, 0, 0, 0]]
xaxispoints = ['V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10']
df = pd.DataFrame(data=np.transpose(clusterpoints), columns=Clusters, index=xaxispoints)
df.plot.bar(stacked=True, rot=0)
i have the following 2D numpy array M
M = np.array([[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,1,1],
[1,1,1,0,0,0,0,0,0,1,1],
[0,0,0,0,0,1,1,1,0,0,0],
[0,0,0,0,0,1,1,1,0,0,0],
[1,1,1,0,1,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0]])
which I want to identify its spots (Pixels with value==1 and connected to each other).
Thanks to the function 'label' from scipy, I can identify all of my spots in the matrix. The output should seem like this:
Output, Nbr= label(M)
#Output= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])
I want only to have spots with 9 elements, that means the first and fourth spot.
using a for loop like this works fine:
for i in range(Nbr+1):
Spot= np.argwhere(components[:,:]== i)
if len(Spot)!=9:
M[Spot[:, 0], Spot[:, 1]]=0
#M= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]])
The porblem is when my Spots are more than 4, my code is slower.
Is there any faster alternative that can do the job of the for loop?
Thanks.
Folks this is driving me crazy.
I have the following functions defined below
I get the expected result from maskArray() when I pass it anchor='top' or anchor='left', but it returns and all zeros numpy array in case of 'bottom' and 'right'. I thought i got the slicing wrong, so i experimented with the statements mask[-y:,:] = somevalue outside the function and it works so I believe the syntax is right. Not sure what is going on here.
Here are examples of function calls results
In [5]: x = np.round(np.random.rand(10,10) * 10).astype(np.uint8)
In [6]: x
Out[6]:
array([[ 3, 2, 1, 10, 4, 7, 7, 9, 6, 5],
[ 1, 6, 3, 0, 9, 3, 7, 6, 0, 4],
[ 4, 2, 5, 3, 4, 7, 6, 2, 0, 3],
[ 1, 4, 10, 2, 8, 1, 9, 10, 4, 8],
[ 9, 8, 3, 5, 3, 0, 10, 5, 2, 3],
[ 1, 9, 8, 6, 1, 3, 7, 4, 9, 3],
[ 8, 8, 4, 6, 9, 1, 10, 6, 9, 7],
[ 6, 2, 4, 8, 2, 9, 2, 4, 7, 4],
[ 7, 9, 2, 6, 9, 2, 6, 8, 7, 8],
[ 4, 6, 3, 5, 7, 5, 3, 3, 5, 5]], dtype=uint8)
In [7]: maskArray(x,0.3333,'top')
Out[7]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
In [8]: maskArray(x,0.3333,'left')
Out[8]:
array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
In [9]: maskArray(x,0.3333,'bottom')
Out[9]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
Can any of you see something that I'm not seeing ?
My other questions is : is there a way to make the slicing statement generic for any dimension of np.array ? meaning instead of having an if statement for each expected array.ndim (i.e: [:x,:] and [:x,:,:] )
Cheers
import numpy as np
def getChannels(srx):
try:
if srx.ndim == 2:
return 0
elif srx.ndim == 3:
return srx.shape[2]
else:
return None
except TypeError:
print("srx is not a numpy.array")
def maskArray(dsx, fraction, anchor):
if anchor == 'top':
y = np.round(dsx.shape[0] * fraction).astype(np.uint8)
mask = np.zeros_like(dsx)
if getChannels(dsx) == 0:
mask[:y,:] = 1
return mask
elif getChannels(dsx) == 3:
mask[:y,:,:] = 1
return mask
else:
return None
elif anchor == 'bottom':
y = np.round(dsx.shape[0] * fraction).astype(np.uint8)
mask = np.zeros_like(dsx)
if getChannels(dsx) == 0:
mask[-y:,:] = 1
return mask
elif getChannels(dsx) == 3:
mask[-y:,:,:] = 1
return mask
else:
return None
elif anchor == 'left':
x = np.round(dsx.shape[1] * fraction).astype(np.uint8)
mask = np.zeros_like(dsx)
if getChannels(dsx) == 0:
mask[:,:x] = 1
return mask
elif getChannels(dsx) == 3:
mask[:,:x,:] = 1
return mask
else:
return None
elif anchor == 'right':
x = np.round(dsx.shape[1] * fraction).astype(np.uint8)
mask = np.zeros_like(dsx)
if getChannels(dsx) == 0:
mask[:,-x:] = 1
return mask
elif getChannels(dsx) == 3:
mask[:,-x:,:] = 1
return mask
else:
return None
When you ask for the negative of a type uint8 variable, the result overflows, because negative values don't exist for this type:
>>> -np.round(10 * 0.3333).astype('uint8')
253
Use a signed integer type and it will work as expected:
>>> -np.round(10 * 0.3333).astype('int')
-3
I am having difficulty in vectorizing the below operation:
# x.shape = (a,)
# y.shape = (a, b)
# x and y are ordered over a.
# Want to combine x, y into z.shape(num_unique_x, b)
# Below works and illustrates intent but is iterative
z = np.zeros((num_unique_x, b))
for i in range(a):
z[x[i], y[i, :]] += 1
Your use of num_unique_x, and the size of z suggests that this is a case where x and y have repeats, and that some of the z will be larger than 1. In which case we need to use np.add.at. But to set that up I'd have review its documentation, and possibly test some alternatives.
But first a no-repeats case
In [522]: x=np.arange(6)
In [523]: y=np.arange(3)+x[:,None]
In [524]: y
Out[524]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])
See why I ask for a diagnostic example. I'm guessing as to possible values. I have to make a z with more than 3 columns.
In [529]: z=np.zeros((6,8),dtype=int)
In [530]: for i in range(6):
...: z[x[i],y[i,:]]+=1
In [531]: z
Out[531]:
array([[1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1]])
The vectorized equivalent
In [532]: z[x[:,None],y]
Out[532]:
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
In [533]: z[x[:,None],y] += 1
In [534]: z
Out[534]:
array([[2, 2, 2, 0, 0, 0, 0, 0],
[0, 2, 2, 2, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 0, 0, 0],
[0, 0, 0, 2, 2, 2, 0, 0],
[0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 2, 2, 2]])
The corresponding add.at expression is
In [538]: np.add.at(z,(x[:,None],y),1)
In [539]: z
Out[539]:
array([[3, 3, 3, 0, 0, 0, 0, 0],
[0, 3, 3, 3, 0, 0, 0, 0],
[0, 0, 3, 3, 3, 0, 0, 0],
[0, 0, 0, 3, 3, 3, 0, 0],
[0, 0, 0, 0, 3, 3, 3, 0],
[0, 0, 0, 0, 0, 3, 3, 3]])
So that works for this no-repeats case.
For repeats in x:
In [542]: x1=np.array([0,1,1,2,3,5])
In [543]: z1=np.zeros((6,8),dtype=int)
In [544]: np.add.at(z1,(x1[:,None],y),1)
In [545]: z1
Out[545]:
array([[1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 2, 2, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1]])
Without add.at we miss the 2s.
In [546]: z2=np.zeros((6,8),dtype=int)
In [547]: z2[x1[:,None],y] += 1
In [548]: z2
Out[548]:
array([[1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1]])
I've got an array of objects labeled with scipy.ndimage.measurements.label called Labels. I've got other array Data containing stuff related to Labels. How can I make a third array Neighbourhoods which could serve to map the nearest label to x,y is L
Given Labels and Data, how can I use python/numpy/scipy to get Neighbourhoods?
Labels = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] )
Data = array([[1, 1, 1, 1, 1, 1, 2, 3, 4, 5],
[1, 0, 0, 0, 0, 1, 2, 3, 4, 5],
[1, 0, 0, 0, 0, 1, 2, 3, 4, 4],
[1, 0, 0, 0, 0, 1, 2, 3, 3, 3],
[1, 0, 0, 0, 0, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 1, 0, 0, 0, 1],
[3, 3, 3, 3, 2, 1, 0, 0, 0, 1],
[4, 4, 4, 3, 2, 1, 0, 0, 0, 1],
[5, 5, 4, 3, 2, 1, 1, 1, 1, 1]] )
Neighbourhoods = array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 1, 1, 1, 1, 1],
[1, 0, 0, 0, 0, 1, 1, 1, 0, 2],
[1, 0, 0, 0, 0, 1, 1, 0, 2, 2],
[1, 0, 0, 0, 0, 1, 0, 2, 2, 2],
[1, 1, 1, 1, 1, 0, 2, 2, 2, 2],
[1, 1, 1, 1, 0, 2, 0, 0, 0, 2],
[1, 1, 1, 0, 2, 2, 0, 0, 0, 2],
[1, 1, 0, 2, 2, 2, 0, 0, 0, 2],
[1, 1, 2, 2, 2, 2, 2, 2, 2, 2]] )
Note: I'm not sure what should happen with ties, so used zeros in the above Neighbourhoods
As suggested by David Zaslavsky, this is the job for a voroni diagram. Here is a numpy implementation: http://blancosilva.wordpress.com/2010/12/15/image-processing-with-numpy-scipy-and-matplotlibs-in-sage/
The relevant function is scipy.ndimage.distance_transform_edt. It has a return_indices option that can be exploited to do what you need (as well as calculate the raw distances (data in your example)).
As an example:
import numpy as np
from scipy.ndimage import distance_transform_edt
labels = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 2, 2, 2, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] )
i, j = distance_transform_edt(labels == 0, return_distances=False,
return_indices=True)
neighborhoods = labels[i,j]
print neighborhoods
This yields:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 2, 2, 2],
[1, 1, 1, 1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
[1, 1, 1, 1, 2, 2, 2, 2, 2, 2],
[1, 1, 1, 2, 2, 2, 2, 2, 2, 2],
[1, 1, 2, 2, 2, 2, 2, 2, 2, 2]])