I built the Delaunay triangulation in python.
Now I have 8 points (black) and generate 14 edges (gray).
How can I count the length of the edge associated with each point?
the matrix I want is the edges' length connected by each point, such as
[[P1, E1_length, E2_length, ...], [P2, E6_length, E7_length, ...], ...]
import numpy as np
points = np.array([[0, 0], [0, 1.1], [1, 0], [1, 1],[1.5, 0.6],[1.2, 0.5],[1.7, 0.9],[1.1, 0.1],])
from scipy.spatial import Delaunay
tri = Delaunay(points)
import matplotlib.pyplot as plt
plt.triplot(points[:, 0], points[:, 1], tri.simplices.copy(), color='0.7')
plt.plot(points[:, 0], points[:, 1], 'o', color='0.3')
New answer
Here's an approach which will give you a dictionary of points and edge lengths associated with each point:
simplices = points[tri.simplices]
edge_lengths = {}
for point in points:
key = tuple(point)
vertex_edges = edge_lengths.get(key, [])
adjacency_mask = np.isin(simplices, point).all(axis=2).any(axis=1)
for simplex in simplices[adjacency_mask]:
self_mask = np.isin(simplex, point).all(axis=1)
for other in simplex[~self_mask]:
dist = np.linalg.norm(point - other)
if dist not in vertex_edges:
edge_lengths[key] = vertex_edges
{(0.0, 0.0): [1.4142135623730951, 1.1, 1.3, 1.0],
(0.0, 1.1): [1.004987562112089, 1.3416407864998738, 1.4866068747318506],
(1.0, 0.0): [1.4866068747318506, 0.5385164807134504, 0.7810249675906654, 1.140175425099138, 0.14142135623730956],
(1.0, 1.0): [1.004987562112089, 1.4142135623730951, 0.5385164807134504, 0.6403124237432849, 0.7071067811865475],
(1.5, 0.6): [0.6403124237432849, 0.36055512754639896, 0.31622776601683794, 0.6403124237432848],
(1.2, 0.5): [0.5385164807134504, 1.3, 0.31622776601683794, 0.41231056256176607],
(1.7, 0.9): [0.7071067811865475, 0.36055512754639896],
(1.1, 0.1): [0.14142135623730956, 0.41231056256176607, 0.6403124237432848]}
Old answer before requirements changed
The Delaunay object has a simplices attribute which returns the points which make up the simplices. Using scipy.spatial.distance.pdist(), and advanced indexing, you can get all the edge lengths like so:
>>> from scipy.spatial.distance import pdist
>>> edge_lengths = np.array([pdist(x) for x in points[tri.simplices]])
>>> edge_lengths
array([[1.00498756, 1.41421356, 1.1 ],
[0.53851648, 1.3 , 1.41421356],
[0.53851648, 1. , 1.3 ],
[0.64031242, 0.70710678, 0.36055513],
[0.64031242, 0.31622777, 0.53851648],
[0.14142136, 0.53851648, 0.41231056],
[0.64031242, 0.41231056, 0.31622777]])
Note however, that edge lengths are duplicated here, since every simplex shares at least one edge with another simplex.
The tri.simplices attribute gives the indices in points for each vertex in each simplex in the Delaunay object:
>>> tri.simplices
array([[2, 6, 5],
[7, 2, 5],
[0, 7, 5],
[2, 1, 4],
[1, 2, 7],
[0, 3, 7],
[3, 1, 7]], dtype=int32)
Using advanced indexing, we can get all the points which make up the simplices:
>>> points[tri.simplices]
array([[[1. , 1. ],
[0. , 1.1],
[0. , 0. ]],
[[1.2, 0.5],
[1. , 1. ],
[0. , 0. ]],
[[1. , 0. ],
[1.2, 0.5],
[0. , 0. ]],
[[1. , 1. ],
[1.5, 0.6],
[1.7, 0.9]],
[[1.5, 0.6],
[1. , 1. ],
[1.2, 0.5]],
[[1. , 0. ],
[1.1, 0.1],
[1.2, 0.5]],
[[1.1, 0.1],
[1.5, 0.6],
[1.2, 0.5]]])
Finally, each subarray here represents a simplex and the three points which form it, and by using scipy.spatial.distance.pdist(), we can get the pairwise distances of each point in each simplex by iterating over the simplices:
>>> np.array([pdist(x) for x in points[tri.simplices]])
array([[1.00498756, 1.41421356, 1.1 ],
[0.53851648, 1.3 , 1.41421356],
[0.53851648, 1. , 1.3 ],
[0.64031242, 0.70710678, 0.36055513],
[0.64031242, 0.31622777, 0.53851648],
[0.14142136, 0.53851648, 0.41231056],
[0.64031242, 0.41231056, 0.31622777]])
I have a feature matrix that I want to row normalize.
This is what I have done based on min-max scaling and I am getting an error. Can anyone help me with this error.
a = np.random.randint(10, size=(4,5))
s=a.max(axis=1) - a.min(axis=1)
(a - a.min(axis=1))/(a.max(axis=1) - a.min(axis=1))\
>>[7 6 4 5]
4 print(s)
----> 6 (a - a.min(axis=1))/(a.max(axis=1) - a.min(axis=1))
ValueError: operands could not be broadcast together with shapes (4,5) (4,)
Try to work with transposed matrix:
b = a.T
m = (b - b.min(axis=0)) / (b.max(axis=0) - b.min(axis=0))
m = m.T
>>> a
array([[2, 3, 2, 8, 3], # min=2 -> 0, max=8 -> 1
[3, 3, 9, 2, 1], # min=1 -> 0, max=9 -> 1
[1, 9, 8, 4, 7], # min=1 -> 0, max=9 -> 1
[6, 8, 7, 9, 4]]) # min=4 -> 0, max=9 -> 1
>>> m
array([[0. , 0.16666667, 0. , 1. , 0.16666667],
[0.25 , 0.25 , 1. , 0.125 , 0. ],
[0. , 1. , 0.875 , 0.375 , 0.75 ],
[0.4 , 0.8 , 0.6 , 1. , 0. ]])
I have an alternative solution , I am not sure if this one is correct.Would be great if someone can comment on it.
def row_normalize(mf):
row_sums = np.array(mf.sum(1))
new_matrix = mf / row_sums[:, np.newaxis]
return new_matrix
As an example, I have an array of branches and probabilities that looks like this:
paths = np.array([
[1, 0, 1.0],
[2, 0, 0.4],
[2, 1, 0.6],
[3, 1, 1.0],
[5, 1, 0.25],
[5, 2, 0.5],
[5, 4, 0.25],
[6, 0, 0.7],
[6, 5, 0.2],
[6, 2, 0.1]])
The columns are upper node, lower node, probability.
Here's a visual of the nodes:
/ | \
5 0 2
/ | \ / \
1 2 4 0 1
| /\ |
0 0 1 0
I want to be able to pick a starting node and output an array of the branches and cumulative probabilities, including all the duplicate branches. For example:
start_node = 5 should return
[5, 1, 0.25],
[5, 2, 0.5],
[5, 4, 0.25],
[1, 0, 0.25],
[2, 0, 0.2],
[2, 1, 0.3],
[1, 0, 0.3]])
Notice the [1, 0, x] branch is included twice, as it's fed by both the [5, 1, 0.25] branch and the [2, 1, 0.3] branch.
Here's some code I got working but it's far too slow for my application (millions of branches):
def branch(start_node, paths):
output = paths[paths[:,0]==start_node]
next_nodes = output
while True:
can_go_lower = np.isin(next_nodes[:,1], paths[:,0])
if ~np.any(can_go_lower): break
next_nodes_checked = next_nodes[can_go_lower]
next_nodes = np.empty([0,3])
for nodes in next_nodes_checked:
to_append = paths[paths[:,0]==nodes[1]]
to_append[:,2] *= nodes[2]
next_nodes = np.append(next_nodes, to_append, axis=0)
output = np.append(output, next_nodes, axis=0)
return output
The branches are always higher to lower, therefor getting caught in circles isn't a concern. A way to vectorize the for loop and avoid the appends would be the best optimization, I think.
Instead of storing in numpy array lets' store graph in dict.
tree = {k:arr[arr[:, 0] == k] for k in np.unique(arr[:, 0])}
Make as set of nodes which are non-leaf:
non_leaf_nodes = set(np.unique(arr[:, 0]))
Now to find the branch and cumulative probability:
def branch(start_node, tree, non_leaf_nodes):
curr_nodes = [[start_node, start_node, 1.0]] #(prev_node, starting_node, current_probability)
output = []
while True:
next_nodes = []
for _, node, prob in curr_nodes:
if node not in non_leaf_nodes: continue
subtree = tree[node]
to_append = subtree.copy()
to_append[:, 2] *= prob
to_append = to_append.tolist()
output += to_append
next_nodes += to_append
curr_nodes = next_nodes
if len(curr_nodes) == 0:
return np.array(output)
>>> branch(5, tree, non_leaf_nodes)
[5. , 1. , 0.25],
[5. , 2. , 0.5 ],
[5. , 4. , 0.25],
[1. , 0. , 0.25],
[2. , 0. , 0.2 ],
[2. , 1. , 0.3 ],
[1. , 0. , 0.3 ]])
I am expecting it to work faster. Let me know.
Consider the following code:
import numpy as np
index_info = np.matrix([[1, 1], [1, 2]])
value = np.matrix([[0.5, 0.5]])
initial = np.zeros((3, 3))
How can I produce a matrix, final, which has the structure of initial with the elements specified by value at the locations specified by index_info WITHOUT a for loop? In this toy example, see below.
final = np.matrix([[0, 0, 0], [0, 0.5, 0.5], [0, 0, 0]])
With a for loop, you can easily loop through all of the index's in index_info and value and use that to populate initial and form final. But is there a way to do so with vectorization (no for loop)?
Convert index_info to a tuple and use it to assign:
>>> initial[(*index_info,)]=value
>>> initial
array([[0. , 0. , 0. ],
[0. , 0.5, 0.5],
[0. , 0. , 0. ]])
Please note that use of the matrix class is discouraged. Use ndarray instead.
You can do this with NumPy's array indexing:
>>> initial = np.zeros((3, 3))
>>> row = np.array([1, 1])
>>> col = np.array([1, 2])
>>> final = np.zeros_like(initial)
>>> final[row, col] = [0.5, 0.5]
>>> final
array([[0. , 0. , 0. ],
[0. , 0.5, 0.5],
[0. , 0. , 0. ]])
This is similar to #PaulPanzer's answer, where he is unpacking row and col from index_info all in one step. In other words:
row, col = (*index_info,)
I would like calculate the sum of two in two column in a matrix(the sum between the columns 0 and 1, between 2 and 3...).
So I tried to do nested "for" loops but at every time I haven't the good results.
For example:
c = np.array([[0,0,0.25,0.5],[0,0.5,0.25,0],[0.5,0,0,0]],float)
freq=np.zeros(6,float).reshape((3, 2))
#I calculate the sum between the first and second column, and between the fird and the fourth column
for i in range(0,4,2):
for j in range(1,4,2):
for p in range(0,2):
But the result is:
print freq
array([[ 0.75, 0.75],
[ 0.25, 0.25],
[ 0. , 0. ]])
Normaly the good result must be (0., 0.5,0.5) and (0.75,0.25,0). So I think the problem is in the nested "for" loops.
Is there a person who know how I can calculate the sum every two columns, because I have a matrix with 400 columns?
You can simply reshape to split the last dimension into two dimensions, with the last dimension of length 2 and then sum along it, like so -
freq = c.reshape(c.shape[0],-1,2).sum(2).T
Reshaping only creates a view into the array, so effectively, we are just using the summing operation here and as such must be efficient.
Sample run -
In [17]: c
array([[ 0. , 0. , 0.25, 0.5 ],
[ 0. , 0.5 , 0.25, 0. ],
[ 0.5 , 0. , 0. , 0. ]])
In [18]: c.reshape(c.shape[0],-1,2).sum(2).T
array([[ 0. , 0.5 , 0.5 ],
[ 0.75, 0.25, 0. ]])
Add the slices c[:, ::2] and c[:, 1::2]:
In [62]: c
array([[ 0. , 0. , 0.25, 0.5 ],
[ 0. , 0.5 , 0.25, 0. ],
[ 0.5 , 0. , 0. , 0. ]])
In [63]: c[:, ::2] + c[:, 1::2]
array([[ 0. , 0.75],
[ 0.5 , 0.25],
[ 0.5 , 0. ]])
Here is one way using np.split():
In [36]: np.array(np.split(c, np.arange(2, c.shape[1], 2), axis=1)).sum(axis=-1)
array([[ 0. , 0.5 , 0.5 ],
[ 0.75, 0.25, 0. ]])
Or as a more general way even for odd length arrays:
In [87]: def vertical_adder(array):
return np.column_stack([np.sum(arr, axis=1) for arr in np.array_split(array, np.arange(2, array.shape[1], 2), axis=1)])
In [88]: vertical_adder(c)
array([[ 0. , 0.75],
[ 0.5 , 0.25],
[ 0.5 , 0. ]])
In [94]: a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
In [95]: vertical_adder(a)
array([[ 1, 5, 4],
[11, 15, 9],
[21, 25, 14]])