generate random directed fully-accessible adjacent probability matrix

generate random directed fully-accessible adjacent probability matrix - python

given V nodes and E connections as parameters, how do I generate random directed fully-!connected! adjacent probability matrix, where all the connections weights fanning out of a node sum to 1.
The idea is after I pick random starting node to do a random walk according to the probabilities thus generating similar-random-structured sequences.
Although I prefer adj-matrix, graph is OK too.
Of course the fan-out connections can be one or many.
Loops are OK just not with itself.
I can do the walk using np.random.choice(nodes,prob)
Now that Jerome mention it it seem I was mistaken .. I dont want fully-coonnected BUT a closed-loop where there are no islands of sub-graphs i.e. all nodes are accessible via others.
Sorry I dont know how is this type of graph called ?
here is my complex solution ;(
def gen_adjmx(self):
passx = 1
c = 0 #connections so far
#until enough conns are generated
while c < self.nconns :
#loop the rows
for sym in range(self.nsyms):
if c >= self.nconns : break
if passx == 1 : #guarantees at least one connection
self.adj[sym, randint(self.nsyms) ] = randint(100)
else:
if randint(2) == 1 : #maybe a conn ?
col = randint(self.nsyms)
#already exists
if self.adj[sym, col] > 0 : continue
self.adj[sym, col ] = randint(100)
c += 1
passx += 1
self.adj /= self.adj.sum(axis=0)

You can simply create a random matrix and normalize the rows so that the sum is 1:
v = np.random.rand(n, n)
v /= v.sum(axis=1)

You mentioned that you want a graph which doesn't have any islands. I guess what you mean is that the adjacency matrix should be irreducible, i.e. the associated graph doesn't have any disconnected components.
One way to generate a random graph with the required property is to generate a random graph and then see if it has the property; throw it out and try again if it doesn't, otherwise keep it.
Here's a sketch of a solution with that in mind.
(1) generate a matrix n_vertices by n_vertices, which contains n_edges elements which are 1, and the rest are 0. This is a random adjacency matrix.
(2) test the adjacency matrix to see if it's irreducible. If so, keep it, otherwise go back to step 1.
I'm sure you can implement that in Python. I tried a proof of concept in Maxima (https://maxima.sourceforge.io), since it's convenient in some ways. There are probably ways to go about it which directly construct an irreducible matrix.
I implemented the irreducibility test for a matrix A as whether sum(A^^k, k, 0, n) has any 0 elements, according to: https://math.stackexchange.com/a/1703650 That test becomes more and more expensive as the number of vertices grows; and as the ratio of edges to vertices decreases, it increases the probability that you'll have to repeat steps 1 and 2. Whether that's tolerable for you depends on the typical number of vertices and edges you're working with.
random_irreducible (n_vertices, n_edges) :=
block ([A, n: 1],
while not irreducible (A: random_adjacency (n_vertices, n_edges))
do n: n + 1,
[A, n]);
random_adjacency (n_vertices, n_edges) :=
block([list_01, list_01_permuted, get_element],
list_01: append (makelist (1, n_edges), makelist (0, n_vertices^2 - n_edges)),
list_01_permuted: random_permutation (list_01),
get_element: lambda ([i, j], list_01_permuted[1 + (i - 1) + (j - 1)*n_vertices]),
genmatrix (get_element, n_vertices, n_vertices));
irreducible (A) :=
is (member (0, flatten (args (sum (A^^k, k, 0, length(A))))) = false);
A couple of things, one is I left out the part about normalizing edge weights so they sum to 1. I guess you'll have to put in that part to get a transition matrix and not just an adjacency matrix. The other is that I didn't prevent elements on the diagonal, i.e., you can stay on a vertex instead of always going to another one. If that's important, you'll have to deal with that too.

Related

Find a cycle of 3 (triangle) from adjacency matrix

I have a code which gets a number of triangles in an Undirected Graph using matrix multiplication method. Now I would like it to also print these triangles, preferably to print those vertexes. It could be done with third party libraries, e.g. numpy or networkx, but it has to be done with matrix multiplication, as I know that I could do it with naive version.
To make it simplier I will use the easiest adjacency matrix:
[[0, 1, 0, 0],
[1, 0, 1, 1],
[0, 1, 0, 1],
[0, 1, 1, 0]]
it has edges:
x,y
0,1
1,2
1,3
2,3
So the triangle exsists between vertexes 1,2,3 and this is what I would like this program ALSO prints to the console
Now the code, which just prints how many triangles are in this graph:
# num of vertexes
V = 4
# graph from adjacency matrix
graph = [[0, 1, 0, 0],
[1, 0, 1, 1],
[0, 1, 0, 1],
[0, 1, 1, 0]]
# get the vertexes in a dict
vertexes = {}
for i in range(len(graph)):
vertexes[i] = i
print(vertexes)
## >> {0: 0, 1: 1, 2: 2, 3: 3}
# matrix multiplication
def multiply(A, B, C):
global V
for i in range(V):
for j in range(V):
C[i][j] = 0
for k in range(V):
C[i][j] += A[i][k] * B[k][j]
# Utility function to calculate
# trace of a matrix (sum of
# diagonal elements)
def getTrace(graph):
global V
trace = 0
for i in range(V):
trace += graph[i][i]
return trace
# Utility function for calculating
# number of triangles in graph
def triangleInGraph(graph):
global V
# To Store graph^2
aux2 = [[None] * V for _ in range(V)]
# To Store graph^3
aux3 = [[None] * V for i in range(V)]
# Initialising aux
# matrices with 0
for i in range(V):
for j in range(V):
aux2[i][j] = aux3[i][j] = 0
# aux2 is graph^2 now printMatrix(aux2)
multiply(graph, graph, aux2)
# after this multiplication aux3 is
# graph^3 printMatrix(aux3)
multiply(graph, aux2, aux3)
trace = getTrace(aux3)
return trace // 6
print("Total number of Triangle in Graph :",
triangleInGraph(graph))
## >> Total number of Triangle in Graph : 1

The thing is, the information of the triangle (more generally speaking, information of paths between a vertex i and a vertex j) is lost during that matrix multiplication process. All that is stored is that the path exist.
For adjacency matrix itself, whose numbers are the number of length 1 paths between i and j, answer is obvious, because if a path exists, then it has to be edge (i,j). But even in M², when you see number 2 at row i column j of M², well, all you know is that there are 2 length 2 paths connecting i to j. So, that it exists 2 different index k₁ and k₂, such as (i,k₁) and (k₁,j) are edges, and so are (i,k₂) and (k₂, j).
That is exactly why matrix multiplication works (and that is a virtue of coding as explicitly as you did: I don't need to recall you that element M²ᵢⱼ = ΣMᵢₖ×Mₖⱼ
So it is exactly that: 1 for all intermediate vertex k such as (i,k) and (k,j) are both edges. So 1 for all intermediate vertex k such as (i,k),(k,j) is a length 2 path for i to j.
But as you can see, that Σ is just a sum. In a sum, we loose the detail of what contributed to the sum.
In other words, nothing to do from what you computed. You've just computed the number of length-3 path from i to j, for all i and j, and, in particular what you are interested in, the number of length-3 paths from i to i for all i.
So the only solution you have, is to write another algorithm, that does a completely different computation (but makes yours useless: why compute the number of paths, when you have, or you will compute the list of paths?).
That computation is a rather classic one: you are just looking for paths from a node to another. Only, those two nodes are the same.
Nevertheless the most classical algorithm (Dijkstra, Ford, ...) are not really useful here (you are not searching the shortest one, and you want all paths, not just one).
One method I can think of, is to start nevertheless ("nevertheless" because I said earlier that your computing of length of path was redundant) from your code. Not that it is the easiest way, but now that your code is here; besides, I allways try to stay as close as possible from the original code
Compute a matrix of path
As I've said earlier, the formula ΣAᵢₖBₖⱼ makes sense: it is computing the number of cases where we have some paths (Aᵢₖ) from i to k and some other paths (Bₖⱼ) from k to j.
You just have to do the same thing, but instead of summing a number, sum a list of paths.
For the sake of simplicity, here, I'll use lists to store paths. So path i,k,j is stored in a list [i,k,j]. So in each cell of our matrix we have a list of paths, so a list of list (so since our matrix is itself implemented as a list of list, that makes the path matrix a list of list of list of list)
The path matrix (I made up the name just now. But I am pretty sure it has already an official name, since the idea can't be new. And that official name is probably "path matrix") for the initial matrix is very simple: each element is either [] (no path) where Mᵢⱼ is 0, and is [[i,j]] (1 path, i→j) where Mᵢⱼ is 1.
So, let's build it
def adjacencyToPath(M):
P=[[[] for _ in range(len(M))] for _ in range(len(M))]
for i in range(len(M)):
for j in range(len(M)):
if M[i][j]==1:
P[i][j]=[[i,j]]
else:
P[i][j]=[]
return P
Now that you've have that, we just have to follow the same idea as in the matrix multiplication. For example (to use the most complete example, even if out of your scope, since you don't compute more than M³) when you compute M²×M³, and say M⁵ᵢⱼ = ΣM²ᵢₖM³ₖⱼ that means that if M²ᵢₖ is 3 and M³ₖⱼ is 2, then you have 6 paths of length 5 between i and j whose 3rd step is at node k: all the 6 possible combination of the 3 ways to go from i to k in 3 steps and the 2 ways to go from k to j in 2 steps.
So, let's do also that for path matrix.
# Args=2 list of paths.
# Returns 1 list of paths
# Ex, if p1=[[1,2,3], [1,4,3]] and p2=[[3,2,4,2], [3,4,5,2]]
# Then returns [[1,2,3,2,4,2], [1,2,3,4,5,2], [1,4,3,2,4,2], [1,4,3,4,5,2]]
def combineListPath(lp1, lp2):
res=[]
for p1 in lp1:
for p2 in lp2:
res.append(p1+p2[1:]) # p2[0] is redundant with p1[-1]
return res
And the path matrix multiplication therefore goes like this
def pathMult(P1, P2):
res=[[[] for _ in range(len(P1))] for _ in range(len(P1))]
for i in range(len(P1)):
for j in range(len(P1)):
for k in range(len(P1)):
res[i][j] += combineListPath(P1[i][k], P2[k][j])
return res
So, all we have to do now, is to use this pathMult function as we use the matrix multiplication. As you computed aux2, let compute pm2
pm=adjacencyToPath(graph)
pm2=pathMult(pm, pm)
and as you computed aux3, let's compute pm3
pm3=pathMult(pm, pm2)
And now, you have in pm3, at each cell pm3[i][j] the list of paths of length 3, from i to j. And in particular, in all pm3[i][i] you have the list of triangles.
Now, the advantage of this method is that it mimics exactly your way of computing the number of paths: we do the exact same thing, but instead of retaining the number of paths, we retain the list of them.
Faster way
Obviously there are more efficient way. For example, you could just search pair (i,j) of connected nodes such as there is a third node k connected to both i and j (with an edge (j,k) and an edge (k,i), making no assumption whether your graph is oriented or not).
def listTriangle(M):
res=[]
for i in range(len(M)):
for j in range(i,len(M)):
if M[i][j]==0: continue
# So, at list point, we know i->j is an edge
for k in range(i,len(M)):
if M[j,k]>0 and M[k,i]>0:
res.append( (i,j,k) )
return res
We assume j≥i and k≥i, because triangles (i,j,k), (j,k,i) and (k,i,j) are the same, and exist all or none.
It could be optimized if we make the assumption that we are always in a non-oriented (or at least symmetric) graph, as you example suggest. In which case, we can assume i≤j≤k for example (since triangles (i,j,k) and (i,k,j) are also the same), turning the 3rd for from for k in range(i, len(M)) to for k in range(j, len(M)). And also if we exclude loops (either because there are none, as in your example, or because we don't want to count them as part of a triangle), then you can make the assumption i<j<k. Which then turns the 2 last loops into for j in range(i+1, len(M)) and for k in range(j+1, len(M)).
Optimisation
Last thing I didn't want to introduce until now, to stay as close as possible to your code. It worth mentioning that python already has some matrix manipulation routines, through numpy and the # operator. So it is better to take advantage of it (even tho I took advantage of the fact you reinvented the wheel of matrix multiplication to explain my path multiplication).
Your code, for example, becomes
import numpy as np
graph = np.array([[0, 1, 0, 0],
[1, 0, 1, 1],
[0, 1, 0, 1],
[0, 1, 1, 0]])
# Utility function for calculating
# number of triangles in graph
# That is the core of your code
def triangleInGraph(graph):
return (graph # graph # graph).trace()//6 # numpy magic
# shorter that your version, isn't it?
print("Total number of Triangle in Graph :",
triangleInGraph(graph))
## >> Total number of Triangle in Graph : 1
Mine is harder to optimize that way, but that can be done. We just have to define a new type, PathList, and define what are multiplication and addition of pathlists.
class PathList:
def __init__(self, pl):
self.l=pl
def __mul__(self, b): # That's my previous pathmult
res=[]
for p1 in self.l:
for p2 in b.l:
res.append(p1+p2[1:])
return PathList(res)
def __add__(self,b): # Just concatenation of the 2 lists
return PathList(self.l+b.l)
# For fun, a compact way to print it
def __repr__(self):
res=''
for n in self.l:
one=''
for o in n:
one=one+'→'+str(o)
res=res+','+one[1:]
return '<'+res[1:]+'>'
Using list pathlist (which is just the same list of list as before, but with add and mul operators), we can now redefine our adjacencyToPath
def adjacencyToPath(M):
P=[[[] for _ in range(len(M))] for _ in range(len(M))]
for i in range(len(M)):
for j in range(len(M)):
if M[i][j]==1:
P[i][j]=PathList([[i,j]])
else:
P[i][j]=PathList([])
return P
And now, a bit of numpy magic
pm = np.array(adjacencyToPath(graph))
pm3 = pm#pm#pm
triangles = [pm3[i,i] for i in range(len(pm3))]
pm3 is the matrix of all paths from i to j. So pm3[i,i] are the triangles.
Last remark
Some python remarks on your code.
It is better to compute V from your data, that assuming that coder is coherent when they choose V=4 and a graph 4x4. So V=len(graph) is better
You don't need global V if you don't intend to overwrite V. And it is better to avoid as many global keywords as possible. I am not repeating a dogma here. I've nothing against a global variable from times to times, if we know what we are doing. Besides, in python, there is already a sort of local structure even for global variables (they are still local to the unit), so it is not as in some languages where global variables are a high risks of collision with libraries symbols. But, well, not need to take the risk of overwriting V.
No need neither for the allocate / then write in way you do your matrix multiplication (like for matrix multiplication. You allocate them first, then call matrixmultiplication(source1, source2, dest). You can just return a new matrix. You have a garbage collector now. Well, sometimes it is still a good idea to spare some work to the allocation/garbage collector. Especially if you intended to "recycle" some variables (like in mult(A,A,B); mult(A,B,C); mult(A,C,B) where B is "recycled")

Since the triangles are defined by a sequence o vertices i,j,k such that , we can define the following function:
def find_triangles(adj, n=None):
if n is None:
n = len(adj)
triangles = []
for i in range(n):
for j in range(i + 1, n):
for k in range(j + 1, n):
if (adj[i][j] and adj[j][k] and adj[k][i]):
triangles.append([i, j, k])
return triangles
print("The triangles are: ", find_triangles(graph, V))
## >> The triangles are: [[1, 2, 3]]

Variation of max path sum problem using more directions

I don't know how to approach this question.
We're given an N*N grid listing the costs to get from location a to location b.
Each row in the grid tells us the cost of getting from location to location (each location corresponds to a row in the costs array). (We say that location a is bigger than location b if row a appears after row b in the costs array. The index of every row is a location). We may choose to start from any given location, and visit every location exactly once. At every location p that we visit, we must have already visited all locations less than p, or no locations less than p.
costs[a][b] gives us the cost to move from location a to location b.
costs[a][b] is not necessarily the same as costs[b][a].
costs[a][a] = 0 for every index a (diagonals in the costs array are always 0).
Our task is to find the maximum-sum cost of a valid path.
If the costs array is:
[[0, 9, 1],
[5, 0, 2],
[4, 6, 0]]
The max cost consequently will be 13 as the most expensive valid path is starting at location 2 -> location 0 -> location 1.
The first row tells us how much it will cost to get from location 0 to location 0 (remain in the same location, costs us 0), 0 to location 1 (costs us 9) and 0 to location 2 (costs us 1). The second and third rows follow the same pattern.

The requirements on which locations you can visit mean that after you start at some location i, you're forced to move to a lower location repeatedly until you're at location 0. At that point, you have to ascend consecutively through all the locations that are unvisited. The dynamic programming solution is not obvious, but with a fairly complex implementation you can get an O(n^3) DP algorithm with standard techniques.
It turns out there's an O(n^2) solution as well, which is optimal. It also uses O(n) extra space, which is maybe also optimal. The solution comes from thinking about the structure of our visits: there's a downward sequence of indices (possibly with gaps) ending at 0, and then an upward sequence starting at 0 that contains all other indices. There's 2^n possible subsequences though, so we'll have to think more to speed this up.
Two Sequences
Suppose we have i locations, 0, 1, ... i-1, and we've partitioned these into two ordered subsequences (except 0, which is at the start of both). We'll call these two sequences U and D, for up and down. Exactly one of them has to end on i-1. Without loss of generality, assume U ends with i-1 and D ends with j >= 0.
What happens when we add a location i? We either add it to the end of U so our sequences end on i and j, or we add it to the end of D so our sequences end on i-1 and i. If we add it to U, the path-sum of U (which we define as the sum of cost[u][v] for all adjacent indices u,v in U) increases by cost[i-1][i]. If we add the location to the end of D, the path-sum of D increases by cost[i][j] (since it's a downward sequence, we've flipped the indices relative to U).
It turns out that we only need to track the endpoints of our subsequences as we grow them, as well as the maximum combined path-sum for any pair of subsequences with those endpoints. If we let (i, j) denote the state where U ends with i and D ends with j, we can think about how we could have arrived here.
For example, at (8,5), our previous state must have had a subsequence containing 7, so our previous state must have been (7,5). Therefore max-value(8,5) = max-value(7,5) + cost[7][8]. We always have exactly one predecessor state when the two endpoints differ by more than one.
Now consider the state (8,7). We can't have come from (7,7), since the only number allowed to be in both sequences is 0. So we could have come from any of (0,7), (1,7), ... (6,7): we can choose whichever will maximize our path sum.
def solve(costs: List[List[int]]) -> int:
n = len(costs)
# Deal with edge cases
if n == 1:
return 0
if n == 2:
return max(costs[0][1], costs[1][0])
ups = [costs[0][1]]
downs = [costs[1][0]]
# After iteration i, ups[j] denotes the max-value of state (i, j)
# and downs[j] denotes the max-value of state (j, i)
for i in range(2, n):
ups.append(max(downs[j] + costs[j][i] for j in range(i - 1)))
downs.append(max(ups[j] + costs[i][j] for j in range(i - 1)))
up_gain = costs[i-1][i]
down_gain = costs[i][i-1]
for j in range(i - 1):
ups[j] += up_gain
downs[j] += down_gain
return max(max(ups), max(downs))

Fastest way to sample most numbers with minimum difference larger than a value from a Python list

Given a list of 20 float numbers, I want to find a largest subset where any two of the candidates are different from each other larger than a mindiff = 1.. Right now I am using a brute-force method to search from largest to smallest subsets using itertools.combinations. As shown below, the code finds a subset after 4 s for a list of 20 numbers.
from itertools import combinations
import random
from time import time
mindiff = 1.
length = 20
random.seed(99)
lst = [random.uniform(1., 10.) for _ in range(length)]
t0 = time()
n = len(lst)
sample = []
found = False
while not found:
# get all subsets with size n
subsets = list(combinations(lst, n))
# shuffle to ensure randomness
random.shuffle(subsets)
for subset in subsets:
# sort the subset numbers
ss = sorted(subset)
# calculate the differences between every two adjacent numbers
diffs = [j-i for i, j in zip(ss[:-1], ss[1:])]
if min(diffs) > mindiff:
sample = set(subset)
found = True
break
# check subsets with size -1
n -= 1
print(sample)
print(time()-t0)
Output:
{2.3704888087015568, 4.365818049020534, 5.403474619948962, 6.518944556233767, 7.8388969285727015, 9.117993839791751}
4.182451486587524
However, in reality I have a list of 200 numbers, which is infeasible for a brute-froce enumeration. I want a fast algorithm to sample just one random largest subset with a minimum difference larger than 1. Note that I want each sample has randomness and maximum size. Any suggestions?

My previous answer assumed you simply wanted a single optimal solution, not a uniform random sample of all solutions. This answer assumes you want one that samples uniformly from all such optimal solutions.
Construct a directed acyclic graph G where there is one node for each point, and nodes a and b are connected when b - a > mindist. Also add two virtual nodes, s and t, where s -> x for all x and x -> t for all x.
Calculate for each node in G how many paths of length k exist to t. You can do this efficiently in O(n^2 k) time using dynamic programming with a table P[x][k], filling initially P[x][0] = 0 except P[t][0] = 1, and then P[x][k] = sum(P[y][k-1] for y in neighbors(x)).
Keep doing this until you reach the maximum k - you now know the size of the optimal subset.
Uniformly sample a path of length k from s to t using P to weight your choices.
This is done by starting at s. We then look at each neighbor of s and choose one randomly with a weighting dictated by P[s][k]. This gives us our first element of the optimal set.
We then repeatedly perform this step. We are at x, look at the neighbors of x and pick one randomly using weights P[x][k-i] where i is the step we're at.
Use the nodes you sampled in 3 as your random subset.
An implementation of the above in pure Python:
import random
def sample_mindist_subset(xs, mindist):
# Construct directed graph G.
n = len(xs)
s = n; t = n + 1 # Two virtual nodes, source and sink.
neighbors = {
i: [t] + [j for j in range(n) if xs[j] - xs[i] > mindist]
for i in range(n)}
neighbors[s] = [t] + list(range(n))
neighbors[t] = []
# Compute number of paths P[x][k] from x to t of length k.
P = [[0 for _ in range(n+2)] for _ in range(n+2)]
P[t][0] = 1
for k in range(1, n+2):
for x in range(n+2):
P[x][k] = sum(P[y][k-1] for y in neighbors[x])
# Sample maximum length path uniformly at random.
maxk = max(k for k in range(n+2) if P[s][k] > 0)
path = [s]
while path[-1] != t:
candidates = neighbors[path[-1]]
weights = [P[cn][maxk-len(path)] for cn in candidates]
path.append(random.choices(candidates, weights)[0])
return [xs[i] for i in path[1:-1]]
Note that if you want to sample from the same set of numbers many times, you don't have to recompute P every single time and can re-use it.

I probably don't fully understand the question, because right now the solution is quite trivial. EDIT: yes, I misunderstood after all, the OP does not just want an optimal solution, but wishes to randomly sample from the set of optimal solutions. This answer is not incorrect but it also is an answer to a different question than what OP is interested in.
Simply sort the numbers and greedily construct the subset:
def mindist_subset(xs, mindist):
result = []
for x in sorted(xs):
if not result or x - result[-1] > mindist:
result.append(x)
return result
Sketch of proof of correctness.
Suppose we have a solution S given input array A that is of optimal size. If it does not contain min(A) note that we could remove min(S) from S and add min(A) since this would only increase the distance between min(S) and the second smallest number in S. Conclusion: we can without loss of generality assume that min(A) is part of an optimal solution.
Now we can apply this argument recursively. We add min(A) to a solution and remove all elements too close to min(A), giving remaining elements A'. Then we're left with a subproblem where exactly the same argument applies, we can choose min(A') as our next element of the solution, etc.

how can i optimize this pseudo code

i am new user in python... i know that question very primitive but my project have lots of sets and i need effective and fast code
i want to generate a matrix with if condition.
for example:
M=Matrix(m[i,j] if Condition1 and Condition2 and ...)
how can i optimize following pseudo code?
import networkx as nx
import numpy as np
#G=nx.graph()
#G.neighbors(node)
def seidel_matrix(G):
n=nx.number_of_nodes(G)
x=np.zeros((n,n))
for i in range(n):
for j in range(n):
if i==j:
x[i][j]=0
elif i in G.neighbors(j):
x[i][j]=-1
else:
x[i][j]=1
return x

There are probably multiple ways to do this. Right now, you're looping over every possible edge. If there are lots of non-edges this is a poor choice. It would be faster to just loop over every edge that actually exists.
x = np.ones((n,n)) #default entry is 1.
for u, v in G.edges(): #get edges right
x[u][v] = -1
x[v][u] = -1 #assuming undirected network
for u in G.nodes(): #get diagonal right.
x[u][u] = 0
Note that this assumes that the nodes are labeled 0, 1, ..., n-1

Find the maximum amount of non-intersecting 2-cycles in an undirected graph

I have an adjecency matrix and an adjecency list (I can use either) that both represent a graph.
Basically, how can I pair off connected vertices in the graph so that I am left with the least unpaired (and disconnected) vertices?
I have tried this brute-force strategy:
def max_pairs(adj_matrix):
if len(adj_matrix) % 2:
# If there are an odd amount of vertices, add a disconnected vertex
adj_matrix = [adj + [0] for adj in adj_matrix] + [0] * (len(adj_matrix) + 1)
return max(adj_matrix)
def all_pairs(adj_matrix):
# Adapted from http://stackoverflow.com/a/5360442/5754656
if len(adj_matrix) < 2:
yield 0
return
a = adj_matrix[0]
for i in range(1, len(adj_matrix)):
# Recursively get the next pairs from the list
for rest in all_pairs([
adj[1:i] + adj[i+1:] for adj in adj_matrix[1:i] + adj_matrix[i+1:]]):
yield a[i] + rest # If vertex a and i are adjacent, add 1 to the total pairs
Which is alright for the smaller graphs, but the graphs I am working with have up to 100 vertices.
Is there a way to optimise this so that it can handle that large of a graph?
And is this synonymous to another problem that has algorithms for it? I searched for "Most non-intersecting k-cycles" and variations of that, but could not find an algorithm to do this.

There is polynomial time solution (it works in O(|V|^2 * |E|)). It's known as the Blossom algorithm. The idea is to do something like a matching in a bipartite graph, but also shrink the cycles of odd length into one vertex.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

generate random directed fully-accessible adjacent probability matrix - python

You can simply create a random matrix and normalize the rows so that the sum is 1: v = np.random.rand(n, n) v /= v.sum(axis=1)

Related

Find a cycle of 3 (triangle) from adjacency matrix

Variation of max path sum problem using more directions

Fastest way to sample most numbers with minimum difference larger than a value from a Python list

how can i optimize this pseudo code

Find the maximum amount of non-intersecting 2-cycles in an undirected graph

Categories

Resources