Split two lists of lists into subgraphs using python - python

I have a network as a list of lists, where the first list is the origin nodes and the second list is the destination nodes, and then the two lists combined tell you which origins have an edge to which destinations.
So essentially I have this:
edge_index = [[0,1,2,3,5,6,5,9,10,11,12,12,13],[1,2,3,4,6,7,8,10,11,10,13,12,9]]
And I want to split this list structure into:
[[0,1,2,3,5,6,5],[9,10,11,12,12,13]]
[[1,2,3,4,6,7,8],[10,11,10,13,12,9]]
i.e. there is no link between 8 and 9, so it's a new subgraph.
I cannot use networkx because it does not seem to give me the right number of subgraphs (I know how many networks there should be in advance). So I wanted to subgraph the list using a different method, and then see if I get the same number as NetworkX or not.
I wrote this code:
edge_index = [[0,1,2,3,5,6,5],[1,2,3,4,6,7,8]]
origins_split = edge_index[0]
dest_split = edge_index[1]
master_list_of_all_graph_nodes = [0,1,2,3,4,5,6,7,8] ##for testing
list_of_graph_nodes = []
list_of_origin_edges = []
list_of_dest_edges = []
graph_nodes = []
graph_edge_origin = []
graph_edge_dest = []
targets_list = []
for o,d in zip(origins_split,dest_split): #change
if o not in master_list_of_all_graph_nodes:
if d not in master_list_of_all_graph_nodes:
nodes = [o,d]
origin = [o]
dest = [d]
graph_nodes.append(nodes)
graph_edge_origin.append(origin)
graph_edge_dest.append(dest)
elif d in master_list_of_all_graph_nodes:
for index,graph_node_list in enumerate(graph_nodes):
if d in graph_node_list:
origin_list = graph_edge_origin[index]
origin_list.append(o)
dest_list.append(d)
master_list_of_all_graph_nodes.append(o)
if d not in master_list_of_all_graph_nodes:
if o in master_list_of_all_graph_nodes:
for index,graph_node_list in enumerate(graph_nodes):
if o in graph_node_list:
origin_list = graph_edge_origin[index]
origin_list.append(o)
dest_list.append(d)
master_list_of_all_graph_nodes.append(d)
if o in master_list_of_all_graph_nodes:
if d in master_list_of_all_graph_nodes:
o_index = ''
d_index = ''
for index,graph_node_list in enumerate(graph_nodes):
if d in graph_node_list:
d_index = index
if o in graph_node_list:
o_index = index
if o_index == d_index:
graph_edge_origin[o_index].append(o)
graph_edge_dest[d_index].append(d)
master_list_of_all_graph_nodes.append(o)
master_list_of_all_graph_nodes.append(d)
else:
o_list = graph_edge_origin[o_index]
d_list = graph_edge_dest[d_index]
node_o_list = node_list[o_index]
node_d_list = node_list[d_index]
new_node_list = node_o_list + node_d_list
node_list.remove(node_o_list)
node_list.remove(node_d_list)
graph_edge_origin.remove(o_list)
graph_edge_dest.remove(d_list)
new_origin_list = o_list.append(o)
new_dest_list = d_list.append(d)
graph_nodes.append(new_node_list)
graph_edge_dest.append(new_dest_list)
graph_edge_origin.append(new_origin_list)
master_list_of_all_graph_nodes.append(o)
master_list_of_all_graph_nodes.append(d)
print(graph_nodes)
print(graph_edge_dest)
print(graph_edge_origin)
And i get the error:
graph_edge_origin[o_index].append(o)
TypeError: list indices must be integers or slices, not str
I was wondering if someone could demonstrate where I'm going wrong, but also I feel like I'm doing this really inefficiently so if someone could demonstrate a better method I'd appreciate it. I can see other questions like this, but not one I can specifically figure out how to apply here.

In this line:
graph_edge_origin[o_index].append(o)
o_index is a string (probably the empty string, due to the for-loop not being entered).
In general either set a break-point on the line that is failing and inspect the variables in your debugger, or print out the variables before the failing line.

Related

How to create coordinates for nodes of graph

I have a list that has strings separated by commas. The values of each string are nothing but the navigation steps/action of the same procedure done by different users. I want to create coordinates for these steps/actions and store them for creating graph. Each unique steps/actions
will have one coordinate. My idea is I will consider a string with more steps first. I will assign them coordinates ranging from (1,0) to (n,0). Here first string will have 'y' as 0 saying all the actions will be in one layer. When i check for steps/actions in second string, if there are any missing ones i will assign them (1,1) to (n,1). So on... Care has to be taken that if first steps/actions of one string falls in between of another bigger string, the coordinates should be after that.
This sounds confusing, but in simple terms, i want to create coordinates for user flow of a website.
Assume list,
A = ['A___O___B___C___D___E___F___G___H___I___J___K___L___M___N',
'A___O___B___C___E___D___F___G___H___I___J___K___L___M___N',
'A___B___C___D___E___F___G___H___I___J___K___L___M___N',
'A___B___C___E___D___F___G___H___I___J___K___L___M___N',
'A___Q___C___D___E___F___G___H___I___J___K___L___M___N',
'E___P___F___G___H___I___J___K___L___M___N']
I started below code, but it is getting complicated. Any help is appreciated.
A1 = [i.split('___') for i in A]
# A1.sort(key=len, reverse=True)
A1 = sorted(A1, reverse=True)
if len(A1)>1:
Actions = {}
horizontalVal = {}
verticalVal = {}
restActions = []
for i in A1:
for j in i[1:]:
restActions.append(j)
for i in range (len(A1)):
if A1[i][0] not in restActions and A1[i][0] not in Actions.keys():
Actions[A1[i][0]] = [i,0]
horizontalVal[A1[i][0]] = i
verticalVal[A1[i][0]] = 0
unmarkedActions = []
for i in range(len(sortedLen)):
currLen = sortedLen[i]
for j in range(len(A1)):
if len(A1[j]) == currLen:
if j == 0:
for k in range(len(A1[j])):
currK = A1[j][k]
if currK not in Actions.keys():
Actions[currK] = [k,0]
horizontalVal[currK] = k
verticalVal[currK] = 0
else:
currHori = []
print(A1[j])
for k in range(len(A1[j])):
currK = A1[j][k]
.
. to be continued

Python: copied lists keep changing elements in all other lists, although list(), copy() etc. are used

I have a bug in my code and I have tried to fix it using different approaches, still it does not work. I have scaled down my original code to the essential part of it below. I use a textfile as the input and it contains the number of vertices (first line), number of edges (second line), number of colors (third line) and the remaining lines consist of two numbers (separated by a blank space) representing the edges. What is important are the edges.
INPUT
6
5
3
6 2
2 3
3 4
4 6
6 2
CODE
# An instance of m-Coloring Graph problem (NP-hard) Karp-reduced to an
# instance of the Casting problem.
#! /usr/bin/python3
def subgraph(v,aux1,aux2):
print(nhoods)
sg = list(aux2[v-1])
aux1.remove(sg)
sg.remove(v)
for i, nhood in enumerate(aux1):
try:
aux1[i].remove(v)
aux2[i].remove(v)
except ValueError:
pass # do nothing!
for vertex in sg:
sg.extend(subgraph(vertex,aux1,aux2))
return sg
line = 0
edges = []
inputs = "testfile.txt"
f = open(inputs,"r")
for i in f.readlines():
line += 1
if line == 1:
V = int(i)
elif line == 2:
E = int(i)
elif line == 3:
m = int(i)
else:
edge = [int(n) for n in i.split()]
if edge in edges:
pass # Removes double edges
else:
edges.append(edge)
conv = [] # Connected vertices
for edge in edges:
for vend in edge:
if vend in conv:
pass
else:
conv.append(vend) # Stores none-isolated vertices
# Create lists of neighbors/neighborhoods for each vertex
nhoods = []
for v in conv:
nhood = []
for edge in edges:
if v == edge[0]:
nhood.append(edge[1])
elif v == edge[1]:
nhood.append(edge[0])
nhood.append(v)
nhoods.append(nhood)
# Create list of connected subgraphs
aux1 = list(nhoods)
aux2 = list(nhoods)
#for nhood in nhoods:
# aux1.append(nhood)
# aux2.append(nhood)
SG = [] # List of subgraphs
while aux1 != []:
v = aux1[0][0]
SG.append(subgraph(v,aux1,aux2))
Now, when I run the code, what I want it to do is create copied lists of the nhoods list called aux1 and aux2 (at line 62 in the code). (I later use these for the prupose of finding connected subgraphs in the input graph). However, when I modify one of the copied lists aux1 or aux2 the nhoods changes! But this should not happen when I am using the list() function, right? I have tried using the copy() function and a for-loop with no better results. To me it seems that the lists refer to the same spot in the memory, but why? Is it that the elements of the lists (which are lists) are refering to the same memory spot? How do I solve this?
I hope I did not miss anything, otherwise just ask, thanks in advance!
Best regards//
I somewhat figured out the issue you are facing is mutability property of the list. Also you need to understand the difference in Soft copy and Hard copy. Whatever you have followed are Soft copy approaches. Since you have mutable elements inside a mutable object, Hard copy is required. For Hard copy one approach you can follow is using copy.deepcopy method.
import copy
...
aux1 = copy.deepcopy(nhoods)
aux2 = copy.deepcopy(nhoods)
Now all of the elements of aux1 & aux2 are created on different memory than that of nhoods.

Calculating the number of graphs created and the number of vertices in each graph from a list of edges

Given a list of edges such as, edges = [[1,2],[2,3],[3,1],[4,5]]
I need to find how many graphs are created, by this I mean how many groups of components are created by these edges. Then get the number of vertices in the group of components.
However, I am required to be able to handle 10^5 edges, and i am currently having trouble completing the task for large number of edges.
My algorithm is currently getting the list of edges= [[1,2],[2,3],[3,1],[4,5]] and merging each list as set if they have a intersection, this will output a new list that now contains group components such as , graphs = [[1,2,3],[4,5]]
There are two connected components : [1,2,3] are connected and [4,5] are connected as well.
I would like to know if there is a much better way of doing this task.
def mergeList(edges):
sets = [set(x) for x in edges if x]
m = 1
while m:
m = 0
res = []
while sets:
common, r = sets[0], sets[1:]
sets = []
for x in r:
if x.isdisjoint(common):
sets.append(x)
else:
m = 1
common |= x
res.append(common)
sets = res
return sets
I would like to try doing this in a dictionary or something efficient, because this is toooo slow.
A basic iterative graph traversal in Python isn't too bad.
import collections
def connected_components(edges):
# build the graph
neighbors = collections.defaultdict(set)
for u, v in edges:
neighbors[u].add(v)
neighbors[v].add(u)
# traverse the graph
sizes = []
visited = set()
for u in neighbors.keys():
if u in visited:
continue
# visit the component that includes u
size = 0
agenda = {u}
while agenda:
v = agenda.pop()
visited.add(v)
size += 1
agenda.update(neighbors[v] - visited)
sizes.append(size)
return sizes
Do you need to write your own algorithm? networkx already has algorithms for this.
To get the length of each component try
import networkx as nx
G = nx.Graph()
G.add_edges_from([[1,2],[2,3],[3,1],[4,5]])
components = []
for graph in nx.connected_components(G):
components.append([graph, len(graph)])
components
# [[set([1, 2, 3]), 3], [set([4, 5]), 2]]
You could use Disjoint-set data structure:
edges = [[1,2],[2,3],[3,1],[4,5]]
parents = {}
size = {}
def get_ancestor(parents, item):
# Returns ancestor for a given item and compresses path
# Recursion would be easier but might blow stack
stack = []
while True:
parent = parents.setdefault(item, item)
if parent == item:
break
stack.append(item)
item = parent
for item in stack:
parents[item] = parent
return parent
for x, y in edges:
x = get_ancestor(parents, x)
y = get_ancestor(parents, y)
size_x = size.setdefault(x, 1)
size_y = size.setdefault(y, 1)
if size_x < size_y:
parents[x] = y
size[y] += size_x
else:
parents[y] = x
size[x] += size_y
print(sum(1 for k, v in parents.items() if k == v)) # 2
In above parents is a dict where vertices are keys and ancestors are values. If given vertex doesn't have a parent then the value is the vertex itself. For every edge in the list the ancestor of both vertices is set the same. Note that when current ancestor is queried the path is compressed so following queries can be done in O(1) time. This allows the whole algorithm to have O(n) time complexity.
Update
In case components are required instead of just number of them the resulting dict can be iterated to produce it:
from collections import defaultdict
components = defaultdict(list)
for k, v in parents.items():
components[v].append(k)
print(components)
Output:
defaultdict(<type 'list'>, {3: [1, 2, 3], 5: [4, 5]})

Using Dijkstra to work out shortest route between destinations, dictionary help. (python)

How can I add values to a dictionary like they have done here: How to implement Dijkstra algorithm with Python (solved with all explanations) ? ?
I'm trying to create a program that will work out the shortest route between destinations using Dijkstras algorithm, see code below.
It currently takes in the postcodes and returns the longitude and latitude in a list called "geocodes". I still need to somehow work out the difference between each of the geocodes and then add it to the dictionary. From there I will be able to just implement the code from the link.
If anyone could help me it would be much appreciated.
import urllib.request
userinput = ""
postcodes = []
geocodes = []
while userinput != ("q"):
print ("Enter postcode, Or q to finish")
postcode = input()
if postcode == "q" :
break
postcodes.append(postcode)
print (postcodes)
for each in postcodes:
geocode = []
core_string = 'http://uk-postcodes.com/postcode/' + each + '.xml'
response = urllib.request.urlopen(core_string)
html = response.read()
##print(type(html))
##print(html)
raw_html = str(html)
##print(raw_html)
##print(raw_html.find("lat"))
latStartPoint = raw_html.find("<lat>")+5
latEndPoint = raw_html.find("</lat>")-1
lonStartPoint = raw_html.find("<lng>")+5
lonEndPoint = raw_html.find("</lng>")-1
lat = (raw_html[latStartPoint:latEndPoint])
lon = (raw_html[lonStartPoint:lonEndPoint])
geocode.append(lat)
geocode.append(lon)
geocodes.append(geocode)
So you've built two "parallel" lists postcodes and geocodes -- where the latter's items are 2-item strings -- and you want a dict of dicts with the distances being the ultimate values, postcodes the keys at each level.
Is that correct?
Then, alas, there really isn't any shortcut... it's a quadratic loop.
result = {p: {} for p in postcodes}
joint_list = list(zip(postcodes, geocodes))
for i, (p, g) in enumerate(joint_list):
g = [float(x) for x in g]
for j in range(i+1, len(joint_list)):
op, og = joint_list[j]
og = [float(x) for x in og]
dist = distance(g, og)
result[p][op] = result[op][p] = dist
where of course
import math
def distance(g1, g2):
return math.hypot(g1[0]-g2[0], g1[1]-g2[1])

Problems with the zip function: lists that seem not iterable

I'm having some troubles trying to use four lists with the zip function.
In particular, I'm getting the following error at line 36:
TypeError: zip argument #3 must support iteration
I've already read that it happens with not iterable objects, but I'm using it on two lists! And if I try use the zip only on the first 2 lists it works perfectly: I have problems only with the last two.
Someone has ideas on how to solve that? Many thanks!
import numpy
#setting initial values
R = 330
C = 0.1
f_T = 1/(2*numpy.pi*R*C)
w_T = 2*numpy.pi*f_T
n = 10
T = 1
w = (2*numpy.pi)/T
t = numpy.linspace(-2, 2, 100)
#making the lists c_k, w_k, a_k, phi_k
c_karray = []
w_karray = []
A_karray = []
phi_karray = []
#populating the lists
for k in range(1, n, 2):
c_k = 2/(k*numpy.pi)
w_k = k*w
A_k = 1/(numpy.sqrt(1+(w_k)**2))
phi_k = numpy.arctan(-w_k)
c_karray.append(c_k)
w_karray.append(w_k)
A_karray.append(A_k)
phi_karray.append(phi_k)
#making the function w(t)
w = []
#doing the sum for each t and populate w(t)
for i in t:
w_i = ([(A_k*c_k*numpy.sin(w_k*i+phi_k)) for c_k, w_k, A_k, phi_k in zip(c_karray, w_karray, A_k, phi_k)])
w.append(sum(w_i)
Probably you mistyped the last 2 elements in zip. They should be A_karray and phi_karray, because phi_k and A_k are single values.
My result for w is:
[-0.11741034896740517,
-0.099189027720991918,
-0.073206290274556718,
...
-0.089754003567358978,
-0.10828235682188027,
-0.1174103489674052]
HTH,
Germán.
I believe you want zip(c_karray, w_karray, A_karray, phi_karray). Additionally, you should produce this once, not each iteration of the for the loop.
Furthermore, you are not really making use of numpy. Try this instead of your loops.
d = numpy.arange(1, n, 2)
c_karray = 2/(d*numpy.pi)
w_karray = d*w
A_karray = 1/(numpy.sqrt(1+(w_karray)**2))
phi_karray = numpy.arctan(-w_karray)
w = (A_karray*c_karray*numpy.sin(w_karray*t[:,None]+phi_karray)).sum(axis=-1)

Categories

Resources