Context
This is the first time I have to work with NetworkX so either I can't read correctly the documentation, or I simply do not use the right vocabulary.
Problem
I am working with a DiGraph, and I want to get a list of every nodes accessible starting from a specified node.
I thought of making a sub-graph containing the nodes I just described, and I would siply have to iterate over that specific sub-graph. Unfortunately, I didn't find a way to create automatically a sub-graph with the condition I mentioned.
It feels like an obvious feature. What am I missing ?
You are looking for the nx.descendants method:
descendants(G, source)
Return all nodes reachable from
(source) in G.
Parameters : G : NetworkX DiGraph
source : node in G
Returns : des : set()
The descendants of source in G
Related
I have a networkX graph where every node has an attribute.
I need to extract nodes based on a numerical attribute made in the range [0,inf] to create edges.
I tried using random.choice(G.nodes(), p) with p=attribute/(sum of the attributes in the graph).
The problem is that everytime i extract a node to create the edge my attribute change (for example let's say the attribute+=1) so I need to update all the probabilities because also the sum increases by 1.
For example I could have a graph with G.nodes(data=True)={1:{att=10},2:{att=5}, 3:{att=2}}
So p=[10/17, 5/17, 2/17].
If I extract for example 1 at the first extraction my graph will be G.nodes(data=True)={1:{att=11},2:{att=5}, 3:{att=2}} and p=[11/18, 5/18, 2/18].
Now, because i have more than a thousand graph and for every one of them I need to do a 50000 for clause that create edges, it's not computationally feasible to update all the probability every time i create an edge.
Is there a way to just use the node's attribute or to not calculate my probability every time?
By using numpy array I have done this:
G=nx.Graph()
G.add_nodes_from([1,2,3])
G.nodes[1]["att"]=10
G.nodes[2]["att"]=5
G.nodes[3]["att"]=2
dict={}
for i in G.nodes():
dict[i]=G.nodes[i]["att"]
extracted=random.chance(G.nodes(),p=np.fromiter(dict.values(),dtype="float")/np.sum(np.fromiter(dict.values(),dtype="float")))
When extracted (for example node 1) G.nodes[1]["att"]+=1 and nothing else need to be updated
I looking for a elegant way to find all nodes with a defined attribute. E.g. let's say I create a new network with two nodes
G.add_node('A', attr1='alpha')
G.add_node('B', attr1='beta')
Now, I would like to have a function that returns all nodes where the attribute "attr1" that matches "beta" something like
THX
Lazloo
Try
L = [node for node in G.nodes() if G.node[node]['attr1']=='beta']
to create a list (look at list comprehensions). You can also create other data types that contain all of these nodes.
I keep seeing pseudocode for Depth First Search that is completely confusing to me in how it relates to my specific problem. I'm trying to determine whether or not a 'directed graph' is strongly connected.
If I have a dict with 2 strings (the first represents the source, the second represents the destination) and an optional number that represents edge weight:
{'Austin': {'Houston': 300}, 'SanFrancisco': {'Albany': 1000}, 'NewYorkCity': { 'SanDiego': True }}
How can I implement some of the elements of the DFS? I know I can start at a vertex 'Austin' and that 'Houston' is another vertex. But I don't see how any of this works in Python code
Like I have this pseudocode:
function graph_DFS(start):
# Input: start vertex
S = new Stack()
# Mark start as visited
S.push(start)
while S is not empty:
node = S.pop()
# Do something? (e.g. print)
for neighbor in node’s adjacent nodes:
if neighbor not visited:
# Mark neighbor as visited
S.push(neighbor)
I can see that I could pass 'Austin' as my start. But how in the world do I set 'Austin' to visited, and how do I see what nodes are adjacent to 'Austin'?
Plus how can I even use this algorithm to return true or false if the graph is strongly connected?
I am just having a really hard time seeing this transfer from pseudocode to code. Appreciate any help.
I can see that I could pass 'Austin' as my start. But how in the world
do I set 'Austin' to visited, and how do I see what nodes are adjacent
to 'Austin'?
You can see in the code that you pop out 'Austin', so we will not be looking back at it. In your data structure, you allow only one edge from a vertex, so you will never have more than one neighbor.
Plus how can I even use this algorithm to return true or false if the graph is strongly connected?
This is just a utility DFS function, the algorithm to find whether a graph is strongly connected or not, required running DFS twice. Basically, the hint is you want to know whether you get a tree or a forest on running DFS.
In my opinion, you should update your data structure such that the value of every key is a list (vertices to which it has an edge). You can store weights as well in the list in case you need them later.
This question already has answers here:
Best practices for Querying graphs by edge and node attributes in NetworkX
(3 answers)
Closed 7 years ago.
So I'm using a networkX graph to represent some information. This information is represented by different object types (for example, ColorNode and ShapeNode).
Some of the processing that is done on this graph requires me to extract out a specific type of node. Every time I need to do this, I do something along the lines of the code below.
colornodes = []
for node in graph.nodes():
if isinstance(node, ColorNode):
colornodes.append()
While this works, I feel like this is a situation that would arise often when working with graphs and I am re-inventing the wheel there.
Essentially, I would like to know if there is a nicer way of doing this.
Instead of defining your own type and always check with isinstance (which is painfully slow) I suggest another approach.
You can have a look at this answer for the classic node/edge filtering.
However I found another trick which may come handy for your specific case.
If you define an attribute that represents the node type, you can query nodes having that specific attribute using
the builtin get_node_attributes function. The trick is that it only returns nodes that really define the attribute:
import networkx as nx
G = nx.complete_graph(10)
G.node[0]['ColorNode'] = True # right-hand side value is irrelevant for the lookup
G.node[1]['ColorNode'] = True
G.node[2]['ShapeNode'] = True
G.node[3]['ShapeNode'] = True
shape_nodes = nx.get_node_attributes(G, 'ShapeNode').keys()
color_nodes = nx.get_node_attributes(G, 'ColorNode').keys()
print('Shape node ids: {}'.format(shape_nodes))
print('Color node ids: {}'.format(color_nodes))
Output:
Shape node ids: [2, 3]
Color node ids: [0, 1]
Of course if your graph is big or static, you should keep the id lists for fast querying!
instead of graph.nodes(),
you can use xpath.
I'd like to know the best way to read a disconected undirected graph using igraph for python. For instance, if I have the simple graph in which 0 is linked to 1 and 2 is a node not connected to any other. I couldn't get igraph to read it from a edgelist format(Graph.Read_Edgelist(...)), because every line must be an edge, so the following is not allowed:
0 1
2
I've been just wondering if adjacency matrix is my only/best option in this case (I could get it to work through this representation)? I'd rather a format in which I could understand the data by looking it (something really hard when it comes to matrix format).
Thanks in advance!
There's the LGL format which allows isolated vertices (see Graph.Read_LGL). The format looks like this:
# nodeID
nodeID2
nodeID3
# nodeID2
nodeID4
nodeID5
nodeID
# isolatedNode
# nodeID5
I think you get the basic idea; lines starting with a hash mark indicate that a new node is being defined. After this, the lines specify the neighbors of the node that has just been defined. If you need an isolated node, you just specify the node ID prepended by a hash mark in the line, then continue with the next node.
More information about the LGL format is to be found here.
Another fairly readable format that you might want to examine is the GML format which igraph also supports.