Can you specify a bidirectional edge in a NetworkX digraph? - python

I'd like to be able to draw a NetworkX graph connecting characters from the movie "Love, Actually" (because it's that time of the year in this country), and specifying how each character "relates" to the other in the story.
Certain relationships between characters are unidirectional - e.g. Mark is in love with Juliet, but not the reverse. However, Mark is best friends with Peter, and Peter is best friends with Mark - this is a bidirectional relationship. Ditto Peter and Juliet being married to each other.
I'd like to specify both kinds of relationships. Using a NetworkX digraph in Python, I seem to have a problem: to specify a bidirectional edge between two nodes, I apparently have to provide the same link twice, which will subsequently create two arrows between two nodes.
What I'd really like is a single arrow connecting two nodes, with heads pointing both ways. I'm using NetworkX to create the graph, and pyvis.Network to render it in HTML.
Here is the code so far, which loads a CSV specifying the nodes and edges to create in the graph.
import networkx as nx
import csv
from pyvis.network import Network
dg = nx.DiGraph()
with open("rels.txt", "r") as fh:
reader = csv.reader(fh)
for row in reader:
if len(row) != 3:
continue # Quick check for malformed csv input
dg.add_edge(row[0], row[1], label=row[2])
nt = Network('500px', '800px', directed=True)
nt.from_nx(dg)
nt.show('nx.html', True)
Here is the CSV, which can be read as "Node1", "Node2", "Edge label":
Mark,Juliet,in love with
Mark,Peter,best friends
Peter,Mark,best friends
Juliet,Peter,married
Peter,Juliet,married
And the resulting image:
Whereas what I'd really like the graph to look like is this:
(Thank you to this site for the wonderful graph tool for the above visualisation)
Is there a way to achieve the above visualisation using NetworkX and Pyvis? I wasn't able to find any documentation on ways to create bidirectional edges in a directed graph.

Read the csv into pandas. Create a digraph and plot. Networkx has quite a comprehensive documentation on plotting. See what I came up with
import pandas as pd
import networkx as nx
from networkx import*
df =pd.DataFrame({'Source':['Mark','Mark','Peter','Juliet','Peter'],'Target':['Juliet','Peter','Mark','Peter','Juliet'],'Status':['in love with','best friends','best friends','married','married']})
#Create graph
g = nx.from_pandas_edgelist(df, 'Source', "Target", ["Status"], create_using=nx.DiGraph())
pos = nx.spring_layout(g)
nx.draw(g, pos, with_labels=True)
edge_labels = dict([((n1, n2), d['Status'])
for n1, n2, d in g.edges(data=True)])
nx.draw_networkx_edge_labels(g,
pos, edge_labels=edge_labels,
label_pos=0.5,
font_color='red',
font_size=7,
font_weight='bold',
verticalalignment='bottom' )
plt.show()

Related

Graph VIsualization in NetworkX. Is this loop under node 10 ok?

I am making a graph visualization under NetworkX, but then I found a self loop around node 10. Of all the graph visualizations I have seen, I have never come across such a thing.
I dont if this is wrong or right, but Mr Stark, I dont feel good about this. Can somebody help me out?
I tried modifying the dataframe that I used to make this graph, but I cant figure it out yet.
import dgl as dgl
import networkx as nx
curr = coun_df['Currencies'].to_numpy()
loca = coun_df['Location'].to_numpy()
g = dgl.graph((loca, curr))
print(g)
nxgraph = g.to_networkx().to_undirected()
pos = nx.spring_layout(nxgraph)
nx.draw(nxgraph, pos, with_labels=True, node_color=[[.7, .7, .7]])
And this is how the dataframes (both currencies and location are actual categorical variables one-hot encoded to look like this) look like -
Dataframe
This is the image of the graph.
I can see currency 10 being mapped to location 10, but I wonder if self-loops are possible in graphs.
Yes, a network with loops in it may validly be called a "graph". As the wiki page indicates, there are multiple conventions followed as to what exactly constitutes a "graph" that can typically be inferred from context.
If you want to do so, then the easiest way to get rid of loops is to get rid of rows where Currencies and Location entries match in the dataframe from which the data is pulled. For instance,
import dgl as dgl
import networkx as nx
noloop_df = coun_df[coun_df['Currencies']!=coun_df['Location']]
curr = noloop_df['Currencies'].to_numpy()
loca = noloop_df['Location'].to_numpy()
g = dgl.graph((loca, curr))
print(g)
nxgraph = g.to_networkx().to_undirected()
pos = nx.spring_layout(nxgraph)
nx.draw(nxgraph, pos, with_labels=True, node_color=[[.7, .7, .7]])

Network graph CircosPlot function couldnt get the node labeling and node size variation to work

I will appreciate any help I can get here, I am using python networkx CircosPlot function to generate network graph. The graph looks ok, I am having trouble labeling the node and varying the node size. My spreadsheet has the following columns: "Model", "Node_Size", "Category", "Factors", "Edge_width". Please find below the python codes, thanks.
import networkx as nx
G = nx.from_pandas_edgelist(df,
source="Model",
target="Category",
edge_attr=["edge_size"],
create_using=nx.MultiGraph(),
)
bottom_nodes, top_nodes = bipartite.sets(G)
bipartite = bipartite.color(G)
nx.set_node_attributes(G, bipartite, 'Model')
c = CircosPlot(G, node_order='Model', node_grouping='Model', node_color='Model')
plt.show()

I can't form a graph with networkx based on three criteria

I'm new to Python. Please help me solve the problem with graph construction. I have a database with the attribute "Source", "Interlocutor" and "Frequency".
An example of three lines:
I need to build a graph based on the Source-Interlocutor, but the frequency is also taken into account.
Like this:
My code:
dic_values={Source:[24120.0,24120.0,24120.0], Interlocutor:[34,34,34],Frequency:[446625000, 442475000, 445300000]
session_graph=pd.DataFrame(dic_values)
friquency=session_graph['Frequency'].unique()
plt.figure(figsize=(10,10))
for i in range(len(friquency)):
df_friq=session_subset[session_subset['Frequency']==friquency[i]]
G_frique=nx.from_pandas_edgelist(df_friq,source='Source',target='Interlocutor')
pos = nx.spring_layout(G_frique)
nx.draw_networkx_nodes(G_frique, pos, cmap=plt.get_cmap('jet'), node_size = 20)
nx.draw_networkx_edges(G_frique, pos, arrows=True)
nx.draw_networkx_labels(G_frique, pos)
plt.show()
And I have like this:
Your problem requires a MultiGraph
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
import pydot
from IPython.display import Image
dic_values = {"Source":[24120.0,24120.0,24120.0], "Interlocutor":[34,34,34],
"Frequency":[446625000, 442475000, 445300000]}
session_graph = pd.DataFrame(dic_values)
sources = session_graph['Source'].unique()
targets = session_graph['Interlocutor'].unique()
#create a Multigraph and add the unique nodes
G = nx.MultiDiGraph()
for n in [sources, targets]:
G.add_node(n[0])
#Add edges, multiple connections between the same set of nodes okay.
# Handled by enum in Multigraph
#Itertuples() is a faster way to iterate through a Pandas dataframe. Adding one edge per row
for row in session_graph.itertuples():
#print(row[1], row[2], row[3])
G.add_edge(row[1], row[2], label=row[3])
#Now, render it to a file...
p=nx.drawing.nx_pydot.to_pydot(G)
p.write_png('multi.png')
Image(filename='multi.png') #optional
This will produce the following:
Please note that node layouts are trickier when you use Graphviz/Pydot.
For example check this SO answer.. I hope this helps you move forward. And welcome to SO.

Clipping a networkx graph according to georeferenced polygon

I am running a loop that computes a networkx.classes.multidigraph.MultiDiGraph for each row (neighbourhood) of a list of GeoDataFrames (cities). It then computes some statistics for each row and writes the file out to disk. The problem is that the loop is extremely long to compute because the graph is computed for each row.
The way I want to quicken the loop is by computing the graph for the whole GeoDataFrame and then clipping the graph into each row (each row has a polygon). You can do this for GeoSeries with geopandas.clip. It seems, however, that no equivalent to geopandas.clip exists for networkx graphs.
Does anyone know of a way to clip a networkx graph?
Alternatively, what other methods exist to speed up my loop.
Note: clipping would work if I could convert the networkx graph to a pandas object. Unfortunately, I think it is not possible to keep the properties which osmnx acts on when the graph is converted to a pandas object. If I'm wrong, please say so.
Here is my initial code:
import osmnx as ox
import pandas as pd
import geopandas as gpd
import os
path="C:/folder/"
files=[os.path.join(path, f) for f in os.listdir(path)]
for i in range(0,2):
city=gpd.read_file(files[i])
circ=[]
for i in range(0,181):
graph_for_row=ox.graph_from_polygon(city.geometry[i])
#above is the long command
stat = ox.basic_stats(graph_for_row)
circ.append(stat['circuity_avg'])
circ=pd.Series(circ)
merged.append(pd.concat([city, circ], axis=1))
for i in (range(0,len(merged))):
with open(geofiles[i], 'w') as f:
f.write(merged[i].to_json())
Here is the new loop I'm aiming for:
clipped_graph=[]
for i in range(0,2):
city=gpd.read_file(files[i])
whole_city=city.unary_union
graph=ox.graph_from_polygon(whole_city)
clipped_graph.append(gpd.clip(graph, city.geometry))#this line
#does not work since 'graph' is a networkx object, not
#a GeoDataFrame or GeoSeries
circ=[]
for i in range(0,181)
stat = ox.basic_stats(clipped_graph[i])
circ.append(stat['circuity_avg'])
circ=pd.Series(circ)
merged.append(pd.concat([city, circ], axis=1))
for i in (range(0,len(merged))):
with open(geofiles[i], 'w') as f:
f.write(merged[i].to_json())
You can use your individual polygons to (spatially) intersect the graph nodes, then use those nodes to induce a subgraph. MWE:
import osmnx as ox
ox.config(use_cache=True, log_console=True)
# load a shapefile of polygons as geodataframe using geopandas
# here i just get 3 cities from OSM to make example reproducible without a shapefile
places = ['Cudahy, CA, USA', 'Bell, CA, USA', 'Maywood, CA, USA']
gdf = ox.gdf_from_places(places)
# get a graph of the union of their boundaries, then extract nodes as geodataframe
G = ox.graph_from_polygon(gdf.unary_union, network_type='drive')
nodes = ox.graph_to_gdfs(G, edges=False)
# for each city polygon, find intersecting nodes then induce subgraph
for polygon in gdf['geometry']:
intersecting_nodes = nodes[nodes.intersects(polygon)].index
G_sub = G.subgraph(intersecting_nodes)
fig, ax = ox.plot_graph(G_sub)

Node size dependent on the node degree on NetworkX

I imported my Facebook data onto my computer in the form of a .json file. The data is in the format:
{"nodes":[{"name":"Alan"},{"name":"Bob"}],"links":[{"source":0,"target:1"}]}
Then, I use this function:
def parse_graph(filename):
"""
Returns networkx graph object of facebook
social network in json format
"""
G = nx.Graph()
json_data=open(filename)
data = json.load(json_data)
# The nodes represent the names of the respective people
# See networkx documentation for information on add_* functions
G.add_nodes_from([n['name'] for n in data['nodes']])
G.add_edges_from([(data['nodes'][e['source']]['name'],data['nodes'][e['target']]['name']) for e in data['links']])
json_data.close()
return G
to enable this .json file to be used a graph on NetworkX. If I find the degree of the nodes, the only method I know how to use is:
degree = nx.degree(p)
Where p is the graph of all my friends. Now, I want to plot the graph such that the size of the node is the same as the degree of that node. How do I do this?
Using:
nx.draw(G,node_size=degree)
didn't work and I can't think of another method.
Update for those using networkx 2.x
The API has changed from v1.x to v2.x. networkx.degree no longer returns a dict but a DegreeView Object as per the documentation.
There is a guide for migrating from 1.x to 2.x here.
In this case it basically boils down to using dict(g.degree) instead of d = nx.degree(g).
The updated code looks like this:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(1,2), (2,3), (2,4), (3,4)])
d = dict(g.degree)
nx.draw(g, nodelist=d.keys(), node_size=[v * 100 for v in d.values()])
plt.show()
nx.degree(p) returns a dict while the node_size keywod argument needs a scalar or an array of sizes. You can use the dict nx.degree returns like this:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
g.add_edges_from([(1,2), (2,3), (2,4), (3,4)])
d = nx.degree(g)
nx.draw(g, nodelist=d.keys(), node_size=[v * 100 for v in d.values()])
plt.show()
#miles82 provided a great answer. However, if you've already added the nodes to your graph using something like G.add_nodes_from(nodes), then I found that d = nx.degree(G) may not return the degrees in the same order as your nodes.
Building off the previous answer, you can modify the solution slightly to ensure the degrees are in the correct order:
d = nx.degree(G)
d = [(d[node]+1) * 20 for node in G.nodes()]
Note the d[node]+1, which will be sure that nodes of degree zero are added to the chart.
other method if you still get 'DiDegreeView' object has no attribute 'keys'
1)you can first get the degree of each node as a list of tuples
2)build a node list from the first value of tuple and degree list from the second value of tuple.
3)finally draw the network with the node list you've created and degree list you've created
here's the code:
list_degree=list(G.degree()) #this will return a list of tuples each tuple is(node,deg)
nodes , degree = map(list, zip(*list_degree)) #build a node list and corresponding degree list
plt.figure(figsize=(20,10))
nx.draw(G, nodelist=nodes, node_size=[(v * 5)+1 for v in degree])
plt.show() #ploting the graph

Categories

Resources