Use leidenalg and igraph to find cluster and then output gml file - python

import leidenalg as la
import igraph as ig
G = ig.Graph.Famous('Zachary')
partition = la.find_partition(G, la.ModularityVertexPartition)
ig.plot(partition,vertex_size = 30)
ig.save(G,'ttt.gml')
Everything works fine, however ig.save does not contain cluster information, just nodes and edges.
Need to add Cluster information to nodes in ttt.gml file

The graph itself doesn't contain any information of the partition. You should first add this information to the graph before saving it by doing G.vs['cluster'] = partition.membership.

Related

Pick data from NodeClustring object

I was working with cdlib and detecting my data communities. but I need clusters of communities. how can I pick data from this object?
from cdlib import algorithms
import networkx as nx
coms = algorithms.louvain( G )
print(coms.data)
this is result of print(coms)
<cdlib.classes.node_clustering.NodeClustering object at 0x7fe46ec70ac0>
I found it. but not sure it's good to do. but worked.
We can save it using
from cdlib import readwrite
readwrite.write_community_csv(coms, "communities.csv", ",")
then read again communities.csv dataset and use clusters data

social network analysis using python

I have two csv files. names.csv is containing name of person and its corresponding node and nodelinks.csv file is containing the link weight between nodes(persons). nodelinks.csv contains information about how many times a person calls other person(how many times is represented as weight column).
I want to create a network which is divided into sub-networks according to leaders, followers, marginals, outliers and bridges in the network.
I searched internet and I found out networkx library in python. So I tried networkx and it gave me an output of the whole network but it is very clustered i.e. nodes are drawn on top of each other in the output. I'd like to get an output of the network that can be easily understood and also i want to find out sub-networks, leaders, followers, marginals, outliers and bridges in that network.
What I've tried so far
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
df = pd.read_csv('Nodelinks.csv')
df.columns = ['Source', 'Destination', 'Link']
df.head()
graph = nx.from_pandas_edgelist(df, source = 'Source', target =
'Destination', edge_attr = 'Link',create_using = nx.DiGraph())
plt.figure(figsize = (10,9))
nx.draw(graph, node_size=1200, node_color='lightblue',
linewidths=0.25, font_size=10, font_weight='bold', with_labels=True,
dpi=1000)
plt.show()
Install networkx library using pip or conda.
I tried using pip but it was showing me error. I tried to install it using conda and it worked.
The dataset and jupyter notebook is uploaded on mega.
I don't know how I should proceed next to get what I want as the output. Also, is there any other way to go about this topic?(preferably easier way if there is one)

Write nested network in file (e.g. gml, graphml or nnf) using networkx

I'm currently trying to build a block model using the python package networkx. I found that the function networkx.quotient_graph can be used for this job:
g_block = nx.quotient_graph(G=g, partition=node_list, relabel=True)
In the next step, I want to export the generated block graph "g_block" to a file to import it afterwards in a visualization tool that supports for example graphml-files.
nx.write_graphml(g_block, 'test_block.graphml')
However, this leads to the error:
{KeyError}class 'networkx.classes.graphviews.SubDiGraph'
Can someone help?
Currently networkx (version 2.2) doesn't support nested graphs in a way you can easily export and visualize. Consider using graphviz for handling your nested graph and export it to a dot format.
For working with a networkx version of the graph, you can transform the pygraphviz to a networkx graph and vise versa by keeping a 'graph' property for nodes (which is semantically a subgraph), similarly to the result of quotient_graph.
Here is an example of transforming a small networkx graph to pygraphviz with subgraphs, and exporting it as a dot file:
import networkx as nx
import pygraphviz as pgv
G = nx.erdos_renyi_graph(6, 0.5, directed=False)
node_list = [set([0, 1, 2, 3]), set([4, 5])]
pgv_G = pgv.AGraph(directed=True)
pgv_G.add_edges_from(G.edges())
for i, sub_graph in enumerate(node_list):
pgv_G.add_subgraph(sub_graph, name=str(i))
print(pgv_G)
pgv_G.write("test_pgv.dot")
Note that netwrokx also allows writing and reading 'dot' format (see example), however since there is no built-in support for nested graphs it's not too helpful for this purpose.
The reason you can't write the quotient_graph is twofold:
In a quotient_graph each node has a 'graph' property, which is a SubDiGraph (or a SubGraph, if the original graph is undirected). A SubDiGraph is a ReadOnlyGraph which means it is not possible to write it using the standard networkx.readwrite utils.
Even if we convert the SubDiGraph to a DiGraph, not every graph file format allows to encode a 'graph' property. For example, graphml format supports primitive properties such as booleans, integers etc. Read more here.
One solution that works is to solve the first issue by overriding the 'graph' property with a DiGraph copy of the original SubDiGraph. The second issue can be simply solved by using another file format (e.g., pickle format can work). Read about all supported formats here.
Following is a working example:
g_block = nx.quotient_graph(G=G, partition=node_list, relabel=True)
def subdigraph_to_digraph(subdigraph):
G = nx.DiGraph()
G.add_nodes_from(subdigraph.nodes())
G.add_edges_from(subdigraph.edges())
return G
for node in g_block:
g_block.nodes[node]['graph'] = subdigraph_to_digraph(g_block.nodes[node]['graph'])
nx.write_gpickle(g_block, "test_block.pickle")
This allows to write and load the nested graph for using with netwrokx, however for the purpose of using the exported file in a visualization tool this is not too helpful.

Set node positions using Graphviz in Jupyter Python

I want to make a graph with random node positions but it seems that the "pos" attribute for nodes does nothing. Here is a minimal example:
import graphviz
import pylab
from graphviz import Digraph
g = Digraph('G', filename='ex.gv',format='pdf')
g.attr(size='7')
g.node('1',pos='1,2')
g.node('2',pos='2,3')
g.node('3',pos='0,0')
g.edge('1','2')
g.edge('1','3')
graphviz.Source(g)
Any ideas of how achieve that?
Thanks in advance.
Although not 100% clear in the docs, I think pos is not supported in the dot engine on input. The fdp or neato engines do support pos on input for setting the initial position, and if you end the coordinate specification with '!', the coordinates will not change and thus become the final node position.
Play with a live example at https://beta.observablehq.com/#magjac/placing-graphviz-nodes-in-fixed-positions
This standalone python script generates a pdf with the expected node positions:
#!/usr/bin/python
import graphviz
from graphviz import Digraph
g = Digraph('G', engine="neato", filename='ex.gv',format='pdf')
g.attr(size='7')
g.node('1',pos='1,2!')
g.node('2',pos='2,3!')
g.node('3',pos='0,0!')
g.edge('1','2')
g.edge('1','3')
g.render()
Since SO does not support pdf uploading, here's a png image generated with the same code except format='png':
Without the exclamation marks you get:
Without any pos attributes at all you get a similar (but not exactly the same) result:

Plot tree graph from Pandas dataset in Python

I am trying to plot a tree from a Pandas dataframe but I don't know which is the correct data structure I must use and how can I solve it.
The dataset has 4 columns: source, destination, application and timemark.
For example, a row in the dataset could be:
192.168.1.1 | 192.168.1.200 | ping | 10:00AM
I would like to plot a tree graph generating a node for each of the sources, and showing the adjacency of each source with the destinations who has communicated with, and an adjacency of each destination with all the applications that this destination has used with the source and finally showing the adjacency of each (source, destination, application) leaf, with all the timemarks of the sessions that used this application between this destination and this source.
Could you please tell me how can I find a Python solution for this?
Thanks a lot!
You should look at NetworkX:
"NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks."
You can feed your dataset to populate a graph and then plot the graph.
For example:
import networkx as nx
import matplotlib.pyplot as plt
G=nx.Graph()
G.add_node('192.168.1.1')
G.add_node('192.168.1.100')
G.add_node('192.168.1.200')
G.add_edge('192.168.1.1', '192.168.1.100', object='10')
G.add_edge('192.168.1.1', '192.168.1.200', object='11')
nx.draw_networkx(G, pos = nx.shell_layout(G))
nx.draw_networkx_edge_labels(G, pos = nx.shell_layout(G))
plt.show()
would give you:

Categories

Resources