Generate a bipartite structure in gephi - python

I have constructed in python with networkx a bipartite network like this:
import networkx as nx
from random import choice
from string import ascii_lowercase, digits
# Define the characters to choose from
chars = ascii_lowercase + digits
# Create two separate lists of 100 random strings each
lst = [''.join(choice(chars) for _ in range(12)) for _ in range(100)]
lst1 = [''.join(choice(chars) for _ in range(12)) for _ in range(100)]
# Create node labels for each list
List1 = [city for city in lst]
List2 = [city for city in lst1]
# Create the graph object
G = nx.Graph()
# Add nodes to the graph with different bipartite indices
G.add_nodes_from(List1, bipartite=0)
G.add_nodes_from(List2, bipartite=1)
# Add edges connecting nodes from the two lists
for i in range(len(lst)):
G.add_edge(List1[i], List2[i])
# Save the graph to a file
nx.write_gexf(G, "bipartite_network.gexf")
and I want to export this in Gephi which results in the following database:
which does not give me a bipartite structure (i.e. two separate lists of node connected via edges, namely the list under id connected to the list under Label). What is the right input to give Gephi in order to obtain the desired outcome?
Thank you

Related

construct a tree out of list of strings

I have 400 lists that look like that:
[A ,B, C,D,E]
[A, C, G, B, E]
[A,Z,B,D,E]
...
[A,B,R,D,E]
Each with length of 5 items that start with A.
I would like to construct a tree or directed acyclic graph (while with counts a weights ) where each level is the index of the item i.e A have edges with all possible items in the first index, they will have edge with child in the second index and so on.
is there an easy way to build in in networkx ? what i thought to do is to create all the tuples for each level i.e for level 0 : (A,B) ,(A,C) , (A,Z) etc .. but not sure how to move with it
If I understood you correctly, you can set each list as a path using nx.add_path of a directed graph.
l = [['A' ,'B', 'C','D','E'],
['A', 'C','G', 'B', 'E'],
['A','Z','B','D','E'],
['A','B','R','D','E']]
Though since you have nodes across multiple levels, you should probably rename them according to their level, since you cannot have multiple nodes with the same name. So one way could be:
l = [[f'{j}_level{lev}' for lev,j in enumerate(i, 1)] for i in l]
#[['A_level1', 'B_level2', 'C_level3', 'D_level4', 'E_level5'],
# ['A_level1', 'C_level2', 'G_level3', 'B_level4', 'E_level5'],
# ['A_level1', 'Z_level2', 'B_level3', 'D_level4', 'E_level5'],
# ['A_level1', 'B_level2', 'R_level3', 'D_level4', 'E_level5']]
And now construct the graph with: ​
G = nx.DiGraph()
for path in l:
nx.add_path(G, path)
Then you could create a tree-like structure using a graphviz's dot layout:
from networkx.drawing.nx_agraph import graphviz_layout
pos=graphviz_layout(G, prog='dot')
nx.draw(G, pos=pos,
node_color='lightgreen',
node_size=1500,
with_labels=True,
arrows=True)

Update edge attributes of a graph with networkx

I have the following graph with the edge attributes:
import networkx as nx
import random
G=nx.DiGraph()
G.add_edge('x','a', dependency=0.4)
G.add_edge('x','b', dependency=0.6)
G.add_edge('a','c', dependency=1)
G.add_edge('b','c', dependency=0.3)
G.add_edge('b','d', dependency=0.7)
G.add_edge('d','e', dependency=1)
G.add_edge('c','y', dependency=1)
G.add_edge('e','y', dependency=1)
After setting the structure of my graph, I will sample three different edge attributes and multiply them with a random number as followed:
for i in range(3):
sampled_edge = random.sample(G.edges, 1)
print(sampled_edge)
sampled_edge_with_random_number = G.edges[sampled_edge[0]]['dependency'] * random.uniform(0,1)
print(sampled_edge_with_random_number)
Now I want to update the initial graph attribute with the new sampled graph attribute so it would look something like this. The algorithm should look for the same edge attribute in the structure and update the dependency value:
for i in G.edges:
if i == sampled_edge:
i['dependency'] = sampled_edge_with_random_number
Can someone help me with this?
You can just access the attribute to update and change it
>>> G=nx.DiGraph()
>>> G.add_edge('x','a', dependency=0.4)
>>> G['x']['a']
{'dependency': 0.4}
>>> G['x']['a']['dependency'] = 10
>>> G['x']['a']
{'dependency': 10}
Another approach is nx.set_edge_attributes
>>> sampled_edge = ('x', 'a')
>>> new_val = 42
>>> nx.set_edge_attributes(G, {sampled_edge:{'dependency':new_val}})
>>> G['x']['a']['dependency']
42
where ('x','a') is your sampled_edge.

iGraph: selecting vertices connected to

Suppose I have the following graph:
g = ig.Graph([(0,1), (0,2), (2,3), (3,4), (4,2), (2,5), (5,0), (6,3), (5,6)], directed=False)
g.vs["name"] = ["Alice", "Bob", "Claire", "Dennis", "Esther", "Frank", "George"]
and I wish to see who Bob is connected to. Bob is only connected to one person Alice. However if try to find the edge :
g.es.select(_source=1)
>>> <igraph.EdgeSeq at 0x7f15ece78050>
I simply get the above response. How do I infer what the vertex index is from the above. Or if that isn't possible, how do I find the vertices connected to Bob?
This seems to work. The keyword arguments consist of the property, e.g _source and _target, and operator e.g eq (=). And also it seems you need to check both the source and target of the edges (even it's an undirected graph), after filtering the edges, you can use a list comprehension to loop through the edges and extract the source or target:
connected_from_bob = [edge.target for edge in g.es.select(_source_eq=1)]
connected_to_bob = [edge.source for edge in g.es.select(_target_eq=1)]
connected_from_bob
# []
connected_to_bob
# [0]
Then vertices connected with Bob is a union of the two lists:
connected_with_bob = connected_from_bob + connected_to_bob

Python - find longest path

The function will take in a dictionary as input, and I want to find the length of a longest path in a dictionary. Basically, if in a dictionary, key2 matches value1, and key3 matches value2, and so forth, this counts as a path. For example:
{'a':'b', 'b':'c', 'c':'d'}
In the case above, the length should be three. How would I achieve this? Or more specifically how would I compare keys to values? (it could be anything, strings, numbers, etc., not only numbers)
Many thanks in advance!
I would treat the dictionary as a list of edges in a directed acyclic graph (DAG) and use the networkx module to find the longest path in the graph:
import networkx as nx
data = {'a':'b', 'b':'c', 'c':'d'}
G = nx.DiGraph()
G.add_edges_from(data.items())
try:
path = nx.dag_longest_path(G)
print(path)
# ['a', 'b', 'c', 'd']
print(len(path) - 1)
# 3
except nx.exception.NetworkXUnfeasible: # There's a loop!
print("The graph has a cycle")
If you're insisting on not importing anything you could do something like:
def find_longest_path(data):
longest = 0
for key in data.iterkeys():
seen = set()
length = -1
while key:
if key in seen:
length = -1
raise RuntimeError('Graph has loop')
seen.add(key)
key = data.get(key, False)
length += 1
if length > longest:
longest = length
return longest

Python: How to evaluate a part of each item in a list, and append matching results?

Problem:
Trying to evaluate first 4 characters of each item in list.
If the first 4 chars match another first 4 chars in the list, then append the last three digits to the first four. See example below.
Notes:
The list values are not hard coded.
The list always has this structure "####.###".
Only need to match first 4 chars in each item of list.
Order is not essential.
Code:
Grid = ["094G.016", "094G.019", "194P.005", "194P.015", "093T.021", "093T.102", "094G.032"]
Desired Output:
Grid = ["094G.016\019\032", "194P.005\015", "093T.021\102"]
Research:
I know that sets can find duplicates, could I use a set to evaluate only the 1st 4 chars, would I run into a problem since indexing of sets cannot be done?
Would it be better to split the list items into the 2 parts. The four digits before the period ("094G"), and a separate list of the three digits after the period ("093"), compare them, then join them in a new list?
Is there a better way of doing this all together that I'm not realizing?
Here is one straightforward way to do it.
from collections import defaultdict
grid = ['094G.016', '094G.019', '194P.005', '194P.015', '093T.021', '093T.102', '094G.032']
d = defaultdict(list)
for item in grid:
k,v = item.split('.')
d[k].append(v)
result = ['%s.%s' % (k, '/'.join(v)) for k, v in d.items()]
Gives unordered result:
['093T.021/102', '194P.005/015', '094G.016/019/032']
What you'll most likely want is a dictionary mapping the first part of each code to a list of second parts. You can build the dictionary like so:
mappings = {} #Empty dictionary
for code in Grid: #Loop over each code
first, second = code.split('.') #Separate the code into first.second
if first in mappings: #if the first was already found
mappings[first].append(second) #add the second to those already computed
else:
mappings[first] = [second] #otherwise, put it in a new list
Once you have the dictionary, it will be quite simple to loop over it and combine the second parts together (ideally, using '\\'.join)
Sounds like a job for defaultdict.
from containers import defaultdict
grid = ["094G.016", "094G.019", "194P.005", "194P.015", "093T.021", "093T.102"]
d = defaultdict(set)
for item in grid:
prefix, suffix = item.split(".")
d[prefix].add(suffix)
output = [ "%s.%s" % (prefix, "/".join(d[prefix]), ) for prefix in d ]
>>> from itertools import groupby
>>> Grid = ["094G.016", "094G.019", "194P.005", "194P.015", "093T.021", "093T.102", "094G.032"]
>>> Grid = sorted(Grid, key=lambda x:x.split(".")[0])
>>> gen = ((k, g) for k, g in groupby(Grid, key=lambda x:x.split(".")[0]))
>>> gen = ((k,[x.split(".") for x in g]) for k, g in gen)
>>> gen = list((k + '.' + '/'.join(x[1] for x in g) for k, g in gen))
>>> for x in gen:
... print(x)
...
093T.021/102
094G.016/019/032
194P.005/015

Categories

Resources