How to identify possible chain pattern from list of tuples - python

Edited
Looking for easy or optimized way of implementing below problem, seems with "networkx" we can achieve this quite easily (Thanks to BENY in comment section)
input_list = [('A','B'),('D','C'),('C','B'),('E','D'),('I','J'),('L','K'),('J','K')] # path map
def get_chain_list(sp, d):
global result
result.append(sp)
if sp in d : get_chain_list(d[sp], d)
return tuple(result)
d = dict(input_list)
s1 = set(d.keys())
s2 = set(d.values())
sps = s1 - s2
master_chain = []
for sp in sps :
result = []
master_chain.append(get_chain_list(sp, d))
output_list = sorted(master_chain, key=len, reverse=True)
print(output_list)
[('E', 'D', 'C', 'B'), ('I', 'J', 'K'), ('A', 'B'), ('L', 'K')] # Chains in input list

This is more like a networkx problem
import networkx as nx
G = nx.Graph()
G.add_edges_from(input_list)
l = [*nx.connected_components(G)]
Out[6]: [{'A', 'B', 'C', 'D', 'E'}, {'I', 'J', 'K', 'L'}]

Use
output_list = set(input_list)
then form the required chain pattern using tuples like so:
from string import ascii_uppercase
input_list = [('A','B'),('D','C'),('C','B'),
('E','D'),('I','J'),('L','K'),('J','K')]
src=sorted({e for t in input_list for e in t})
ss=""
tgt=[]
for c in src:
if ss+c in ascii_uppercase:
ss+=c
else:
tgt.append(tuple(ss))
ss=c
else:
tgt.append(tuple(ss))
>>> tgt
[('A', 'B', 'C', 'D', 'E'), ('I', 'J', 'K', 'L')]

Related

sorting values between open and closed parenthesis iteratively in python

I am trying to sort values alphabetically between open and closed parenthesis like the following
(m,b,l,a(d,g,c(l,e)),f(k,g,h(i)),j) and I would like a output of
(a(c(e,l),d,g),b,f(g,h(i),k),j,l,m) in the sorted order.
I have done some coding and it is going wacky. I was able to sort innermost parenthesis and but not able continue.
indexofparen = []
for i, c in enumerate(s):
if c == "(":
indexofparen.append(i)
##print(indexofparen)
closeparen = []
for i, c in enumerate(s):
if c == ")":
closeparen.append(i)
##print(closeparen)
parenindex=s.rindex("(")
##print(parenindex)
matchparen=s.index(")")-1
##print(matchparen)
list_val = s[s.rindex("(")+1:s.index(")")].split(",")
##print("Before:", list_val)
for passnum in range(len(list_val)-1, 0, -1):
for i in range(passnum):
if list_val[i] > list_val[i+1]:
list_val[i], list_val[i+1] = list_val[i+1], list_val[i]
##print(list_val)
s1=s[:parenindex] + "(" + ','.join(str(e) for e in list_val) + ")" + s[matchparen:]
##print(s1)
s2 = s[indexofparen[1]+1:closeparen[2]]
After this I am kind of loopy. Any help is to refine and go about sorting this hard problem is appreciated. Thank you very much for the time spent in helping me. Much appreciated.
This was fun:
In [55]: def sort_key(t):
...: if isinstance(t, tuple):
...: return t
...: return (t,)
...:
In [56]: def recursive_sort(x):
...: gen = (recursive_sort(t) if isinstance(t, tuple) else t for t in x)
...: return tuple(sorted(gen, key=sort_key))
...:
In [57]: print(x)
('m', 'b', 'l', 'a', ('d', 'g', 'c', ('l', 'e')), 'f', ('k', 'g', 'h', ('i',)), 'j')
In [58]: print(recursive_sort(x))
('a', 'b', ('c', 'd', ('e', 'l'), 'g'), 'f', ('g', 'h', ('i',), 'k'), 'j', 'l', 'm')
The "key" is in the key argument, which makes sure you are comparing tuples and takes care of the lexicographic aspect of the sort. Then, it is simply recursively sorting the elements if they are a tuple or not.
EDIT
Whoops! Just realized that you wanted to sort a string. Well, since I've grown attached to my solution, here's an ugly hack to salvage it:
In [60]: import re
In [61]: import ast
In [62]: def to_tuple(s):
...: s = re.sub(r"([a-zA-Z])([\(\)])",r"\1,\2",s)
...: s = re.sub(r"([a-zA-Z])",r"'\1'",s)
...: return ast.literal_eval(s)
...:
In [63]: def to_string(t):
...: return str(t).replace("',)", "')").replace("'",'')
...:
In [64]: s = "(m,b,l,a(d,g,c(l,e)),f(k,g,h(i)),j)"
And finally:
In [65]: print(t)
('m', 'b', 'l', 'a', ('d', 'g', 'c', ('l', 'e')), 'f', ('k', 'g', 'h', ('i',)), 'j')
In [66]: to_string(recursive_sort(t))
Out[66]: '(a, b, (c, d, (e, l), g), f, (g, h, (i), k), j, l, m)'

Handle Transitivity in Python

I have set of pairwise relationship something like this
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
Number of such relationship is big enough to check it individually. These tuple indicates that both values are same. I would like to apply transitivity and find out common groups. Output would be like following:
[('a','b','c','g'), ('d','e','m','z','p'), ('t','k','n','l','j')]
I tried following code but it has bug,
common_cols = []
common_group_count = 0
for (c1, c2) in col_combi:
found = False
for i in range(len(common_cols)):
if (c1 in common_cols[i]):
common_cols[i].append(c2)
found = True
break
elif (c2 in common_cols[i]):
common_cols[i].append(c1)
found = True
break
if not found:
common_cols.append([c1,c2])
Output of above code is following
[['a', 'b', 'c', 'g'], ['d', 'e', 'm', 'z', 'p'], ['l', 'j', 'k'], ['t', 'k', 'n']]
I know why this code is not working. So I would like to know how can I perform this task.
Thanks in advance
You can approach this as a graph problem using the NetworkX library:
import networkx
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
g = networkx.Graph(col_combi)
for subgraph in networkx.connected_component_subgraphs(g):
print subgraph.nodes()
Output:
['m', 'z', 'e', 'd', 'p']
['t', 'k', 'j', 'l', 'n']
['a', 'c', 'b', 'g']
You can implement a solution using sets and union/intersection operations.
col_combi = [('a','b'), ('b','c'), ('d','e'), ('l','j'), ('c','g'),
('e','m'), ('m','z'), ('z','p'), ('t','k'), ('k', 'n'),
('j','k')]
from itertools import combinations
sets = [set(x) for x in col_combi]
stable = False
while not stable: # loop until no further reduction is found
stable = True
# iterate over pairs of distinct sets
for s,t in combinations(sets, 2):
if s & t: # do the sets intersect ?
s |= t # move items from t to s
t ^= t # empty t
stable = False
# remove empty sets
sets = list(filter(None, sets)) # added list() for python 3
print sets
Output:
[set(['a', 'c', 'b', 'g']), set(['p', 'e', 'd', 'z', 'm']), set(['t', 'k', 'j', 'l', 'n'])]
Note: doc for itertools.combinations
A solution with itertools, you can take a look.
lst =[]
import itertools
for a, b in itertools.combinations(col_combi, 2):
for i in a:
if i in b:
lst.append(set(a+b))
for indi,i in enumerate(lst):
for j in lst:
if i == j:
continue
if i & j:
lst[indi] = i|j
lst.remove(j)
print lst
Output of this is:
[set(['a', 'c', 'b', 'g']), set(['k', 'j', 'l', 'n']), set(['e', 'd', 'm', 'p', 'z'])]
Of course this can be made more efficient. I will try to update soon.
From the code after elif you assume the relationship is reflexive.
Your algorithm fails if the pairs are not provided in a specific order.
Example:
(b, c) (a, b) (c, d)
will end up with two sets
b, c, d
and
a, b
The problem is about partitioning a set using an equivalence relation. Understanding the set theory background helps identifying a library that can solve the problem. See https://en.m.wikipedia.org/wiki/Equivalence_relation .

Networkx: how can i import a graph from a file?

With networkx, im trying to import a graph from a txt file.
The graph format is this (ex:):
a b
a c
b d
c e
Thtat means: a--b a--c b--d c--e
I suppose this is an edge list so i tried use the appropriate command:
G=nx.read_edgelist("path\file.txt")
but it doesen't work, any ideas?
Are you looking for something like this?
import networkx as nx
with open('a.txt') as f:
lines = f.readlines()
myList = [line.strip().split() for line in lines]
# [['a', 'b'], ['a', 'c'], ['b', 'd'], ['c', 'e']]
g = nx.Graph()
g.add_edges_from(myList)
print g.nodes()
# ['a', 'c', 'b', 'e', 'd']
print g.edges()
# [('a', 'c'), ('a', 'b'), ('c', 'e'), ('b', 'd')]

Get a list of N items with K selections for each element?

For example if I have a selection set K
K = ['a','b','c']
and a length N
N = 4
I want to return all possible:
['a','a','a','a']
['a','a','a','b']
['a','a','a','c']
['a','a','b','a']
...
['c','c','c','c']
I can do it with recursion but it is not interesting. Is there a more Pythonic way?
That can be done with itertools.
>>> K = ['a','b','c']
>>> import itertools
>>> N = 4
>>> i = itertools.product(K,repeat = N)
>>> l = [a for a in i]
>>> l[:3]
[('a', 'a', 'a', 'a'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c')]
EDIT: I realized you actually want product, not combinations_with_replacement. Updated code.

Append all possibilities from sequences with numbers

I have a question that is consuming my brain. Let us suppose that the variable I stores a sequence, and the variable II another one, and the variable III another one too. The variable one will represent the number 1, the next 2 and the next 3; and then I have another key variable with random characters of these 3 sequences. Giving that fact, I can easily translate the characters of this key variable in the correspondent numbers. In the example, x = 'afh', than, it is the same to say that x = '123', because A OR B OR C = 1, and so on.
Now comes the complicated part:
When the key variable x is translated into numbers, each character individually, I can also return characters randomly from the result. For example: x = '123', then I can return a list like ['a','e','f'], or ['b','d','i'], especially if I use random.choice(). From this, what I couldn't figure out how to do yet is:
How can I append into a list ALL THE POSSIBLE VARIATIONS from the variables I, II, III. For example:
['adg','beh','cfi','aei','ceg',...]
I know how to print endlessly random combinations, but in this case, I get repetitions, and I don't want them. I want to append to a list exactly all the possible variations between I, II and III, because when they're translated into numbers, I can return any character from the correspondent sequence. Well, I hope my example is self-explainable. I thank you all very much for the attention!
I = 'abc' # 1
II = 'def' # 2
III = 'ghi' # 3
x = 'afh' # Random possibility: It could be an input.
L = []
LL = []
for i in range(len(x)):
if x[i] in I:
L.append(1)
if x[i] in II:
L.append(2)
if x[i] in III:
L.append(3)
for i in range(len(L)): # Here lies the mistery...
if L[i] == 1:
LL.append(I)
if L[i] == 2:
LL.append(II)
if L[i] == 3:
LL.append(III)
print L
print LL
The output is:
[1, 2, 3]
['abc', 'def', 'ghi']
Here's how I would rewrite your code. Lengthy if statements like yours are a big code smell. I put the sequences into a tuple and used a single loop. I also replaced the second loop with a list comprehension.
By the way, you could also simplify the indexing if you used zero based indexing like a sensible person.
I = 'abc' # 1
II = 'def' # 2
III = 'ghi' # 3
x = 'afh' # Random possibility: It could be an input.
L = []
LL = []
lists = I, II, III
for c in x:
for i, seq in enumerate(lists):
if c in seq:
L.append(i+1)
LL = [lists[i-1] for i in L]
print L
print LL
Also, be sure to check out the itertools module, and in particular the product function. It's not clear exactly what you mean, but product gives you all combinations of an item from each of a list of sequences.
Thank you very much Antimony! The answer is exactly product() from itertools. The code with it is bloody far more simple:
from itertools import *
I = 'abc' # 1
II = 'def' # 2
III = 'ghi' # 3
IV = product(I,II,III)
for i in IV:
print i
And the output is exactly what I wanted, every possible combination:
('a', 'd', 'g')
('a', 'd', 'h')
('a', 'd', 'i')
('a', 'e', 'g')
('a', 'e', 'h')
('a', 'e', 'i')
('a', 'f', 'g')
('a', 'f', 'h')
('a', 'f', 'i')
('b', 'd', 'g')
('b', 'd', 'h')
('b', 'd', 'i')
('b', 'e', 'g')
('b', 'e', 'h')
('b', 'e', 'i')
('b', 'f', 'g')
('b', 'f', 'h')
('b', 'f', 'i')
('c', 'd', 'g')
('c', 'd', 'h')
('c', 'd', 'i')
('c', 'e', 'g')
('c', 'e', 'h')
('c', 'e', 'i')
('c', 'f', 'g')
('c', 'f', 'h')
('c', 'f', 'i')
python 3.2
[(i,v,c) for i in I for v in II for c in III]

Categories

Resources