Sorting out strands in python

Sorting out strands in python - python

I have a list of pairs of numbers with the list sorted by the number on the right- eg:
[(7, 1)
(6, 2)
(5, 3)
(8, 5)
(9, 7)
(4, 9)]
and I want to get out the strands that are linked. A strand is defined as:
x->y->z
where tuples exist:
(y, x)
(z, y)
The strands in the above example are:
1->7->9->4
2->6
3->5->8
in the above example. I cannot think of any sensible code; as simple iteration with a counting variable will cause significant repeats. Please give me some pointers.

There's an easier way to do this than a real linked list. Since there's no real need for traversal, you can simply build regular lists as you go.
ts = [(7, 1),
(6, 2),
(5, 3),
(8, 5),
(9, 7),
(4, 9)]
def get_strands(tuples):
'''builds a list of lists of connected x,y tuples
get_strands([(2,1), (3,2), (4,3)]) -> [[1,2,3,4]]
Note that this will not handle forked or merging lists intelligently
'''
lst = []
for end, start in tuples:
strand = next((strand for strand in lst if strand[-1]==start), None)
# give me the sublist that ends with `start`, or None
if strand is None:
lst.append([start, end]) # start a new strand
else:
strand.append(end)
return lst
Demo:
In [21]: get_strands(ts)
Out[21]: [[1, 7, 9, 4], [2, 6], [3, 5, 8]]

I think the most complete solution is to create a graph from your data and then perform a topological sort on it. It will provide your expected result as long as the your graph doesn't have any cycles.

Related

Sort list of iterables by nth element which might not be present in all of them

Given a list of iterables:
li = [(1,2), (3,4,8), (3,4,7), (9,)]
I want to sort by the third element if present, otherwise leave the order unchanged. So here the desired output would be:
[(1,2), (3,4,7), (3,4,8), (9,)]
Using li.sort(key=lambda x:x[2]) returns an IndexError. I tried a custom function:
def safefetch(li, idx):
try:
return li[idx]
except IndexError:
return # (ie return None)
li.sort(key=lambda x: safefetch(x, 2))
But None in sorting yields a TypeError.
Broader context: I first want to sort by the first element, then the second, then the third, etc. until the length of the longest element, ie I want to run several sorts of decreasing privilege (as in SQL's ORDER BY COL1 , COL2), while preserving order among those elements that aren't relevant. So: first sort everything by first element; then among the ties on el_1 sort on el_2, etc.. until el_n. My feeling is that calling a sort function on the whole list is probably the wrong approach.
(Note that this was an "XY question": for my actual question, just using sorted on tuples is simplest, as Patrick Artner pointed out in the comments. But the question is posed is trickier.)

We can first get the indices for distinct lengths of elements in the list via a defaultdict and then sort each sublist with numpy's fancy indexing:
from collections import defaultdict
# {length -> inds} mapping
d = defaultdict(list)
# collect indices per length
for j, tup in enumerate(li):
d[len(tup)].append(j)
# sort
li = np.array(li, dtype=object)
for inds in d.values():
li[inds] = sorted(li[inds])
# convert back to list if desired
li = li.tolist()
to get li at the end as
[(1, 2), (3, 4, 7), (3, 4, 8), (9,)]
For some other samples:
In [134]: the_sorter([(12,), (3,4,8), (3,4,7), (9,)])
Out[134]: [(9,), (3, 4, 7), (3, 4, 8), (12,)]
In [135]: the_sorter([(12,), (3,4,8,9), (3,4,7), (11, 9), (9, 11), (2, 4, 4, 4)])
Out[135]: [(12,), (2, 4, 4, 4), (3, 4, 7), (9, 11), (11, 9), (3, 4, 8, 9)]
where the_sorter is above procedure wrapped in a function (name lacks imagination...)
def the_sorter(li):
# {length -> inds} mapping
d = defaultdict(list)
# collect indices per length
for j, tup in enumerate(li):
d[len(tup)].append(j)
# sort
li = np.array(li)
for inds in d.values():
li[inds] = sorted(li[inds])
return li.tolist()

Whatever you return as fallback value must be comparable to the other key values that might be returned. In your example that would require a numerical value.
import sys
def safefetch(li, idx):
try:
return li[idx]
except IndexError:
return sys.maxsize # largest int possible
This would put all the short ones at the back of the sort order, but maintain a stable order among them.

Inspired by #Mustafa Aydın here is a solution in Pandas. Would prefer one without the memory overhead of a dataframe, but this might be good enough.
import pandas as pd
li = [(1,2), (3,4,8), (3,4,7), (9,)]
tmp = pd.DataFrame(li)
[tuple(int(el) for el in t if not pd.isna(el)) for t in tmp.sort_values(by=tmp.columns.tolist()).values]
> [(1, 2), (3, 4, 7), (3, 4, 8), (9,)]

How to access lists inside a tuple inside a list in python

I am a newbie to python. Trying to learn how to access Lists inside a Tuple inside a List. My List is:
holidays = [(0,),
(1, [2, 16]),
(2, [20]),
(4, [14]),
(5, [29]),
(7, [4]),
(9, [4]),
(11, [23, 24]),
(12, [25])]
I would like to know the best way to access each tuple and its list in a more efficient way. I tried using:
for i, tuples in enumerate(holidays):
for list in tuples:
print list
But i get the following error:
for list in tuples:
TypeError: 'int' object is not iterable
Help would be much appreciated.

You need to remove the i in the first for loop:
for tuples in enumerate(holidays):
for list in tuples:
print list

short version
[y for x in holidays if isinstance(x, tuple) for y in x if isinstance(y, list)]
You can't do a for .. in LOOP on an integer, that's why the program cras

Well, your holidays list is not uniform: the first entry is an integer (0), the others are tuples.
holidays = [0, # <- integer
(1, [2, 16]),
(2, [20]),
(4, [14]),
(5, [29]),
(7, [4]),
(9, [4]),
(11, [23, 24]),
(12, [25])]
Here is a possible loop:
for entry in holidays:
if entry == 0:
continue # don't know what to do with zero
month, days = entry
print(month, days)
We use unpaking to extract the month and the days.
See Tuples and Sequences in the Python tutorial.

Change your first element 0 to (0), Also, remove 'i' from your for loop, as told by Stavros, it will work.
holidays = [([0]),
(1, [2, 16]),
(2, [20]),
(4, [14]),
(5, [29]),
(7, [4]),
(9, [4]),
(11, [23, 24]),
(12, [25])]
tuples in enumerate(holidays):
list in tuples:
print list

Merging overlapping items in a list

My goal is to merge overlapping tuples in the example list below.
If an item falls within the range of the next, the two tuples will have to be merged. The resulting tuple is one that covers the range of the two items (minimum to maximum values). For instance; [(1,6),(2,5)] will result in [(1,6)], as [2,5] falls within the range of [(1,6)]
mylist=[(1, 1), (1, 6), (2, 5), (4, 4), (9, 10)]
My attempt:
c=[]
t2=[]
for i, x in enumerate(mylist):
w=x,mylist[i-1]
if x[0]-my[i-1][1]<=1:
d=min([x[0] for x in w]),max([x[1] for x in w])
c.append(d)
for i, x in enumerate(set(c)):
t=x,c[i-1]
if x[0]-c[i-1][1]<=1:
t1=min([x[0] for x in t]),max([x[1] for x in t])
t2.append(t1)
print sorted(set(t2))
Derived Output:
[(1, 6), (1, 10)]
Desired output:
[(1, 6), (9, 10)]
Any suggestions on how to get the desired output (in fewer lines if possible)? Thanks.

Basing on answer from #Valera, python implementation:
mylist=[(1, 6), (2, 5), (1, 1), (3, 7), (4, 4), (9, 10)]
result = []
for item in sorted(mylist):
result = result or [item]
if item[0] > result[-1][1]:
result.append(item)
else:
old = result[-1]
result[-1] = (old[0], max(old[1], item[1]))
print result # [(1, 7), (9, 10)]

You can solve this problem in O(nlogn)
First, you need to sort your intervals by it's starting points. After that, you create a new stack, and for each interval do the following:
if it's empty, just push the current interval
if it's not, you check if the first interval in the stack overlaps with you current interval. If it does, you pop it, merge it with your current interval, and push the result back. If it doesn't, you just push your current interval. After you check all the intervals, your stack will contain all merged intervals.

Converting list of tuples in functional way

I have the following list in Python:
[('1','2','3'),('5','6','7')]
I need to convert the tuples inside the list into integer([(1,2,3),(5,6,7)]) in a functional way.
I can do them for a list using this simple code: map(lambda x:int(x),['1','2','3'])
But how shall i apply the same concept for list of tuples ?
(I know the imperative way of doing this.)

tl = [('1','2','3'),('5','6','7')]
[tuple(int(x) for x in t) for t in tl]
# [(1, 2, 3), (5, 6, 7)]
If you really want the map syntax,
map(lambda t:tuple(map(int, t)), tl)
# [(1, 2, 3), (5, 6, 7)]

How about the following:
[tuple([int(str_int) for str_int in tup]) for tup in list_of_string_tuples]

This hybrid works:
>>> [tuple(map(int,t)) for t in [('1','2','3'),('5','6','7')]]
[(1, 2, 3), (5, 6, 7)]

How flatten a list of lists one step

I have a list of lists of tuples
A= [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
The outer list can have any number of inner lists, the inner lists can have any number of tuples, a tuple always has 3 integers.
I want to generate all combination of tuples, one from each list:
[(1,2,3),(7,8,9),(2,1,0)]
[(1,2,3),(7,8,9),(1,3,5)]
[(1,2,3),(8,7,6),(2,1,0)]
...
[(4,5,6),(5,4,3),(1,3,5)]
A simple way to do it is to use a function similar to itertools.poduct()
but it must be called like this
itertools.product([(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)])
i.e the outer list is removed. And I don't know how to do that. Is there a better way to generate all combinations of tuples?

itertools.product(*A)
For more details check the python tutorial

This works for your example, if there is only one level of nested lists (no lists of lists of lists):
itertools.product(*A)

you can probably call itertools.product like so:
itertools.product(*A) # where A is your list of lists of tuples
This way it expands your list's elements into arguments for the function you are calling.

Late to the party but ...
I'm new to python and come from a lisp background. This is what I came up with (check out the var names for lulz):
def flatten(lst):
if lst:
car,*cdr=lst
if isinstance(car,(list)):
if cdr: return flatten(car) + flatten(cdr)
return flatten(car)
if cdr: return [car] + flatten(cdr)
return [car]
Seems to work. Test:
A = [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
flatten(A)
Result:
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (8, 7, 6), (5, 4, 3), (2, 1, 0), (1, 3, 5)]
Note: the line car,*cdr=lst only works in Python 3.0

This is not exactly one step, but this would do what you want if for some reason you don't want to use the itertools solution:
def crossprod(listoflists):
if len(listoflists) == 1:
return listoflists
else:
result = []
remaining_product = prod(listoflists[1:])
for outertupe in listoflists[0]:
for innercombo in remaining_product[0]:
newcombo = [outertupe]
newcombo.append(innercombo)
result.append(newcombo)
return result

def flatten(A)
answer = []
for i in A:
if type(i) == list:
ans.extend(i)
else:
ans.append(i)
return ans

This may also be achieved using list comprehension.
In [62]: A = [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
In [63]: improved_list = [num for elem in A for num in elem]
In [64]: improved_list
Out[64]: [(1, 2, 3), (4, 5, 6), (7, 8, 9), (8, 7, 6), (5, 4, 3), (2, 1, 0), (1, 3, 5)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sorting out strands in python - python

I think the most complete solution is to create a graph from your data and then perform a topological sort on it. It will provide your expected result as long as the your graph doesn't have any cycles.

Related

Sort list of iterables by nth element which might not be present in all of them

How to access lists inside a tuple inside a list in python

Merging overlapping items in a list

Converting list of tuples in functional way

How flatten a list of lists one step

Categories

Resources