Combine all elements of n lists in python [closed]

Combine all elements of n lists in python [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
there are a lot of questions and answers about combining and merging lists in python but I have not found a way to create a full combination of all elements.
If I had a list of lists like the following:
data_small = [ ['a','b','c'], ['d','e','f'] ]
data_big = [ ['a','b','c'], ['d','e','f'], ['u','v','w'], ['x','y','z'] ]
How can I get a list of lists with all combinations?
For data_small this should be:
[ [a,b,c], [d,b,c], [a,b,f], [a,e,c],
[d,e,c], [d,b,f], [a,e,f], [d,e,f], ... ]
This should also work for an arbitrary number of lists of the same length like data_big.
I am pretty sure there is a fancy itertools solution for this, right?

I think I deciphered the question:
def so_called_combs(data):
for sublist in data:
for sbl in data:
if sbl==sublist:
yield sbl
continue
for i in range(len(sublist)):
c = sublist[:]
c[i] = sbl[i]
yield c
This returns the required list, if I understood it correctly:
For every list in the data, every element is replaced (but only one at a time) with the corresponding element (same position) in each of the other lists.
For data_big, this returns:
[['a', 'b', 'c'], ['d', 'b', 'c'], ['a', 'e', 'c'], ['a', 'b', 'f'],
['u', 'b', 'c'], ['a', 'v', 'c'], ['a', 'b', 'w'], ['x', 'b', 'c'],
['a', 'y', 'c'], ['a', 'b', 'z'], ['a', 'e', 'f'], ['d', 'b', 'f'],
['d', 'e', 'c'], ['d', 'e', 'f'], ['u', 'e', 'f'], ['d', 'v', 'f'],
['d', 'e', 'w'], ['x', 'e', 'f'], ['d', 'y', 'f'], ['d', 'e', 'z'],
['a', 'v', 'w'], ['u', 'b', 'w'], ['u', 'v', 'c'], ['d', 'v', 'w'],
['u', 'e', 'w'], ['u', 'v', 'f'], ['u', 'v', 'w'], ['x', 'v', 'w'],
['u', 'y', 'w'], ['u', 'v', 'z'], ['a', 'y', 'z'], ['x', 'b', 'z'],
['x', 'y', 'c'], ['d', 'y', 'z'], ['x', 'e', 'z'], ['x', 'y', 'f'],
['u', 'y', 'z'], ['x', 'v', 'z'], ['x', 'y', 'w'], ['x', 'y', 'z']]

Here is another way to do using itertools permutations and chain function. You also need to check if the indexes line up and are all the same length, and whether there is more than one element being replaced
from itertools import *
data_small = [ ['a','b','c'], ['d','e','f'] ]
data_big = [ ['a','b','c'], ['d','e','f'], ['u','v','w'], ['x','y','z'] ]
def check(data, sub):
check_for_mul_repl = []
for i in data:
if len(i) != len(data[0]):
return False
for j in i:
if j in sub:
if i.index(j) != sub.index(j):
return False
else:
if i not in check_for_mul_repl:
check_for_mul_repl.append(i)
if len(check_for_mul_repl) <= 2:
return True
print [x for x in list(permutations(chain(*data_big), 3)) if check(data_big, x)]
['a', 'b', 'c'], ['a', 'b', 'f'], ['a', 'b', 'w'], ['a', 'b', 'z'],
['a', 'e', 'c'], ['a', 'e', 'f'], ['a', 'v', 'c'], ['a', 'v', 'w'],
['a', 'y', 'c'], ['a', 'y', 'z'], ['d', 'b', 'c'], ['d', 'b', 'f'],
['d', 'e', 'c'], ['d', 'e', 'f'], ['d', 'e', 'w'], ['d', 'e', 'z'],
['d', 'v', 'f'], ['d', 'v', 'w'], ['d', 'y', 'f'], ['d', 'y', 'z'],
['u', 'b', 'c'], ['u', 'b', 'w'], ['u', 'e', 'f'], ['u', 'e', 'w'],
['u', 'v', 'c'], ['u', 'v', 'f'], ['u', 'v', 'w'], ['u', 'v', 'z'],
['u', 'y', 'w'], ['u', 'y', 'z'], ['x', 'b', 'c'], ['x', 'b', 'z'],
['x', 'e', 'f'], ['x', 'e', 'z'], ['x', 'v', 'w'], ['x', 'v', 'z'],
['x', 'y', 'c'], ['x', 'y', 'f'], ['x', 'y', 'w'], ['x', 'y', 'z']
This doesn't care if there is more than one element being replaced
from itertools import permutations, chain
data_small = [ ['a','b','c'], ['d','e','f'] ]
data_big = [ ['a','b','c'], ['d','e','f'], ['u','v','w'], ['x','y','z'] ]
def check(data, sub):
for i in data:
if len(i) != len(data[0]):
return False
for j in i:
if j in sub:
if i.index(j) != sub.index(j):
return False
return True
#If you really want lists just change the first x to list(x)
print [x for x in list(permutations(chain(*data_big), 3)) if check(data_big, x)]
['a', 'b', 'c'], ['a', 'b', 'f'], ['a', 'b', 'w'], 61 more...
The reason I use permutations instead of combinations is because ('d','b','c') is equal to ('c','b','d') in terms of combinations and not in permutations
If you just want combinations then that's a lot easier. You can just do
def check(data) #Check if all sub lists are same length
for i in data:
if len(i) != len(data[0]):
return False
return True
if check(data_small):
print list(combinations(chain(*data_small), 3))
[('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'b', 'e'), ('a', 'b', 'f'),
('a', 'c', 'd'), ('a', 'c', 'e'), ('a', 'c', 'f'), ('a', 'd', 'e'),
('a', 'd', 'f'), ('a', 'e', 'f'), ('b', 'c', 'd'), ('b', 'c', 'e'),
('b', 'c', 'f'), ('b', 'd', 'e'), ('b', 'd', 'f'), ('b', 'e', 'f'),
('c', 'd', 'e'), ('c', 'd', 'f'), ('c', 'e', 'f'), ('d', 'e', 'f')]

Sorry for being late to the party, but here is the fancy "one--liner" (split over multiple lines for readability) using itertools and the extremely useful new Python 3.5 unpacking generalizations (which, by the way, is a significantly faster and more readable way of converting between iterable types than, say, calling list explicitly) --- and assuming unique elements:
>>> from itertools import permutations, repeat, chain
>>> next([*map(lambda m: [m[i][i] for i in range(a)],
{*permutations((*chain(*map(
repeat, map(tuple, l), repeat(a - 1))),), a)})]
for l in ([['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']],)
for a in (len(l[0]),))
[['g', 'h', 'f'], ['g', 'b', 'i'], ['g', 'b', 'f'],
['d', 'b', 'f'], ['d', 'h', 'f'], ['d', 'h', 'i'],
['a', 'e', 'c'], ['g', 'e', 'i'], ['a', 'h', 'i'],
['a', 'e', 'f'], ['g', 'e', 'c'], ['a', 'b', 'i'],
['g', 'b', 'c'], ['g', 'h', 'c'], ['d', 'h', 'c'],
['d', 'b', 'c'], ['d', 'e', 'c'], ['a', 'b', 'f'],
['d', 'b', 'i'], ['a', 'h', 'c'], ['g', 'e', 'f'],
['a', 'e', 'i'], ['d', 'e', 'i'], ['a', 'h', 'f']]
The use of next on the generator and the last two lines are, of course, just unnecessary exploitations of syntax to put the expression into one line, and I hope people will not use this as an example of good coding practice.
EDIT
I just realised that maybe I should give a short explanation. So, the inner part creates a - 1 copies of each sublist (converted to tuples for hashability and uniqueness testing) and chains them together to allow permutations to do its magic, which is to create all permutations of sublists of a sublists length. These are then converted to a set which gets rid of all duplicates which are bound to occur, and then a map pulls out the ith element of the ith sublist within each unique permutation. Finally, a is the length of the first sublist, since all sublists are assumed to have identical lengths.

Related

Find biggest repetitions of size n and less in list

I have a list of elements like so:
['x', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'g', 'h', 'i', 'i', 'i', 'i']
I would like to find all the "biggest" repetitions of n elements and below and the number of times each sequence is repeated. For example, if n=3:
>>> [(['a', 'b', 'c'], 3), (['g', 'h'], 2), (['i'], 4)]
I also don't want to return (['i', 'i'], 2) since there is a longer sequence involving the element 'i'.
Here is a second condition:
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'g', 'h', 'i', 'i', 'i', 'i']
>>> [(['a', 'b', 'c'], 3), (['b', 'c'], 2), (['g', 'h'], 2), (['i'], 4)]
Overlapping of elements belonging to 2 different repetitions are accepted.
I was thinking about a solution based on sliding windows of size n and decreasing, keeping track of the already used indices but I doesn't fulfill the first condition.
Is there an efficient way of doing so?

You can create a function:
import re
def counting(x):
d = re.sub(r"(?<=(\w))(?=\1)","\n","\n".join(re.findall(r"(\w+)(?=\1)",''.join(x)))).split()
return [(list(i),d.count(i)+1)for i in set(d)]
Now you can run this function on your data:
m = ['x', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'g', 'h', 'i', 'i', 'i', 'i']
counting(m)
[(['g', 'h'], 2), (['i'], 4), (['a', 'b', 'c'], 3)]
n = ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'g', 'h', 'i', 'i', 'i', 'i']
counting(n)
[(['g', 'h'], 2), (['i'], 4), (['a', 'b', 'c'], 3), (['b', 'c'], 2)]

You can use a regex:
>>> li=['x', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'g', 'h', 'i', 'i', 'i', 'i']
>>> [(t[0],''.join(t).count(t[0])) for t in re.findall(r'(\w+)(\1+)', ''.join(li))]
[('abc', 3), ('gh', 2), ('ii', 2)]
Or,
>>> [(list(t[0]),''.join(t).count(t[0])) for t in re.findall(r'(\w+)(\1+)', ''.join(li))
[(['a', 'b', 'c'], 3), (['g', 'h'], 2), (['i', 'i'], 2)]

Find all combinations of a list with different elements removed each time

I need a way to find all combinations from two different lists where each element may or may not be null. For example with the two lists I'd like to call a function that returns a list of all these combinations:
a = ['A', 'S', 'B']
b = ['A', 'B']
find_combinations(a, b)
Should return:
[['A', 'S', 'B'], ['A', 'S'], ['S', 'B'], ['S']]
I can do this trivial example but if the list is more complicated say ['A', 'B', 'S', 'A', 'A'] then the possible options are much more complicated:
[['A', 'B', 'S', 'A', 'A'],
['A', 'B', 'S', 'A'],
['A', 'B', 'S', 'A'],
['B', 'S', 'A', 'A']
... etc.

I will assume the elements of b are hashable. In that case you can use the following code:
def without(a,i,bi,l):
if i >= len(a):
yield tuple(l)
else:
l.append(a[i])
for result in without(a,i+1,bi,l):
yield result
l.pop()
if a[i] in bi:
for result in without(a,i+1,bi,l):
yield result
def find_combinations(a, b):
for result in without(a,0,set(b),[]):
yield result
Here we first convert the b into a set to boost performance. This is strictly speaking not necessary. Then we use a recursive algorithm where for each element a[i] in a that is in b, we have a decision point whether to or not to include it in the result (that's why we perform the recursion again when that element is popped). When we reach the end of the list, we convert our running list l into a tuple(..). You can also use list(..) to convert it into a list.
We use a running list to boost performance a bit since concatenating two lists is done in O(n) whereas the running list can append(..) and pop(..) in O(1) amortized cost.
This will produce a generator of tuples. You can materialize the outcome of each generator with list(..) like:
>>> list(find_combinations(['A','S','B'],['A','B']))
[('A', 'S', 'B'), ('A', 'S'), ('S', 'B'), ('S',)]
>>> list(find_combinations(['A', 'B', 'S', 'A', 'A'],['A','B']))
[('A', 'B', 'S', 'A', 'A'), ('A', 'B', 'S', 'A'), ('A', 'B', 'S', 'A'), ('A', 'B', 'S'), ('A', 'S', 'A', 'A'), ('A', 'S', 'A'), ('A', 'S', 'A'), ('A', 'S'), ('B', 'S', 'A', 'A'), ('B', 'S', 'A'), ('B', 'S', 'A'), ('B', 'S'), ('S', 'A', 'A'), ('S', 'A'), ('S', 'A'), ('S',)]
In case lists are required, you can use map(list,..) to convert them to lists, like:
>>> list(map(list,find_combinations(['A','S','B'],['A','B'])))
[['A', 'S', 'B'], ['A', 'S'], ['S', 'B'], ['S']]
>>> list(map(list,find_combinations(['A', 'B', 'S', 'A', 'A'],['A','B'])))
[['A', 'B', 'S', 'A', 'A'], ['A', 'B', 'S', 'A'], ['A', 'B', 'S', 'A'], ['A', 'B', 'S'], ['A', 'S', 'A', 'A'], ['A', 'S', 'A'], ['A', 'S', 'A'], ['A', 'S'], ['B', 'S', 'A', 'A'], ['B', 'S', 'A'], ['B', 'S', 'A'], ['B', 'S'], ['S', 'A', 'A'], ['S', 'A'], ['S', 'A'], ['S']]

After playing around I found an alternative way to Willem Van Onsem's answer. Not quite as clean but also works.
def empty_derivations(rule, empty_list):
returned = []
returned.append(rule)
for element in returned:
for num, char in enumerate(element):
temp = element[:]
if char in empty_list:
del temp[num]
returned.append(temp)
return_list = []
for element in returned:
if element not in return_list:
return_list.append(element)
return return_list
When called gives:
>>> a = empty_derivations(['A', 'B', 'S', 'A', 'A'], ['A', 'B'])
>>> print(a)
[['A', 'B', 'S', 'A', 'A'],
['B', 'S', 'A', 'A'],
['A', 'S', 'A', 'A'],
['A', 'B', 'S', 'A'],
['S', 'A', 'A'],
['B', 'S', 'A'],
['A', 'S', 'A'],
['A', 'B', 'S'],
['S', 'A'],
['B', 'S'],
['A', 'S'],
['S']]

Merging nested lists with overlap

I have to merge nested lists which have an overlap. I keep thinking that there has to be an intelligent solution using list comprehensions and probably difflib, but I can't figure out how it should work.
My lists look like this:
[['C', 'x', 'F'], ['A', 'D', 'E']]
and
[['x', 'F', 'G', 'x'], ['D', 'E', 'H', 'J']].
They are above another, like rows in a matrix. Therefore, they have overlap (in the form of
[['x', 'F'], ['D', 'E']]).
A merge should yield:
[['C', 'x', 'F', 'G', 'x'], ['A', 'D', 'E', 'H', 'J']].
How can I achieve this?

You can try something like this.
list1 = [['C', 'x', 'F'], ['A', 'D', 'E']]
list2 = [['x', 'F', 'G', 'x'], ['D', 'E', 'H', 'J']]
for x in range(len(list1)):
for element in list2[x]:
if element not in list1[x]:
list1[x].append(element)
print list1[x]
Output:
['C', 'x', 'F', 'G']
['A', 'D', 'E', 'H', 'J']
I hope this helps you.

Python 2d list sort by colum header

I have a 2d list:
['C', 'A', 'D', 'B'], #header row
['F', 'C', 'F', 'E'], #row 1
['F', 'E', 'F', 'F'], #row 2
['B', 'A', 'F', 'A'],
['A', 'F', 'C', 'E'],
['F', 'F', 'E', 'E'],
['C', 'E', 'F', 'C']
The first row in the list is the header row: C, A, D, B.
How can I change this so that the columns are in order so it will appear:
['A', 'B', 'C', 'D'], #header row
['C', 'E', 'F', 'F'], #row 1
['E', 'F', 'F', 'F'], #row 2
['A', 'A', 'B', 'F'],
['F', 'E', 'A', 'C'],
['F', 'E', 'F', 'E'],
['E', 'C', 'C', 'F']
I want the headers to be in order A,B,C,D but also move the columns underneath with the header

You can use sorted and zip, first create a list of your column with zip(*a) then sort it (based on first index) with sorted, and again convert to first state with zip and convert the indices to list with map :
>>> map(list,zip(*sorted(zip(*a))))
[['A', 'B', 'C', 'D'],
['C', 'E', 'F', 'F'],
['E', 'F', 'F', 'F'],
['A', 'A', 'B', 'F'],
['F', 'E', 'A', 'C'],
['F', 'E', 'F', 'E'],
['E', 'C', 'C', 'F']]

You could use numpy.
>>> import numpy as np
>>> data = np.array([['C', 'A', 'D', 'B'], #header row
... ['F', 'C', 'F', 'E'], #row 1
... ['F', 'E', 'F', 'F'], #row 2
... ['B', 'A', 'F', 'A'],
... ['A', 'F', 'C', 'E'],
... ['F', 'F', 'E', 'E'],
... ['C', 'E', 'F', 'C']])
>>> data[:,np.argsort(data[0])] # select sorted indices of first row as column indices.
array([['A', 'B', 'C', 'D'],
['C', 'E', 'F', 'F'],
['E', 'F', 'F', 'F'],
['A', 'A', 'B', 'F'],
['F', 'E', 'A', 'C'],
['F', 'E', 'F', 'E'],
['E', 'C', 'C', 'F']],
dtype='|S1')

python compare values of the sublists of a list and add up values

I have a list with an unknown number of sublists.
I want to compare always one sublist with ALL other sublists.
If the values of the first sublist at position 0, 3 and 5 are equal to any other sublist, I want to to add together the values at position 7 in all matching lists.
And then add this first sublist with the (newly added up value) at position 7 to a new list.
list = [['A', 'a', 'b', 'B', 'c', 'd', 'C', '1'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '3'],
['D', 'r', 's', 'E', 't', 'u', 'F', '2'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '2'],
['D', 'r', 's', 'E', 't', 'u', 'F', '2'],.....]
wanted Output:
new_list = [['A', 'a', 'b', 'B', 'c', 'd', 'C', '6'],
['D', 'r', 's', 'E', 't', 'u', 'F', 4],...]
I wrote this code
def Inter(list):
a = 0
b = 3
c = 5
d = 0
x = []
for i in range(len(list)):
for y in range(len(list)):
if list[i][a] == list[y][a] and list[i][b] == list[y][b] and list[i][c] == list[y][c]:
IntSumtemp = []
IntSumtemp.append(str(float(list[i][7]) + float(list[y][7])))
x.append(list[i] + IntSumtemp)
del (x[d][7])
d +=1
else: None
return x
new_list= Inter(list)
but it gave this output:
new_list= [['A', 'a', 'b', 'B', 'c', 'd', 'C', '2.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '4.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '3.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '4.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '6.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '5.0'],
['D', 'r', 's', 'E', 't', 'u', 'F', '4.0'],
['D', 'r', 's', 'E', 't', 'u', 'F', '4.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '3.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '5.0'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '4.0'],
['D', 'r', 's', 'E', 't', 'u', 'F', '4.0'],
['D', 'r', 's', 'E', 't', 'u', 'F', '4.0']]
Can someone help me with this, please?
(Sorry, I am a absolute beginner, so if anything is unclear, plz ask...)

This approach is O(N) as opposed to yours which was O(N^2)
from collections import OrderedDict
from operator import itemgetter
items = [['A', 'a', 'b', 'B', 'c', 'd', 'C', '1'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '3'],
['D', 'r', 's', 'E', 't', 'u', 'F', '2'],
['A', 'a', 'b', 'B', 'c', 'd', 'C', '2'],
['D', 'r', 's', 'E', 't', 'u', 'F', '2']]
key = itemgetter(0, 3, 5)
d = OrderedDict()
for x in items:
d.setdefault(key(x), x[:7] + [0])[7] += int(x[7])
print d.values()
[['A', 'a', 'b', 'B', 'c', 'd', 'C', 6], ['D', 'r', 's', 'E', 't', 'u', 'F', 4]]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combine all elements of n lists in python [closed] - python

Related

Find biggest repetitions of size n and less in list

Find all combinations of a list with different elements removed each time

Merging nested lists with overlap

Python 2d list sort by colum header

python compare values of the sublists of a list and add up values

Categories

Resources