creating tuple with repeating elements - python

I am trying to create tuple of following kind:
('a', 0), ('b', 0), ('a', 1), ('b', 1), ('a', 2), ('b', 2), ('a', 3), ('b', 3)
from arrays:
A = ['a','b'] and numbers 0 through 3.
What is good pythonic representation as I am ending with a real for loop here.

Use itertools.product.
from itertools import product
tuples = list(product(['a', 'b'], [0, 1, 2, 3]))
print(tuples) # [('a', 0), ('a', 1), ..., ('b', 0), ('b', 1), ...]
If you need them in the exact order you originally specified, then:
tuples = [(let, n) for n, let in product([0, 1, 2, 3], ['a', 'b'])]
If your comment that "I am ending with a real for loop here" means you ultimately just want to iterate over these elements, then:
for n, let in product([0, 1, 2, 3], ['a', 'b']):
tup = (let, n) # possibly unnecessary, depending on what you're doing
''' your code here '''

You could opt for itertools.product to get the Cartesian product you're looking for. If the element order isn't of significance, then we have
>>> from itertools import product
>>> list(product(A, range(4)))
[('a', 0),
('a', 1),
('a', 2),
('a', 3),
('b', 0),
('b', 1),
('b', 2),
('b', 3)]
If you need that particular order,
>>> list(tuple(reversed(x)) for x in product(range(4), A))
[('a', 0),
('b', 0),
('a', 1),
('b', 1),
('a', 2),
('b', 2),
('a', 3),
('b', 3)]

L = range(0, 4)
K = ['a', 'b']
L3 = [(i, j) for i in K for j in L]
print(L3)
OUTPUT
[('a', 0), ('a', 1), ('a', 2), ('a', 3), ('b', 0), ('b', 1), ('b', 2), ('b', 3)]
If you wish to use list comprehension... other answers are correct as well

Use list comprehension
>>> [(a,n) for a in list1 for n in range(4)]
[('a', 0), ('a', 1), ('a', 2), ('a', 3), ('b', 0), ('b', 1), ('b', 2), ('b', 3)]
If order matters:
>>> [(a,n) for n in range(4) for a in list1]
[('a', 0), ('b', 0), ('a', 1), ('b', 1), ('a', 2), ('b', 2), ('a', 3), ('b', 3)]

Related

Remove elements from tuple array that have same value in first index position of each element

Lets say I have a list:
t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
There are two tuples with 'a' as the first element, and two tuples with 'c' as the first element. I want to only keep the first instance of each, so I end up with:
t = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
How can I achieve that?
You can use a dictionary to help you filter the duplicate keys:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> d = {}
>>> for x, y in t:
... if x not in d:
... d[x] = y
...
>>> d
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> t = list(d.items())
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
#MrGeek's answer is good, but if you do not want to use a dictionary, you could do something simply like this:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> already_seen = []
>>> for e in t:
... if e[0] not in already_seen:
... already_seen.append(e[0])
... else:
... t.remove(e)
...
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
#gold_cy's Comment is the easiest way:
You can use itertools.groupby in order to group your data. We use key param to group by the first element of each tuple.
import itertools as it
t = [list(my_iterator)[0] for g, my_iterator in it.groupby(t, key=lambda x: x[0])]
Output:
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]

How to readably and efficeintly create a dictionary of iterable combinations with a tuple of their indices as keys?

So i have some code that works, but it is at best hard to read, and I feel inefficient as it uses two list comprehensions where a single one should suffice.
What I need is to create a dictionary of all n combinations of the letters in alpha, with the key to the dictionary for each item being a tuple of the indices in alpha for the elements in the combination. This should work for any n:
n=2
from itertools import combinations
alpha = "abcde"
n = 2
D = {tuple([c_i[0] for c_i in comb]): tuple([c_i[1] for c_i in comb])
for comb in combinations(enumerate(alpha), n)}
>>>{(0, 1): ('a', 'b'),
(0, 2): ('a', 'c'),
(0, 3): ('a', 'd'),
(0, 4): ('a', 'e'),
(1, 2): ('b', 'c'),
(1, 3): ('b', 'd'),
(1, 4): ('b', 'e'),
(2, 3): ('c', 'd'),
(2, 4): ('c', 'e'),
(3, 4): ('d', 'e')}
n=3
from itertools import combinations
alpha = "abcde"
n = 3
D = {tuple([c_i[0] for c_i in comb]): tuple([c_i[1] for c_i in comb])
for comb in combinations(enumerate(alpha), n)}
>>>{(0, 1, 2): ('a', 'b', 'c'),
(0, 1, 3): ('a', 'b', 'd'),
(0, 1, 4): ('a', 'b', 'e'),
(0, 2, 3): ('a', 'c', 'd'),
(0, 2, 4): ('a', 'c', 'e'),
(0, 3, 4): ('a', 'd', 'e'),
(1, 2, 3): ('b', 'c', 'd'),
(1, 2, 4): ('b', 'c', 'e'),
(1, 3, 4): ('b', 'd', 'e'),
(2, 3, 4): ('c', 'd', 'e')}
This is working as desired, but I want to know if there is a more readable implementation, or one where I don't need a separate comprehension for [c_i[0] for c_i in comb] and [c_i[1] for c_i in comb] as this feels inefficient.
Note: this is a minimal case representation of a more complex problem where the elements of alpha are arguments to an expensive function and I want to store the output of f(alpha[i], alpha[j], alpha[k]) in a dictionary for ease of lookup without recomputation: ans = D[(i, j, k)]
Try this: (I feel it's a lot less complicated than the other answer, but that one works well too)
from itertools import combinations
alpha = "abcde"
n = 2
print({key: tuple([alpha[i] for i in key]) for key in combinations(range(len(alpha)), n)})
Output:
{(0, 1): ('a', 'b'), (0, 2): ('a', 'c'), (0, 3): ('a', 'd'), (0, 4): ('a', 'e'), (1, 2): ('b', 'c'), (1, 3): ('b', 'd'), (1, 4): ('b', 'e'), (2, 3): ('c', 'd'), (2, 4): ('c', 'e'), (3, 4): ('d', 'e')}
One way to avoid the seemingly redundant tuple key-value formation is to use zip with an assignment expression:
from itertools import combinations
alpha = "abcde"
n = 2
D = {(k:=list(zip(*comb)))[0]:k[1] for comb in combinations(enumerate(alpha), n)}
Output:
{(0, 1): ('a', 'b'), (0, 2): ('a', 'c'), (0, 3): ('a', 'd'), (0, 4): ('a', 'e'), (1, 2): ('b', 'c'), (1, 3): ('b', 'd'), (1, 4): ('b', 'e'), (2, 3): ('c', 'd'), (2, 4): ('c', 'e'), (3, 4): ('d', 'e')}

Processing combinations but some elements cannot go together [duplicate]

This question already has answers here:
Permutations between two lists of unequal length
(11 answers)
Closed 3 years ago.
Let's say I have the following elements: ['A', 'B', 1, 2]
My idea is to get the following combinations:
('A', 1)
('A', 2)
('B', 1)
('B', 2)
But these are not all the combinations of the above sequence, e.g. I'm not considering (in purpose) ('A', 'B') or (1, 2)
Using itertools.combinations, of course, gets me all the combinations:
from itertools import combinations
combinations(['A', 'B', 1, 2], 2)
# [('A', 'B'), ('A', 1), ('A', 2), ('B', 1), ('B', 2), (1, 2)]
It's possible for me to internally group the elements that cannot go together:
elems = [('A', 'B'), (1, 2)]
However, combinations does not expect iterables inside other iterables, so the outcome is not really unexpected: [(('A', 'B'), (1, 2))]. Not what I want, nonetheless.
What's the best way to achieve this?
You can use itertools.product to get the cartesian product of two lists:
from itertools import product
elems = [('A', 'B'), (1, 2)]
list(product(*elems))
# [('A', 1), ('A', 2), ('B', 1), ('B', 2)]
You can use itertools.product after forming new input with values grouped by type:
from itertools import product as prd, groupby as gb
d = ['A', 'B', 1, 2]
result = list(product(*[list(b) for _, b in gb(sorted(d, key=lambda x:str(type(x)), reverse=True), key=type)]))
Output:
[('A', 1), ('A', 2), ('B', 1), ('B', 2)]
This solution will create new sublists grouped by data type, enabling robustness for future input and/or flexibility in element ordering in d:
d = ['A', 1, 'B', 2, (1, 2), 'C', 3, (3, 4), (4, 5)]
result = list(prd(*[list(b) for _, b in gb(sorted(d, key=lambda x:str(type(x)), reverse=True), key=type)]))
Output:
[((1, 2), 'A', 1), ((1, 2), 'A', 2), ((1, 2), 'A', 3), ((1, 2), 'B', 1), ((1, 2), 'B', 2), ((1, 2), 'B', 3), ((1, 2), 'C', 1), ((1, 2), 'C', 2), ((1, 2), 'C', 3), ((3, 4), 'A', 1), ((3, 4), 'A', 2), ((3, 4), 'A', 3), ((3, 4), 'B', 1), ((3, 4), 'B', 2), ((3, 4), 'B', 3), ((3, 4), 'C', 1), ((3, 4), 'C', 2), ((3, 4), 'C', 3), ((4, 5), 'A', 1), ((4, 5), 'A', 2), ((4, 5), 'A', 3), ((4, 5), 'B', 1), ((4, 5), 'B', 2), ((4, 5), 'B', 3), ((4, 5), 'C', 1), ((4, 5), 'C', 2), ((4, 5), 'C', 3)]

How to sort list of tuples by several keys

I am doing an exercise on Python and lists with one problem:
I have a list of tuples sorted by second key:
[('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
And I need sort it: Second value by number and first value by alphabetical order. And it must look like:
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]
When I used the sorted function I got:
[('a', 3), ('b', 2), ('c', 2), ('d', 3), ('f', 3)]
It sorted by first element (and I lost arrangement of second). I also tried to use key:
def getKey(item):
return item[0]
a = (sorted(lis, key=getKey))
And it didn't help me either.
When you have a list with nested tuples you cannot sort it by looking at both elements. In your case you can either sort by alphabetical order or numerical order. The key parameter of the sort method let's you specify by which element in the tuple pair you want to sort your data.
If you want to sort by increasing numerical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(key=lambda pair: pair[1])
If you want to sort by decreasing numerical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(key = lambda pair: pair[1], reverse=True)
If you want to sort by alphabetical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort()
Reverse alphabetical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(reverse = True)
The key parameter let's you specify by which element of the tuple pair you want to sort by.
You cannot sort by both alphabetical and numerical order.
l = [('f', 4), ('b', 4), ('a', 4), ('c', 3), ('k', 1)]
l.sort(key=lambda x:(-x[1],x[0]))
print(l)
[('a', 4), ('b', 4), ('f', 4), ('c', 3), ('k', 1)]
We pass two keys to sort -x[1] which reverses the sort by numbers with the negative sign from highest to lowest, we then break ties with x[0] which is sorted from lowest to highest i.e a-z naturally.
`
Correct answer:
l = [('f', 4), ('b', 4), ('a', 4), ('c', 3), ('k', 1)]
l.sort(key=lambda x:(-x[1],x[0]))
print(l)
Result:
[('a', 4), ('b', 4), ('f', 4), ('c', 3), ('k', 1)]
def getKey(item):
return item[0]
This returns the first element of the tuple, so the list will be sorted by the first tuple element. You want to sort by second element, then first, so you want to reverse your tuple. Your key function would then need to be:
def getKey(item):
return -item[1], item[0]
Making your final call:
>>> sorted(lis, key=getKey)
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]
The sort() method is stable. Call it twice, first for the secondary key (alphabetically), then for the primary key (the number):
>>> lst = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
>>> lst.sort()
>>> lst.sort(key=lambda kv: kv[1], reverse=True)
>>> lst
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]

Python list sort by size of group

I have a group of items that are labeled like item_labels = [('a', 3), ('b', 2), ('c', 1), ('d', 3), ('e', 2), ('f', 3)]
I want to sort them by the size of group. e.g., label 3 has size 3 and label 2 has size 2 in the above example.
I tried using a combination of groupby and sorted but didn't work.
In [162]: sil = sorted(item_labels, key=op.itemgetter(1))
In [163]: sil
Out[163]: [('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]
In [164]: g = itt.groupby(sil,)
Display all 465 possibilities? (y or n)
In [164]: g = itt.groupby(sil, key=op.itemgetter(1))
In [165]: for k, v in g:
.....: print k, list(v)
.....:
.....:
1 [('c', 1)]
2 [('b', 2), ('e', 2)]
3 [('a', 3), ('d', 3), ('f', 3)]
In [166]: sg = sorted(g, key=lambda x: len(list(x[1])))
In [167]: sg
Out[167]: [] # not exactly know why I got an empty list here
I can always write some tedious for-loop to do this, but I would rather find something more elegant. Any suggestion? If there are libraries that are useful I would happy to use that. e.g., pandas, scipy
In python2.7 and above, use Counter:
from collections import Counter
c = Counter(y for _, y in item_labels)
item_labels.sort(key=lambda t : c[t[1]])
In python2.6, for our purpose, this Counter constructor can be implemented using defaultdict (as suggested by #perreal) this way:
from collections import defaultdict
def Counter(x):
d = defaultdict(int)
for v in x: d[v]+=1
return d
Since we are working with numbers only, and assuming the numbers are as low as those in your example, we can actually use a list (which will be compatible with even older version of Python):
def Counter(x):
lst = list(x)
d = [0] * (max(lst)+1)
for v in lst: d[v]+=1
return d
Without counter, you can simply do this:
item_labels.sort(key=lambda t : len([x[1] for x in item_labels if x[1]==t[1] ]))
It is slower, but reasonable over short lists.
The reason you've got an empty list is that g is a generator. You can only iterate over it once.
from collections import defaultdict
import operator
l=[('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]
d=defaultdict(int)
for p in l: d[p[1]] += 1
print [ p for i in sorted(d.iteritems(), key=operator.itemgetter(1))
for p in l if p[1] == i[1] ]
itertools.groupby returns an iterator, so this for loop: for k, v in g: actually consumed that iterator.
>>> it = iter([1,2,3])
>>> for x in it:pass
>>> list(it) #iterator already consumed by the for-loop
[]
code:
>>> lis = [('a', 3), ('b', 2), ('c', 1), ('d', 3), ('e', 2), ('f', 3)]
>>> from operator import itemgetter
>>> from itertools import groupby
>>> lis.sort(key = itemgetter(1) )
>>> new_lis = [list(v) for k,v in groupby(lis, key = itemgetter(1) )]
>>> new_lis.sort(key = len)
>>> new_lis
[[('c', 1)], [('b', 2), ('e', 2)], [('a', 3), ('d', 3), ('f', 3)]]
To get a flattened list use itertools.chain:
>>> from itertools import chain
>>> list( chain.from_iterable(new_lis))
[('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]
Same as #perreal's and #Elazar's answers, but with better names:
from collections import defaultdict
size = defaultdict(int)
for _, group_id in item_labels:
size[group_id] += 1
item_labels.sort(key=lambda (_, group_id): size[group_id])
print item_labels
# -> [('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]
Here is another way:
example=[('a', 3), ('b', 2), ('c', 1), ('d', 3), ('e', 2), ('f', 3)]
out={}
for t in example:
out.setdefault(t[1],[]).append(t)
print sorted(out.values(),key=len)
Prints:
[[('c', 1)], [('b', 2), ('e', 2)], [('a', 3), ('d', 3), ('f', 3)]]
If you want a flat list:
print [l for s in sorted(out.values(),key=len) for l in s]
[('c', 1), ('b', 2), ('e', 2), ('a', 3), ('d', 3), ('f', 3)]

Categories

Resources