List duplicates manipulation

List duplicates manipulation - python

I am very new to python, and I am really lost right now. If anyone can help me, I will appreciate it a lot.
I have a list:
list1 = [((a, b), 2), ((a, b), 5), ((c, d), 1)), ((e, f), 2), ((e, f), 4)]
The output I am looking for is:
output = [((a, b), 7), ((c, d), 1), ((e, f), 6)]
I tried to put it in a dictionary
new_dict = {i: j for i, j in list1}
But it throws me an error
Maybe there are other ways?

Find the explanation in the code comments
list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
# let's create an empty dictionary
output = {}
# ('a', 'b') is a tuple and tuple is hashable so we can use it as dictionary key
# iterate over the list1
for i in list1:
# for each item check if i[0] exist in output
if i[0] in output:
# if yes just add i[1]
output[i[0]] += i[1]
else:
# create new key
output[i[0]] = i[1]
# finally print the dictionary
final_output = list(output.items())
print(final_output)
[(('a', 'b'), 7), (('c', 'd'), 1), (('e', 'f'), 6)]

You can use {}.get in this fashion:
list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
di={}
for t in list1:
di[t[0]]=di.get(t[0],0)+t[1]
>>> di
{('a', 'b'): 7, ('c', 'd'): 1, ('e', 'f'): 6}
You can also use a Counter:
from collections import Counter
c=Counter({t[0]:t[1] for t in list1})
>>> c
Counter({('a', 'b'): 5, ('e', 'f'): 4, ('c', 'd'): 1})
Then to turn either of those into a list of tuples (as you have) you use list and {}.items():
>>> list(c.items())
[(('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 4)]

list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
sorted_dict = {}
for ele in list1:
if ele[0] in sorted_dict:
sorted_dict[ele[0]] += ele[1]
else:
sorted_dict[ele[0]] = ele[1]
print(sorted_dict)

Related

Remove elements from tuple array that have same value in first index position of each element

Lets say I have a list:
t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
There are two tuples with 'a' as the first element, and two tuples with 'c' as the first element. I want to only keep the first instance of each, so I end up with:
t = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
How can I achieve that?

You can use a dictionary to help you filter the duplicate keys:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> d = {}
>>> for x, y in t:
... if x not in d:
... d[x] = y
...
>>> d
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> t = list(d.items())
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]

#MrGeek's answer is good, but if you do not want to use a dictionary, you could do something simply like this:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> already_seen = []
>>> for e in t:
... if e[0] not in already_seen:
... already_seen.append(e[0])
... else:
... t.remove(e)
...
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]

#gold_cy's Comment is the easiest way:
You can use itertools.groupby in order to group your data. We use key param to group by the first element of each tuple.
import itertools as it
t = [list(my_iterator)[0] for g, my_iterator in it.groupby(t, key=lambda x: x[0])]
Output:
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]

How to readably and efficeintly create a dictionary of iterable combinations with a tuple of their indices as keys?

So i have some code that works, but it is at best hard to read, and I feel inefficient as it uses two list comprehensions where a single one should suffice.
What I need is to create a dictionary of all n combinations of the letters in alpha, with the key to the dictionary for each item being a tuple of the indices in alpha for the elements in the combination. This should work for any n:
n=2
from itertools import combinations
alpha = "abcde"
n = 2
D = {tuple([c_i[0] for c_i in comb]): tuple([c_i[1] for c_i in comb])
for comb in combinations(enumerate(alpha), n)}
>>>{(0, 1): ('a', 'b'),
(0, 2): ('a', 'c'),
(0, 3): ('a', 'd'),
(0, 4): ('a', 'e'),
(1, 2): ('b', 'c'),
(1, 3): ('b', 'd'),
(1, 4): ('b', 'e'),
(2, 3): ('c', 'd'),
(2, 4): ('c', 'e'),
(3, 4): ('d', 'e')}
n=3
from itertools import combinations
alpha = "abcde"
n = 3
D = {tuple([c_i[0] for c_i in comb]): tuple([c_i[1] for c_i in comb])
for comb in combinations(enumerate(alpha), n)}
>>>{(0, 1, 2): ('a', 'b', 'c'),
(0, 1, 3): ('a', 'b', 'd'),
(0, 1, 4): ('a', 'b', 'e'),
(0, 2, 3): ('a', 'c', 'd'),
(0, 2, 4): ('a', 'c', 'e'),
(0, 3, 4): ('a', 'd', 'e'),
(1, 2, 3): ('b', 'c', 'd'),
(1, 2, 4): ('b', 'c', 'e'),
(1, 3, 4): ('b', 'd', 'e'),
(2, 3, 4): ('c', 'd', 'e')}
This is working as desired, but I want to know if there is a more readable implementation, or one where I don't need a separate comprehension for [c_i[0] for c_i in comb] and [c_i[1] for c_i in comb] as this feels inefficient.
Note: this is a minimal case representation of a more complex problem where the elements of alpha are arguments to an expensive function and I want to store the output of f(alpha[i], alpha[j], alpha[k]) in a dictionary for ease of lookup without recomputation: ans = D[(i, j, k)]

Try this: (I feel it's a lot less complicated than the other answer, but that one works well too)
from itertools import combinations
alpha = "abcde"
n = 2
print({key: tuple([alpha[i] for i in key]) for key in combinations(range(len(alpha)), n)})
Output:
{(0, 1): ('a', 'b'), (0, 2): ('a', 'c'), (0, 3): ('a', 'd'), (0, 4): ('a', 'e'), (1, 2): ('b', 'c'), (1, 3): ('b', 'd'), (1, 4): ('b', 'e'), (2, 3): ('c', 'd'), (2, 4): ('c', 'e'), (3, 4): ('d', 'e')}

One way to avoid the seemingly redundant tuple key-value formation is to use zip with an assignment expression:
from itertools import combinations
alpha = "abcde"
n = 2
D = {(k:=list(zip(*comb)))[0]:k[1] for comb in combinations(enumerate(alpha), n)}
Output:
{(0, 1): ('a', 'b'), (0, 2): ('a', 'c'), (0, 3): ('a', 'd'), (0, 4): ('a', 'e'), (1, 2): ('b', 'c'), (1, 3): ('b', 'd'), (1, 4): ('b', 'e'), (2, 3): ('c', 'd'), (2, 4): ('c', 'e'), (3, 4): ('d', 'e')}

How to remove list items depending on predecessor in python

Given a Python list, I want to remove consecutive 'duplicates'. The duplicate value however is a attribute of the list item (In this example, the tuple's first element).
Input:
[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
Desired Output:
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]
Cannot use set or dict, because order is important.
Cannot use list comprehension [x for x in somelist if not determine(x)], because the check depends on predecessor.
What I want is something like:
mylist = [...]
for i in range(len(mylist)):
if mylist[i-1].attr == mylist[i].attr:
mylist.remove(i)
What is the preferred way to solve this in Python?

You can use itertools.groupby (demonstration with more data):
from itertools import groupby
from operator import itemgetter
data = [(1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (3, 'a')]
[next(group) for key, group in groupby(data, key=itemgetter(0))]
Output:
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (2, 'a'), (3, 'a')]
For completeness, an iterative approach based on other answers:
result = []
for first, second in zip(data, data[1:]):
if first[0] != second[0]:
result.append(first)
result
Output:
[(1, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a')]
Note that this keeps the last duplicate, instead of the first.

In order to remove consecutive duplicates, you could use itertools.groupby:
l = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
from itertools import groupby
[tuple(k) for k, _ in groupby(l)]
# [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]

If I am not mistaken, you only need to lookup the last value.
test = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a'),(3, 'a'),(4,"a"),(4,"a")]
result = []
for i in test:
if result and i[0] == result[-1][0]: #edited since OP considers (1,"a") and (1,"b") as duplicate
#if result and i == result[-1]:
continue
else:
result.append(i)
print (result)
Output:
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (3, 'a'), (4, 'a')]

If you just want to stick to list comprehension, you can use something like this:
>>> li = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
>>> [li[i] for i in range(len(li)) if not i or li[i] != li[i-1]]
[(1, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
Please not that not i is the pythonic way of writing i == 0.

You could also use enumerate and a list comprehension:
>>> data = [(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
>>> [v for ix, v in enumerate(data) if not ix or v[0] != data[ix-1][0]]
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]

I'd change Henry Yik's proposal a little bit, making it a bit simpler. Not sure if I am missing something.
inputList = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
outputList = []
lastItem = None
for item in inputList:
if not item == lastItem:
outputList.append(item)
lastItem = item
print(outputList)

You can easily zip the list with itself. Every element, except the first one, is zipped with its predecessor:
>>> L = [(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
>>> list(zip(L[1:], L))
[((2, 'b'), (1, 'a')), ((2, 'b'), (2, 'b')), ((2, 'c'), (2, 'b')), ((3, 'd'), (2, 'c')), ((2, 'e'), (3, 'd'))]
The first element is always part of the result, and then you filter the pairs on the condition and return the first element:
>>> [L[0]]+[e for e, f in zip(L[1:], L) if e[0]!=f[0]]
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]

It's somewhat overkill but you can use 'reduce',too:
from functools import reduce
data=[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
reduce(lambda rslt,t: rslt if rslt[-1][0]==t[0] else rslt+[t], data, [data[0]])
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]

creating tuple with repeating elements

I am trying to create tuple of following kind:
('a', 0), ('b', 0), ('a', 1), ('b', 1), ('a', 2), ('b', 2), ('a', 3), ('b', 3)
from arrays:
A = ['a','b'] and numbers 0 through 3.
What is good pythonic representation as I am ending with a real for loop here.

Use itertools.product.
from itertools import product
tuples = list(product(['a', 'b'], [0, 1, 2, 3]))
print(tuples) # [('a', 0), ('a', 1), ..., ('b', 0), ('b', 1), ...]
If you need them in the exact order you originally specified, then:
tuples = [(let, n) for n, let in product([0, 1, 2, 3], ['a', 'b'])]
If your comment that "I am ending with a real for loop here" means you ultimately just want to iterate over these elements, then:
for n, let in product([0, 1, 2, 3], ['a', 'b']):
tup = (let, n) # possibly unnecessary, depending on what you're doing
''' your code here '''

You could opt for itertools.product to get the Cartesian product you're looking for. If the element order isn't of significance, then we have
>>> from itertools import product
>>> list(product(A, range(4)))
[('a', 0),
('a', 1),
('a', 2),
('a', 3),
('b', 0),
('b', 1),
('b', 2),
('b', 3)]
If you need that particular order,
>>> list(tuple(reversed(x)) for x in product(range(4), A))
[('a', 0),
('b', 0),
('a', 1),
('b', 1),
('a', 2),
('b', 2),
('a', 3),
('b', 3)]

L = range(0, 4)
K = ['a', 'b']
L3 = [(i, j) for i in K for j in L]
print(L3)
OUTPUT
[('a', 0), ('a', 1), ('a', 2), ('a', 3), ('b', 0), ('b', 1), ('b', 2), ('b', 3)]
If you wish to use list comprehension... other answers are correct as well

Use list comprehension
>>> [(a,n) for a in list1 for n in range(4)]
[('a', 0), ('a', 1), ('a', 2), ('a', 3), ('b', 0), ('b', 1), ('b', 2), ('b', 3)]
If order matters:
>>> [(a,n) for n in range(4) for a in list1]
[('a', 0), ('b', 0), ('a', 1), ('b', 1), ('a', 2), ('b', 2), ('a', 3), ('b', 3)]

How to sum values of tuples that have same name in Python

I have the following list containing tuples that have to values:
mylist=[(3, 'a'), (2, 'b'), (4, 'a'), (5, 'c'), (2, 'a'), (1, 'b')]
Is there a way to sum all values that share the same name?
Something like:
(9, 'a'), (3, 'b'), (5, 'c')
I tried iterating tuples with for loop but can't get what i want.
Thank you

Use a dict (or defaultdict) to aggregate over your tuples:
from collections import defaultdict
mylist=[(3, 'a'), (2, 'b'), (4, 'a'), (5, 'c'), (2, 'a'), (1, 'b')]
sums = defaultdict(int)
for i, k in mylist:
sums[k] += i
print sums.items()
# output: [('a', 9), ('c', 5), ('b', 3)]
# Or if you want the key/value order reversed and sorted by key
print [(v, k) for (k, v) in sorted(sums.items())]
# output: [(9, 'a'), (3, 'b'), (5, 'c')]

You can use itertools.groupby (after sorting by the second value of each tuple) to create groups. Then for each group, sum the first element in each tuple, then create a tuple per group in a list comprehension.
>>> import itertools
>>> [(sum(i[0] for i in group), key) for key, group in itertools.groupby(sorted(mylist, key = lambda i: i[1]), lambda i: i[1])]
[(9, 'a'), (3, 'b'), (5, 'c')]

A simple approach without any dependencies:
mylist = [(3, 'a'), (2, 'b'), (4, 'a'), (5, 'c'), (2, 'a'), (1, 'b')]
sums = {}
for a, b in mylist:
if b not in sums: sums[b] = a
else: sums[b] += a
print list(sums.items())

You can create a dict which has sum of individual keys.
>>> my_dict = {}
>>> for item in mylist:
... my_dict[item[1]] = my_dict.get(item[1], 0) + item[0]
>>> [(my_dict[k], k) for k in my_dict]
[(9, 'a'), (5, 'c'), (3, 'b')]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

List duplicates manipulation - python

list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)] sorted_dict = {} for ele in list1: if ele[0] in sorted_dict: sorted_dict[ele[0]] += ele[1] else: sorted_dict[ele[0]] = ele[1] print(sorted_dict)

Related

Remove elements from tuple array that have same value in first index position of each element

How to readably and efficeintly create a dictionary of iterable combinations with a tuple of their indices as keys?

How to remove list items depending on predecessor in python

creating tuple with repeating elements

How to sum values of tuples that have same name in Python

Categories

Resources