Related
Lets say I have a list:
t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
There are two tuples with 'a' as the first element, and two tuples with 'c' as the first element. I want to only keep the first instance of each, so I end up with:
t = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
How can I achieve that?
You can use a dictionary to help you filter the duplicate keys:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> d = {}
>>> for x, y in t:
... if x not in d:
... d[x] = y
...
>>> d
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> t = list(d.items())
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
#MrGeek's answer is good, but if you do not want to use a dictionary, you could do something simply like this:
>>> t = [('a', 1), ('a', 6), ('b', 2), ('c', 3), ('c', 5), ('d', 4)]
>>> already_seen = []
>>> for e in t:
... if e[0] not in already_seen:
... already_seen.append(e[0])
... else:
... t.remove(e)
...
>>> t
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
#gold_cy's Comment is the easiest way:
You can use itertools.groupby in order to group your data. We use key param to group by the first element of each tuple.
import itertools as it
t = [list(my_iterator)[0] for g, my_iterator in it.groupby(t, key=lambda x: x[0])]
Output:
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
I am very new to python, and I am really lost right now. If anyone can help me, I will appreciate it a lot.
I have a list:
list1 = [((a, b), 2), ((a, b), 5), ((c, d), 1)), ((e, f), 2), ((e, f), 4)]
The output I am looking for is:
output = [((a, b), 7), ((c, d), 1), ((e, f), 6)]
I tried to put it in a dictionary
new_dict = {i: j for i, j in list1}
But it throws me an error
Maybe there are other ways?
Find the explanation in the code comments
list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
# let's create an empty dictionary
output = {}
# ('a', 'b') is a tuple and tuple is hashable so we can use it as dictionary key
# iterate over the list1
for i in list1:
# for each item check if i[0] exist in output
if i[0] in output:
# if yes just add i[1]
output[i[0]] += i[1]
else:
# create new key
output[i[0]] = i[1]
# finally print the dictionary
final_output = list(output.items())
print(final_output)
[(('a', 'b'), 7), (('c', 'd'), 1), (('e', 'f'), 6)]
You can use {}.get in this fashion:
list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
di={}
for t in list1:
di[t[0]]=di.get(t[0],0)+t[1]
>>> di
{('a', 'b'): 7, ('c', 'd'): 1, ('e', 'f'): 6}
You can also use a Counter:
from collections import Counter
c=Counter({t[0]:t[1] for t in list1})
>>> c
Counter({('a', 'b'): 5, ('e', 'f'): 4, ('c', 'd'): 1})
Then to turn either of those into a list of tuples (as you have) you use list and {}.items():
>>> list(c.items())
[(('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 4)]
list1 = [(('a', 'b'), 2), (('a', 'b'), 5), (('c', 'd'), 1), (('e', 'f'), 2), (('e', 'f'), 4)]
sorted_dict = {}
for ele in list1:
if ele[0] in sorted_dict:
sorted_dict[ele[0]] += ele[1]
else:
sorted_dict[ele[0]] = ele[1]
print(sorted_dict)
Given a Python list, I want to remove consecutive 'duplicates'. The duplicate value however is a attribute of the list item (In this example, the tuple's first element).
Input:
[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
Desired Output:
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]
Cannot use set or dict, because order is important.
Cannot use list comprehension [x for x in somelist if not determine(x)], because the check depends on predecessor.
What I want is something like:
mylist = [...]
for i in range(len(mylist)):
if mylist[i-1].attr == mylist[i].attr:
mylist.remove(i)
What is the preferred way to solve this in Python?
You can use itertools.groupby (demonstration with more data):
from itertools import groupby
from operator import itemgetter
data = [(1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (3, 'a')]
[next(group) for key, group in groupby(data, key=itemgetter(0))]
Output:
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (2, 'a'), (3, 'a')]
For completeness, an iterative approach based on other answers:
result = []
for first, second in zip(data, data[1:]):
if first[0] != second[0]:
result.append(first)
result
Output:
[(1, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a')]
Note that this keeps the last duplicate, instead of the first.
In order to remove consecutive duplicates, you could use itertools.groupby:
l = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
from itertools import groupby
[tuple(k) for k, _ in groupby(l)]
# [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
If I am not mistaken, you only need to lookup the last value.
test = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a'),(3, 'a'),(4,"a"),(4,"a")]
result = []
for i in test:
if result and i[0] == result[-1][0]: #edited since OP considers (1,"a") and (1,"b") as duplicate
#if result and i == result[-1]:
continue
else:
result.append(i)
print (result)
Output:
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (3, 'a'), (4, 'a')]
If you just want to stick to list comprehension, you can use something like this:
>>> li = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
>>> [li[i] for i in range(len(li)) if not i or li[i] != li[i-1]]
[(1, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
Please not that not i is the pythonic way of writing i == 0.
You could also use enumerate and a list comprehension:
>>> data = [(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
>>> [v for ix, v in enumerate(data) if not ix or v[0] != data[ix-1][0]]
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]
I'd change Henry Yik's proposal a little bit, making it a bit simpler. Not sure if I am missing something.
inputList = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (2, 'a')]
outputList = []
lastItem = None
for item in inputList:
if not item == lastItem:
outputList.append(item)
lastItem = item
print(outputList)
You can easily zip the list with itself. Every element, except the first one, is zipped with its predecessor:
>>> L = [(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
>>> list(zip(L[1:], L))
[((2, 'b'), (1, 'a')), ((2, 'b'), (2, 'b')), ((2, 'c'), (2, 'b')), ((3, 'd'), (2, 'c')), ((2, 'e'), (3, 'd'))]
The first element is always part of the result, and then you filter the pairs on the condition and return the first element:
>>> [L[0]]+[e for e, f in zip(L[1:], L) if e[0]!=f[0]]
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]
It's somewhat overkill but you can use 'reduce',too:
from functools import reduce
data=[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]
reduce(lambda rslt,t: rslt if rslt[-1][0]==t[0] else rslt+[t], data, [data[0]])
[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]
I have the following lists of tuples:
mylist=[('a', 3), ('b', 2), ('c', 8)]
mylist2=[('a', 3), ('b', 5), ('c', 20), ('d', 5)]
Is there a way I can sum all values that share the same name and sort them in Python? Something like:
[('c', 28), ('b', 7), ('a', 6), ('d', 5)]
If I were you, I would have done it like:
>>> mylist=[('a', 3), ('b', 2), ('c', 8)]
>>> mylist2=[('a', 3), ('b', 5), ('c', 20), ('d', 5)]
# Step 1: Convert the list of tuples to `dict`
>>> dict_1, dict_2 = dict(mylist), dict(mylist2)
# Step 2: get set of all keys
>>> all_keys = set(dict_1.keys() + dict_2.keys())
# Step 3: Get `sum` of value for each key
>>> sum_list = [(k, dict_1.get(k, 0) + dict_2.get(k, 0)) for k in all_keys]
And then sort the list as:
>>> from operator import itemgetter
# Step 4: Sort in descending order based on value at index 1
>>> sorted(sum_list, key=itemgetter(1), reverse=True)
[('c', 28), ('b', 7), ('a', 6), ('d', 5)]
Note: It assumes that the key at 0th index in tuples of both the lists are unique.
I am doing an exercise on Python and lists with one problem:
I have a list of tuples sorted by second key:
[('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
And I need sort it: Second value by number and first value by alphabetical order. And it must look like:
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]
When I used the sorted function I got:
[('a', 3), ('b', 2), ('c', 2), ('d', 3), ('f', 3)]
It sorted by first element (and I lost arrangement of second). I also tried to use key:
def getKey(item):
return item[0]
a = (sorted(lis, key=getKey))
And it didn't help me either.
When you have a list with nested tuples you cannot sort it by looking at both elements. In your case you can either sort by alphabetical order or numerical order. The key parameter of the sort method let's you specify by which element in the tuple pair you want to sort your data.
If you want to sort by increasing numerical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(key=lambda pair: pair[1])
If you want to sort by decreasing numerical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(key = lambda pair: pair[1], reverse=True)
If you want to sort by alphabetical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort()
Reverse alphabetical order:
alist = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
alist.sort(reverse = True)
The key parameter let's you specify by which element of the tuple pair you want to sort by.
You cannot sort by both alphabetical and numerical order.
l = [('f', 4), ('b', 4), ('a', 4), ('c', 3), ('k', 1)]
l.sort(key=lambda x:(-x[1],x[0]))
print(l)
[('a', 4), ('b', 4), ('f', 4), ('c', 3), ('k', 1)]
We pass two keys to sort -x[1] which reverses the sort by numbers with the negative sign from highest to lowest, we then break ties with x[0] which is sorted from lowest to highest i.e a-z naturally.
`
Correct answer:
l = [('f', 4), ('b', 4), ('a', 4), ('c', 3), ('k', 1)]
l.sort(key=lambda x:(-x[1],x[0]))
print(l)
Result:
[('a', 4), ('b', 4), ('f', 4), ('c', 3), ('k', 1)]
def getKey(item):
return item[0]
This returns the first element of the tuple, so the list will be sorted by the first tuple element. You want to sort by second element, then first, so you want to reverse your tuple. Your key function would then need to be:
def getKey(item):
return -item[1], item[0]
Making your final call:
>>> sorted(lis, key=getKey)
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]
The sort() method is stable. Call it twice, first for the secondary key (alphabetically), then for the primary key (the number):
>>> lst = [('f', 3), ('a', 3), ('d', 3), ('b', 2), ('c', 2)]
>>> lst.sort()
>>> lst.sort(key=lambda kv: kv[1], reverse=True)
>>> lst
[('a', 3), ('d', 3), ('f', 3), ('b', 2), ('c', 2)]