Merge tuples with the same key - python

How to merge a tuple with the same key
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
and turn them into
list_2 = [("AAA", [123, 456]), ("AAW", [147, 124])]

The most performant approach is to use a collections.defaultdict dictionary to store data as an expanding list, then convert back to tuple/list if needed:
import collections
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
c = collections.defaultdict(list)
for a,b in list_1:
c[a].extend(b) # add to existing list or create a new one
list_2 = list(c.items())
result:
[('AAW', [147, 124]), ('AAA', [123, 456])]
note that the converted data is probably better left as dictionary. Converting to list again loses the "key" feature of the dictionary.
On the other hand, if you want to retain the order of the "keys" of the original list of tuples, unless you're using python 3.6/3.7, you'd have to create a list with the original "keys" (ordered, unique), then rebuild the list from the dictionary. Or use an OrderedDict but then you cannot use defaultdict (or use a recipe)

You can use a dict to keep track of the indices of each key to keep the time complexity O(n):
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
list_2 = []
i = {}
for k, s in list_1:
if k not in i:
list_2.append((k, s))
i[k] = len(i)
else:
list_2[i[k]][1].extend(s)
list_2 would become:
[('AAA', [123, 456]), ('AAW', [147, 124])]

You can create a dictionary and loop through the list. If the item present in dictionary append the value to already existing list else assign the value to key.
dict_1 = {}
for item in list_1:
if item[0] in dict_1:
dict_1[item[0]].append(item[1][0])
else:
dict_1[item[0]] = item[1]
list_2 = list(dict_1.items())

Similarly to other answers, you can use a dictionary to associate each key with a list of values. This is implemented in the function merge_by_keys in the code snippet below.
import pprint
list_1 = [("AAA", [123]), ("AAA", [456]), ("AAW", [147]), ("AAW", [124])]
def merge_by_key(ts):
d = {}
for t in ts:
key = t[0]
values = t[1]
if key not in d:
d[key] = values[:]
else:
d[key].extend(values)
return d.items()
result = merge_by_key(list_1)
pprint.pprint(result)

Related

Comparing nested list with dictionary keys, create compound key 'a+b' with sum of values

I have a nested list and I want to compare list items with dictionary keys and if match is found the corresponding dictionary values should be summed and appended to same dictionary as new key-value pair.
l1 = [['a','b'], ['c','d']]
dict1 = {'a':10, 'e':20, 'c':30, 'b':40}
Expected result:
dict1 = {'a':10, 'e':20, 'c':30, 'b':40, 'a+b':50, 'a+c':40, 'b+c':70}
What I have done so far:
for x in range(len(l1)):
for y in range(len(l1[x])):
for k in dict1.keys():
if k == l1[x][y]:
dict1.append(dict1[k])
Is there any way to do this without using nested for loops?
PS: code is not complete yet.
Presuming there is no importance to your nested lists e.g. l1 can be changed to ["a", "b", "c", d"] you can use itertools here.
First flatten l1 with itertools.chain
import itertools
l2 = itertools.chain(*l1)
(or l2 = itertools.chain.from_iterable(l1)).
Then loop through all combinations of two elements
for i, j in itertools.combinations(l2, 2):
if i in dict1 and j in dict1:
dict1[f"{i}+{j}"] = dict1[i] + dict1[j]
All together
import itertools
l1 = [['a','b'], ['c','d']]
dict1 = {'a':10, 'e':20, 'c':30, 'b':40}
for i, j in itertools.combinations(itertools.chain(*l1), 2):
if i in dict1 and j in dict1:
dict1[f"{i}+{j}"] = dict1[i] + dict1[j]
dict1 will now equal
{'a': 10, 'e': 20, 'c': 30, 'b': 40, 'a+b': 50, 'a+c': 40, 'b+c': 70}
you can try the below code
dict1 = {'a':10, 'e':20, 'c':30, 'b':40, 'a+b':50, 'a+c':40, 'b+c':70}
dict2={}
dict2['a+b']=dict1['a']+dict1['b']
dict2['a+c']=dict1['a']+dict1['c']
dict2['b+c']=dict1['b']+dict1['c']
dict1.update(dict2)
print(dict1)
Something like the following, using generators:
NOTE: OP if you don't clarify the question I'll delete this
d = {'a':10, 'e':20, 'c':30, 'b':40}
l1 = [['a','b'], ['c','d']]
def _gen_compound_keys(d, kk):
"""Generator helper function for the single and ocmpound keys from input tuple of keys"""
# Note: you can trivially generalize this from assuming fixed length of 2
yield kk[0], d[kk[0]]
yield kk[1], d[kk[1]]
yield '+'.join(kk), sum(d[k] for k in kk)
def gen_compound_keys(d, kk):
"""Generator for the single and compound keys, returns a dictionary as desired"""
return {k:v for k,v in _gen_compound_keys(d, kk)}
result = {}
result.update(gen_compound_keys(d, l1[0]))
result.update(gen_compound_keys(d, l1[1]))
result.update(d)

How to create a list of dictionaries from another list of dictionaries by taking only some keys : python

How to convert a list of dictionaries [{"a":1, "b":2, "c":3}, {"a":4, "b":5, "c":4}] to [{"a":1}, {"a":4}]. I wrote a function to just pop the keys that are not required by looping through the list.
def pop_keys(dictionary, keys_to_pop):
for item in dictionary:
for key in keys_to_pop:
item.pop(key, None)
return dictionary
Is there any better and fastest way to achieve the same.?
You could try
l1 = [{"a":1, "b":2, "c":3}, {"a":4, "b":5, "c":4}]
keys_to_pop = {"b", "c"} # use a set for fast lookup
l2 = [{k:v for k,v in d.items() if k not in keys_to_pop} for d in l1] # use comprehension list and dictionary

Return dictionary keys based on list integer value

I have the following:
my_list = ["7777777", "888888", "99999"]
my_dict = {21058199: '500', 7777777: '500', 21058199: '500'}
I am trying to create a new dictionary which will include the dictionary value (from the original dictionary) that matches the list entry to the dictionary key (in the original dictionary)
for k in my_dict.keys():
if k in my_list:
new_dict.append(k)
print(new_dict)
should return
7777777: '500'
But I'm returning an empty set. I'm not sure what I'm doing wrong here
A dictionary comprehension would provide you what you need.
You need to make sure the types agree (int vs. str)
Unless the list is significantly longer than the dict, it will be much more efficient to iterate over the list and check that key is in the dict than the other way around.
E.g.:
In []:
new_dict = {k: my_dict[k] for k in map(int, my_list) if k in my_dict}
print(new_dict)
Out[]:
{7777777: '500'}
Try like this :
my_list = ["7777777", "888888", "99999"]
my_dict = {21058199: '500', 7777777: '500', 21058199: '500'}
new_dict = {k:my_dict[k] for k in my_dict.keys() if str(k) in my_list}
print(new_dict)
# {7777777: '500'}
Update:
You can also do this with project function from funcy library.
from funcy import project
new_dict = project(my_dict, map(int, my_list))
print(new_dict)
You have to int() the iterator.
my_list = ["7777777", "888888", "99999"]
my_dict = {21058199: '500', 7777777: '500', 21058199: '500'}
for l_i in my_list:
if l_i in my_dict:
print(my_dict[int(l_i)])
Your dictionary keys are of the type 'Int', while your list items are strings. You need to convert one into the other.
for example:
new_dict = {}
for k in my_dict.keys():
if str(k) in my_list:
new_dict[k] = my_dict[k]
Note that you cannot use .append() to add key-value pairs to a dictionary.

trying to round robin a set of devices against set of criteria

I have 2 lists, both of which aren't fixed in len.
list1 = ["John", "bruce", "William"]
list2 = ["lindt", "reese", "snickers", "chocolate", "Milkyway", "Cadbury", "Candy"]
I want to distribute the candy amongst the members in list1 so that end result would look something like
John: "lindt","chocolate","candy"
Bruce: "reese","Mlikyway"
Will: "Snickers","Cadbury"
I tried using cycle and zip from itertools but all I am getting is a tuple with something like
list1 = ["John","bruce","William"]
list2 = ["lindt","reese","snickers","chocolate","Milkyway","Cadbury","Candy"]
for i in zip(list2,cycle(list1)):
print(i)
Output
('lindt', 'John')
('reese', 'bruce')
('snickers', 'William')
('chocolate', 'John')
('Milkyway', 'bruce')
('Cadbury', 'William')
('Candy', 'John')
Let's use a dictionary:
kids = {}
all members from list1 become a list in your dictionary:
for i in list1: kids[i] = []
then use zip and cycle to assign each kid their candy:
for i in zip(list2,cycle(list1)): kids[i[1]].append(i[0])
# kids[i[1]] represents the kids's name since it's at index 1 in the tuple
# i[0] represents the candy that is associated to the kid.
Result:
>>> for i in kids: print(i,':', kids[i])
john : ['lindt', 'chocolate', 'candy']
bruce : ['reese', 'milkyway']
william : ['snickers', 'cadbury']
You were pretty close but you need a data structure to store these results. Instead of creating a dictionary containing the keys and empty lists you can also use dict.setdefault:
from itertools import cycle
d = dict()
for name, item in zip(cycle(list1), list2):
d.setdefault(name, []).append(item)
print(d)
# {'John': ['lindt', 'chocolate', 'Candy'],
# 'William': ['snickers', 'Cadbury'],
# 'bruce': ['reese', 'Milkyway']}
Instead of setdefault you can also use collections.defaultdict:
from itertools import cycle
from collections import defaultdict
d = defaultdict(list)
for name, item in zip(cycle(list1), list2):
d[name].append(item)
print(d)
# defaultdict(list,
# {'John': ['lindt', 'chocolate', 'Candy'],
# 'William': ['snickers', 'Cadbury'],
# 'bruce': ['reese', 'Milkyway']})

How can I only parse/split this list with multiple colons in each element? Create dictionary

I have the following Python list:
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EY:EE:KJERWEWERKJWE']
I would like to take the entries of this list and create a dictionary of key-values pairs that looks like
dictionary_list1 = {'EW':'G:B<<LADHFSSFAFFF', 'CB':'E:OWTOWTW', 'PP':'E:A,A<F<AF', 'GR':'A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX':'F:-111', 'DS':'f:115.5', 'MW':'AA:0', 'MA':'A:0XT:i:0', 'EW':'EE:KJERWEWERKJWE'}
How does one parse/split the list above list1 to do this? My first instinct was to try try1 = list1.split(":"), but then I think it is impossible to retrieve the "key" for this list, as there are multiple colons :
What is the most pythonic way to do this?
You can specify a maximum number of times to split with the second argument to split.
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EW:EE:KJERWEWERKJWE']
d = dict(item.split(':', 1) for item in list1)
Result:
>>> import pprint
>>> pprint.pprint(d)
{'CB': 'E:OWTOWTW',
'DS': 'f:115.5',
'EW': 'EE:KJERWEWERKJWE',
'GR': 'A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7',
'MA': 'A:0XT:i:0',
'MW': 'AA:0',
'PP': 'E:A,A<F<AF',
'SX': 'F:-111'}
If you'd like to keep track of values for non-unique keys, like 'EW:G:B<<LADHFSSFAFFF' and 'EW:EE:KJERWEWERKJWE', you could add keys to a collections.defaultdict:
import collections
d = collections.defaultdict(list)
for item in list1:
k,v = item.split(':', 1)
d[k].append(v)
Result:
>>> pprint.pprint(d)
{'CB': ['E:OWTOWTW'],
'DS': ['f:115.5'],
'EW': ['G:B<<LADHFSSFAFFF', 'EE:KJERWEWERKJWE'],
'GR': ['A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7'],
'MA': ['A:0XT:i:0'],
'MW': ['AA:0'],
'PP': ['E:A,A<F<AF'],
'SX': ['F:-111']}
You can also use str.partition
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EW:EE:KJERWEWERKJWE']
d = dict([t for t in x.partition(':') if t!=':'] for x in list1)
# or more simply as TigerhawkT3 mentioned in the comment
d = dict(x.partition(':')[::2] for x in list1)
for k, v in d.items():
print('{}: {}'.format(k, v))
Output:
MW: AA:0
CB: E:OWTOWTW
GR: A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7
PP: E:A,A<F<AF
EW: EE:KJERWEWERKJWE
SX: F:-111
DS: f:115.5
MA: A:0XT:i:0

Categories

Resources