Removing inverse duplicates in dictionary python - python

I have a python dictionary containing some example keys and values:
{'a': ['b'],
'c': ['d'],
'x': ['y'],
'y': ['x'],
'i': ['j','k'],
'j': ['i','k']
'k': ['i','j']}
What letter the key is and which letters are values are irrelevant providing they are shown to have a relationship. I need to be able to remove any 'duplicate' key-value combination so that my dictionary would be displayed as follows.
{'a': ['b'],
'c': ['d'],
'x': ['y'],
'i': ['j','k']}

You can turn each entry to a tuple and use a set to get O(n) time.
d = {'a': ['b'],
'c': ['d'],
'x': ['y'],
'y': ['x'],
'i': ['j','k'],
'j': ['i','k'],
'k': ['i','j']}
seen = set()
to_remove = []
for key, val in d.items():
entry = tuple(sorted(val.copy() + [key]))
to_remove.append(key) if entry in seen else seen.add(entry)
for key in to_remove:
del d[key]
print(d)
Output:
{'a': ['b'], 'c': ['d'], 'x': ['y'], 'i': ['j', 'k']}

Here is a solution how you can have it in one single loop with a dict comprehension as a one liner:
{k: v for i, (k, v) in enumerate(d.items()) if not set(list(d.keys())[:i]).intersection(v)}
And if you want to have it really fast:
s = set()
dmod = {}
for k, v in d.items():
s.add(k)
if not s.intersection(v):
dmod[k] = v
Both approaches assume your dict is named d.
Result:
# {'a': ['b'], 'c': ['d'], 'x': ['y'], 'i': ['j', 'k']}
However, I have to state here that your text description does not suit the expected example. It would be nice if you could update that.
Besides that: are you aware, that the algorithm you ask for is completely order dependent? No solution which returns the result you want will work reliably prior python 3.6 without explicitely using ordered dicts.
I don't know your use case, but is it ok that applying the same algorithm to a e.g. backwards ordered dict creates a different result?

Another one liner:
>>> d = {'a': ['b'], 'c': ['d'], 'x': ['y'], 'y': ['x'], 'i': ['j','k'], 'j': ['i','k'], 'k': ['i','j']}
>>> dict({tuple(sorted((k, *v))):(k, v) for k, v in d.items()}.values())
{'a': ['b'], 'c': ['d'], 'y': ['x'], 'k': ['i', 'j']}
The inner dict is built with the sorted tuples (key, value1, value2, ...) as keys and the (key, [value1, value2, ...]) pairs as values. Obviously, for every sorted tuple, you keep the last (key, [value]) pair (this matters only if the dict keys are ordered, Python >= 3.6). Then build a dict with these (key, [value]) pair.
If you want to get only the first key-value (Python >= 3.6), just reverse the order of iteration of the original dict:
>>> dict({tuple(sorted((k, *v))):(k, v) for k, v in sorted(d.items(), reverse=True)}.values())
{'x': ['y'], 'i': ['j', 'k'], 'c': ['d'], 'a': ['b']}
If that's not clear, here's a more simple example. I want to keep the first list having a given length in a list:
>>> L = [[1], [2], [1,2], [2,3,4], [3], [5,2], [7,8,9]]
>>> {len(v): v for v in reversed(L)}
{3: [2, 3, 4], 2: [1, 2], 1: [1]}
We see that only the first value is kept:
[*[1]*, [2], *[1,2]*, *[2,3,4]*, [3], [5,2], [7,8,9]]
Because this first value is the last to be added to the dict and overwrite the next one (or previous one in reverse order).

Related

How to sum dict elements inside dicts

In Python I have a list of dictionary containing dictionaries.
list = [{a: {b:1, c:2}}, {d: {b:3, c:4}}, {a: {b:2, c:3}}]
I want one final list with dictionary that contain dictionary that will contain the sum of all dictionary with the same dictionary as the key. i.e. the result will be:
result = [{a: {b:3, c:5}}, {d: {b:3, c:4}}]
N.B: every dictionary in the list will contain same number of key, value pairs.
Code:
lst = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
p={}
for l in lst:
for key , val in l.items():
if key in p and val != p[key]:
p.update({key:{k: p[key].get(k, 0) + val.get(k, 0) for k in set(p[key])}})
else:
p.update(l)
Output:
{'a': {'c': 5, 'b': 3}, 'd': {'b': 3, 'c': 4}}
I don't know if it is the best way to do this, but here it goes:
First I've made a loop for to get all of the primary and secondary keys:
# list containing the data
lista = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
# empty list with keys
primary_keys = []
secondary_keys = []
# for each dict in the list it appends the primary key and the secondary key
for dic in lista:
for key in dic.keys():
primary_keys.append(key)
for key2 in dic[key].keys():
secondary_keys.append(key2)
# prints all the keys
print('Primary keys:',primary_keys)
print('Secondary keys:',secondary_keys)
Results:
Primary keys: ['a', 'd', 'a']
Secondary keys: ['b', 'c', 'b', 'c', 'b', 'c']
Then I've made a final dict with all the combinations:
# creates the final dict from the list
dict_final = dict.fromkeys(primary_keys)
# for all primary keys creates a secondary key
for pkey in dict_final.keys():
dict_final[pkey] = dict.fromkeys(secondary_keys)
# for all secondary keys puts a zero
for skey in dict_final[pkey].keys():
dict_final[pkey][skey] = 0
# prints the dict
print(dict_final)
Results:
{'a': {'b': 0, 'c': 0}, 'd': {'b': 0, 'c': 0}}
And later I've made a loop through each dictionary item and added to the corresponding keys in the final dict
# for each primary and secondary keys in the dic list sums into the final dict
for dic in lista:
for pkey in dict_final.keys():
for skey in dict_final[pkey].keys():
try:
dict_final[pkey][skey] += dic[pkey][skey]
except:
pass
# prints the final dict
print(dict_final)
Results:
{'a': {'b': 3, 'c': 5}, 'd': {'b': 3, 'c': 4}}
By using defaultdict, we can simplify the logic a bit:
# Using default dict
from collections import defaultdict
original_list = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
out = defaultdict(lambda: defaultdict(int))
for outter_dict in original_list:
for outter_key, inner_dict in outter_dict.items():
for inner_key, inner_value in inner_dict.items():
out[outter_key][inner_key] += inner_value
print(out)
out_list = [{key: dict(value) for key, value in out.items()}]
print(out_li)
Output:
defaultdict(<function <lambda> at 0x103c18a60>, {'a': defaultdict(<class 'int'>, {'b': 3, 'c': 5}), 'd': defaultdict(<class 'int'>, {'b': 3, 'c': 4})})
[{'a': {'b': 3, 'c': 5}, 'd': {'b': 3, 'c': 4}}]
Notes
out is a nested defaultdict whose values are yet defaultdict whose values are integers
After the 3 nested loops, we converted the defaultdict out to a list of dictionaries to conform the output requirement

Comparing 2 lists of dictionaries and adding a key,value when a common key is identified

I have two lists of dictionaries. The first list will contain significantly more dictionaries than the second list. There could be up to 200-300 dictionaries in list1 and no more than 10-15 dictionaries in list2.
For example, any dictionary in list1 that has the same 'g': h key/value as that of list2 needs to add key/value 'j': k to list 1.
list1 = [{'a': b, 'c': d, 'e': f, 'g': h},
{'a': b, 'c': d, 'e': f, 'g': h},
{'a': b, 'c': d, 'e': f, 'g': h},
{'a': b, 'c': d, 'e': f, 'g': h}
]
list2 = [{'g': h, 'j': k}]
I'm struggling on finding any previous examples of this type and cannot figure out a function of my own.
A trivial implementation could be:
for d1 in list1:
for d2 in list2:
if any(pair in d1.items() for pair in d2.items()):
d1.update(d2)
The value of list1 after this transformation:
[{'a': 'b', 'c': 'd', 'e': 'f', 'g': 'h', 'j': 'k'},
{'a': 'b', 'c': 'd', 'e': 'f', 'g': 'h', 'j': 'k'},
{'a': 'b', 'c': 'd', 'e': 'f', 'g': 'h', 'j': 'k'},
{'a': 'b', 'c': 'd', 'e': 'f', 'g': 'h', 'j': 'k'}]

python list of dicts - convert each key-value as a individual dict

with a list like below that has one or more dicts
l = [{'b': 'h', 'c': (1,2)}, {'d': [0, 1], 'e': {'f': 2, 'g': 'i'} } ]
need to extract each key-value pair as an individual dict
Expected output
[{'b': 'h'}, {'c': (1,2)}, {'d': [0, 1]}, {'e': {'f': 2, 'g': 'i'} } ]
I have been trying to do this via list comprehension - the outer comprehension could be something like [ {k,v} for k, v in ?? - need some help in getting the inner comprehension.
I believe this is what you're looking for - except that the order of the elements might be different, but that's to be expected when dealing with dictionaries:
lst = [{'b': 'h', 'c': (1,2)}, {'d': [0, 1], 'e': {'f': 2, 'g': 'i'}}]
[{k: v} for d in lst for k, v in d.items()]
=> [{'c': (1, 2)}, {'b': 'h'}, {'e': {'g': 'i', 'f': 2}}, {'d': [0, 1]}]
This should work:
[{k: v} for i in l for k, v in i.items()]

Comparing dicts, updating NOT overwriting values [duplicate]

This question already has answers here:
Combining 2 dictionaries with common key
(5 answers)
Closed 3 years ago.
I am not looking for something like this:
How do I merge two dictionaries in a single expression?
Generic way of updating python dictionary without overwriting the subdictionaries
Python: Dictionary merge by updating but not overwriting if value exists
I am looking for something like this:
input:
d1 = {'a': 'a', 'b': 'b'}
d2 = {'b': 'c', 'c': 'd'}
output:
new_dict = {'a': ['a'], 'b': ['b', 'c'], 'c': ['d']}
I have the following code which works but I am wondering if there is a more efficient method:
First, I create a list "unique_vals", where all the values that are present in both dicts are stored.
From this, a new dictionary is created which stores all the values present in both dictionaries
unique_vals = []
new_dict = {}
for key in list(d1.keys())+list(d2.keys()) :
unique_vals = []
try:
for val in d1[key]:
try:
for val1 in d2[key]:
if(val1 == val) and (val1 not in unique_vals):
unique_vals.append(val)
except:
continue
except:
new_dict[key] = unique_vals
new_dict[key] = unique_vals
Then, for every value in both dictionaries that are not listed in this new dictionary, these values are appended to the new dictionary.
for key in d1.keys():
for val in d1[key]:
if val not in new_dict[key]:
new_dict[key].append(val)
for key in d2.keys():
for val in d2[key]:
if val not in new_dict[key]:
new_dict[key].append(val)
Maybe with a defaultdict?
>>> d1 = {'a': 'a', 'b': 'b'}
>>> d2 = {'b': 'c', 'c': 'd'}
>>> from collections import defaultdict
>>>
>>> merged = defaultdict(list)
>>> dicts = [d1, d2]
>>> for d in dicts:
...: for key, value in d.items():
...: merged[key].append(value)
...:
>>> merged
defaultdict(list, {'a': ['a'], 'b': ['b', 'c'], 'c': ['d']})
This works with any number of dictionaries in the dicts list.
As a function:
def merge_dicts(dicts):
merged = defaultdict(list)
for d in dicts:
for key, value in d.items():
merged[key].append(value)
return merged
Here is a far simpler version:
d1 = {'a': 'a', 'b': 'b'}
d2 = {'b': 'c', 'c': 'd'}
new_dict = {key: [value] for key, value in d1.items()}
for key, value in d2.items():
try:
new_dict[key].append(value)
except:
new_dict[key] = [value]
Output:
{'a': ['a'], 'b': ['b', 'c'], 'c': ['d']}
EDIT: Solution below is for the original question, see other answers or duplicate for updated question.
A one line solution:
def merge_dicts(*dcts):
return {k: [d[k] for d in dcts if k in d] for k in {k for d in dcts for k in d.keys()}}
d1 = {'a': 'a', 'b': 'b'}
d2 = {'b': 'c', 'c': 'd'}
print(merge_dicts(d1, d2))
# {'c': ['d'], 'a': ['a'], 'b': ['b', 'c']}

Problems while zipping two lists as a dictionary?

I have the following lists:
a = ['A', 'B', 'C', 'C']
b = ['2', '3', 2, 3]
I am zipping them as follows in order to get a dict:
a_dict = dict(zip(a,b))
However, since the final object is a dict I cant have repeated keys:
{'A': '2', 'B': '3', 'C': 3}
Which alternatives can I have in order to have something like this? (*):
{'A': '2', 'B': '3', 'C':2, 'C': 3}
I tried to convert everything as tuples, however I am using a pop to replace some keys and values from the dictionary:
data['A'] = data.pop('B')
Therefore, I cant use a tuple format. Therefore, given the above two lists, how can I get (*)?
The most common way to resolve key conflicts while still maintaining most of the benefit of the quick indexing of a dict is to turn the values into lists:
d = {}
for k, v in zip(a, b):
d.setdefault(k, []).append(v)
so that d becomes:
{'A': ['2'], 'B': ['3'], 'C': [2, 3]}
Your desired output is not achievable by using dicts. You could either resolve the name conflict by using #blhsing's answer, or use sets to get somewhat close to your desired result as I suspect that you want to check for already existing combinations in a data structure because you tried using tuples.
c = set(zip(a, b))
so c becomes:
{('B', '3'), ('C', 3), ('A', '2'), ('C', 2)}
Or defaultdict of collections:
from collections import defaultdict
d=defaultdict(list)
for k,v in zip(a,b):
d[k].append(v)
And now:
print(dict(d))
Output:
{'A': ['2'], 'B': ['3'], 'C': [2, 3]}
If care about single element lists:
print({k:(v if len(v)-1 else v[0]) for k,v in d.items()})
Output:
{'A': '2', 'B': '3', 'C': [2, 3]}

Categories

Resources