Dictionary difference similar to set difference - python

I have a dictionary and a list:
dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
remove = ['b', 'c', 'e']
I need to split "dictionary" into two dictionaries using "remove". The idea is to remove keys in "remove" from "dictionary" but instead of discarding them, I want to keep them in a new dictionary. The outcome I want is
old_dictionary = {'a':1, 'd':4, 'f':6}
new_dictionary = {'b':2, 'c':3, 'e':5}
Getting "new_dictionary" is fairly easy.
new_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
How do I find the difference between "dictionary" and "new_dictionary" to get "old_dictionary"? I guess I could loop again only with not in remove... but is there a nice trick for dictionaries similar to set difference?

One way could be to use dict.pop in loop. dict.pop method removes the key and returns its value. So in each iteration, we remove a key in remove from dictionary and add this key along with its value to new_dict. At the end of the iteration, dictionary will have all keys in remove removed from it.
new_dict = {k: dictionary.pop(k) for k in remove}
old_dict = dictionary.copy()
Output:
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> old_dict
{'a': 1, 'd': 4, 'f': 6}

Just add else
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value

Use else: to put it in the other dictionary.
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value

The dict.keys() or dict.items() can be operated like a set with other iterable sequences:
>>> dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
>>> remove = list('bce')
>>> new_dict = {key: dictionary[key] for key in remove}
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> dict(dictionary.items() - new_dict.items())
{'d': 4, 'f': 6, 'a': 1}
However, in terms of performance, this method is not as good as the answer with the highest score.

Related

How to get the differences details of two different dictionaries?

How to perform below operations on two different dictionaries ?
dict2 can be very huge. I believe set operations is bit slower.
get the values of keys of dict1 from dict2, whether same values or different:
dict1 = {'a': 1, 'b': 2}
dict2 = {'a': 3, 'b': 4, 'd': 5}
# Output: {'a': 3, 'b': 4}
dict1 = {'a': 1, 'b': 2}
dict2 = {'a': 1, 'b': 2, 'd': 5}
# Output: {'a': 1, 'b': 2}
key that's present in dict1 but not in dict2, output as shown below
dict1 = {'a': 1, 'b': 2, 'c': 6}
dict2 = {'a': 3, 'b': 4, 'd': 5}
# Output: {'c': 6}
all the keys of dict1 exist in dict2, get values of keys of dict1 from dict2
dict1 = {'a': 1, 'b': 2, 'c': 6}
dict2 = {'a': 3, 'b': 4, 'c': 6, 'd': 5}
# Output: {'a': 3, 'b': 4, 'c': 6}
Tried below methods:
def keys_in_dict1_but_not_in_dict2(dict1, dict2):
d1 = {}
for key in dict1.keys():
if not key in dict2:
d1[key] = dict1[key]
return d1
def keys_in_dict1_and_dict2(dict1, dict2):
d1 = {}
for key in dict1.keys():
if key in dict2:
d1[key] = dict2[key]
return d1
I could have used sets, but that's slower when dictionary length increases. And this above conventional looping may increase the time (along with complexity), as dictionary length increases.
What would be efficient and best way to handle ?
Is this the right approach or any other better approach to handle these scenarios ?
The problem is as follows: iterate through the keys of one dictionary, decide if they are in another, and extract the appropriate values. You can either do both steps in one loop, or separate the key checking from the value extraction. The choice will depend on the number of keys and the number of values you want to copy.
To do the operations together, you would use the loops you show, possibly optimized as comprehensions. To separate out the operations, you would use the fact that dict.keys() returns a set-like view backed by the dictionary itself. This allows you to do selection much faster than converting to a set.
Keys of dict1 that appear in dict2 are dict1.keys() & dict2.keys(). Your two options are therefore
{key: dict2[key] for key in dict1.keys() & dict2.keys()}
AND
{key: dict2[key] for key in dict1 if key in dict2}
Keys of dict1 that don't appear in dict2 are dict1.keys() - dict2.keys(). Your two options are therefore
{key: dict1[key] for key in dict1.keys() - dict2.keys()}
AND
{key: dict1[key] for key in dict1 if key not in dict2}
In this case, no optimization is possible: all the keys need to be iterated over:
{key: dict2[key] for key in dict1}
To find out which option works fastest for 1. and 2., you will have to run a benchmark specific to your data size and ratio of overlap. For example, if len(dict1.keys() & dict2.keys()) < len(dict1), you will save a lot of overhead in __getitem__ by using the first approach for 1. But if most keys are present, then the overhead of doing a set operation on keys() may overtake the computation.
Try using dictionary comprehension.
get the values of keys of dict1 from dict2:
part1 = {key: dict2[key] for key in dict1 if key in dict2}
key that's present in dict1 but not in dict2:
part2 = {key: value for key, value in dict1.items() if key not in dict2}
get values of keys of dict1 from dict2"
part3 = {key: dict2[key] for key in dict1}

Removing duplicate values in Python dictionaries? [duplicate]

This question already has answers here:
Removing Duplicates From Dictionary
(11 answers)
remove duplicates values in a dictionary python
(1 answer)
Closed 2 years ago.
More a concept question than a direct coding one. But say I had a dictionary akin to this one.
Dict = {'A':1, 'B':3, 'C':3, 'D':4, 'E':1}
Dict2 = {}
And I wanted to take all instances were two keys had the same value, and put them in a different dictionary, what sort of process be the most efficient? I've tried measures like
for value in Dict.items()
for a in value:
if a != b:
continue
else:
Dict2.append(a)
continue
But to no luck.
You can do something like this:
Dict = {'A':1, 'B':3, 'C':3, 'D':4, 'E':1}
result = {}
for k, v in Dict.items():
result.setdefault(v, set()).add(k)
print("Original: ")
print(Dict)
print("------------")
print("Result: ")
print(result)
Original:
{'A': 1, 'B': 3, 'C': 3, 'D': 4, 'E': 1}
Result:
{1: {'A', 'E'}, 3: {'B', 'C'}, 4: {'D'}}
Old school, with regular loops. Could maybe be done with list or dict comprehensions, but this is easy and obvious:
dict = {'A':1, 'B':3, 'C':3, 'D':4, 'E':1}
# Make a reverse dictionary that maps values to lists of keys
# that have that value in the original dict
reverse_dict = {}
for k, v in dict.items():
reverse_dict.setdefault(v, list()).append(k)
# Show the reverse dict
# print(reverse_dict)
# Move entries for keys that had duplicates from the original
# dict to a new dict
dups_dict = {}
for k, vs in reverse_dict.items():
if len(vs) > 1: # if there was more than one key with this value
for v in vs: # for each of those keys
dups_dict[v] = k # copy it to the new dict
del dict[v] # and remove it from the original dict
# Show the original dict and the new one
print(dict)
print(dups_dict)
Result:
{'D': 4}
{'A': 1, 'E': 1, 'B': 3, 'C': 3}

How to get dictionary keys and values if both keys are in two separated dictionaries?

I would like to get a new dictionary with keys only if both dictionaries have those keys in them, and then get the values of the second one.
# example:
Dict1 = {'A':3, 'B':5, 'C':2, 'D':5}
Dict2 = {'B':3, 'C':1, 'K':5}
# result--> {'B':3, 'C':1}
As a dictionary comprehension:
>>> {k:v for k, v in Dict2.items() if k in Dict1}
{'B': 3, 'C': 1}
Or use filter:
>>> dict(filter(lambda x: x[0] in Dict1, Dict2.items()))
{'B': 3, 'C': 1}
>>>
Just another solution, doesn't use comprehension. This function loops through the keys k in Dict1 and tries to add Dict2[k] to a new dict which is returned at the end. I think the try-except approach is "pythonic".
def shared_keys(a, b):
"""
returns dict of all KVs in b which are also in a
"""
shared = {}
for k in a.keys():
try:
shared[k] = b[k]
except:
pass
return shared
Dict1 = {'A':3, 'B':5, 'C':2, 'D':5}
Dict2 = {'B':3, 'C':1, 'K':5}
print(shared_keys(Dict1, Dict2))
# >>> {'B': 3, 'C': 1}

Python - Comparing the values of keys [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Say the we have the following two dictionaries in Python:
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4}
dict2 = {'c':2, 'd':1, 'b':2, 'a':1}
Now, I'm assuming that the values in dict1 are the correct values. How can I compare dict2 to dict1, such that if the value of the key in dict2 is similar to that in dict1 the program returns True, and if it is different, it returns False?
Thanks.
If by similar you mean equal, then you can just directly compare them:
def compare_dictionaries(correct_dictionary,dictionary_to_check):
for key,correct_value in correct_dictionary.items():
if dictionary_to_check[key] != correct_value:
return False
return True
The above will throw a KeyError exception if you are missing a key in the dictionary you need to check and does nothing to handle if the dictionary you are checking contains extra keys.
I am guessing that dict1 key: values are what everything is being compared to.
dict2 = {'c':2, 'd':1, 'b':2, 'a':1}
def compareDict(d2):
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4}
for key in d2:
if d2[key] == dict1[key]:
print("{} key has similar value of {}; It is True".format(key,
d2[key]))
else:
print("{} key does not have similar value as dict1 key, it has a
value of {}; It is False".format(key, d2[key]))
compareDict(dict2)
Firstly check the keys are same using xor operation and then check the value with corresponding key of the dictionary. Hope this will help.
dict1 = {'a':2, 'b':2, 'c':3, 'd': 4}
dict2 = {'c':2, 'd':1, 'b':2, 'a':1}
# XOR operator for for checking all key are same.
check_keys= set(dict1.keys()) ^ set(dict2.keys())
keys = set(dict2.keys())
check = False
# a=0 if all key are same.
a = len(check_keys)
# if all key are same than check the value with corresponding key
if not a:
check = True
for key in keys:
if dict1[key] == dict2[key]:
pass
else:
check = False
break
print(check)
Compare dictionaries by size, keys and value for each key by giving a boolean flag.
def compareDictionaries(source, target, compareValues):
## compare if they are same
if (source == target):
return True
## covers scenario where target has more keys than source
if (len(target) > len(source)):
print("target has more keys than source")
return False
## iterate over target keys to check if they exists in
## source and depending on compareValues flag compare key's
## values
for key, values in target.items():
if (key in source.keys()):
if (compareValues):
if(not (source[key] == target[key])):
print ("key(" + key + ") don't have same values!")
return False
else:
print ("key(" + key + ") not found in source!")
return False
return True
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4}
dict2 = {'c':5, 'd':4, 'b':2, 'a':1}
if (compareDictionaries(dict1, dict2, False)):
print("Pass")
else:
print("Fail")
First implementation
# Both dictionaries can have different keys
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4} # correct dict
dict2 = {'c':2, 'd':1, 'b':2, 'a':1} # test dict
if all(map(lambda x: dict2.get(x, "dict2") == dict1.get(x, "dict1"), dict2)):
print "same"
else:
print "different"
Test case #1 gives "same"
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4, 'e':5} # correct dict
dict2 = {'c':3, 'd':4, 'b':2, 'a':1} # test dict
Test case #2 gives "different"
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4 } # correct dict
dict2 = {'c':3, 'd':4, 'b':2, 'a':1, 'e':5} # test dict
Second implementation (different from first, see test results)
# Both dictionaries can have different keys
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4} # correct dict
dict2 = {'c':2, 'd':1, 'b':2, 'a':1} # test dict
if set(dict1.keys()) == set(dict2.keys()) and set(dict1.values()) == set(dict2.values()):
print "same"
else:
print "different"
Test case #1 gives "different"
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4, 'e':5} # correct dict
dict2 = {'c':3, 'd':4, 'b':2, 'a':1} # test dict
Test case #2 gives "different"
dict1 = {'a':1, 'b':2, 'c':3, 'd': 4 } # correct dict
dict2 = {'c':3, 'd':4, 'b':2, 'a':1, 'e':5} # test dict

concatenate dictionaries over a key

I have a list of dictionaries (with some data fetched from an API) assume:
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}..]
There are multiple dictionaries which are repeated in alist. But only one of key has a different values out of repeated dictionaries.So, the query is:
What's the easiest way to combine those dictionaries by keeping separate values in a list?
like:
alist = [{'a':1, 'b':2, 'c':[3, 35, 87]}...]
Update - I have a list which specifies me the repeated keys like:
repeated_keys = ['c',...]
Use defaultdict (it is faster ) and generate dictionary from it- you can also easily convert that dictionary into list.You can modify j in i.keys() to filter keys.
from collections import defaultdict as df
d=df(list)
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
for i in alist:
for j in i.keys():
d[j].append(i[j])
print dict(d.items())
Output-
{'a': [1, 1, 1], 'c': [3, 35, 87], 'b': [2, 2, 2]}
If you want to get rid of repeated element from that use dict-comprehension and set-
>>>{k:list(set(v)) for k,v in d.items()}
>>>{'a': [1], 'c': [35, 3, 87], 'b': [2]}
You could use a list comprehension:
result = [alist[0].copy()]
result[0]['c'] = [d['c'] for d in alist]
Note that there is little point in making this a list again; you combined everything into one dictionary, after all:
result = dict(alist[0], c=[d['c'] for d in alist])
If you have multiple repeated keys, you have two options:
Loop and get each key out:
result = alist[0].copy()
for key in repeated:
result[key] = [d[key] for d in alist]
Make all keys lists, that way you don't have to keep consulting your list of repeated keys:
result = {}
for key in alist[0]:
result[key] = [d[key] for d in alist]
The latter option is alternatively implemented by iterating over alist just once:
result = {}
for d in alist:
for key, value in d.items():
result.setdefault(key, []).append(value)
from collections import defaultdict
con_dict = defaultdict(list)
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
for curr_dict in alist:
for k, v in curr_dict.iteritems():
con_dict[k].append(v)
con_dict = dict(con_dict)
We create a default dict of type list and then iterate over the items and append them in the right key.
It is possible to get your result.You have to test if you want to create a list if items has different values or keep it as is.
repeated_keys is used to store repeated keys and count how many times they are repeated.
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
z = {}
repeated_keys = {}
for dict in alist:
for key in dict:
if z.has_key(key):
if isinstance(z[key], list):
if not dict[key] in z[key]:
repeated_keys[key] +=1
z[key].append(dict[key])
else:
if z[key] != dict[key]:
repeated_keys[key] = 1
z[key] = [z[key], dict[key]]
else:
z[key] = dict[key]
print 'dict: ',z
print 'Repeated keys: ', repeated_keys
output:
dict: {'a': [1, 3], 'c': [3, 35, 87], 'b': 2}
Repeated keys: {'c'}
if:
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}, {'a':3,'b':2}]
output should be:
dict: {'a': [1, 3], 'c': [3, 35, 87], 'b': 2}
Repeated keys: {'a': 1, 'c': 2}

Categories

Resources