How to get the differences details of two different dictionaries?

How to get the differences details of two different dictionaries? - python

How to perform below operations on two different dictionaries ?
dict2 can be very huge. I believe set operations is bit slower.
get the values of keys of dict1 from dict2, whether same values or different:
dict1 = {'a': 1, 'b': 2}
dict2 = {'a': 3, 'b': 4, 'd': 5}
# Output: {'a': 3, 'b': 4}
dict1 = {'a': 1, 'b': 2}
dict2 = {'a': 1, 'b': 2, 'd': 5}
# Output: {'a': 1, 'b': 2}
key that's present in dict1 but not in dict2, output as shown below
dict1 = {'a': 1, 'b': 2, 'c': 6}
dict2 = {'a': 3, 'b': 4, 'd': 5}
# Output: {'c': 6}
all the keys of dict1 exist in dict2, get values of keys of dict1 from dict2
dict1 = {'a': 1, 'b': 2, 'c': 6}
dict2 = {'a': 3, 'b': 4, 'c': 6, 'd': 5}
# Output: {'a': 3, 'b': 4, 'c': 6}
Tried below methods:
def keys_in_dict1_but_not_in_dict2(dict1, dict2):
d1 = {}
for key in dict1.keys():
if not key in dict2:
d1[key] = dict1[key]
return d1
def keys_in_dict1_and_dict2(dict1, dict2):
d1 = {}
for key in dict1.keys():
if key in dict2:
d1[key] = dict2[key]
return d1
I could have used sets, but that's slower when dictionary length increases. And this above conventional looping may increase the time (along with complexity), as dictionary length increases.
What would be efficient and best way to handle ?
Is this the right approach or any other better approach to handle these scenarios ?

The problem is as follows: iterate through the keys of one dictionary, decide if they are in another, and extract the appropriate values. You can either do both steps in one loop, or separate the key checking from the value extraction. The choice will depend on the number of keys and the number of values you want to copy.
To do the operations together, you would use the loops you show, possibly optimized as comprehensions. To separate out the operations, you would use the fact that dict.keys() returns a set-like view backed by the dictionary itself. This allows you to do selection much faster than converting to a set.
Keys of dict1 that appear in dict2 are dict1.keys() & dict2.keys(). Your two options are therefore
{key: dict2[key] for key in dict1.keys() & dict2.keys()}
AND
{key: dict2[key] for key in dict1 if key in dict2}
Keys of dict1 that don't appear in dict2 are dict1.keys() - dict2.keys(). Your two options are therefore
{key: dict1[key] for key in dict1.keys() - dict2.keys()}
AND
{key: dict1[key] for key in dict1 if key not in dict2}
In this case, no optimization is possible: all the keys need to be iterated over:
{key: dict2[key] for key in dict1}
To find out which option works fastest for 1. and 2., you will have to run a benchmark specific to your data size and ratio of overlap. For example, if len(dict1.keys() & dict2.keys()) < len(dict1), you will save a lot of overhead in __getitem__ by using the first approach for 1. But if most keys are present, then the overhead of doing a set operation on keys() may overtake the computation.

Try using dictionary comprehension.
get the values of keys of dict1 from dict2:
part1 = {key: dict2[key] for key in dict1 if key in dict2}
key that's present in dict1 but not in dict2:
part2 = {key: value for key, value in dict1.items() if key not in dict2}
get values of keys of dict1 from dict2"
part3 = {key: dict2[key] for key in dict1}

Related

Dictionary difference similar to set difference

I have a dictionary and a list:
dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
remove = ['b', 'c', 'e']
I need to split "dictionary" into two dictionaries using "remove". The idea is to remove keys in "remove" from "dictionary" but instead of discarding them, I want to keep them in a new dictionary. The outcome I want is
old_dictionary = {'a':1, 'd':4, 'f':6}
new_dictionary = {'b':2, 'c':3, 'e':5}
Getting "new_dictionary" is fairly easy.
new_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
How do I find the difference between "dictionary" and "new_dictionary" to get "old_dictionary"? I guess I could loop again only with not in remove... but is there a nice trick for dictionaries similar to set difference?

One way could be to use dict.pop in loop. dict.pop method removes the key and returns its value. So in each iteration, we remove a key in remove from dictionary and add this key along with its value to new_dict. At the end of the iteration, dictionary will have all keys in remove removed from it.
new_dict = {k: dictionary.pop(k) for k in remove}
old_dict = dictionary.copy()
Output:
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> old_dict
{'a': 1, 'd': 4, 'f': 6}

Just add else
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value

Use else: to put it in the other dictionary.
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value

The dict.keys() or dict.items() can be operated like a set with other iterable sequences:
>>> dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
>>> remove = list('bce')
>>> new_dict = {key: dictionary[key] for key in remove}
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> dict(dictionary.items() - new_dict.items())
{'d': 4, 'f': 6, 'a': 1}
However, in terms of performance, this method is not as good as the answer with the highest score.

How can I merge multiple dictionaries and add the values of the same key? (Python) [duplicate]

This question already has answers here:
Sum corresponding elements of multiple python dictionaries
(4 answers)
Closed 2 years ago.
Suppose I have the following dictionaries:
dict1 = {'a': 10, 'b': 8, 'c':3}
dict2 = {'c': 4}
dict3 = {'e':9, 'a':3}
I'm trying to merge them in a way such that the new (combinational) dictionary contains all the keys and all the values of the same key are added together. For instance, in this case, my desired output looks like:
dict = {'a': 13, 'b': 8, 'c':7, 'e':9}
It looks like the update() method doesn't work since some values are overwritten. I also tried ChainMaps and encountered the same issue. How can I merge multiple dictionaries and add the values of the same key?Thanks a lot:)

Here's a dictionary comprehension to achieve this using itertools.chain.from_iterable() and sum(). Here, I am creating a set of keys from all the three dicts. Then I am iteration over this set inside dictionary comprehension to get the sum of values per key.
>>> from itertools import chain
>>> dict1 = {'a': 10, 'b': 8, 'c':3}
>>> dict2 = {'c': 4}
>>> dict3 = {'e':9, 'a':3}
>>> my_dicts = dict1, dict2, dict3
>>> {k: sum(dd.get(k, 0) for dd in my_dicts) for k in set(chain.from_iterable(d.keys() for d in my_dicts))}
{'a': 13, 'e': 9, 'b': 8, 'c': 7}

This code below should do the trick:
dict1 = {'a': 10, 'b': 8, 'c':3}
dict2 = {'c': 4}
dict3 = {'e':9, 'a':3}
multiple_dict = [dict1, dict2, dict3]
final_dict = {}
for dict in multiple_dict:
for key, value in dict.items():
if key in final_dict:
final_dict[key] += value
else:
final_dict[key] = value
print(final_dict)

Filter list of dictionaries with duplicate values at a certain key

I have a dictionary as follows:
dic=[{'a':1,'b':2,'c':3},{'a':9,'b':2,'c':2},{'a':5,'b':1,'c':2}]
I would like to filter out those dictionaries with recurring values for certain keys, such as in this case the key 'b' which has duplicate values in the first and second dictionaries in the list. I would like to remove the second entry
Quite simply, I would like my filtered list to look as follows:
filt_dic=[{'a':1,'b':2,'c':3},{'a':5,'b':1,'c':2}]
Is there a pythonic way to do this?

Use another dictionary (or defaultdict) to keep track of what values you have already seen for what keys. This dictionary will hold one set (for fast lookup) for each key of the original dict.
dic=[{'a':1,'b':2,'c':3},{'a':9,'b':2,'c':2},{'a':5,'b':1,'c':2}]
seen = defaultdict(set)
filt_dic = []
for d in dic:
if not any(d[k] in seen[k] for k in d):
filt_dic.append(d)
for k in d:
seen[k].add(d[k])
print(filt_dic)
Afterwards, filt_dic is [{'a': 1, 'c': 3, 'b': 2}, {'a': 5, 'c': 2, 'b': 1}] and seen is {'a': set([1, 5]), 'c': set([2, 3]), 'b': set([1, 2])}).

Create new dictionary out of two existing dictionaries

Consider two dictionaries:
dict1 = {'a': 35, 'b': 39, 'c': 20} # (with the values as integers)
dict2 = {'a': 23, 'c': 12}
I want to obtain the following:
dict_new = {'a': 0.657, 'c': 0.6} # (with the values as floats, as values of dict2/dict1)

You can get the common keys using dict2.keys() & dict1 and then just do the division:
dict1 = {'a':35, 'b': 39, 'c':20} #(with the values as integers)
dict2 = {'a':23, 'c':12}
d3 = {k: dict2[k] / dict1[k] for k in dict2.keys() & dict1}
If you want the values rounded to three decimal places use round(dict2[k] / dict1[k],3), if the keys from dict2 should always be in dict1 then you can simply iterate over the items of dict2:
d = {k:v / dict1[k] for k,v in dict2.items()}

dic_new = {}
for key in dic2.keys():
dic_new[key]=float(dict2[key])/dict1[key]

Updating a dictionary

I have created three dictionaries-dict1, dict2, and dict2. I want to update dict1 with dict2 first, and resulting dictionary with dict3. I am not sure why they are not adding up.
def wordcount_directory(directory):
dict = {}
filelist=[os.path.join(directory,f) for f in os.listdir(directory)]
dicts=[wordcount_file(file) for file in filelist]
dict1=dicts[0]
dict2=dicts[1]
dict3=dicts[2]
for k,v in dict1.iteritems():
if k in dict2.keys():
dict1[k]+=1
else:
dict1[k]=v
for k1,v1 in dict1.iteritems():
if k1 in dict3.keys():
dict1[k1]+=1
else:
dict1[k1]=v1
return dict1
print wordcount_directory("C:\\Users\\Phil2040\\Desktop\\Word_count")

Maybe I am not understanding you question right, but are you trying to add all the values from each of the dictionaries together into one final dictionary? If so:
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 5, 'c': 1, 'd': 9}
dict3 = {'d': 1, 'e': 7}
def add_dict(to_dict, from_dict):
for key, value in from_dict.iteritems():
to_dict[key] = to_dict.get(key, 0) + value
result = dict(dict1)
add_dict(result, dict2)
add_dict(result, dict3)
print result
This yields: {'a': 1, 'c': 4, 'b': 7, 'e': 7, 'd': 10}
It would be really helpful to post what the expected outcome should be for your question.
EDIT:
For an arbitrary amount of dictionaries:
result = dict(dicts[0])
for dict_sum in dicts[1:]:
add_dict(result, dict_sum)
print(result)
If you really want to fix the code from your original question in the format it is in:
You are using dict1[k]+=1 when you should be performing dict1[k]+=dict2.get(k, 0).
The introduction of get removes the need to check for its existence with an if statement.
You need to iterate though dict2 and dict3 to introduce new keys from them into dict1
(not really a problem, but worth mentioning) In the if statement to check if the key is in the dictionary, it is recommended to simply the operation to if k in dict2: (see this post for more details)
With the amazing built-in library found by #DisplacedAussie, the answer can be simplified even further:
from collections import Counter
print(Counter(dict1) + Counter(dict2) + Counter(dict3))
The result yields: Counter({'d': 10, 'b': 7, 'e': 7, 'c': 4, 'a': 1})
The Counter object is a sub-class of dict, so it can be used in the same way as a standard dict.

Hmmm, here a simple function that might help:
def dictsum(dict1, dict2):
'''Modify dict1 to accumulate new sums from dict2
'''
k1 = set(dict1.keys())
k2 = set(dict2.keys())
for i in k1 & k2:
dict1[i] += dict2[i]
for i in k2 - k1:
dict1[i] = dict2[i]
return None
... for the intersection update each by adding the second value to the existing one; then for the difference add those key/value pairs.
With that defined you'd simple call:
dictsum(dict1, dict2)
dictsum(dict1, dict3)
... and be happy.
(I will note that functions modify the contents of dictionaries in this fashion are not all that common. I'm returning None explicitly to follow the convention established by the list.sort() method ... functions which modify the contents of a container, in Python, do not normally return copies of the container).

If I understand your question correctly, you are iterating on the wrong dictionary. You want to iterate over dict2 and update dict1 with matching keys or add non-matching keys to dict1.
If so, here's how you need to update the for loops:
for k,v in dict2.iteritems(): # Iterate over dict2
if k in dict1.keys():
dict1[k]+=1 # Update dict1 for matching keys
else:
dict1[k]=v # Add non-matching keys to dict1
for k1,v1 in dict3.iteritems(): # Iterate over dict3
if k1 in dict1.keys():
dict1[k1]+=1 # Update dict1 for matching keys
else:
dict1[k1]=v1 # Add non-matching keys to dict1

I assume that wordcount_file(file) returns a dict of the words found in file, with each key being a word and the associated value being the count for that word. If so, your updating algorithm is wrong. You should do something like this:
keys1 = dict1.keys()
for k,v in dict2.iteritems():
if k in keys1:
dict1[k] += v
else:
dict1[k] = v
If there's a lot of data in these dicts you can make the key lookup faster by storing the keys in a set:
keys1 = set(dict1.keys())
You should probably put that code into a function, so you don't need to duplicate the code when you want to update dict1 with the data in dict3.
You should take a look at collections.Counter, a subclass of dict that supports counting; using Counters would simplify this task considerably. But if this is an assignment (or you're using Python 2.6 or older) you may not be able to use Counters.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get the differences details of two different dictionaries? - python

Related

Dictionary difference similar to set difference

How can I merge multiple dictionaries and add the values of the same key? (Python) [duplicate]

Filter list of dictionaries with duplicate values at a certain key

Create new dictionary out of two existing dictionaries

Updating a dictionary

Categories

Resources