So I realized that
dict1.update(dict2)
replaces values of dict2 with dict1 if the key exists in both the dictionaries. Is there any way to add the values of dict2 to dict1 directly if the key is present instead of looping around the key,value pairs
You say you want to add the values, but not what type they are. If they are numeric, you may be able to use collections.Counter instead of dict
>>> from collections import Counter
>>> a = Counter({'a':1, 'b':2})
>>> b = Counter({'a':5.4, 'c':6})
>>> a + b
Counter({'a': 6.4, 'c': 6, 'b': 2})
Related
I have two or more dictionary, I like to merge it as one with retaining multiple values of the same key as list. I would not able to share the original code, so please help me with the following example.
Input:
a= {'a':1, 'b': 2}
b= {'aa':4, 'b': 6}
c= {'aa':3, 'c': 8}
Output:
c= {'a':1,'aa':[3,4],'b': [2,6], 'c': 8}
I suggest you read up on the defaultdict: it lets you provide a factory method that initializes missing keys, i.e. if a key is looked up but not found, it creates a value by calling factory_method(missing_key). See this example, it might make things clearer:
from collections import defaultdict
a = {'a': 1, 'b': 2}
b = {'aa': 4, 'b': 6}
c = {'aa': 3, 'c': 8}
stuff = [a, b, c]
# our factory method is the list-constructor `list`,
# so whenever we look up a value that doesn't exist, a list is created;
# we can always be sure that we have list-values
store = defaultdict(list)
for s in stuff:
for k, v in s.items():
# since we know that our value is always a list, we can safely append
store[k].append(v)
print(store)
This has the "downside" of creating one-element lists for single occurences of values, but maybe you are able to work around that.
Please find below to resolve your issue. I hope this would work for you.
from collections import defaultdict
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c={'aa':3, 'c': 8}
dd = defaultdict(list)
for d in (a,b,c):
for key, value in d.items():
dd[key].append(value)
print(dd)
Use defaultdict to automatically create a dictionary entry with an empty list.
To process all source dictionaries in a single loop, use itertools.chain.
The main loop just adds a value from the current item, to the list under
the current key.
As you wrote, for cases when under some key there is only one item,
you have to generate a work dictionary (using dictonary comprehension),
limited to items with value (list) containing only one item.
The value of such item shoud contain only the first (and only) number
from the source list.
Then use this dictionary to update d.
So the whole script can be surprisingly short, as below:
from collections import defaultdict
from itertools import chain
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c = {'aa':3, 'c': 8}
d = defaultdict(list)
for k, v in chain(a.items(), b.items(), c.items()):
d[k].append(v)
d.update({ k: v[0] for k, v in d.items() if len(v) == 1 })
As you can see, the actual processing code is contained in only 4 (last) lines.
If you print d, the result is:
defaultdict(list, {'a': 1, 'b': [2, 6], 'aa': [4, 3], 'c': 8})
I explain better my problem:
I have a dictionary made by:
d={'name':(values), (values), values), 'name2':(values),(values), ...ecc}
so values are tuples.
I want to check if some tuples associated to a value are the same.
>>> d={'a':3 , 'b':5, 'c':1, 'a':3, 'b':5}
>>> d
{'a': 3, 'c': 1, 'b': 5}
>>>
Dictionaries can not have duplicate keys.
This is not a valid Python dictionary, you can't have duplicate keys to begin with...
I have created three dictionaries-dict1, dict2, and dict2. I want to update dict1 with dict2 first, and resulting dictionary with dict3. I am not sure why they are not adding up.
def wordcount_directory(directory):
dict = {}
filelist=[os.path.join(directory,f) for f in os.listdir(directory)]
dicts=[wordcount_file(file) for file in filelist]
dict1=dicts[0]
dict2=dicts[1]
dict3=dicts[2]
for k,v in dict1.iteritems():
if k in dict2.keys():
dict1[k]+=1
else:
dict1[k]=v
for k1,v1 in dict1.iteritems():
if k1 in dict3.keys():
dict1[k1]+=1
else:
dict1[k1]=v1
return dict1
print wordcount_directory("C:\\Users\\Phil2040\\Desktop\\Word_count")
Maybe I am not understanding you question right, but are you trying to add all the values from each of the dictionaries together into one final dictionary? If so:
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 5, 'c': 1, 'd': 9}
dict3 = {'d': 1, 'e': 7}
def add_dict(to_dict, from_dict):
for key, value in from_dict.iteritems():
to_dict[key] = to_dict.get(key, 0) + value
result = dict(dict1)
add_dict(result, dict2)
add_dict(result, dict3)
print result
This yields: {'a': 1, 'c': 4, 'b': 7, 'e': 7, 'd': 10}
It would be really helpful to post what the expected outcome should be for your question.
EDIT:
For an arbitrary amount of dictionaries:
result = dict(dicts[0])
for dict_sum in dicts[1:]:
add_dict(result, dict_sum)
print(result)
If you really want to fix the code from your original question in the format it is in:
You are using dict1[k]+=1 when you should be performing dict1[k]+=dict2.get(k, 0).
The introduction of get removes the need to check for its existence with an if statement.
You need to iterate though dict2 and dict3 to introduce new keys from them into dict1
(not really a problem, but worth mentioning) In the if statement to check if the key is in the dictionary, it is recommended to simply the operation to if k in dict2: (see this post for more details)
With the amazing built-in library found by #DisplacedAussie, the answer can be simplified even further:
from collections import Counter
print(Counter(dict1) + Counter(dict2) + Counter(dict3))
The result yields: Counter({'d': 10, 'b': 7, 'e': 7, 'c': 4, 'a': 1})
The Counter object is a sub-class of dict, so it can be used in the same way as a standard dict.
Hmmm, here a simple function that might help:
def dictsum(dict1, dict2):
'''Modify dict1 to accumulate new sums from dict2
'''
k1 = set(dict1.keys())
k2 = set(dict2.keys())
for i in k1 & k2:
dict1[i] += dict2[i]
for i in k2 - k1:
dict1[i] = dict2[i]
return None
... for the intersection update each by adding the second value to the existing one; then for the difference add those key/value pairs.
With that defined you'd simple call:
dictsum(dict1, dict2)
dictsum(dict1, dict3)
... and be happy.
(I will note that functions modify the contents of dictionaries in this fashion are not all that common. I'm returning None explicitly to follow the convention established by the list.sort() method ... functions which modify the contents of a container, in Python, do not normally return copies of the container).
If I understand your question correctly, you are iterating on the wrong dictionary. You want to iterate over dict2 and update dict1 with matching keys or add non-matching keys to dict1.
If so, here's how you need to update the for loops:
for k,v in dict2.iteritems(): # Iterate over dict2
if k in dict1.keys():
dict1[k]+=1 # Update dict1 for matching keys
else:
dict1[k]=v # Add non-matching keys to dict1
for k1,v1 in dict3.iteritems(): # Iterate over dict3
if k1 in dict1.keys():
dict1[k1]+=1 # Update dict1 for matching keys
else:
dict1[k1]=v1 # Add non-matching keys to dict1
I assume that wordcount_file(file) returns a dict of the words found in file, with each key being a word and the associated value being the count for that word. If so, your updating algorithm is wrong. You should do something like this:
keys1 = dict1.keys()
for k,v in dict2.iteritems():
if k in keys1:
dict1[k] += v
else:
dict1[k] = v
If there's a lot of data in these dicts you can make the key lookup faster by storing the keys in a set:
keys1 = set(dict1.keys())
You should probably put that code into a function, so you don't need to duplicate the code when you want to update dict1 with the data in dict3.
You should take a look at collections.Counter, a subclass of dict that supports counting; using Counters would simplify this task considerably. But if this is an assignment (or you're using Python 2.6 or older) you may not be able to use Counters.
I'm working on an assignment. Is there anyway a dictionary can have duplicate keys and hold the same or different values. Here is an example of what i'm trying to do:
dict = {
'Key1' : 'Helo', 'World'
'Key1' : 'Helo'
'Key1' : 'Helo', 'World'
}
I tried doing this but when I associate any value to key1, it gets added to the same key1.
Is this possible with a dictionary? If not what other data structure I can use to implement this process?
Use dictionaries of lists to hold multiple values.
One way to have multiple values to a key is to use a dictionary of lists.
x = { 'Key1' : ['Hello', 'World'],
'Key2' : ['Howdy', 'Neighbor'],
'Key3' : ['Hey', 'Dude']
}
To get the list you want (or make a new one), I recommend using setdefault.
my_list = x.setdefault(key, [])
Example:
>>> x = {}
>>> x['abc'] = [1,2,3,4]
>>> x
{'abc': [1, 2, 3, 4]}
>>> x.setdefault('xyz', [])
[]
>>> x.setdefault('abc', [])
[1, 2, 3, 4]
>>> x
{'xyz': [], 'abc': [1, 2, 3, 4]}
Using defaultdict for the same functionality
To make this even easier, the collections module has a defaultdict object that simplifies this. Just pass it a constructor/factory.
from collections import defaultdict
x = defaultdict(list)
x['key1'].append(12)
x['key1'].append(13)
You can also use dictionaries of dictionaries or even dictionaries of sets.
>>> from collections import defaultdict
>>> dd = defaultdict(dict)
>>> dd
defaultdict(<type 'dict'>, {})
>>> dd['x']['a'] = 23
>>> dd
defaultdict(<type 'dict'>, {'x': {'a': 23}})
>>> dd['x']['b'] = 46
>>> dd['y']['a'] = 12
>>> dd
defaultdict(<type 'dict'>, {'y': {'a': 12}, 'x': {'a': 23, 'b': 46}})
I think you want collections.defaultdict:
from collections import defaultdict
d = defaultdict(list)
list_of_values = [['Hello', 'World'], 'Hello', ['Hello', 'World']]
for v in list_of_values:
d['Key1'].append(v)
print d
This will deal with duplicate keys, and instead of overwriting the key, it will append something to that list of values.
Keys are unique to the data. Consider using some other value for a key or consider using a different data structure to hold this data.
for example:
don't use a persons address as a unique key because several people might live there.
a person's social security number or a drivers license is a much better unique id of a person.
you can create your own id to force it to be unique.
I have a python dictionary dict1 with more than 20,000 keys and I want to update it with another dictionary dict2. The dictionaries look like this:
dict1
key11=>[value11]
key12=>[value12]
...
...
keyxyz=>[value1x] //common key
...... so on
dict2
key21=>[value21]
key22=>[value22]
...
...
keyxyz=>[value2x] // common key
........ so on
If I use
dict1.update(dict2)
then the keys of dict1 which are similar to keys of dict2 will have their values overwritten by values of dict2. What I want is if a key is already present in dict1 then the value of that key in dict2 should be appended to value of dict1. So
dict1.conditionalUpdate(dict2)
should result in
dict1
key11=>[value11]
key12=>[value12]
key21=>[value21]
key22=>[value22]
...
...
keyxyz=>[value1x,value2x]
A naive method would be iterating over keys of dict2 for each key of dict1 and insert or update keys. Is there a better method? Does python support a built in data structure that supports this kind of functionality?
Use defaultdict from the collections module.
>>> from collections import defaultdict
>>> dict1 = {1:'a',2:'b',3:'c'}
>>> dict2 = {1:'hello', 4:'four', 5:'five'}
>>> my_dict = defaultdict(list)
>>> for k in dict1:
... my_dict[k].append(dict1[k])
...
>>> for k in dict2:
... my_dict[k].append(dict2[k])
...
>>> my_dict[1]
['a', 'hello']
This is actually pretty simple to do using a dict comprehension and itertools.groupby():
dict1 = {1: 1, 2: 2, 3: 3, 4: 4}
dict2 = {5: 6, 7: 8, 1: 1, 2: 2}
from itertools import groupby, chain
from operator import itemgetter
sorted_items = sorted(chain(dict1.items(), dict2.items()))
print({key: [value[1] for value in values] for key, values in groupby(sorted_items, itemgetter(0))})
Gives us:
{1: [1, 1], 2: [2, 2], 3: [3], 4: [4], 5: [6], 7: [8]}
Naturally, this creates a new dict, but if you need to update the first dict, you can do that trivially by updating with the new one. If your values are already lists, this may need some minor modification (but I presume you were doing that for the sake of the operation, in which case, there is no need).
Naturally, if you are using Python 2.x, then you will want to use dict.viewitems() or dict.iteritems() over dict.items(). If you are using a version of Python prior to dict comprehensions, then you could use dict((key , value) for ...) instead.
Another method without importing anything, just with the regular Python dictionary:
>>> dict1 = {1:'a',2:'b',3:'c'}
>>> dict2 = {1:'hello', 4:'four', 5:'five'}
>>> for k in dict2:
... dict1[k] = dict1.get(k,"") + dict2.get(k)
...
>>> dict1
{1: 'ahello', 2: 'b', 3: 'c', 4: 'four', 5: 'five'}
>>>
dict1.get(k,"") returns the value associated to k if it exists or an empty string otherwise, and then append the content of dict2.