I'm looking for the most efficient and pythonic (mainly efficient) way to update a dictionary but keep the old values if an existing key is present. For example...
myDict1 = {'1': ('3', '2'), '3': ('2', '1'), '2': ('3', '1')}
myDict2 = {'4': ('5', '2'), '5': ('2', '4'), '2': ('5', '4')}
myDict1.update(myDict2) gives me the following....
{'1': ('3', '2'), '3': ('2', '1'), '2': ('5', '4'), '5': ('2', '4'), '4': ('5', '2')}
notice how the key '2' exists in both dictionaries and used to have values ('3', '1') but now it has the values from it's key in myDict2 ('5', '4')?
Is there a way to update the dictionary in an efficient manner so as the key '2' ends up having values ('3', '1', '5', '4')? #in no particular order
Thanks in advance
I think the most effective way to do it would be something like this:
for k, v in myDict2.iteritems():
myDict1[k] = myDict1.get(k, ()) + v
But there isn't an update equivalent for what you're looking to do, unfortunately.
What is wrong with 2 in-place update operations?
myDict2.update(myDict1)
myDict1.update(myDict2)
Explanation:
The first update will overwrite the already existing keys with the values from myDict1, and insert all key value pairs in myDict2 which don't exist.
The second update will overwrite the already existing keys in myDict1 with values from myDict2, which are actually the values from myDict1 itself due to the 1st operation. Any new key value pairs inserted will be from the original myDict2.
This of course is conditional to the fact that you don't care about preserving myDict2
Update: With python3, you can do this without having to touch myDict2
myDict1 = {**myDict1, **myDict2, **myDict1}
which would actually be same as
myDict1 = {**myDict2, **myDict1}
Output
{'1': ('3', '2'), '3': ('2', '1'), '2': ('3', '1'), '4': ('5', '2'), '5': ('2', '4')}
The fastest way to merge large dictionaries is to introduce an intermediate object that behaves as though the dicts are merged without actually merging them (see #Raymond Hettinger's answer):
from collections import ChainMap
class MergedMap(ChainMap):
def __getitem__(self, key):
result = []
found = False
for mapping in self.maps:
try:
result.extend(mapping[key])
found = True
except KeyError:
pass
return result if found else self.__missing__(key)
merged = MergedMap(myDict1, myDict2)
Whether it is applicable depends on how you want to use the combined dict later.
It uses collections.ChainMap from Python 3.3+ for convenience to provide the full MutableMapping interface; you could implement only parts that you use on older Python versions.
Perhaps a defaultdict would help
from collections import defaultdict
myDict0= {'1': ('3', '2'), '3': ('2', '1'), '2': ('3', '1')}
myDict2 = {'4': ('5', '2'), '5': ('2', '4'), '2': ('5', '4')}
myDict1 = defaultdict(list)
for (key, value) in myDict0.iteritems():
myDict1[key].extend(value)
for (key, value) in myDict2.iteritems():
myDict1[key].extend(value)
print myDict1
defaultdict(<type 'list'>, {'1': ['3', '2'], '3': ['2', '1'], '2': ['3', '1', '5', '4'], '5': ['2', '4'], '4': ['5', '2']})
No there's no easy way to do it I'm afraid.
The best way is probably iterating and merging. Something like:
for key in myDict1.iterkeys():
# Thank you to user2246674 and Nolen Royalty to help me optimise this in their comments
if key in myDict2:
myDict2[key] = myDict2[key] + myDict1[key]
else:
myDict2[key] = myDict1[key]
Related
Currently trying to print the first entry of alphabet_numbers.items (a, 1)?
alphabet_numbers = {'a': '1',
'b': '2',
'c': '3',
'd' : '4',
'e': '5'
}
tuple1 = alphabet_numbers.items()
print(tuple1[0,0])
how can print only first entry of tuple1?
If put print(tuple1) I get:
dict_items([('a', '1'), ('b', '2'), ('c', '3'), ('d', '4'), ('e', '5')])
It can be explicitly converted to a list:
alphabet_numbers = {'a': '1',
'b': '2',
'c': '3',
'd': '4',
'e': '5'
}
tuple1 = list(alphabet_numbers.items())
print(tuple1[0][0])
print(list(alphabet_numbers.keys())[0])
Or use function keys, if you are only interested in the first key.
Function items does not return a list, but a view like object.
The type checker mypy reveals the type by using reveal_type(tuple1) of the original example:
typing.ItemsView[builtins.str*, builtins.str*]
first_entry = list(alphabet_numbers.items())[0]
You can print the first entry and it will output a tuple.
So I have this tricky dictionary of tuples which I want to filter based on the first occurrence of the informative flag in the value elements. If the flag (which is the element occupying the first position of the tuple) is observed in other keys I will only retain only the first key-value pair in which it occurs and subsequent key-value pairs which contain the flag would be skipped.
old_dict = {'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'def':[('abc', '2', '5'), ('def', '1', '5'), ('abcd', '1', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
I have struggled with a lot of attempts and this latest attempt does not produce anything meaningful.
flgset = set()
new_dict = {}
for elem, tp in old_dict.items():
for flg in tp:
flgset.add(flg[0])
counter = 0
for elem, tp in old_dict.items():
for (item1, item2, item3) in tp:
for flg in flgset:
if flg == item1:
counter = 1
new_dict[elem] = [(item1, item2, item3)]
break
Expected results should be:
new_dict = {'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
Thanks in advance.
If i get you correctly, the following should do what you want:
flgset = set()
new_dict = {}
for k, tuple_list in old_dict.items():
# if the key is not in flgset, just keep the k, tuple_list pair
if k not in flgset:
new_dict[k] = tuple_list
# update the elements into flgset
# item in this case is ('abc', '2', '5'),
# since you only want to add the first element, use item[0]
for item in tuple_list:
flgset.add(item[0])
Output as such:
new_dict = {'abc': [('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'ghi': [('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
flgset = {'abc', 'abcd', 'def', 'ghi', 'jkl', 'mno'}
Others may have more efficient ways to do this, but here's one solution that incorporates your intuitions that you need to loop over old_dict items and use a set:
for key, val in old_dict.items():
if val[0][0] not in set([v[0][0] for v in new_dict.values()]):
new_dict.update({key: val})
Here's a brief explanation of what's going on: First, val[0][0] is the "informative flag" from your dictionary entry (i.e. the first item of the first tuple in the entry list). set([v[0][0] for v in new_dict.values()]) will give you the unique values of that flag in your new dictionary. The inner part is a list comprehension to get all the "flags" and then set will give a unique list. The last line just uses the update method to append to it.
REVISED ANSWER
#VinayPai raises two important issues below in the comments. First, this code is inefficient because it reconstructs the test set each time. Here's the more efficient way he suggests:
flag_list = set()
for key, val in old_dict.items():
if val[0][0] not in flag_list:
new_dict.update({key: val})
flag_list.add(val[0][0])
The second issue is that this will produce inconsistent results because dictionaries are not ordered. One possible solution is to use an OrderedDict. But as #SyntaxVoid suggests, this is only necessary if you're using Python3.5 or earlier (here is a great answer discussing the change). If you can create your data in this fashion, it would solve the problem:
from collections import OrderedDict
old_dict = OrderedDict{'abc':[('abc', '1', '5'), ('def', '1', '5'), ('abcd', '2', '5')],
'def':[('abc', '2', '5'), ('def', '1', '5'), ('abcd', '1', '5')],
'ghi':[('ghi', '1', '5'), ('jkl', '1', '4'), ('mno', '2', '4')]}
I searched for sorting a Python dictionary based on value and got various answers on the internet.Tried few of them and finally used Sorted function.
I have simplified the example to make it clear.
I have a dictionary,say:
temp_dict = {'1': '40', '0': '109', '3': '37', '2': '42', '5': '26', '4': '45', '7': '109', '6': '42'}
Now ,to sort it out based on value,I did the following operation(using Operator module):
sorted_temp_dict = sorted(temp_dict.items(), key=operator.itemgetter(1))
The result I'm getting is(The result is a tuple,which is fine for me):
[('0', '109'), ('7', '109'), ('5', '26'), ('3', '37'), ('1', '40'), ('2', '42'), ('6', '42'), ('4', '45')]
The issue is,as you can see,the first two elements of the tuple is not sorted.The rest of the elements are sorted perfectly based on the value.
Not able to find the mistake here.Any help will be great.Thanks
Those are sorted. They are strings, and are sorted lexicographically: '1' is before '2', etc.
If you want to sort by numeric value, you'll need to convert to ints in the key function. For example:
sorted(temp_dict.items(), key=lambda x: int(x[1]))
They are sorted, the issue is that the elements are string , hence -
'109' < '26' # this is true, as they are string
Try converting them to int for the key argument, you can use a lambda such as -
>>> sorted_temp_dict = sorted(temp_dict.items(), key=lambda x: int(x[1]))
>>> sorted_temp_dict
[('5', '26'), ('3', '37'), ('1', '40'), ('6', '42'), ('2', '42'), ('4', '45'), ('7', '109'), ('0', '109')]
The problem is trying to sort with values that are str, and not int. If you first convert the values into int and then sort, it will work.
I am trying to iterate over a Python 2D list. As the algorithm iterates over the list, it will add the key to a new list until a new value is detected. An operation is then applied to the list and then the list is emptied so that it can be used again as follows:
original_list = [('4', 'a'), ('3', 'a'), ('2', 'a'), ('1', 'b'), ('6', 'b')]
When the original_list is read by the algorithm it should evaluate the second value of each object and decipher if it is different from the previous value; if not, add it to a temporary list.
Here is the psedo code
temp_list = []
new_value = original_list[0][1] #find the first value
for key, value in original_list:
if value != new_value:
temp_list.append(new_value)
Should output
temp_list = ['4', '3', '2']
temp_list = []
prev_value = original_list[0][1]
for key, value in original_list:
if value == prev_value:
temp_list.append(key)
else:
do_something(temp_list)
print temp_list
temp_list = [key]
prev_value = value
do_something(temp_list)
print temp_list
# prints ['4', '3', '2']
# prints ['1', '6']
Not entirely sure what you are asking, but I think itertools.groupby could help:
>>> from itertools import groupby
>>> original_list = [('4', 'a'), ('3', 'a'), ('2', 'a'), ('1', 'b'), ('6', 'b')]
>>> [(zip(*group)[0], k) for k, group in groupby(original_list, key=lambda x: x[1])]
[(('4', '3', '2'), 'a'), (('1', '6'), 'b')]
What this does: It groups the items in the list by their value with key=lambda x: x[1] and gets tuples of keys corresponding to one value with (zip(*group)[0], k).
In case your "keys" do not repeat themselves, you could just use a defaultdict to "sort" the values based on keys, then extract what you need
from collections import defaultdict
ddict = defaultdict(list)
for v1, v2 in original_list:
ddict[v2].append(v1)
ddict values are now all temp_list:
>>> ddict["a"]
['4', '3', '2']
Why dictionaries in python appears reversed?
>>> a = {'one': '1', 'two': '2', 'three': '3', 'four': '4'}
>>> a
{'four': '4', 'three': '3', 'two': '2', 'one': '1'}
How can I fix this?
Dictionaries in python (and hash tables in general) are unordered. In python you can use the sort() method on the keys to sort them.
Dictionaries have no intrinsic order. You'll have to either roll your own ordered dict implementation, use an ordered list of tuples or use an existing ordered dict implementation.
Python3.1 has an OrderedDict
>>> from collections import OrderedDict
>>> o=OrderedDict([('one', '1'), ('two', '2'), ('three', '3'), ('four', '4')])
>>> o
OrderedDict([('one', '1'), ('two', '2'), ('three', '3'), ('four', '4')])
>>> for k,v in o.items():
... print (k,v)
...
one 1
two 2
three 3
four 4
Now you know dicts are unordered, here is how to convert them to a list which you can order
>>> a = {'one': '1', 'two': '2', 'three': '3', 'four': '4'}
>>> a
{'four': '4', 'three': '3', 'two': '2', 'one': '1'}
sorted by key
>>> sorted(a.items())
[('four', '4'), ('one', '1'), ('three', '3'), ('two', '2')]
sorted by value
>>> from operator import itemgetter
>>> sorted(a.items(),key=itemgetter(1))
[('one', '1'), ('two', '2'), ('three', '3'), ('four', '4')]
>>>
And what is the "standard order" you would be expecting? It is very much application dependent. A python dictionary doesn't guarantee key ordering anyways.
In any case, you can iterate over a dictionary keys() the way you want.
From the Python Tutorial:
It is best to think of a dictionary as
an unordered set of key: value pairs
And from the Python Standard Library (about dict.items):
CPython implementation detail: Keys
and values are listed in an arbitrary
order which is non-random, varies
across Python implementations, and
depends on the dictionary’s history of
insertions and deletions.
So if you need to process the dict in a certain order, sort the keys or values, e.g.:
>>> sorted(a.keys())
['four', 'one', 'three', 'two']
>>> sorted(a.values())
['1', '2', '3', '4']