Related
Is there any built-in function that would do the following?
dictionary = {‘a’:1, ‘b’:2, ‘c’:3}
dictionary.update(c=10)
# what happens
dictionary ---- {‘a’:1, ‘b’:2, ‘c’:10}
# what I want to happen:
dictionary ---- {‘a’:1, ‘b’:2, ‘c’:(3, 10)}
By default if keys are the same, later key would override earlier one.
If the key is already present in dict, the value of the new key: value pair would be added to already existing value in a form of container, like tuple, or list or set.
I can write a helper function to do so but I believe it should be something built-in for this matter.
You can do this
from collections import defaultdict
d = defaultdict(list)
d["a"].append(1)
d["b"].append(2)
d["c"].append(3)
d["c"].append(10)
print(d)
Result
defaultdict(list, {'a': [1], 'b': [2], 'c': [3, 10]})
Your desired solution is not very elegant, so I am going to propose an alternative one.
Tuples are immutable. Let's use lists instead, because we can easily append to them.
The data type of the values should be consistent. Use lists in any case, even for single values.
Let's use a defaultdict such that we don't have to initialize lists manually.
Putting it together:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for v, k in enumerate('abc', 1):
... d[k].append(v)
...
>>> d
defaultdict(<class 'list'>, {'a': [1], 'b': [2], 'c': [3]})
>>> d['c'].append(10)
>>> d
defaultdict(<class 'list'>, {'a': [1], 'b': [2], 'c': [3, 10]})
You could rewrite the update function by creating a new class:
In Python bulitins.py:
def update(self, E=None, **F): # known special case of dict.update
"""
D.update([E, ]**F) -> None. Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]
"""
pass
So I write this(Inherit from UserDict, suggested by #timgeb):
from collections import UserDict
class CustomDict(UserDict):
def __init__(self):
super().__init__()
def update(self, E=None, **F) -> None:
if E:
if isinstance(E, dict):
for k in E:
self[k] = E[k]
else:
for k, v in E:
self[k] = v
else:
if isinstance(F, dict):
for key in F:
if isinstance(self[key], list):
self[key].append(F[key])
else:
self[key] = [self[key], F[key]]
dictionary = CustomDict()
dictionary.update({'a': 1, 'b': 2, 'c': 3})
print(dictionary)
dictionary.update(a=3)
print(dictionary)
dictionary.update(a=4)
print(dictionary)
Result:
{'a': 1, 'b': 2, 'c': 3}
{'a': [1, 3], 'b': 2, 'c': 3}
{'a': [1, 3, 4], 'b': 2, 'c': 3}
Maybe there are some logic errors in my code,but welcome to point out.
Perhaps you could use something like:
dictionary = {'a':1, 'b':2, 'c':3}
dictionary.update({'c': 10 if not dictionary.get('c') else tuple([dictionary['c'],] + [10,])})
# {'a': 1, 'b': 2, 'c': (3, 10)}
But it should probably be wrapped into a function to make things clean. The general pattern would be (I suppose, based on your question):
dict = {...}
if 'a' not in dict:
do_this() # just add it to the dict?
else:
do_that() # build a tuple or list?
In your above question you're mixing types -- I'm not sure if you want that, a more pythonic approach might be to have all the values as list and use a defaultdict.
I have a dictionary of lists in which some of the values are empty:
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
At the end of creating these lists, I want to remove these empty lists before returning my dictionary. I tried doing it like this:
for i in d:
if not d[i]:
d.pop(i)
but I got a RuntimeError. I am aware that you cannot add/remove elements in a dictionary while iterating through it...what would be a way around this then?
See Modifying a Python dict while iterating over it for citations that this can cause problems, and why.
In Python 3.x and 2.x you can use use list to force a copy of the keys to be made:
for i in list(d):
In Python 2.x calling keys made a copy of the keys that you could iterate over while modifying the dict:
for i in d.keys():
But note that in Python 3.x this second method doesn't help with your error because keys returns an a view object instead of copying the keys into a list.
You only need to use copy:
This way you iterate over the original dictionary fields and on the fly can change the desired dict d.
It works on each Python version, so it's more clear.
In [1]: d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
In [2]: for i in d.copy():
...: if not d[i]:
...: d.pop(i)
...:
In [3]: d
Out[3]: {'a': [1], 'b': [1, 2]}
(BTW - Generally to iterate over copy of your data structure, instead of using .copy for dictionaries or slicing [:] for lists, you can use import copy -> copy.copy (for shallow copy which is equivalent to copy that is supported by dictionaries or slicing [:] that is supported by lists) or copy.deepcopy on your data structure.)
Just use dictionary comprehension to copy the relevant items into a new dict:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.items() if v}
>>> d
{'a': [1], 'b': [1, 2]}
For this in Python 2:
>>> d
{'a': [1], 'c': [], 'b': [1, 2], 'd': []}
>>> d = {k: v for k, v in d.iteritems() if v}
>>> d
{'a': [1], 'b': [1, 2]}
This worked for me:
d = {1: 'a', 2: '', 3: 'b', 4: '', 5: '', 6: 'c'}
for key, value in list(d.items()):
if value == '':
del d[key]
print(d)
# {1: 'a', 3: 'b', 6: 'c'}
Casting the dictionary items to list creates a list of its items, so you can iterate over it and avoid the RuntimeError.
I would try to avoid inserting empty lists in the first place, but, would generally use:
d = {k: v for k,v in d.iteritems() if v} # re-bind to non-empty
If prior to 2.7:
d = dict( (k, v) for k,v in d.iteritems() if v )
or just:
empty_key_vals = list(k for k in k,v in d.iteritems() if v)
for k in empty_key_vals:
del[k]
To avoid "dictionary changed size during iteration error".
For example: "when you try to delete some key",
Just use 'list' with '.items()'. Here is a simple example:
my_dict = {
'k1':1,
'k2':2,
'k3':3,
'k4':4
}
print(my_dict)
for key, val in list(my_dict.items()):
if val == 2 or val == 4:
my_dict.pop(key)
print(my_dict)
Output:
{'k1': 1, 'k2': 2, 'k3': 3, 'k4': 4}
{'k1': 1, 'k3': 3}
This is just an example. Change it based on your case/requirements.
For Python 3:
{k:v for k,v in d.items() if v}
You cannot iterate through a dictionary while it’s changing during a for loop. Make a casting to list and iterate over that list. It works for me.
for key in list(d):
if not d[key]:
d.pop(key)
Python 3 does not allow deletion while iterating (using the for loop above) a dictionary. There are various alternatives to do it; one simple way is to change the line
for i in x.keys():
with
for i in list(x)
The reason for the runtime error is that you cannot iterate through a data structure while its structure is changing during iteration.
One way to achieve what you are looking for is to use a list to append the keys you want to remove and then use the pop function on dictionary to remove the identified key while iterating through the list.
d = {'a': [1], 'b': [1, 2], 'c': [], 'd':[]}
pop_list = []
for i in d:
if not d[i]:
pop_list.append(i)
for x in pop_list:
d.pop(x)
print (d)
For situations like this, I like to make a deep copy and loop through that copy while modifying the original dict.
If the lookup field is within a list, you can enumerate in the for loop of the list and then specify the position as the index to access the field in the original dict.
Nested null values
Let's say we have a dictionary with nested keys, some of which are null values:
dicti = {
"k0_l0":{
"k0_l1": {
"k0_l2": {
"k0_0":None,
"k1_1":1,
"k2_2":2.2
}
},
"k1_l1":None,
"k2_l1":"not none",
"k3_l1":[]
},
"k1_l0":"l0"
}
Then we can remove the null values using this function:
def pop_nested_nulls(dicti):
for k in list(dicti):
if isinstance(dicti[k], dict):
dicti[k] = pop_nested_nulls(dicti[k])
elif not dicti[k]:
dicti.pop(k)
return dicti
Output for pop_nested_nulls(dicti)
{'k0_l0': {'k0_l1': {'k0_l2': {'k1_1': 1,
'k2_2': 2.2}},
'k2_l1': 'not '
'none'},
'k1_l0': 'l0'}
The Python "RuntimeError: dictionary changed size during iteration" occurs when we change the size of a dictionary when iterating over it.
To solve the error, use the copy() method to create a shallow copy of the dictionary that you can iterate over, e.g., my_dict.copy().
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in my_dict.copy():
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # 👉️ {'a': 1, 'c': 3}
You can also convert the keys of the dictionary to a list and iterate over the list of keys.
my_dict = {'a': 1, 'b': 2, 'c': 3}
for key in list(my_dict.keys()):
print(key)
if key == 'b':
del my_dict[key]
print(my_dict) # 👉️ {'a': 1, 'c': 3}
If the values in the dictionary were unique too, then I used this solution:
keyToBeDeleted = None
for k, v in mydict.items():
if(v == match):
keyToBeDeleted = k
break
mydict.pop(keyToBeDeleted, None)
So my input values are as follows:
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,5], 'C':[6,6,7,8]}
temp_dict2 = {}
val = [5]
The list val may contain more values, but for now, it only contains one. My desired outcome is:
>>>temp_dict2
{'B':[5]}
The final dictionary needs to only have the keys for the lists that contain the item in the list val, and only unique instances of that value in the list. I've tried iterating through the two objects as follows:
for i in temp_dict1:
for j in temp_dict1[i]:
for k in val:
if k in j:
temp_dict2.setdefault(i, []).append(k)
But that just returns an argument of type 'int' is not iterable error message. Any ideas?
Changed your dictionary to cover some more cases:
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,6], 'C':[6,6,7,8]}
temp_dict2 = {}
val = [5, 6]
for item in val:
for key, val in temp_dict1.items():
if item in val:
temp_dict2.setdefault(key, []).append(item)
print(temp_dict2)
# {'B': [5, 6], 'C': [6]}
Or, the same using list comprehension (looks a bit hard to understand, not recommended).
temp_dict2 = {}
[temp_dict2.setdefault(key, []).append(item) for item in val for key, val in temp_dict1.items() if item in val]
For comparison with #KeyurPotdar's solution, this can also be achieved via collections.defaultdict:
from collections import defaultdict
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,6], 'C':[6,6,7,8]}
temp_dict2 = defaultdict(list)
val = [5, 6]
for i in val:
for k, v in temp_dict1.items():
if i in v:
temp_dict2[k].append(i)
# defaultdict(list, {'B': [5, 6], 'C': [6]})
You can try this approach:
temp_dict1 = {'A': [1,2,3,4,5,6], 'B':[5,5,5], 'C':[6,6,7,8]}
val = [5,6]
def values(dict_,val_):
default_dict={}
for i in val_:
for k,m in dict_.items():
if i in m:
if k not in default_dict:
default_dict[k]=[i]
else:
default_dict[k].append(i)
return default_dict
print(values(temp_dict1,val))
output:
{'B': [5], 'C': [6], 'A': [5, 6]}
I know how to remove an entry, 'key' from my dictionary d, safely. You do:
if d.has_key('key'):
del d['key']
However, I need to remove multiple entries from a dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.
entities_to_remove = ('a', 'b', 'c')
for x in entities_to_remove:
if x in d:
del d[x]
However, I was wondering if there is a smarter way to do this?
Using dict.pop:
d = {'some': 'data'}
entries_to_remove = ('any', 'iterable')
for k in entries_to_remove:
d.pop(k, None)
Using Dict Comprehensions
final_dict = {key: value for key, value in d if key not in [key1, key2]}
where key1 and key2 are to be removed.
In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.
>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>>
Why not like this:
entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}
def entries_to_remove(entries, the_dict):
for key in entries:
if key in the_dict:
del the_dict[key]
A more compact version was provided by mattbornski using dict.pop()
a solution is using map and filter functions
python 2
d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)
python 3
d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)
you get:
{'c': 3}
If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:
values_removed = [d.pop(k, None) for k in entities_to_remove]
You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.
Found a solution with pop and map
d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)
The output of this:
{'d': 'valueD'}
I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.
Update
The above code will throw an error if a key does not exist in the dict.
DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']
def remove_key(key):
DICTIONARY.pop(key, None)
list(map(remove_key, keys))
print(DICTIONARY)
output:
DICTIONARY = {'b': 'valueB', 'd': 'valueD'}
Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:
timeit results (10k iterations):
all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
"""Remove list of elements from mapping."""
for key in to_remove:
del mapping[key]
For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.
I have no problem with any of the existing answers, but I was surprised to not find this solution:
keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}
for k in keys_to_remove:
try:
del my_dict[k]
except KeyError:
pass
assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}
Note: I stumbled across this question coming from here. And my answer is related to this answer.
I have tested the performance of three methods:
# Method 1: `del`
for key in remove_keys:
if key in d:
del d[key]
# Method 2: `pop()`
for key in remove_keys:
d.pop(key, None)
# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}
Here are the results of 1M iterations:
del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)
So both del and pop() are the fastest. Comprehensions are 2x slower.
But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.
Why not:
entriestoremove = (2,5,1)
for e in entriestoremove:
if d.has_key(e):
del d[e]
I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:
entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}
inline
import functools
#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}
entitiesToREmove = ('a', 'b', 'c')
#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)
#: python3
list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))
print(d)
# output: {'d': 'dvalue'}
I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:
def remove_keys(d, keys):
to_remove = set(keys)
filtered_keys = d.keys() - to_remove
filtered_values = map(d.get, filtered_keys)
return dict(zip(filtered_keys, filtered_values))
Example:
>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}
It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.
from itertools import product
from time import perf_counter
# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))
print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000
keys = product(range(50000, 100000), range(1, 100))
# for x,y in keys:
# del cells[x, y]
for n in map(cells.pop, keys):
pass
print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")
10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:
keys = cells.keys() & keys
In summary: del is already heavily optimised, so don't worry about using it.
Another map() way to remove list of keys from dictionary
and avoid raising KeyError exception
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))
print('k=', k)
print('dic after = \n', dic)
**this will produce output**
k= ['key_not_exist', 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function.
You can add here any array with len_ = len(key_to_remove)
For example
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, np.zeros(len(keys_to_remove))))
print('k=', k)
print('dic after = ', dic)
** will produce output **
k= [0.0, 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
def delete_keys_from_dict(dictionary, keys):
"""
Deletes the unwanted keys in the dictionary
:param dictionary: dict
:param keys: list of keys
:return: dict (modified)
"""
from collections.abc import MutableMapping
keys_set = set(keys)
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, list):
modified_dict[key] = list()
for x in value:
if isinstance(x, MutableMapping):
modified_dict[key].append(delete_keys_from_dict(x, keys_set))
else:
modified_dict[key].append(x)
elif isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value
return modified_dict
_d = {'a': 1245, 'b': 1234325, 'c': {'a': 1245, 'b': 1234325}, 'd': 98765,
'e': [{'a': 1245, 'b': 1234325},
{'a': 1245, 'b': 1234325},
{'t': 767}]}
_output = delete_keys_from_dict(_d, ['a', 'b'])
_expected = {'c': {}, 'd': 98765, 'e': [{}, {}, {'t': 767}]}
print(_expected)
print(_output)
I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.
k = ['a','b','c','d']
Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.
new_dictionary = [dictionary.pop(x, 'n/a') for x in k]
The 'n/a' is in case the key does not exist, a default value needs to be returned.
I know how to remove an entry, 'key' from my dictionary d, safely. You do:
if d.has_key('key'):
del d['key']
However, I need to remove multiple entries from a dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.
entities_to_remove = ('a', 'b', 'c')
for x in entities_to_remove:
if x in d:
del d[x]
However, I was wondering if there is a smarter way to do this?
Using dict.pop:
d = {'some': 'data'}
entries_to_remove = ('any', 'iterable')
for k in entries_to_remove:
d.pop(k, None)
Using Dict Comprehensions
final_dict = {key: value for key, value in d if key not in [key1, key2]}
where key1 and key2 are to be removed.
In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.
>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>>
Why not like this:
entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}
def entries_to_remove(entries, the_dict):
for key in entries:
if key in the_dict:
del the_dict[key]
A more compact version was provided by mattbornski using dict.pop()
a solution is using map and filter functions
python 2
d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)
python 3
d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)
you get:
{'c': 3}
If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:
values_removed = [d.pop(k, None) for k in entities_to_remove]
You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.
Found a solution with pop and map
d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)
The output of this:
{'d': 'valueD'}
I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.
Update
The above code will throw an error if a key does not exist in the dict.
DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']
def remove_key(key):
DICTIONARY.pop(key, None)
list(map(remove_key, keys))
print(DICTIONARY)
output:
DICTIONARY = {'b': 'valueB', 'd': 'valueD'}
Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:
timeit results (10k iterations):
all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
"""Remove list of elements from mapping."""
for key in to_remove:
del mapping[key]
For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.
I have no problem with any of the existing answers, but I was surprised to not find this solution:
keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}
for k in keys_to_remove:
try:
del my_dict[k]
except KeyError:
pass
assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}
Note: I stumbled across this question coming from here. And my answer is related to this answer.
I have tested the performance of three methods:
# Method 1: `del`
for key in remove_keys:
if key in d:
del d[key]
# Method 2: `pop()`
for key in remove_keys:
d.pop(key, None)
# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}
Here are the results of 1M iterations:
del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)
So both del and pop() are the fastest. Comprehensions are 2x slower.
But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.
Why not:
entriestoremove = (2,5,1)
for e in entriestoremove:
if d.has_key(e):
del d[e]
I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:
entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}
inline
import functools
#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}
entitiesToREmove = ('a', 'b', 'c')
#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)
#: python3
list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))
print(d)
# output: {'d': 'dvalue'}
I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:
def remove_keys(d, keys):
to_remove = set(keys)
filtered_keys = d.keys() - to_remove
filtered_values = map(d.get, filtered_keys)
return dict(zip(filtered_keys, filtered_values))
Example:
>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}
It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.
from itertools import product
from time import perf_counter
# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))
print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000
keys = product(range(50000, 100000), range(1, 100))
# for x,y in keys:
# del cells[x, y]
for n in map(cells.pop, keys):
pass
print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")
10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:
keys = cells.keys() & keys
In summary: del is already heavily optimised, so don't worry about using it.
Another map() way to remove list of keys from dictionary
and avoid raising KeyError exception
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))
print('k=', k)
print('dic after = \n', dic)
**this will produce output**
k= ['key_not_exist', 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function.
You can add here any array with len_ = len(key_to_remove)
For example
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, np.zeros(len(keys_to_remove))))
print('k=', k)
print('dic after = ', dic)
** will produce output **
k= [0.0, 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
def delete_keys_from_dict(dictionary, keys):
"""
Deletes the unwanted keys in the dictionary
:param dictionary: dict
:param keys: list of keys
:return: dict (modified)
"""
from collections.abc import MutableMapping
keys_set = set(keys)
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, list):
modified_dict[key] = list()
for x in value:
if isinstance(x, MutableMapping):
modified_dict[key].append(delete_keys_from_dict(x, keys_set))
else:
modified_dict[key].append(x)
elif isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value
return modified_dict
_d = {'a': 1245, 'b': 1234325, 'c': {'a': 1245, 'b': 1234325}, 'd': 98765,
'e': [{'a': 1245, 'b': 1234325},
{'a': 1245, 'b': 1234325},
{'t': 767}]}
_output = delete_keys_from_dict(_d, ['a', 'b'])
_expected = {'c': {}, 'd': 98765, 'e': [{}, {}, {'t': 767}]}
print(_expected)
print(_output)
I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.
k = ['a','b','c','d']
Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.
new_dictionary = [dictionary.pop(x, 'n/a') for x in k]
The 'n/a' is in case the key does not exist, a default value needs to be returned.