Related
Let's say we have a Python dictionary d, and we're iterating over it like so:
for k, v in d.iteritems():
del d[f(k)] # remove some item
d[g(k)] = v # add a new item
(f and g are just some black-box transformations.)
In other words, we try to add/remove items to d while iterating over it using iteritems.
Is this well defined? Could you provide some references to support your answer?
See also How to avoid "RuntimeError: dictionary changed size during iteration" error? for the separate question of how to avoid the problem.
Alex Martelli weighs in on this here.
It may not be safe to change the container (e.g. dict) while looping over the container.
So del d[f(k)] may not be safe. As you know, the workaround is to use d.copy().items() (to loop over an independent copy of the container) instead of d.iteritems() or d.items() (which use the same underlying container).
It is okay to modify the value at an existing index of the dict, but inserting values at new indices (e.g. d[g(k)] = v) may not work.
It is explicitly mentioned on the Python doc page (for Python 2.7) that
Using iteritems() while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.
Similarly for Python 3.
The same holds for iter(d), d.iterkeys() and d.itervalues(), and I'll go as far as saying that it does for for k, v in d.items(): (I can't remember exactly what for does, but I would not be surprised if the implementation called iter(d)).
You cannot do that, at least with d.iteritems(). I tried it, and Python fails with
RuntimeError: dictionary changed size during iteration
If you instead use d.items(), then it works.
In Python 3, d.items() is a view into the dictionary, like d.iteritems() in Python 2. To do this in Python 3, instead use d.copy().items(). This will similarly allow us to iterate over a copy of the dictionary in order to avoid modifying the data structure we are iterating over.
I have a large dictionary containing Numpy arrays, so the dict.copy().keys() thing suggested by #murgatroid99 was not feasible (though it worked). Instead, I just converted the keys_view to a list and it worked fine (in Python 3.4):
for item in list(dict_d.keys()):
temp = dict_d.pop(item)
dict_d['some_key'] = 1 # Some value
I realize this doesn't dive into the philosophical realm of Python's inner workings like the answers above, but it does provide a practical solution to the stated problem.
The following code shows that this is not well defined:
def f(x):
return x
def g(x):
return x+1
def h(x):
return x+10
try:
d = {1:"a", 2:"b", 3:"c"}
for k, v in d.iteritems():
del d[f(k)]
d[g(k)] = v+"x"
print d
except Exception as e:
print "Exception:", e
try:
d = {1:"a", 2:"b", 3:"c"}
for k, v in d.iteritems():
del d[f(k)]
d[h(k)] = v+"x"
print d
except Exception as e:
print "Exception:", e
The first example calls g(k), and throws an exception (dictionary changed size during iteration).
The second example calls h(k) and throws no exception, but outputs:
{21: 'axx', 22: 'bxx', 23: 'cxx'}
Which, looking at the code, seems wrong - I would have expected something like:
{11: 'ax', 12: 'bx', 13: 'cx'}
Python 3 you should just:
prefix = 'item_'
t = {'f1': 'ffw', 'f2': 'fca'}
t2 = dict()
for k,v in t.items():
t2[k] = prefix + v
or use:
t2 = t1.copy()
You should never modify original dictionary, it leads to confusion as well as potential bugs or RunTimeErrors. Unless you just append to the dictionary with new key names.
This question asks about using an iterator (and funny enough, that Python 2 .iteritems iterator is no longer supported in Python 3) to delete or add items, and it must have a No as its only right answer as you can find it in the accepted answer. Yet: most of the searchers try to find a solution, they will not care how this is done technically, be it an iterator or a recursion, and there is a solution for the problem:
You cannot loop-change a dict without using an additional (recursive) function.
This question should therefore be linked to a question that has a working solution:
How can I remove a key:value pair wherever the chosen key occurs in a deeply nested dictionary? (= "delete")
Also helpful as it shows how to change the items of a dict on the run: How can I replace a key:value pair by its value wherever the chosen key occurs in a deeply nested dictionary? (= "replace").
By the same recursive methods, you will also able to add items as the question asks for as well.
Since my request to link this question was declined, here is a copy of the solution that can delete items from a dict. See How can I remove a key:value pair wherever the chosen key occurs in a deeply nested dictionary? (= "delete") for examples / credits / notes.
import copy
def find_remove(this_dict, target_key, bln_overwrite_dict=False):
if not bln_overwrite_dict:
this_dict = copy.deepcopy(this_dict)
for key in this_dict:
# if the current value is a dict, dive into it
if isinstance(this_dict[key], dict):
if target_key in this_dict[key]:
this_dict[key].pop(target_key)
this_dict[key] = find_remove(this_dict[key], target_key)
return this_dict
dict_nested_new = find_remove(nested_dict, "sub_key2a")
The trick
The trick is to find out in advance whether a target_key is among the next children (= this_dict[key] = the values of the current dict iteration) before you reach the child level recursively. Only then you can still delete a key:value pair of the child level while iterating over a dictionary. Once you have reached the same level as the key to be deleted and then try to delete it from there, you would get the error:
RuntimeError: dictionary changed size during iteration
The recursive solution makes any change only on the next values' sub-level and therefore avoids the error.
I got the same problem and I used following procedure to solve this issue.
Python List can be iterate even if you modify during iterating over it.
so for following code it will print 1's infinitely.
for i in list:
list.append(1)
print 1
So using list and dict collaboratively you can solve this problem.
d_list=[]
d_dict = {}
for k in d_list:
if d_dict[k] is not -1:
d_dict[f(k)] = -1 # rather than deleting it mark it with -1 or other value to specify that it will be not considered further(deleted)
d_dict[g(k)] = v # add a new item
d_list.append(g(k))
Today I had a similar use-case, but instead of simply materializing the keys on the dictionary at the beginning of the loop, I wanted changes to the dict to affect the iteration of the dict, which was an ordered dict.
I ended up building the following routine, which can also be found in jaraco.itertools:
def _mutable_iter(dict):
"""
Iterate over items in the dict, yielding the first one, but allowing
it to be mutated during the process.
>>> d = dict(a=1)
>>> it = _mutable_iter(d)
>>> next(it)
('a', 1)
>>> d
{}
>>> d.update(b=2)
>>> list(it)
[('b', 2)]
"""
while dict:
prev_key = next(iter(dict))
yield prev_key, dict.pop(prev_key)
The docstring illustrates the usage. This function could be used in place of d.iteritems() above to have the desired effect.
Let's say we have a Python dictionary d, and we're iterating over it like so:
for k, v in d.iteritems():
del d[f(k)] # remove some item
d[g(k)] = v # add a new item
(f and g are just some black-box transformations.)
In other words, we try to add/remove items to d while iterating over it using iteritems.
Is this well defined? Could you provide some references to support your answer?
See also How to avoid "RuntimeError: dictionary changed size during iteration" error? for the separate question of how to avoid the problem.
Alex Martelli weighs in on this here.
It may not be safe to change the container (e.g. dict) while looping over the container.
So del d[f(k)] may not be safe. As you know, the workaround is to use d.copy().items() (to loop over an independent copy of the container) instead of d.iteritems() or d.items() (which use the same underlying container).
It is okay to modify the value at an existing index of the dict, but inserting values at new indices (e.g. d[g(k)] = v) may not work.
It is explicitly mentioned on the Python doc page (for Python 2.7) that
Using iteritems() while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.
Similarly for Python 3.
The same holds for iter(d), d.iterkeys() and d.itervalues(), and I'll go as far as saying that it does for for k, v in d.items(): (I can't remember exactly what for does, but I would not be surprised if the implementation called iter(d)).
You cannot do that, at least with d.iteritems(). I tried it, and Python fails with
RuntimeError: dictionary changed size during iteration
If you instead use d.items(), then it works.
In Python 3, d.items() is a view into the dictionary, like d.iteritems() in Python 2. To do this in Python 3, instead use d.copy().items(). This will similarly allow us to iterate over a copy of the dictionary in order to avoid modifying the data structure we are iterating over.
I have a large dictionary containing Numpy arrays, so the dict.copy().keys() thing suggested by #murgatroid99 was not feasible (though it worked). Instead, I just converted the keys_view to a list and it worked fine (in Python 3.4):
for item in list(dict_d.keys()):
temp = dict_d.pop(item)
dict_d['some_key'] = 1 # Some value
I realize this doesn't dive into the philosophical realm of Python's inner workings like the answers above, but it does provide a practical solution to the stated problem.
The following code shows that this is not well defined:
def f(x):
return x
def g(x):
return x+1
def h(x):
return x+10
try:
d = {1:"a", 2:"b", 3:"c"}
for k, v in d.iteritems():
del d[f(k)]
d[g(k)] = v+"x"
print d
except Exception as e:
print "Exception:", e
try:
d = {1:"a", 2:"b", 3:"c"}
for k, v in d.iteritems():
del d[f(k)]
d[h(k)] = v+"x"
print d
except Exception as e:
print "Exception:", e
The first example calls g(k), and throws an exception (dictionary changed size during iteration).
The second example calls h(k) and throws no exception, but outputs:
{21: 'axx', 22: 'bxx', 23: 'cxx'}
Which, looking at the code, seems wrong - I would have expected something like:
{11: 'ax', 12: 'bx', 13: 'cx'}
Python 3 you should just:
prefix = 'item_'
t = {'f1': 'ffw', 'f2': 'fca'}
t2 = dict()
for k,v in t.items():
t2[k] = prefix + v
or use:
t2 = t1.copy()
You should never modify original dictionary, it leads to confusion as well as potential bugs or RunTimeErrors. Unless you just append to the dictionary with new key names.
This question asks about using an iterator (and funny enough, that Python 2 .iteritems iterator is no longer supported in Python 3) to delete or add items, and it must have a No as its only right answer as you can find it in the accepted answer. Yet: most of the searchers try to find a solution, they will not care how this is done technically, be it an iterator or a recursion, and there is a solution for the problem:
You cannot loop-change a dict without using an additional (recursive) function.
This question should therefore be linked to a question that has a working solution:
How can I remove a key:value pair wherever the chosen key occurs in a deeply nested dictionary? (= "delete")
Also helpful as it shows how to change the items of a dict on the run: How can I replace a key:value pair by its value wherever the chosen key occurs in a deeply nested dictionary? (= "replace").
By the same recursive methods, you will also able to add items as the question asks for as well.
Since my request to link this question was declined, here is a copy of the solution that can delete items from a dict. See How can I remove a key:value pair wherever the chosen key occurs in a deeply nested dictionary? (= "delete") for examples / credits / notes.
import copy
def find_remove(this_dict, target_key, bln_overwrite_dict=False):
if not bln_overwrite_dict:
this_dict = copy.deepcopy(this_dict)
for key in this_dict:
# if the current value is a dict, dive into it
if isinstance(this_dict[key], dict):
if target_key in this_dict[key]:
this_dict[key].pop(target_key)
this_dict[key] = find_remove(this_dict[key], target_key)
return this_dict
dict_nested_new = find_remove(nested_dict, "sub_key2a")
The trick
The trick is to find out in advance whether a target_key is among the next children (= this_dict[key] = the values of the current dict iteration) before you reach the child level recursively. Only then you can still delete a key:value pair of the child level while iterating over a dictionary. Once you have reached the same level as the key to be deleted and then try to delete it from there, you would get the error:
RuntimeError: dictionary changed size during iteration
The recursive solution makes any change only on the next values' sub-level and therefore avoids the error.
I got the same problem and I used following procedure to solve this issue.
Python List can be iterate even if you modify during iterating over it.
so for following code it will print 1's infinitely.
for i in list:
list.append(1)
print 1
So using list and dict collaboratively you can solve this problem.
d_list=[]
d_dict = {}
for k in d_list:
if d_dict[k] is not -1:
d_dict[f(k)] = -1 # rather than deleting it mark it with -1 or other value to specify that it will be not considered further(deleted)
d_dict[g(k)] = v # add a new item
d_list.append(g(k))
Today I had a similar use-case, but instead of simply materializing the keys on the dictionary at the beginning of the loop, I wanted changes to the dict to affect the iteration of the dict, which was an ordered dict.
I ended up building the following routine, which can also be found in jaraco.itertools:
def _mutable_iter(dict):
"""
Iterate over items in the dict, yielding the first one, but allowing
it to be mutated during the process.
>>> d = dict(a=1)
>>> it = _mutable_iter(d)
>>> next(it)
('a', 1)
>>> d
{}
>>> d.update(b=2)
>>> list(it)
[('b', 2)]
"""
while dict:
prev_key = next(iter(dict))
yield prev_key, dict.pop(prev_key)
The docstring illustrates the usage. This function could be used in place of d.iteritems() above to have the desired effect.
I often deal with heterogeneous datasets and I acquire them as dictionaries in my python routines. I usually face the problem that the key of the next entry I am going to add to the dictionary already exists.
I was wondering if there exists a more "pythonic" way to do the following task: check whether the key exists and create/update the corresponding pair key-item of my dictionary
myDict = dict()
for line in myDatasetFile:
if int(line[-1]) in myDict.keys():
myDict[int(line[-1])].append([line[2],float(line[3])])
else:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Use a defaultdict.
from collections import defaultdict
d = defaultdict(list)
# Every time you try to access the value of a key that isn't in the dict yet,
# d will call list with no arguments (producing an empty list),
# store the result as the new value, and give you that.
for line in myDatasetFile:
d[int(line[-1])].append([line[2],float(line[3])])
Also, never use thing in d.keys(). In Python 2, that will create a list of keys and iterate through it one item at a time to find the key instead of using a hash-based lookup. In Python 3, it's not quite as horrible, but it's still redundant and still slower than the right way, which is thing in d.
Its what that dict.setdefault is for.
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.
example :
>>> d={}
>>> d.setdefault('a',[]).append([1,2])
>>> d
{'a': [[1, 2]]}
Python follows the idea that it's easier to ask for forgiveness than permission.
so the true Pythonic way would be:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
for reference:
https://docs.python.org/2/glossary.html#term-eafp
https://stackoverflow.com/questions/6092992/why-is-it-easier-to-ask-forgiveness-than-permission-in-python-but-not-in-java
Try to catch the Exception when you get a KeyError
myDict = dict()
for line in myDatasetFile:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Or use:
myDict = dict()
for line in myDatasetFile:
myDict.setdefault(int(line[-1]),[]).append([line[2],float(line[3])])
I'm having a problem with setdefault and unions not working like I expect them to. My code looks like:
#!/usr/bin/python3.3
kanjidic = {
'恕': {'radical':{'multi_radical': {'口心女'}}},
'靛': {'radical':{'multi_radical': {'亠宀月疋二青土'}}},
}
k_rad = {}
for k,v in kanjidic.items():
if 'radical' in v and 'multi_radical' in v['radical']:
print (k, set(v['radical']['multi_radical']))
k_rad[k] = k_rad.setdefault(k, set()).update(
set(v['radical']['multi_radical']))
print('>>', k_rad[k])
The print output looks like:
恕 {'口心女'}
>> None
靛 {'亠宀月疋二青土'}
>> None
If I substitute the two lines below for setting k_rad:
k_rad[k] = k_rad.setdefault(k, set())
k_rad[k].update(set(v['radical']['multi_radical']))
My output looks like this:
靛 {'亠宀月疋二青土'}
>> {'亠宀月疋二青土'}
恕 {'口心女'}
>> {'口心女'}
If I understand setdefault, (which obviously I don't) the output should be the same,right?
What am I missing? Why am is dict.setupdate(key,set()).update(set(...)) returning None?
As pointed out below, the problem is that update returns None. I really didn't understand
how update and setdefault work together. Since setdefault sets the dict to the default if
we're creating a new dict element and returning the the hash and update updates the element
I didn't need the assignment. All I really needed was:
for k,v in kanjidic.items():
if 'radical' in v and 'multi_radical' in v['radical']:
k_rad.setdefault(k, set()).update(v['radical']['multi_radical'])
Thanks for the assistance!
dict.setdefault returns a set in your case. And set.update is an in-place operation, which means it changes the original set and returns None. So if you assign the result to a variable you just assign it a None.
None seems to work as a dictionary key, but I am wondering if that will just lead to trouble later. For example, this works:
>>> x={'a':1, 'b':2, None:3}
>>> x
{'a': 1, None: 3, 'b': 2}
>>> x[None]
3
The actual data I am working with is educational standards. Every standard is associated with a content area. Some standards are also associated with content subareas. I would like to make a nested dictionary of the form {contentArea:{contentSubArea:[standards]}}. Some of those contentSubArea keys would be None.
In particular, I am wondering if this will lead to confusion if I look for a key that does not exist at some point, or something unanticipated like that.
Any hashable value is a valid Python Dictionary Key. For this reason, None is a perfectly valid candidate. There's no confusion when looking for non-existent keys - the presence of None as a key would not affect the ability to check for whether another key was present. Ex:
>>> d = {1: 'a', 2: 'b', None: 'c'}
>>> 1 in d
True
>>> 5 in d
False
>>> None in d
True
There's no conflict, and you can test for it just like normal. It shouldn't cause you a problem. The standard 1-to-1 Key-Value association still exists, so you can't have multiple things in the None key, but using None as a key shouldn't pose a problem by itself.
You want trouble? here we go:
>>> json.loads(json.dumps({None:None}))
{u'null': None}
So yea, better stay away from json if you do use None as a key. You can patch this by custom (de/)serializer, but I would advise against use of None as a key in the first place.
None is not special in any particular way, it's just another python value. Its only distinction is that it happens to be the return value of a function that doesn't specify any other return value, and it also happens to be a common default value (the default arg of dict.get(), for instance).
You won't cause any run-time conflicts using such a key, but you should ask yourself if that's really a meaningful value to use for a key. It's often more helpful, from the point of view of reading code and understanding what it does, to use a designated instance for special values. Something like:
NoSubContent = SubContentArea(name=None)
{"contentArea":
{NoSubContent:[standards],
SubContentArea(name="Fruits"): ['apples', 'bananas']}}
jsonify does not support a dictionary with None key.
From Flask import jsonify
def json_():
d = {None: 'None'}
return jsonify(d)
This will throw an error:
TypeError: '<' not supported between instances of 'NoneType' and 'str'
It seems to me, the larger, later problem is this. If your process is creating pairs and some pairs have a "None" key, then it will overwrite all the previous None pairs. Your dictionary will silently throw out values because you had duplicate None keys. No?
Funny though, even this works :
d = {None: 'None'}
In [10]: None in d
Out[10]: True