.append() behaves differently for Python dictionaries initialized from two different methods - python

ky = ['a','b','c']
val = 15
dict_1 = dict.fromkeys(ky,[])
dict_1['a'].append(val)
dict_2 = {'a':[],'b':[],'c':[]}
dict_2['a'].append(val)
print(dict_1)
print(dict_2)

they are all using the same ref when you use dict.fromkeys() A change to one is a change to all since they are the same object
You could use a dict comprehension instead of append:
keys = ['a','b','c']
value = [0, 0]
{key: list(value) for key in keys}
{'a': [0, 0], 'b': [0, 0], 'c': [0, 0]}

Please correct me if I'm wrong, but given your post, I'm presuming your question is why dict_1 has the same value (15) for all its keys a, b, and c as the end result, right?
If so, I agree that this is unintuitive behavior at a first glance, but what is happening is that by passing [] (an empty list) as the second optional argument for the fromkeys method, you're populating dict_1 dictionary with references to the same single empty list, hence, when you append val to the key a, all other keys are updated with the same value as well (you passed the empty list as reference. see difference between passing by reference vs passing by value ).
As you noticed yourself, there are other ways to initialize a dictionary of empty lists in Python (as you did "manually" with dict_2).
In case you're interested in another "less manual" way to initialize a dictionary of empty lists in Python, you could use a list comprehension to do so, as in the following example:
dict_3 = {k : [] for k in ky}
That would initialize dict_3 with empty lists (passed by value), thus, keys b and c would be unaffected when you append val to the key a.
I hope that helps to clarify your example. Kind regards.

With dict_1 = dict.fromkeys(ky,[]), all keys are mapped to the same (empty) array.
So whenever you change the contents of that array, each one of these keys will (still) point to the same array containing these new contents.

dict_1 = dict.fromkeys(ky,[]) creates a single list and then calls fromkeys. That one list is used for the value paired with each key. One list referenced through all of the keys.
If you want unique lists, use a dictionary comprehension
dict_3 = {k:[] for k in ky}
Here, k:[] is evaluated for each key and a new list is created for each value.

Related

list(dictionary) vs dictionary.keys() vs list(dictionary.keys()) in python

Given a dictionary my_dict, we apply list(my_dict) and my_dict.keys() in the following code
my_dict = {'1': 1, '2': 2, '3': 3, '4': 4}
list_keys = list(my_dict)
view_keys = my_dict.keys()
list_from_view = list(my_dict.keys())
print("list_keys : ", list_keys)
print("view_keys : ", view_keys)
print("list_from_view", list_from_view)
Results:
list_keys : ['1', '2', '3', '4']
view_keys : dict_keys(['1', '2', '3', '4'])
list_from_view ['1', '2', '3', '4']
What are the differences between using list(my_dict), my_dict.keys(), and list(my_dict.keys()), especially:
list(my_dict) vs my_dict.keys()
list(my_dict) vs list(my_dict.keys()) (what is the best (fast) way to get a list of keys)
Thanks.
A Python list is an iterable, but not all iterables are list...
Let us examine your expressions:
list_keys = list(my_dict): here you use my_dict as an iterable over the keys and build a new list from it. Long story made short, you have copied the keys into a list. From that point on, you can apply any changes to the list or the initial dict without changing anything to the other object
view_keys = my_dict.keys(): here you get a dict_keys view on the dictionary. It is a non modifiable iterable that can be used to access the keys of the dictionary. If you add an item to the dictionary, you will see it immediately in the view, but you can neither add a new element to view_keys nor change or remove one
list_from_view = list(my_dict.keys()): here you access the view on the keys, and iterate it to build a list. In the end, it is exactly the same as the first way: you get an independent list
Which one is best? It depends.
As I have already said, 1 and 3 give equivalent lists. 1 is probably more Pythonic because it uses the fact that a Python dictionary is implicitly an iterable over its keys. 3 is probably easier to understand for new Python users because it explicitly references an operation on keys.
2 is a completely different animal because instead of having an independent list object, you have a view on the initial dictionary that will follow its changes.
Now for the question:
what is the best (fast) way to get a list of keys
2 will not return a list of keys because a list have append or remove methods that a view has not
1 and 3 should be seen as equivalent on a performance point of view, and I have already spoken of readability which is the most important quality of Python code
The question that you should have asked:
what is the more pythonic way to iterate over the list of keys of a dictionary?
With no doubt for key in my_dict. No need to convert that to a list, and a view is seldom necessary
The major difference in these is between two types, namely list and dict_keys.
list is taking all the values from given keys at time and storing them into a list object.
dict_keys object on the other hand provides you with a view on dictionary keys.
Difference between these is shown in the following:
d = {1: 2, 3: 4}
a = d.keys()
b = list(d)
a
# dict_keys([1, 3])
b
# [1, 3]
d[5] = 6
a
# dict_keys([1, 3, 5])
b
# [1, 3]
In conclusion, dict_keys object will show you the updates to your dict as soon as they are introduced, while list will stay the same.
Should you make changes to the list those changes will not be reflected onto dict, while on the other hand you cannot make changes to dict_keys.

python dictionary: how does appending of items work? [duplicate]

This question already has answers here:
How do I initialize a dictionary of empty lists in Python?
(7 answers)
Closed 2 years ago.
I came across this behavior that surprised me in Python 2.6 and 3.2:
>>> xs = dict.fromkeys(range(2), [])
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: [1]}
However, dict comprehensions in 3.2 show a more polite demeanor:
>>> xs = {i:[] for i in range(2)}
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: []}
>>>
Why does fromkeys behave like that?
Your Python 2.6 example is equivalent to the following, which may help to clarify:
>>> a = []
>>> xs = dict.fromkeys(range(2), a)
Each entry in the resulting dictionary will have a reference to the same object. The effects of mutating that object will be visible through every dict entry, as you've seen, because it's one object.
>>> xs[0] is a and xs[1] is a
True
Use a dict comprehension, or if you're stuck on Python 2.6 or older and you don't have dictionary comprehensions, you can get the dict comprehension behavior by using dict() with a generator expression:
xs = dict((i, []) for i in range(2))
In the first version, you use the same empty list object as the value for both keys, so if you change one, you change the other, too.
Look at this:
>>> empty = []
>>> d = dict.fromkeys(range(2), empty)
>>> d
{0: [], 1: []}
>>> empty.append(1) # same as d[0].append(1) because d[0] references empty!
>>> d
{0: [1], 1: [1]}
In the second version, a new empty list object is created in every iteration of the dict comprehension, so both are independent from each other.
As to "why" fromkeys() works like that - well, it would be surprising if it didn't work like that. fromkeys(iterable, value) constructs a new dict with keys from iterable that all have the value value. If that value is a mutable object, and you change that object, what else could you reasonably expect to happen?
To answer the actual question being asked: fromkeys behaves like that because there is no other reasonable choice. It is not reasonable (or even possible) to have fromkeys decide whether or not your argument is mutable and make new copies every time. In some cases it doesn't make sense, and in others it's just impossible.
The second argument you pass in is therefore just a reference, and is copied as such. An assignment of [] in Python means "a single reference to a new list", not "make a new list every time I access this variable". The alternative would be to pass in a function that generates new instances, which is the functionality that dict comprehensions supply for you.
Here are some options for creating multiple actual copies of a mutable container:
As you mention in the question, dict comprehensions allow you to execute an arbitrary statement for each element:
d = {k: [] for k in range(2)}
The important thing here is that this is equivalent to putting the assignment k = [] in a for loop. Each iteration creates a new list and assigns it to a value.
Use the form of the dict constructor suggested by #Andrew Clark:
d = dict((k, []) for k in range(2))
This creates a generator which again makes the assignment of a new list to each key-value pair when it is executed.
Use a collections.defaultdict instead of a regular dict:
d = collections.defaultdict(list)
This option is a little different from the others. Instead of creating the new list references up front, defaultdict will call list every time you access a key that's not already there. You can there fore add the keys as lazily as you want, which can be very convenient sometimes:
for k in range(2):
d[k].append(42)
Since you've set up the factory for new elements, this will actually behave exactly as you expected fromkeys to behave in the original question.
Use dict.setdefault when you access potentially new keys. This does something similar to what defaultdict does, but it has the advantage of being more controlled, in the sense that only the access you want to create new keys actually creates them:
d = {}
for k in range(2):
d.setdefault(k, []).append(42)
The disadvantage is that a new empty list object gets created every time you call the function, even if it never gets assigned to a value. This is not a huge problem, but it could add up if you call it frequently and/or your container is not as simple as list.

python initialize nested dictionary with keys and ambiguous behavior of dict.fromkeys class method [duplicate]

This question already has answers here:
How do I initialize a dictionary of empty lists in Python?
(7 answers)
Closed 2 years ago.
I came across this behavior that surprised me in Python 2.6 and 3.2:
>>> xs = dict.fromkeys(range(2), [])
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: [1]}
However, dict comprehensions in 3.2 show a more polite demeanor:
>>> xs = {i:[] for i in range(2)}
>>> xs
{0: [], 1: []}
>>> xs[0].append(1)
>>> xs
{0: [1], 1: []}
>>>
Why does fromkeys behave like that?
Your Python 2.6 example is equivalent to the following, which may help to clarify:
>>> a = []
>>> xs = dict.fromkeys(range(2), a)
Each entry in the resulting dictionary will have a reference to the same object. The effects of mutating that object will be visible through every dict entry, as you've seen, because it's one object.
>>> xs[0] is a and xs[1] is a
True
Use a dict comprehension, or if you're stuck on Python 2.6 or older and you don't have dictionary comprehensions, you can get the dict comprehension behavior by using dict() with a generator expression:
xs = dict((i, []) for i in range(2))
In the first version, you use the same empty list object as the value for both keys, so if you change one, you change the other, too.
Look at this:
>>> empty = []
>>> d = dict.fromkeys(range(2), empty)
>>> d
{0: [], 1: []}
>>> empty.append(1) # same as d[0].append(1) because d[0] references empty!
>>> d
{0: [1], 1: [1]}
In the second version, a new empty list object is created in every iteration of the dict comprehension, so both are independent from each other.
As to "why" fromkeys() works like that - well, it would be surprising if it didn't work like that. fromkeys(iterable, value) constructs a new dict with keys from iterable that all have the value value. If that value is a mutable object, and you change that object, what else could you reasonably expect to happen?
To answer the actual question being asked: fromkeys behaves like that because there is no other reasonable choice. It is not reasonable (or even possible) to have fromkeys decide whether or not your argument is mutable and make new copies every time. In some cases it doesn't make sense, and in others it's just impossible.
The second argument you pass in is therefore just a reference, and is copied as such. An assignment of [] in Python means "a single reference to a new list", not "make a new list every time I access this variable". The alternative would be to pass in a function that generates new instances, which is the functionality that dict comprehensions supply for you.
Here are some options for creating multiple actual copies of a mutable container:
As you mention in the question, dict comprehensions allow you to execute an arbitrary statement for each element:
d = {k: [] for k in range(2)}
The important thing here is that this is equivalent to putting the assignment k = [] in a for loop. Each iteration creates a new list and assigns it to a value.
Use the form of the dict constructor suggested by #Andrew Clark:
d = dict((k, []) for k in range(2))
This creates a generator which again makes the assignment of a new list to each key-value pair when it is executed.
Use a collections.defaultdict instead of a regular dict:
d = collections.defaultdict(list)
This option is a little different from the others. Instead of creating the new list references up front, defaultdict will call list every time you access a key that's not already there. You can there fore add the keys as lazily as you want, which can be very convenient sometimes:
for k in range(2):
d[k].append(42)
Since you've set up the factory for new elements, this will actually behave exactly as you expected fromkeys to behave in the original question.
Use dict.setdefault when you access potentially new keys. This does something similar to what defaultdict does, but it has the advantage of being more controlled, in the sense that only the access you want to create new keys actually creates them:
d = {}
for k in range(2):
d.setdefault(k, []).append(42)
The disadvantage is that a new empty list object gets created every time you call the function, even if it never gets assigned to a value. This is not a huge problem, but it could add up if you call it frequently and/or your container is not as simple as list.

Find out if no items in a list are keys in a dictionary

I have this list:
source = ['sourceid', 'SubSourcePontiflex', 'acq_source', 'OptInSource', 'source',
'SourceID', 'Sub-Source', 'SubSource', 'LeadSource_295', 'Source',
'SourceCode', 'source_code', 'SourceSubID']
I am iterating over XML in python to create a dictionary for each child node. The dictionary varies in length and keys with each iteration. Sometimes the dictionary will contain a key that is also an item in this list. Sometimes it wont. What I want to be able to do is, if a key in the dictionary is also an item in this list then append the value to a new list. If none of the keys in the dictionary are in list source, I'd like to append a default value. I'm really having a brain block on how to do this. Any help would be appreciated.
Just use the in keyword to check for membership of some key in a dictionary.
The following example will print [3, 1] since 3 and 1 are keys in the dictionary and also elements of the list.
someList = [8, 9, 7, 3, 1]
someDict = {1:2, 2:3, 3:4, 4:5, 5:6}
intersection = [i for i in someList if i in someDict]
print(intersection)
You can just check if this intersection list is empty at every iteration. If the list is empty then you know that no items in the list are keys in the dictionary.
in_source_and_dict = set(mydict.keys()).intersection(set(source))
in_dict_not_source = set(mydict.keys()) - set(source)
in_source_not_dict = set(source) - set(mydict.keys())
Iterate over the result of which one you want. In this case I guess you'll want to iterate over in_source_not_dict to provide default values.
In Python 3, you can perform set operations directly on the object returned by dict.keys():
in_source_and_dict = mydict.keys() & source
in_dict_not_source = mydict.keys() - source
in_source_not_dict = source - mydict.keys()
This will also work in Python 2.7 if you replace .keys() by .viewkeys().
my_dict = { some values }
values = []
for s in sources:
if my_dict.get(s):
values += [s]
if not values:
values += [default]
You can loop through the sources array and see if there is a value for that source in the dictionary. If there is, append it to values. After that loop, if values is empty, append the default vaule.
Note, if you have a key, value pair in your dictionary (val, None) then you will not append the None value to the end of the list. If that is an issue you will probably not want to use this solution.
You can do this with the any() function
dict = {...}
keys = [...]
if not any(key in dict for key in keys):
# no keys here
Equivalently, with all() (DeMorgan's laws):
if all(key not in dict for key in keys):
# no keys here

Correspendence between list indices originated from dictionary

I wrote the below code working with dictionary and list:
d = computeRanks() # dictionary of id : interestRank pairs
lst = list(d) # tuples (id, interestRank)
interestingIds = []
for i in range(20): # choice randomly 20 highly ranked ids
choice = randomWeightedChoice(d.values()) # returns random index from list
interestingIds.append(lst[choice][0])
There seems to be possible error because I'm not sure if there is a correspondence between indices in lst and d.values().
Do you know how to write this better?
One of the policies of dict is that the results of dict.keys() and dict.values() will correspond so long as the contents of the dictionary are not modified.
As #Ignacio says, the index choice does correspond to the intended element of lst, so your code's logic is correct. But your code should be much simpler: d already contains IDs for the elements, so rewrite randomWeightedChoice to take a dictionary and return an ID.
Perhaps it will help you to know that you can iterate over a dictionary's key-value pairs with d.items():
for k, v in d.items():
etc.

Categories

Resources