Appending to list in Python dictionary [duplicate] - python

This question already has answers here:
Initialize List to a variable in a Dictionary inside a loop
(2 answers)
Closed 8 years ago.
Is there a more elegant way to write this code?
What I am doing: I have keys and dates. There can be a number of dates assigned to a key and so I am creating a dictionary of lists of dates to represent this. The following code works fine, but I was hoping for a more elegant and Pythonic method.
dates_dict = dict()
for key, date in cur:
if key in dates_dict:
dates_dict[key].append(date)
else:
dates_dict[key] = [date]
I was expecting the below to work, but I keep getting a NoneType has no attribute append error.
dates_dict = dict()
for key, date in cur:
dates_dict[key] = dates_dict.get(key, []).append(date)
This probably has something to do with the fact that
print([].append(1))
None
but why?

list.append returns None, since it is an in-place operation and you are assigning it back to dates_dict[key]. So, the next time when you do dates_dict.get(key, []).append you are actually doing None.append. That is why it is failing. Instead, you can simply do
dates_dict.setdefault(key, []).append(date)
But, we have collections.defaultdict for this purpose only. You can do something like this
from collections import defaultdict
dates_dict = defaultdict(list)
for key, date in cur:
dates_dict[key].append(date)
This will create a new list object, if the key is not found in the dictionary.
Note: Since the defaultdict will create a new list if the key is not found in the dictionary, this will have unintented side-effects. For example, if you simply want to retrieve a value for the key, which is not there, it will create a new list and return it.

Is there a more elegant way to write this code?
Use collections.defaultdict:
from collections import defaultdict
dates_dict = defaultdict(list)
for key, date in cur:
dates_dict[key].append(date)

dates_dict[key] = dates_dict.get(key, []).append(date) sets dates_dict[key] to None as list.append returns None.
In [5]: l = [1,2,3]
In [6]: var = l.append(3)
In [7]: print var
None
You should use collections.defaultdict
import collections
dates_dict = collections.defaultdict(list)

Related

Nested dictionaries from file in Python

So I'm trying to create a nested dictionary but I can't seem to wrap my head around the logic.
So say I have input coming in from a csv:
1,2,3
2,3,4
1,4,5
Now'd like to create a dictionary as follows:
d ={1:{2:3,4:5}, 2:{3:4}}
Such that for the first being some ID column that we create keys in the sub dictionary corresponding to second value.
The way I tried it was to go:
d[row[0]] = {row[1]:row[2]}
But that overwrites the first instead of appending/pushing to it, how would I go about this problem? I can't seem to wrap my mind around what keys to use.
Any guidance is appreciated.
Yes, cause dict[row[0]] = is dict[1] = what overwrites previous dict[1] value
You should use :
dict.setdefault(row[0],{})[row[1]] = row[2]
remember there must be no duplicates for row[1] then
or
dict.setdefault(row[0],{}).update({row[1]:row[2]})
if dict[row[0]] == None:
dict[row[0]] = {row[1]:row[2]}
else:
dict[row[0]][row[1]] = row[2]
you can use defaultdict, which is similaire to the built-in dictionaries, but will create dictionaries with a default values, in your case a default value will be a dictionary
from collections import defaultdict
res = defaultdict(dict)
with this code we are creating a defaultdict, with it's values default value being of type dict, so next we would do
for row in l:
res[row[0]][row[1]] = row[2]

Elegant way to set values in a nested json in python

I am setting up some values in a nested JSON. In the JSON, it is not necessary that the keys would always be present.
My sample code looks like below.
if 'key' not in data:
data['key'] = {}
if 'nested_key' not in data['key']:
data['key']['nested_key'] = some_value
Is there any other elegant way to achieve this? Simply assigning the value without if's like - data['key']['nested_key'] = some_value can sometimes throw KeyError.
I referred multiple similar questions about "getting nested JSON" on StackOverflow but none fulfilled my requirement. So I have added a new question. In case, this is a duplicate question then I'll remove this one once guided towards the right question.
Thanks
Please note that, for the insertion you need not check for the key and you can directly add it. But, defaultdict can be used. It is particularly helpful incase of values like lists.
from collections import defaultdict
data = defaultdict(dict)
data['key']['nested_key'] = some_value
defaultdict will ensure that you will never get a key error. If the key doesn't exist, it returns an empty object of the type with which you have initialized it.
List based example:
from collections import defaultdict
data = defaultdict(list)
data['key'].append(1)
which otherwise will have to be done like below:
data = {}
if 'key' not in data:
data['key'] = ['1']
else:
data['key'].append('2')
Example based on existing dict:
from collections import defaultdict
data = {'key1': 'sample'}
data_new = defaultdict(dict,data)
data_new['key']['something'] = 'nothing'
print data_new
Output:
defaultdict(<type 'dict'>, {'key1': 'sample', 'key': {'something': 'nothing'}})
You can write in one statement:
data.setdefault('key', {})['nested_value'] = some_value
but I am not sure it looks more elegant.
PS: if you prefer to use defaultdict as proposed by Jay, you can initialize the new dict with the original one returned by json.loads(), then passes it to json.dumps():
data2 = defaultdict(dict, data)
data2['key'] = value
json.dumps(data2) # print the expected dict

Faster way to add values to existing dictionary?

I have a dictionary of dictionaries of dictionaries
#Initialize the dictionary
myDict=dict()
for f in ncp:
myDict[f]={}
for t in ncp:
myDict[f][t] = {}
And now I go through and add a value to the lowest level (which happens to be a dictionary key and value of None), like so, but my current method is very slow
for s in subsetList:
stIndex = 0
for f in list(allNodes.intersection(set(s)))
for t in list(allNodes.difference(set( allNodes.intersection(s)))):
myDict[f][t]['st_'+str(stIndex)]=None
stIndex+=1
I try to do it with principles of comprehension, but I fail miserably because the examples I find for comprehension are creating the dictionary, not iterating through an already existing one to add. My attempt to do so wont even 'compile':
myDict[f][t]['st_'+str(stIndex)]
for f in list(allNodes.intersection(set(s)))
for t in list(allNodes.difference(set( allNodes.intersection(s)))) = None
I would write your code like this:
myDict = {}
for i, s in enumurate(subsetList):
tpl = ('st_%d' % (i,), None) # Used to create a new {'st_n': None} later
x = allNodes.intersection(s)
for f in x:
myDict[f] = {}
for t in allNodes.difference(x):
myDict[f][t] = dict([tpl])
This cuts down on the number of new objects you need to create, as well as initializing myDict on-demand.
This should be faster...
from itertools import product
from collections import defaultdict
mydict = defaultdict(dict)
for f, t in product(ncp, repeat=2):
myDict[f][t] = {}
for s in subsetList:
myDict[f][t]['st_'+str(stIndex)] = None
Or if the innermost key level is the same each time...
from itertools import product
from collections import defaultdict
innerDict = {}
for s in subsetList:
innerDict['st_'+str(stIndex)] = None
mydict = defaultdict(dict)
for f, t in product(ncp, repeat=2):
myDict[f][t] = innerDict.copy()
But I'm not sure whether creating a copy of the innermost dictionary is faster than iterating through your subsetList and creating the new dictionary each time. You'd need to time the two options.
Answering my own question here with a theory on best approach after much trial: The final result is myDict and it is a function of 2 elements: allNodes and subsetList, both of which are effectively static tables imported from SQL at the start of my program. So, why not calculate myDict once and store it in SQL and import it also. So instead of rebuilding it every time the program runs which takes 2 minutes, it is just a couple second pyodbc read. I know its kind of a cop out, but it works for the time being.

How to make a list as the default value for a dictionary? [duplicate]

This question already has answers here:
How to create key or append an element to key?
(6 answers)
Closed 1 year ago.
I have a Python code that looks like:
if key in dict:
dict[key].append(some_value)
else:
dict[key] = [some_value]
but I figure there should be some method to get around this 'if' statement. I tried:
dict.setdefault(key, [])
dict[key].append(some_value)
and
dict[key] = dict.get(key, []).append(some_value)
but both complain about "TypeError: unhashable type: 'list'". Any recommendations?
The best method is to use collections.defaultdict with a list default:
from collections import defaultdict
dct = defaultdict(list)
Then just use:
dct[key].append(some_value)
and the dictionary will create a new list for you if the key is not yet in the mapping. collections.defaultdict is a subclass of dict and otherwise behaves just like a normal dict object.
When using a standard dict, dict.setdefault() correctly sets dct[key] for you to the default, so that version should have worked just fine. You can chain that call with .append():
>>> dct = {}
>>> dct.setdefault('foo', []).append('bar') # returns None!
>>> dct
{'foo': ['bar']}
However, by using dct[key] = dct.get(...).append() you replace the value for dct[key] with the output of .append(), which is None.

How to go from a values_list to a dictionary of lists

I have a django queryset that returns a list of values:
[(client pk, timestamp, value, task pk), (client pk, timestamp, value, task pk),....,].
I am trying to get it to return a dictionary of this format:
{client pk:[[timestamp, value],[timestamp, value],...,], client pk:[list of lists],...,}
The values_list may have multiple records for each client pk. I have been able to get dictionaries of lists for client or task pk using:
def dict_from_vl(vls_list):
keys=[values_list[x][3] for x in range(0,len(values_list),1)]
values = [[values_list[x][1], values_list[x][2]] for x in range(0,len(values_list),1)]
target_dict=dict(zip(keys,values))
return target_dict
However using this method, values for the same key write over previous values as it iterates through the values_list, rather than append them to a list. So this works great for getting the most recent if the values list is sorted oldest records to newest, but not for the purpose of creating a list of lists for the dict value.
Instead of target_dict=dict(zip(keys,values)), do
target_dict = defaultdict(list)
for i, key in enumerate(keys):
target_dict[k].append(values[i])
(defaultdict is available in the standard module collections.)
from collections import defaultdict
d = defaultdict(list)
for x in vls_list:
d[x].append(list(x[1:]))
Although I'm not sure if I got the question right.
I know in Python you're supposed to cram everything into a single line, but you could do it the old fashioned way...
def dict_from_vl(vls_list):
target_dict = {}
for v in vls_list:
if v[0] not in target_dict:
target_dict[v[0]] = []
target_dict[v[0]].append([v[1], v[2]])
return target_dict
For better speed, I suggest you don't create the keys and values lists separately but simply use only one loop:
tgt_dict = defaultdict(list)
for row in vas_list:
tgt_dict[row[0]].append([row[1], row[2]])

Categories

Resources