Accessing values in a sub-dictionary within a dictionary - python

Hello I have a dictionary that looks like this:
dictionary = {'John': {'car':12, 'house':10, 'boat':3},
'Mike': {'car':5, 'house':4, 'boat':6}}
I want to gain access and extract the keys within the sub-dictionary and assign them to variables like this:
cars_total = dictionary['car']
house_total = dictionary['house']
boat_total = dictionary['boat']
Now, when I run the variables above I get a 'Key Error'. It is understandable because I need to first access the outer dictionary. I would appreciate if someone gave a helping hand on how to access keys and the values within the sub-dictionary as those are the values I want to use.
Also i would like to do create a new key, this may not be right but something on these lines:
car = dictionary['car']
house = dictionary['house']
boat = dictionary['boat']
dictionary['total_assets'] = car + house + boat
I want to be able to access all those keys in the dictionary and create a new key. The outer keys such as "John' and 'Mike' should both contain the newly made key at the end.
I know this throws an error but it will give you an idea on what I want to achieve. Thanks for the help

I would just use a Counter object to get the totals:
>>> from collections import Counter
>>> totals = Counter()
>>> for v in dictionary.values():
... totals.update(v)
...
>>> totals
Counter({'car': 17, 'house': 14, 'boat': 9})
>>> totals['car']
17
>>> totals['house']
14
>>>
This has the added benefit of working nicely even if the keys aren't always present.
If you want the total assets, you can then simply sum the values:
>>> totals['total_assets'] = sum(totals.values())
>>> totals
Counter({'total_assets': 40, 'car': 17, 'house': 14, 'boat': 9})
>>>

To sum the total assets for each person and add it as a new key:
for person in dictionary:
dictionary[person]['total_assets'] = sum(dictionary[person].values())
which will result in:
dictionary = {'John': {'car':12, 'house':10, 'boat':3, 'total_assets':25},
'Mike': {'car':5, 'house':4, 'boat':6, 'total_assets':15}}

dictionary doens't have a key car, as you've seen. But dictionary['John'] does.
$ >>> dictionary['John']
{'car': 12, 'boat': 3, 'house': 10}
>>> dictionary['John']['car']
12
>>>
The value associated with each key in dictionary is, itself, another dictionary, which you index separately. There is no single object that contains, e.g., the car value for each subdictionary; you have to iterate
over each value.
# Iterate over the dictionary once per aggregate
cars_total = sum(d['car'] for d in dictionary.values())
house_total = sum(d['house'] for d in dictionary.values())
boat_total = sum(d['boat'] for d in dictionary.values())
or
# Iterate over the dictionary once total
cars_total = 0
house_total = 0
boat_total = 0
for d in dictionary.values():
cars_total += d['car']
house_total += d['house']
boat_total += d['boat']

dictionary = {'John': {'car':12, 'house':10, 'boat':3},'Mike': {'car':5, 'house':4, 'boat':6}}
total_cars=sum([dictionary[x]['car'] for x in dictionary ])
total_house=sum([dictionary[x]['house'] for x in dictionary ])
total_boats=sum([dictionary[x]['boat'] for x in dictionary ])
print(total_cars)
print(total_house)
print(total_boats)

Sample iteration method:
from collections import defaultdict
totals = defaultdict(int)
for person in dictionary:
for element in dictionary[person]:
totals[element] += dictionary[person][element]
print(totals)
Output:
defaultdict(<type 'int'>, {'car': 17, 'boat': 9, 'house': 14})

Related

Convert a list with duplicating keys into a dictionary and sum the values for each duplicating key

I am new to Python so I do apologize that my first question might not be asked clearly to achieve the right answer.
I thought if I converted a list with duplicating keys into a dictionary then I would be able to sum the values of each duplicating key. I have tried to search on Google and Stack Overflow but I actually still can't solve this problem.
Can anybody help, please? Thank you very much in advance and I truly appreciate your help.
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
My expected output is:
dict = {a: 10, b: 17, c: 7}
You can try this code:
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
l1 = [each.split(":") for each in list1]
d1 = {}
for each in l1:
if each[0] not in d1:
d1[each[0]] = int(each[1])
else:
d1[each[0]] += int(each[1])
d1
Output: {'a': 10, 'b': 17, 'c': 7}
Explanation:
Step 1. Convert your given list to key-value pair by splitting each of the elements in your original list from : and store that in a list/tuple
Step 2. Initialize an empty dictionary
Step 3. Iterate through each key-value pair in the newly created list/tuple and store that in a dictionary. If the key doesn't exist, then add new key-value pair to dictionary or else just add the values to it's corresponding key.
A list does not have "keys" per say, rather it has elements. In your example, the elements them selves are a key value pair. To make the dictionary you want you have to do 3 things,
Parse each element into its key value pair
Handle duplicate values
Add each pair to the dictionary.
the code should look like this
list1 = ["a:2", "b:5", "c:7", "a:8", "b:12"]
dict1={}#make an empty dictionary
for element in list1:
key,value=element.split(':')#This splits your list elements into a tuple of (key,value)
if key in dict1:#check if the key is in the dictionary
dict1[key]+=int(value)#add to existing key
else:
dict1[key]=int(value)#initilize new key
print(dict1)
That code prints out
{'a': 10, 'c': 7, 'b': 17}
You could use a defaultdict, iterate over each string and add the corresponding value after splitting it to a pair (key, value).
>>> from collections import defaultdict
>>> res = defaultdict(int)
>>> for el in list1:
... k, v = el.split(':')
... res[k]+=int(v)
...
>>> res
defaultdict(<class 'int'>, {'a': 10, 'b': 17, 'c': 7})

Retrieving items from a nested dictionary with a nested for loop fresults in KeyError

I need to systematically access dictionaries that are nested within a list within a dictionary at the 3rd level, like this:
responses = {'1': {'responses': [{1st dict to be retrieved}, {2nd dict to be retrieved}, ...]},
'2': {'responses': [{1st dict to be retrieved}, {2nd dict to be retrieved}, ...]}, ...}
I need to unnest and transform these nested dicts into dataframes, so the end result should look like this:
responses = {'1': df1,
'2': df2, ...}
In order to achieve this, I built a for-loop in order to loop through all keys on the first level. Within that loop, I am using another loop to extract each item from the nested dicts into a new empty list called responses_df:
responses_dict = {}
for key in responses.keys():
for item in responses[key]['responses']:
responses_dict[key].update(item)
However, I get:
KeyError: '1'
The inner loop works if I use it individually on a key within the dict, but that doesn't really help me since the data comes from an API and has to be updated dynamically every few minutes in production.
The nex loop to transform the result into dataframes would look like this:
for key in responses_dict:
responses_df[key] = pd.DataFrame.from_dict(responses_dict[key], orient='index')
But I haven't gotten to try that out since the first operation fails.
Try this:
from collections import defaultdict
responses_dict = defaultdict(dict) # instead of {}
Then your code will work.
In fact responses_dict[key] where key=1 doesn't exist.
So when you simply do print(responses_dict[key]) you get the same error, 1 is not a key of that dict and update is not used as it should be.
Try the following syntax :
responses_dict = {}
for key in responses.keys():
print(key)
for item in responses[key]['responses']:
responses_dict.update(key = item)
I prefer using dictionaries while updating a dictionary.
If you update with an existing key, the value of that key will be updated.
If you update with an new key-value pair, the pair will be added to that dictionary.
>>>d1 = {1: 10, 2:20}
>>>d1.update({1:20})
>>>d1
>>>{1: 20, 2:20}
>>>d1.update({3:30})
>>>d1
>>>{1: 20, 2:20, 3:30}
Try fixing your line with:
responses_dict = {}
for key in responses.keys():
for item in responses[key]['responses']:
responses_dict.update({key: item})
So basically, use dictionary to update a dictionary, more readable and easy.
Try this:
responses = {'1': {'responses': [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}]},
'2': {'responses': [{'e': 5}, {'f': 6}]}}
result = {k: pd.DataFrame(chain.from_iterable(v['responses'])) for k, v in responses.items()}
for df in result.values():
print(df, end='\n\n')
Output:
0
0 a
1 b
2 c
3 d
0
0 e
1 f

Make a new list depending on group number and add scores up as well

If a have a list within a another list that looks like this...
[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
How can I add the middle element together so so for 'Harry' for example, it shows up as ['Harry', 26] and also for Python to look at the group number (3rd element) and output the winner only (the one with the highest score which is the middle element). So for each group, there needs to be one winner. So the final output shows:
[['Harry', 26],['Sam',21]]
THIS QUESTION IS NOT A DUPLICATE: It has a third element as well which I am stuck about
The similar question gave me an answer of:
grouped_scores = {}
for name, score, group_number in players_info:
if name not in grouped_scores:
grouped_scores[name] = score
grouped_scores[group_number] = group_number
else:
grouped_scores[name] += score
But that only adds the scores up, it doesn't take out the winner from each group. Please help.
I had thought doing something like this, but I'm not sure exactly what to do...
grouped_scores = {}
for name, score, group_number in players_info:
if name not in grouped_scores:
grouped_scores[name] = score
else:
grouped_scores[name] += score
for group in group_number:
if grouped_scores[group_number] = group_number:
[don't know what to do here]
Solution:
Use itertools.groupby, and collections.defaultdict:
l=[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
from itertools import groupby
from collections import defaultdict
l2=[list(y) for x,y in groupby(l,key=lambda x: x[-1])]
l3=[]
for x in l2:
d=defaultdict(int)
for x,y,z in x:
d[x]+=y
l3.append(max(list(map(list,dict(d).items())),key=lambda x: x[-1]))
Now:
print(l3)
Is:
[['Harry', 26], ['Sam', 21]]
Explanation:
First two lines are importing modules. Then the next line is using groupby to separate in to two groups based on last element of each sub-list. Then the next line to create empty list. Then the next loop iterating trough the grouped ones. Then create a defaultdict. Then the sub-loop is adding the stuff to the defaultdict. Then last line to manage how to make that dictionary into a list.
I would aggregate the data first with a defaultdict.
>>> from collections import defaultdict
>>>
>>> combined = defaultdict(lambda: defaultdict(int))
>>> data = [['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
>>>
>>> for name, score, group in data:
...: combined[group][name] += score
...:
>>> combined
>>>
defaultdict(<function __main__.<lambda>()>,
{1: defaultdict(int, {'Harry': 26, 'Jake': 4}),
2: defaultdict(int, {'Dave': 9, 'Sam': 21})})
Then apply max to each value in that dict.
>>> from operator import itemgetter
>>> [list(max(v.items(), key=itemgetter(1))) for v in combined.values()]
>>> [['Harry', 26], ['Sam', 21]]
use itertools.groupby and then take the middle value from the grouped element and then append it to a list passed on the maximum condition
import itertools
l=[['Harry',9,1],['Harry',17,1],['Jake',4,1], ['Dave',9,2],['Sam',17,2],['Sam',4,2]]
maxlist=[]
maxmiddleindexvalue=0
for key,value in itertools.groupby(l,key=lambda x:x[0]):
s=0
m=0
for element in value:
s+=element[1]
m=max(m,element[1])
if(m==maxmiddleindexvalue):
maxlist.append([(key,s)])
if(m>maxmiddleindexvalue):
maxlist=[(key,s)]
maxmiddleindexvalue=m
print(maxlist)
OUTPUT
[('Harry', 26), [('Sam', 21)]]

How can I correct my code to produce a nested dictionary?

I am trying to create a nested dictionary. I have a list of tuples (called 'kinetic_parameters') which looks like this:
('New Model','v7','k1',0.1)
('New Model','v8','k2',0.2)
('New Model','v8','k3',0.3)
I need the second column to be the outer key and the value to be another dictionary with inner key being the third column and value being the number in the fourth.
I currently have:
for i in kinetic_parameters:
dict[i[1]]={}
dict[i[1]][i[2]]=i[3]
But this code will not deal with multiple keys in the inner dictionary so I lose some information. Does anybody know how to correct my problem?
I'm using Python 2.7 and I want the output to look like this:
{'v7': {'k1': 0.1}, 'v8':{'k2':0.2, 'k3': 0.3}}
Use a defaultdict, and don't use dict as a variable name, since we need it to refer to the dictionary type:
import collections
d = collections.defaultdict(dict)
for i in kinetic_parameters:
d[i[1]][i[2]]=i[3]
This will create the dictionaries automatically.
Right, if the major ("outer") key has been seen before you should be using the existing dictionary. Or put the other way around: Create an embedded dictionary only if does not exist, then add the value. Here's the logic, using tuple assignment for clarity:
nested = dict()
for row in kinetic_parameters:
_model, outkey, inkey, val = row
if outkey not in d:
nested[outkey] = dict()
nested[outkey][inkey] = val
Or you can skip the existence check by using defaultdict, which can create new embedded dicts as needed:
from collections import defaultdict
nested = defaultdict(dict)
for row in kinetic_parameters:
_model, outkey, inkey, val = row
nested[outkey][inkey] = val
In your code on every loop dictionary is re-initialized. You need to initialize the dictionary first and then add items to it
for i in kinetic_parameters:
d[i[1]]={}
for i in kinetic_parameters:
d[i[1]][i[2]]=i[3]
or check it before initializing
for i in kinetic_parameters:
if d.get(i[1]) is None:
d[i[1]]={}
d[i[1]][i[2]]=i[3]
kinetic_parameters = [('New Model','v7','k1',0.1),
('New Model','v8','k2',0.2),
('New Model','v8','k3',0.3)
]
d = {}
for i in kinetic_parameters:
if i[1] not in d.keys(): # Check if v7, v8 etc is present.
d[i[1]] = {} # Create an empty dict if absent
d[i[1]][i[2]] = i[3]
print(d)
Output is what you expected:
{'v7': {'k1': 0.1}, 'v8': {'k3': 0.3, 'k2': 0.2}}
You can use a dict comprehension to get the last 3 items then use reduce function to create a nested dictionary :
>>> l=[('New Model','v7','k1',0.1),
... ('New Model','v8','k2',0.2),
... ('New Model','v8','k3',0.3)]
>>>
>>> [reduce(lambda x,y:{y:x},p) for p in [i[1:][::-1] for i in l]]
[{'v7': {'k1': 0.1}},
{'v8': {'k2': 0.2}},
{'v8': {'k3': 0.3}}]
This also will works with longer lists :
>>> l=[('New Model','v7','k1',0.1,'c','5','r',9),
... ('New Model','v8','k2',0.2,'d','6'),
... ('New Model','v8','k3',0.3)]
>>> [reduce(lambda x,y:{y:x},p) for p in [i[1:][::-1] for i in l]]
[{'v7': {'k1': {0.1: {'c': {'5': {'r': 9}}}}}},
{'v8': {'k2': {0.2: {'d': '6'}}}},
{'v8': {'k3': 0.3}}]
Edit: If you want a dictionary as the main container you can use a generator expression within dict to convert your list to dictionary :
>>> g=[reduce(lambda x,y:{y:x},p) for p in [i[1:][::-1] for i in l]]
>>> dict(next(i.iteritems()) for i in g)
{'v8': {'k3': 0.3}, 'v7': {'k1': {0.1: {'c': {'5': {'r': 9}}}}}}
Here is a solution, which I believe is easy to understand:
import collections
kinetic_parameters = [
('New Model','v7','k1',0.1),
('New Model','v8','k2',0.2),
('New Model','v8','k3',0.3),
]
result = collections.defaultdict(dict)
for _, outter_key, inner_key, inner_value in kinetic_parameters:
outter_value = {inner_key: inner_value}
result[outter_key].update(outter_value)
In this solution, we use defaultdict for the outer dictionary. The first time we encounter result[outter_key], an empty dictionary will be created and assigned to the value. The next step is to update that value (the inner dictionary).
Update
If you don't want to use defaultdict:
result = {}
for _, outter_key, inner_key, inner_value in kinetic_parameters:
outter_value = {inner_key: inner_value}
result.setdefault(outter_key, {})
result[outter_key].update(outter_value)
The setdefault method create a new dictionary and assign to the outter dictionary only for the first time.

Inverting a dictionary when some of the original values are identical

Say I have a dictionary called word_counter_dictionary that counts how many words are in the document in the form {'word' : number}. For example, the word "secondly" appears one time, so the key/value pair would be {'secondly' : 1}. I want to make an inverted list so that the numbers will become keys and the words will become the values for those keys so I can then graph the top 25 most used words. I saw somewhere where the setdefault() function might come in handy, but regardless I cannot use it because so far in the class I am in we have only covered get().
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
inverted_dictionary[new_key] = word_counter_dictionary.get(new_key, '') + str(key)
inverted_dictionary
So far, using this method above, it works fine until it reaches another word with the same value. For example, the word "saves" also appears once in the document, so Python will add the new key/value pair just fine. BUT it erases the {1 : 'secondly'} with the new pair so that only {1 : 'saves'} is in the dictionary.
So, bottom line, my goal is to get ALL of the words and their respective number of repetitions in this new dictionary called inverted_dictionary.
A defaultdict is perfect for this
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
from collections import defaultdict
d = defaultdict(list)
for key, value in word_counter_dictionary.iteritems():
d[value].append(key)
print(d)
Output:
defaultdict(<type 'list'>, {1: ['first'], 2: ['second', 'fourth'], 3: ['third']})
What you can do is convert the value in a list of words with the same key:
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(str(key))
else:
inverted_dictionary[new_key] = [str(key)]
print inverted_dictionary
>>> {1: ['first'], 2: ['second', 'fourth'], 3: ['third']}
Python dicts do NOT allow repeated keys, so you can't use a simple dictionary to store multiple elements with the same key (1 in your case). For your example, I'd rather have a list as the value of your inverted dictionary, and store in that list the words that share the number of appearances, like:
inverted_dictionary = {}
for key in word_counter_dictionary:
new_key = word_counter_dictionary[key]
if new_key in inverted_dictionary:
inverted_dictionary[new_key].append(key)
else:
inverted_dictionary[new_key] = [key]
In order to get the 25 most repeated words, you should iterate through the (sorted) keys in the inverted_dictionary and store the words:
common_words = []
for key in sorted(inverted_dictionary.keys(), reverse=True):
if len(common_words) < 25:
common_words.extend(inverted_dictionary[key])
else:
break
common_words = common_words[:25] # In case there are more than 25 words
Here's a version that doesn't "invert" the dictionary:
>>> import operator
>>> A = {'a':10, 'b':843, 'c': 39, 'd': 10}
>>> B = sorted(A.iteritems(), key=operator.itemgetter(1), reverse=True)
>>> B
[('b', 843), ('c', 39), ('a', 10), ('d', 10)]
Instead, it creates a list that is sorted, highest to lowest, by value.
To get the top 25, you simply slice it: B[:25].
And here's one way to get the keys and values separated (after putting them into a list of tuples):
>>> [x[0] for x in B]
['b', 'c', 'a', 'd']
>>> [x[1] for x in B]
[843, 39, 10, 10]
or
>>> C, D = zip(*B)
>>> C
('b', 'c', 'a', 'd')
>>> D
(843, 39, 10, 10)
Note that if you only want to extract the keys or the values (and not both) you should have done so earlier. This is just examples of how to handle the tuple list.
For getting the largest elements of some dataset an inverted dictionary might not be the best data structure.
Either put the items in a sorted list (example assumes you want to get to two most frequent words):
word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
counter_word_list = sorted((count, word) for word, count in word_counter_dictionary.items())
Result:
>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]
Or use Python's included batteries (heapq.nlargest in this case):
import heapq, operator
print(heapq.nlargest(2, word_counter_dictionary.items(), key=operator.itemgetter(1)))
Result:
[('third', 3), ('second', 2)]

Categories

Resources