Slices from keys of a multidimensional dictionary - python

I need to assign and retrieve slices of data using two keys, and I do not a priori know the values for one of the keys.
Specifically, I'm downloading and processing text data files that list float values by year and duration (e.g., 1 hour). The duration keys are predetermined, but the years are not. The data are provided sequentially, one line at a time (not tabular, in other words).
Because I don't know all the years in a given file, so far I've tried using defaultdict(dict). Here's my sample code.
from collections import defaultdict
a = defaultdict(dict)
a[2006][2]=0.024
a[2004][2]=0.157
a[2000][1]=0.64
a[2005][2]=0.346
a[2003][2]=0.165
a[2003][6]=0.8
a[2007][12]=0.642
a[2003][1]=0.664
a[2002][6]=0.579
a[2004][1]=0.829
a[2001][6]=0.344
a[2003][3]=0.508
a[2003][12]=0.66
a[2002][1]=0.923
:a
defaultdict(dict,
{2006: {2: 0.024},
2004: {2: 0.157, 1: 0.829},
2000: {1: 0.64},
2005: {2: 0.346},
2003: {2: 0.165, 6: 0.8, 1: 0.664, 3: 0.508, 12: 0.66},
2007: {12: 0.642},
2002: {6: 0.579, 1: 0.923},
2001: {6: 0.344}})
I need to do three things.
Retrieve all the year keys. Remember I don't know them ahead of time.
For each year, retrieve the duration key:value pairs. I figured that one out.
: a[2002]
{6: 0.579, 1: 0.923}
For each duration, retrieve the year key: value pairs. I'm stuck on this one.
I appreciate any help you can offer. If I should be doing this in numpy, pandas, or something else, feel free to redirect me. Keep in mind I don't know the year range ahead of time, and even if I did there are random gap years with no data.

Not sure what you're looking for, but for getting all the year keys and their values you can iterate over that dictionary, by doing like this:
for i in a:
print(i, a[i])
Output:
2006 {2: 0.024}
2004 {2: 0.157, 1: 0.829}
2000 {1: 0.64}
2005 {2: 0.346}
2003 {2: 0.165, 6: 0.8, 1: 0.664, 3: 0.508, 12: 0.66}
2007 {12: 0.642}
2002 {6: 0.579, 1: 0.923}
2001 {6: 0.344}

from collections import defaultdict
a = defaultdict(dict)
a[2006][2]=0.024
a[2004][2]=0.157
a[2000][1]=0.64
a[2005][2]=0.346
a[2003][2]=0.165
a[2003][6]=0.8
a[2007][12]=0.642
a[2003][1]=0.664
a[2002][6]=0.579
a[2004][1]=0.829
a[2001][6]=0.344
a[2003][3]=0.508
a[2003][12]=0.66
a[2002][1]=0.923
print(a)
"""
defaultdict(<class 'dict'>, {2006: {2: 0.024}, 2004: {2: 0.157, 1: 0.829}, 2000: {1: 0.64}, 2005: {2: 0.346}, 2003: {2: 0.165, 6:0.8, 1: 0.664, 3: 0.508, 12: 0.66}, 2007: {12: 0.642}, 2002: {6:0.579, 1: 0.923}, 2001: {6: 0.344}})
"""
# Retrieve all the year keys. Remember I don't know them ahead of time.
for item in a:
print(item)
"""
2006
2004
2000
2005
2003
2007
2002
2001
"""
# For each year, retrieve the duration key:value pairs. I figured that one out.
for year in a:
dur_key_val = a[year]
print(year,'=>',dur_key_val)
# For each duration, retrieve the year key: value pairs. I'm stuck on this one.
durationDict = {}
for year in a:
dur_key_val = a[year]
for inner_key in dur_key_val:
duration = dur_key_val[inner_key]
durationDict[duration] = { year:inner_key }
print(durationDict)
"""
{0.024: {2006: 2}, 0.157: {2004: 2}, 0.829: {2004: 1}, 0.64: {2000: 1}, 0.346: {2005: 2}, 0.165: {2003: 2}, 0.8: {2003: 6}, 0.664: {2003: 1}, 0.508: {2003: 3}, 0.66: {2003: 12}, 0.642: {2007: 12}, 0.579: {2002: 6}, 0.923: {2002: 1}, 0.344: {2001: 6}}
"""

Related

Update value and keep the key on python dictionary

I have a parking place app where I want to calculate the available days in a week, and the total available hours per day.
I have different cities with different timeshifts, some have full time (for example 7:00-20:00) and others have separated time (for example 7:00-14:00 and 16:00-20:00).
Here is the loop I tried:
def get_available_hours(self):
_dict = []
time = 0
query_set = TimeTableCity.objects.filter(city_id=self._city_id)
for i in query_set:
initial_hour_datetime = datetime.strptime(i.initial_hour, '%H:%M')
end_hour_datetime = datetime.strptime(i.end_hour, '%H:%M')
time = end_hour_datetime - initial_hour_datetime
_dict.append({i.day_table.id: time.seconds / 3600})
time = 0
return _dict
And the returned dict at the end is the following:
[{4: 5.0}, {4: 4.0}, {5: 5.0}, {5: 4.0}, {1: 5.0}, {1: 4.0}, {2: 5.0}, {2: 4.0}, {3: 5.0}, {3: 4.0}]
The key is the day of the week, and the value is the hours for that shift.
Is there a way to sum the values for the same key?
You can use get function.
def get_available_hours(self):
_dict = {}
time = 0
query_set = TimeTableCity.objects.filter(city_id=self._city_id)
for i in query_set:
initial_hour_datetime = datetime.strptime(i.initial_hour, '%H:%M')
end_hour_datetime = datetime.strptime(i.end_hour, '%H:%M')
time = end_hour_datetime - initial_hour_datetime
_dict[i.day_table.id] = _dict.get(i.day_table.id, 0) + (time.seconds / 3600)
time = 0
return _dict
Have a look at Counter. https://docs.python.org/3/library/collections.html#collections.Counter
It can be used to sum values of seperate dicts with the same key.
IE
from collections import Counter
a = {'a': 5, 'b': 7}
b = {'a': 3, 'b': 2, 'c': 5}
dict(Counter(a)+Counter(b))
--
Out[7]: {'a': 8, 'b': 9, 'c': 5}
This method works without needing to import, someone will probably comment a cleaner method though, but I think it is pretty readable
d = [{4: 5.0}, {4: 4.0}, {5: 5.0}, {5: 4.0}, {1: 5.0}, {1: 4.0}, {2: 5.0}, {2: 4.0}, {3: 5.0}, {3: 4.0}]
summed = {}
for item in d:
day_of_week = list(item.keys())[0]
if day_of_week not in summed:
summed[day_of_week] = item[day_of_week]
else:
summed[day_of_week] += item[day_of_week]
result:
Out[12]: {4: 9.0, 5: 9.0, 1: 9.0, 2: 9.0, 3: 9.0}

Inverted Index Python

I'm trying to create a dictionary of the form:
{: [ , {}]}
For example:
d = {term: [number, {number1: number2}]}
I tried to create the dictionary inside but I'm new and I couldn't understand how it's possible. The problem is that I want the form of d and I want to update number or the dictionary that contains number1 as key and number2 as value when finding term.
So the question is:
Is it possible to create a dictionary like d ? And if so, how can I access term, number and the inside dictionay?
d = {"term": [5, {6: 7}]}
The key's value is a list:
d["term"]
[5, {6: 7}]
The list 's 1st element:
d["term"][0]
5
The list 's second element is a dictionary:
d["term"][1]
{6: 7}
The value of the dictionary's key '6' is 7:
d["term"][1][6]
7
Edit:
Some examples for modification:
d = {"term": [5, {6: 7}]}
d["term"].append(10)
print(d)
Out: {'term': [5, {6: 7}, 10]}
l=d["term"]
l[0]=55
print(d)
Out: {'term': [55, {6: 7}, 10]}
insidedict=l[1]
print(insidedict)
{6: 7}
insidedict[66]=77
print(d)
{'term': [55, {6: 7, 66: 77}, 10]}
Sure, just define it as you have:
d = {'term': [5, {6: 7}]}
Since your dictionary has just one key, you an access the key via:
key = next(iter(d))
You can then access the value 5 via a couple of ways:
number = d[key][0]
number = next(iter(d.values()))[0]
Similarly, you can access the inner dictionary via either:
inner_dict = d[key][1]
inner_dict = next(iter(d.values()))[1]
And repeat the process for inner_dict if you want to access its key / value.

Finding 3 biggest value in dictionaries contained within a list

my_list = [{0: 0}, {1: 4.2}, {2: 3.7}, {3: 5.0}, {4: 4.0}, {5: 3.3}, {6: 4.3}, {7: 4.0}, {8: 3.9}, 0, {10: 4.0}]
What I want my program to do is go through the list, record the highest value (as in the value from a key-value pair) once it's scanned through the entire thing, append that key-pair value to a new list, remove that key-pair value from the original list [my_list], and repeat the process twice more. So the desired output would look like this:
desired output: [{3: 5.0},{6: 4.3},{1: 4.2}]
I'm not sure how to achieve the desired output.
I'm assuming that the single integer in your my_list is a typo.
Use the heapq module to get the three largest items. This has slightly better complexity and memory efficiency than sorting the whole list and then extracting the last three elements.
>>> from heapq import nlargest
>>> my_list = [{0: 0}, {1: 4.2}, {2: 3.7}, {3: 5.0}, {4: 4.0}, {5: 3.3}, {6: 4.3}, {7: 4.0}, {8: 3.9}, {10: 4.0}]
>>> nlargest(3, my_list, key=lambda d: d.values()[0])
[{3: 5.0}, {6: 4.3}, {1: 4.2}]
The key function specifies the criterion by which the items from your list are to be ordered, it simply fetches the only value any individual dictionary has.

Group data in nested dictionary Python

I have a dictionary like this,
data = {'04-01-2012': [{1: 0.93}, {2: 0.9195000000000001}, {3: 0.9090000000000001}, {4: 0.8985000000000002},
{5: 0.8880000000000002}, {6: 0.8775000000000003}, {7: 0.8670000000000003},
{8: 0.8565000000000004}, {9: 0.8460000000000004}],
'12-01-2012': [{1: 0.96}],
'07-01-2012': [{1: 0.96}, {2: 0.95}, {3: 0.94}, {4: 0.9299999999999999}, {5: 0.9199999999999999},
{6: 0.9099999999999999}],
'06-01-2012': [{1: 0.945}, {2: 0.9365}, {3: 0.928}, {4: 0.9195000000000001}, {5: 0.9110000000000001},
{6: 0.9025000000000002}, {7: 0.8940000000000002}],
'10-01-2012': [{1: 0.93}, {2: 0.9244}, {3: 0.9188}],
'05-01-2012': [{1: 0.935}, {2: 0.926}, {3: 0.917}, {4: 0.908}, {5: 0.899}, {6: 0.89}, {7: 0.881}, {8: 0.872}],
'11-01-2012': [{1: 0.945}, {2: 0.9325}],
'02-01-2012': [{1: 0.94}, {2: 0.9299999999999999}, {3: 0.9199999999999999}, {4: 0.9099999999999999},
{5: 0.8999999999999999}, {6: 0.8899999999999999}, {7: 0.8799999999999999},
{8: 0.8699999999999999}, {9: 0.8599999999999999}, {10: 0.8499999999999999},
{11: 0.8399999999999999}],
'03-01-2012': [{1: 0.955}, {2: 0.9455}, {3: 0.936}, {4: 0.9265000000000001}, {5: 0.9170000000000001},
{6: 0.9075000000000002}, {7: 0.8980000000000002}, {8: 0.8885000000000003},
{9: 0.8790000000000003}, {10: 0.8695000000000004}],
'08-01-2012': [{1: 0.94}, {2: 0.9295}, {3: 0.919}, {4: 0.9085000000000001}, {5: 0.8980000000000001}],
'01-01-2012': [{1: 0.95}, {2: 0.94}, {3: 0.9299999999999999}, {4: 0.9199999999999999}, {5: 0.9099999999999999},
{6: 0.8999999999999999}, {7: 0.8899999999999999}, {8: 0.8799999999999999},
{9: 0.8699999999999999}, {10: 0.8599999999999999}, {11: 0.8499999999999999},
{12: 0.8399999999999999}],
'09-01-2012': [{1: 0.92}, {2: 0.91}, {3: 0.9}, {4: 0.89}]}
I need to iterate over the dictionary values and group all the 1's, 2's and so on.
This is my code so far
from collections import defaultdict
final = defaultdict(list)
for k, v in data.items():
new_data = next(iter(v))
for m, n in new_data.items():
final[m].append(n)
print(final)
# defaultdict(<class 'list'>, {1: [0.935, 0.92, 0.955, 0.96, 0.94, 0.93, 0.95, 0.96, 0.945, 0.94, 0.945, 0.93]})
It groups all 1's only, not 2's and so on. What wrong I am doing?
You forgot to iterate over the many tiny dictionaries:
from collections import defaultdict
final = defaultdict(list)
for k, v in data.items():
for d in v: # <-- this was missing
for m, n in d.items():
final[m].append(n)
print(final)
(You only called next(...), which yields the first item only.)
Output:
defaultdict(, {1: [0.96, 0.935, 0.93, 0.945, 0.96, 0.95, 0.93, 0.94, 0.945, 0.955, 0.94, 0.92], 2: [0.926, 0.9244, 0.9365, 0.95, 0.94, 0.9195000000000001, 0.9299999999999999, 0.9325, 0.9455, 0.9295, 0.91], 3: [0.917, 0.9188, 0.928, 0.94, 0.9299999999999999, 0.9090000000000001, 0.9199999999999999, 0.936, 0.919, 0.9], 4: [0.908, 0.9195000000000001, 0.9299999999999999, 0.9199999999999999, 0.8985000000000002, 0.9099999999999999, 0.9265000000000001, 0.9085000000000001, 0.89], 5: [0.899, 0.9110000000000001, 0.9199999999999999, 0.9099999999999999, 0.8880000000000002, 0.8999999999999999, 0.9170000000000001, 0.8980000000000001], 6: [0.89, 0.9025000000000002, 0.9099999999999999, 0.8999999999999999, 0.8775000000000003, 0.8899999999999999, 0.9075000000000002], 7: [0.881, 0.8940000000000002, 0.8899999999999999, 0.8670000000000003, 0.8799999999999999, 0.8980000000000002], 8: [0.872, 0.8799999999999999, 0.8565000000000004, 0.8699999999999999, 0.8885000000000003], 9: [0.8699999999999999, 0.8460000000000004, 0.8599999999999999, 0.8790000000000003], 10: [0.8599999999999999, 0.8499999999999999, 0.8695000000000004], 11: [0.8499999999999999, 0.8399999999999999], 12: [0.8399999999999999]})
new_data = next(iter(v))`
This is the line where it goes wrong. This just returns the next value in your iterable, which in this case is the key-value pair bound to key: 1.
you can see this by adding a print statement:
from collections import defaultdict
final = defaultdict(list)
print final
for k, v in data.items():
new_data = next(iter(v))
for m, n in new_data.items():
print new_data
final[m].append(n)
#{1: 0.96}
#{1: 0.935}
#{1: 0.93}
#{1: 0.945}
#{1: 0.96}
#{1: 0.95}
#{1: 0.93}
#{1: 0.94}
#{1: 0.945}
#{1: 0.955}
#{1: 0.94}
#{1: 0.92}
Then the solution is to actually grab all items, instead of just the first, which can be done in a lot of ways, the simplest and ugliest of which is to nest another loop.
from collections import defaultdict
final = defaultdict(list)
for k, v in data.items():
iterable = iter(v)
for i in range(len(v)):
new_data = iterable.next()
for m, n in new_data.items():
final[m].append(n)
print(final)

How to change value for item in a list of dict in python in a simple way?

Suppose a list composed by several dict in python:
a = [{1: u'100'}, {2: u'200'}, {3: u'300'}]
I'd like to change the datatype of items of the dict from unicode to float, i.e.,
a = [{1: 100.0}, {2: 200.0}, {3: 300.0}]
The following are my current codes:
for i in a:
for j in i.keys():
if type(i[j]) == unicode:
i[j] = float(i[j])
It works but I hate this stupid expression.
There must be some much more elegant expression.
Please help.
>>> a = [{1: u'100'}, {2: u'200'}, {3: u'300'}]
>>> [{k:float(v) for k,v in d.iteritems()} for d in a]
[{1: 100.0}, {2: 200.0}, {3: 300.0}]
If you need to add a Unicode type check, you can, but then arguably a nested list/dict comprehension isn't all that readable any more:
>>> a = [{1: u'100'}, {2: u'200'}, {3: u'300', 4: "not unicode"}]
>>> [{k:float(v) if isinstance(v, unicode) else v for k,v in d.iteritems()} for d in a]
[{1: 100.0}, {2: 200.0}, {3: 300.0, 4: 'not unicode'}]

Categories

Resources