Dictionary of lists to nested dictionary - python

I have the following dictionary {44: [0, 1, 0, 3, 6]} and need to convert this to dict1 = {44: {0:0, 1:1, 2:0, 3:3, 4:6}} but my current for loop doesn't work:
maxnumbers = 5 #this is how many values are within the list
for i in list(range(maxnumbers)):
for k in list(dict1.keys()):
for g in dict1[k]:
newdict[i] = g
print(num4)
Can you help me? Thanks in advance.

You can use a dictionary comprehension with enumerate:
d = {44: [0, 1, 0, 3, 6]}
{k:dict(enumerate(v)) for k,v in d.items()}
# {44: {0: 0, 1: 1, 2: 0, 3: 3, 4: 6}}

Use a simple nested dictionary-comprehension that uses enumerate:
d = {44: [0, 1, 0, 3, 6]}
print({k: {i: x for i, x in enumerate(v)} for k, v in d.items()})
# {44: {0: 0, 1: 1, 2: 0, 3: 3, 4: 6}}

a = {44: [0, 1, 0, 3, 6]}
a= {i:{j:a[i][j] for i in a for j in range(len(a[i]))}}
print(a)
output
{44: {0: 0, 1: 1, 2: 0, 3: 3, 4: 6}}

Why your current implementation doesn't work:
for i in list(range(maxnumbers)):
for k in list(dict1.keys()):
for g in dict1[k]:
# this will iterate over all of the values in
# d1[k] and the i: v pair will be overwritten by
# the last value
newdict[i] = g
Taken in steps, this would look like:
# for value in [0, 1, 0, 3, 6]: Just take this set of values as an example
# first, value is 0, and say we are on i = 1, in the outer for loop
newdict[1] = 0
# Then it will progress to value = 1, but i has not changed
# which overwrites the previous value
newdict[1] = 1
# continues until that set of values is complete
In order to fix this, you'll want i and the values of dict1[k] to increment together. This can be accomplished with zip:
for index, value in zip(range(maxnumbers), dict1[k]):
newdict[index] = value
Also, if you need access to both the keys and values, use dict.items():
for k, values in dict1.items():
# then you can use zip on the values
for idx, value in zip(range(maxnumbers), values):
However, the enumerate function already facilitates this:
for k, values in dict1.items():
for idx, value in enumerate(values):
# rest of loop
This is more robust, since you don't have to find what maxnumbers is ahead of time.
To do this in the traditional for loop that you've been using:
new_dict = {}
for k, v in dict1.items():
sub_d = {} # create a new sub_dictionary
for i, x in enumerate(v):
sub_d[i] = x
# assign that new sub_d as an element in new_dict
# when the inner for loop completes
new_dict[k] = sub_d
Or, more compactly:
d = {44: [0, 1, 0, 3, 6]}
new_d = {}
for k, v in d.items():
new_d[k] = dict(enumerate(v))
Where the dict constructor will take an iterable of 2-element tuples as an argument, which enumerate provides

Related

Python dictionary find key of max vlue

in python, if I want to find the max value of d, but the key only include 1,2,3 other than all the keys in the d. so how to do, thank you.
d = {1: 5, 2: 0, 3: 4, 4: 0, 5: 1}
Just get the keys and values for the keys 1, 2 and 3 in a list of tuples, sort the list and get the first tuple element [0] key [0].
d = {1: 5, 2: 0, 3: 4, 4: 0, 5: 1}
key_max_val = sorted([(k,v) for k,v in d.items() if k in [1,2,3]])[0][0]
print(key_max_val) # Outputs 1
You can use operator:
It will return you the key with maximum value:
In [873]: import operator
In [874]: d = {1: 5, 2: 0, 3: 4, 4: 0, 5: 1}
In [875]: max(d.iteritems(), key=operator.itemgetter(1))[0]
Out[875]: 1
I think this below should work (base on
#Mayank Porwal idea, sorry coz I can not reply):
d = {1: 5, 2: 0, 3: 4, 4: 0, 5: 1}
max(v for k,v in d.items())
Use a generator and the max builtin function:
Max value
max(v for k,v in d.items() if k in [1,2,3])
Max key
max(k for k,v in d.items() if k in [1,2,3])

Delete dictionary keys that are out of bounds (python)

If you have a dictionary of integers:
d = {
1:[0],
2:[1],
3:[0,1,2,3,4],
4:[0],
5:[1],
6:[0,1,2,3,4],
11:[0],
22:[1],
33:[0,1,2,3,4],
44:[0],
55:[1],
66:[0,1,2,3,4]
}
You want to:
Validate that the keys are between 0 and 25.
Delete any keys that are outside of the range as they are not valid and will ruin the data set.
Dictionary keys are not naturally sorted.
Given, how would validate that your keys are in the required range?
My try:
for x,y in d.items():
if x<0 or x>25:
del d[x]
When ran I get the error:
RuntimeError: dictionary changed size during iteration
How would I compensate for this?
In your example, you are mutating the d while looping through it. This is bad.
The easiest way to do this if you don't need to change the original d is to use a dictionary comprehension:
d = {k: v for k, v in d.items() if 0 <= k <= 25}
If you want to delete keys while iterating, you need to iterate over a copy instead and pop keys that don't hold to your condition:
d = {1:[0], 2:[1], 3:[0,1,2,3,4], 4:[0], 5:[1], 6:[0,1,2,3,4], 11:[0], 22:[1], 33:[0,1,2,3,4], 44:[0], 55:[1], 66:[0,1,2,3,4]}
for k in d.copy(): # or list(d)
if not 0 <= k <= 25:
d.pop(k) # or del d[k]
Which Outputs:
{1: [0], 2: [1], 3: [0, 1, 2, 3, 4], 4: [0], 5: [1], 6: [0, 1, 2, 3, 4], 11: [0], 22: [1]}
As others have shown, reconstructing a new dictionary is always an easy way around this.
You can use a basic dict comprehension here:
{k: d[k] for k in d if 0 <= k <= 25}
Or even a functional approach with filter():
dict(filter(lambda x: 0 <= x[0] <= 25, d.items()))
You can use a dictionary comprehension:
d = { 1:[0], 2:[1], 3:[0,1,2,3,4], 4:[0], 5:[1], 6:[0,1,2,3,4], 11:[0], 22:[1], 33:[0,1,2,3,4], 44:[0], 55:[1], 66:[0,1,2,3,4] }
new_d = {a:b for a, b in d.items() if a <= 25 and a >= 0}
Output:
{1: [0], 2: [1], 3: [0, 1, 2, 3, 4], 4: [0], 5: [1], 6: [0, 1, 2, 3, 4], 11: [0], 22: [1]}

Python dictionary from two lists

I have two lists, one of them is a list of values and the other is a list of dates.
I want to create a dictionary with values and dates as keys. But a lot of the values have the same "key" (date). I need to add the values with the same date (same key) together before making a dictionary.
Both of the lists have the same number of elements but the list of dates has some values duplicated (since every date has more than one value).
What would be the best way to group the values (add them together) based on the keys (dates)?
Examples of the lists
dates = [datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 3, 1, 0, 0),datetime(2014, 3, 1, 0, 0)]
values = [2,7,4,8,4]
I want my dictionary to look like this:
dict = [datetime(2014, 2, 1, 0, 0):13,datetime(2014, 3, 1, 0, 0):8,datetime(2014, 3, 1, 0, 0):4]
If you have repeating dates and want to group the values for repeating keys, use a defaultdict:
from collections import defaultdict
d = defaultdict(int)
for dte, val in zip(dates, values):
d[dte] += val
Output:
defaultdict(<class 'int'>, {datetime.datetime(2014, 2, 1, 0, 0): 13, datetime.datetime(2014, 3, 1, 0, 0): 12})
Or using a normal dict and dict.setdefault:
d = {}
for dte, val in zip(dates,values):
d.setdefault(dte,0)
d[dte] += val
Lastly you can use dict.get with a default value of 0:
d = {}
for dte, val in zip(dates,values):
d[dte] = d.get(dte, 0) + val
The defaultdict is going to be the fastest approach as it is designed exactly for this purpose.
Assuming if this is your input,
>>> dates = ['2015-01-01', '2015-01-01', '2015-01-02', '2015-01-03']
>>> values = [10, 15, 10, 10]
Combine the values,
>>> data = zip(dates, values)
[('2015-01-01', 10), ('2015-01-01', 15), ('2015-01-02', 10), ('2015-01-03', 10)]
Aggregate the values for same dates,
>>> import itertools
>>> new_data = []
>>> for key, group in itertools.groupby(data, lambda x: x[0]):
tmp = [key, 0] #: '0' is the default value
for thing in group:
tmp[1] += thing[1]
new_data.append(tmp)
Print the new_data,
>>> new_data
[['2015-01-01', 25], ['2015-01-02', 10], ['2015-01-03', 10]]
Now build the final dictionary,
>>> dict(new_data)
{'2015-01-03': 10, '2015-01-02': 10, '2015-01-01': 25}
itertools and defaultdict are pretty unnecessary for this. I think that this is simpler and easier to read.
dates = [datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 3, 1, 0, 0),datetime(2014, 3, 1, 0, 0)]
values = [2,7,4,8,4]
combined = {}
for (date,value) in zip(dates,values):
if date in combined:
combined[date] += value
else:
combined[date] = value
Performance analysis
I'm not saying that defaultdict is a bad solution, I was only pointing out that it requires more tacit knowledge to use without pitfalls.
It is not however the fastest solution.
from collections import defaultdict
from datetime import datetime
import timeit
dates = [datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 2, 1, 0, 0),datetime(2014, 3, 1, 0, 0),datetime(2014, 3, 1, 0, 0)]
values = [2,7,4,8,4]
def combine_default_dict(dates=dates,values=values):
d = defaultdict(int)
for dte, val in zip(dates, values):
d[dte] += val
return d
def combine_setdefault(dates=dates,values=values):
d = {}
for dte, val in zip(dates,values):
d.setdefault(dte,0)
d[dte] += val
return d
def combine_get(dates=dates,values=values):
d = {}
for dte, val in zip(dates,values):
d[dte] = d.get(dte, 0) + val
return d
def combine_contains(dates=dates,values=values):
d = {}
for (date,value) in zip(dates,values):
if date in d:
d[date] += value
else:
d[date] = value
return d
def time_them(number=100000):
for func_name in [k for k in sorted(globals().keys()) if k.startswith('combine_')]:
timer = timeit.Timer("{0}()".format(func_name),"from __main__ import {0}".format(func_name))
time_taken = timer.timeit(number=number)
print "{0} - {1}".format(time_taken,func_name)
Yields:
>>> time_them()
0.388070106506 - combine_contains
0.485766887665 - combine_default_dict
0.415601968765 - combine_get
0.472551822662 - combine_setdefault
I've tried it on a couple of different machines and python versions. combine_default_dict competes with combine_setdefault for the slowest. combine_contains has been consistently the fastest.

List comprehension inside dictionary comprehension - scope

I am trying to create a complete graph in a Python Dictionary in 1 line. But when creating the list comprehension for the values I can not figure out how to specify that the key_value can not appear in the list of values (in graph speak, no self loop).
for n nodes
G = {k:[v for v in range(n)] for k in range(n) }
results in this (example n = 3)
{0: [0, 1, 2], 1: [0, 1, 2], 2: [0, 1, 2]}
but what I want is this
{0: [1, 2], 1: [0, 2], 2: [0, 1]}
But trying something similar to this
G = {k:[v for v in range(n) for v !=k] for k in range(n) }
will throw an error at the k in the list comprehension. So k must be out of scope for the list comprehension, which makes sense.
Can G be defined in this method?
To ignore the key's value from the value list, you just have to put a validation in your list comprehension.
G = { k: [v for v in range(n) if v != k] for k in range(n) }
So for n = 3 you graph G would return :-
{0: [1, 2], 1: [0, 2], 2: [0, 1]}

List of dicts to/from dict of lists

I want to change back and forth between a dictionary of (equal-length) lists:
DL = {'a': [0, 1], 'b': [2, 3]}
and a list of dictionaries:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
For those of you that enjoy clever/hacky one-liners.
Here is DL to LD:
v = [dict(zip(DL,t)) for t in zip(*DL.values())]
print(v)
and LD to DL:
v = {k: [dic[k] for dic in LD] for k in LD[0]}
print(v)
LD to DL is a little hackier since you are assuming that the keys are the same in each dict. Also, please note that I do not condone the use of such code in any kind of real system.
If you're allowed to use outside packages, Pandas works great for this:
import pandas as pd
pd.DataFrame(DL).to_dict(orient="records")
Which outputs:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
You can also use orient="list" to get back the original structure
{'a': [0, 1], 'b': [2, 3]}
Perhaps consider using numpy:
import numpy as np
arr = np.array([(0, 2), (1, 3)], dtype=[('a', int), ('b', int)])
print(arr)
# [(0, 2) (1, 3)]
Here we access columns indexed by names, e.g. 'a', or 'b' (sort of like DL):
print(arr['a'])
# [0 1]
Here we access rows by integer index (sort of like LD):
print(arr[0])
# (0, 2)
Each value in the row can be accessed by column name (sort of like LD):
print(arr[0]['b'])
# 2
To go from the list of dictionaries, it is straightforward:
You can use this form:
DL={'a':[0,1],'b':[2,3], 'c':[4,5]}
LD=[{'a':0,'b':2, 'c':4},{'a':1,'b':3, 'c':5}]
nd={}
for d in LD:
for k,v in d.items():
try:
nd[k].append(v)
except KeyError:
nd[k]=[v]
print nd
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Or use defaultdict:
nd=cl.defaultdict(list)
for d in LD:
for key,val in d.items():
nd[key].append(val)
print dict(nd.items())
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}
Going the other way is problematic. You need to have some information of the insertion order into the list from keys from the dictionary. Recall that the order of keys in a dict is not necessarily the same as the original insertion order.
For giggles, assume the insertion order is based on sorted keys. You can then do it this way:
nl=[]
nl_index=[]
for k in sorted(DL.keys()):
nl.append({k:[]})
nl_index.append(k)
for key,l in DL.items():
for item in l:
nl[nl_index.index(key)][key].append(item)
print nl
#[{'a': [0, 1]}, {'b': [2, 3]}, {'c': [4, 5]}]
If your question was based on curiosity, there is your answer. If you have a real-world problem, let me suggest you rethink your data structures. Neither of these seems to be a very scalable solution.
Here are the one-line solutions (spread out over multiple lines for readability) that I came up with:
if dl is your original dict of lists:
dl = {"a":[0, 1],"b":[2, 3]}
Then here's how to convert it to a list of dicts:
ld = [{key:value[index] for key,value in dl.items()}
for index in range(max(map(len,dl.values())))]
Which, if you assume that all your lists are the same length, you can simplify and gain a performance increase by going to:
ld = [{key:value[index] for key, value in dl.items()}
for index in range(len(dl.values()[0]))]
Here's how to convert that back into a dict of lists:
dl2 = {key:[item[key] for item in ld]
for key in list(functools.reduce(
lambda x, y: x.union(y),
(set(dicts.keys()) for dicts in ld)
))
}
If you're using Python 2 instead of Python 3, you can just use reduce instead of functools.reduce there.
You can simplify this if you assume that all the dicts in your list will have the same keys:
dl2 = {key:[item[key] for item in ld] for key in ld[0].keys() }
cytoolz.dicttoolz.merge_with
Docs
from cytoolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}
Non-cython version
Docs
from toolz.dicttoolz import merge_with
merge_with(list, *LD)
{'a': [0, 1], 'b': [2, 3]}
The python module of pandas can give you an easy-understanding solution. As a complement to #chiang's answer, the solutions of both D-to-L and L-to-D are as follows:
import pandas as pd
DL = {'a': [0, 1], 'b': [2, 3]}
out1 = pd.DataFrame(DL).to_dict('records')
Output:
[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
In the other direction:
LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
out2 = pd.DataFrame(LD).to_dict('list')
Output:
{'a': [0, 1], 'b': [2, 3]}
Cleanest way I can think of a summer friday. As a bonus, it supports lists of different lengths (but in this case, DLtoLD(LDtoDL(l)) is no more identity).
From list to dict
Actually less clean than #dwerk's defaultdict version.
def LDtoDL (l) :
result = {}
for d in l :
for k, v in d.items() :
result[k] = result.get(k,[]) + [v] #inefficient
return result
From dict to list
def DLtoLD (d) :
if not d :
return []
#reserve as much *distinct* dicts as the longest sequence
result = [{} for i in range(max (map (len, d.values())))]
#fill each dict, one key at a time
for k, seq in d.items() :
for oneDict, oneValue in zip(result, seq) :
oneDict[k] = oneValue
return result
I needed such a method which works for lists of different lengths (so this is a generalization of the original question). Since I did not find any code here that the way that I expected, here's my code which works for me:
def dict_of_lists_to_list_of_dicts(dict_of_lists: Dict[S, List[T]]) -> List[Dict[S, T]]:
keys = list(dict_of_lists.keys())
list_of_values = [dict_of_lists[key] for key in keys]
product = list(itertools.product(*list_of_values))
return [dict(zip(keys, product_elem)) for product_elem in product]
Examples:
>>> dict_of_lists_to_list_of_dicts({1: [3], 2: [4, 5]})
[{1: 3, 2: 4}, {1: 3, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5]})
[{1: 3, 2: 5}, {1: 4, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6]})
[{1: 3, 2: 5}, {1: 3, 2: 6}, {1: 4, 2: 5}, {1: 4, 2: 6}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6], 7: [8, 9, 10]})
[{1: 3, 2: 5, 7: 8},
{1: 3, 2: 5, 7: 9},
{1: 3, 2: 5, 7: 10},
{1: 3, 2: 6, 7: 8},
{1: 3, 2: 6, 7: 9},
{1: 3, 2: 6, 7: 10},
{1: 4, 2: 5, 7: 8},
{1: 4, 2: 5, 7: 9},
{1: 4, 2: 5, 7: 10},
{1: 4, 2: 6, 7: 8},
{1: 4, 2: 6, 7: 9},
{1: 4, 2: 6, 7: 10}]
Here my small script :
a = {'a': [0, 1], 'b': [2, 3]}
elem = {}
result = []
for i in a['a']: # (1)
for key, value in a.items():
elem[key] = value[i]
result.append(elem)
elem = {}
print result
I'm not sure that is the beautiful way.
(1) You suppose that you have the same length for the lists
Here is a solution without any libraries used:
def dl_to_ld(initial):
finalList = []
neededLen = 0
for key in initial:
if(len(initial[key]) > neededLen):
neededLen = len(initial[key])
for i in range(neededLen):
finalList.append({})
for i in range(len(finalList)):
for key in initial:
try:
finalList[i][key] = initial[key][i]
except:
pass
return finalList
You can call it as follows:
dl = {'a':[0,1],'b':[2,3]}
print(dl_to_ld(dl))
#[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
If you don't mind a generator, you can use something like
def f(dl):
l = list((k,v.__iter__()) for k,v in dl.items())
while True:
d = dict((k,i.next()) for k,i in l)
if not d:
break
yield d
It's not as "clean" as it could be for Technical Reasons: My original implementation did yield dict(...), but this ends up being the empty dictionary because (in Python 2.5) a for b in c does not distinguish between a StopIteration exception when iterating over c and a StopIteration exception when evaluating a.
On the other hand, I can't work out what you're actually trying to do; it might be more sensible to design a data structure that meets your requirements instead of trying to shoehorn it in to the existing data structures. (For example, a list of dicts is a poor way to represent the result of a database query.)
List of dicts ⟶ dict of lists
from collections import defaultdict
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def ld_to_dl(ld: list[dict[K, V]]) -> dict[K, list[V]]:
dl = defaultdict(list)
for d in ld:
for k, v in d.items():
dl[k].append(v)
return dl
defaultdict creates an empty list if one does not exist upon key access.
Dict of lists ⟶ list of dicts
Collecting into "jagged" dictionaries
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = []
for k, vs in dl.items():
ld += [{} for _ in range(len(vs) - len(ld))]
for i, v in enumerate(vs):
ld[i][k] = v
return ld
This generates a list of dictionaries ld that may be missing items if the lengths of the lists in dl are unequal. It loops over all key-values in dl, and creates empty dictionaries if ld does not have enough.
Collecting into "complete" dictionaries only
(Usually intended only for equal-length lists.)
from typing import TypeVar
K = TypeVar("K")
V = TypeVar("V")
def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
ld = [dict(zip(dl.keys(), v)) for v in zip(*dl.values())]
return ld
This generates a list of dictionaries ld that have the length of the smallest list in dl.
DL={'a':[0,1,2,3],'b':[2,3,4,5]}
LD=[{'a':0,'b':2},{'a':1,'b':3}]
Empty_list = []
Empty_dict = {}
# to find length of list in values of dictionry
len_list = 0
for i in DL.values():
if len_list < len(i):
len_list = len(i)
for k in range(len_list):
for i,j in DL.items():
Empty_dict[i] = j[k]
Empty_list.append(Empty_dict)
Empty_dict = {}
LD = Empty_list

Categories

Resources