I have a dictionary "my_dict" in this format:
{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}},
'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}}
}
Currently, I am using pd.DataFrame.from_dict(my_dict, orient='index') and get a df like this:
c2 c1
l1 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}
l2 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}
However, what I want is both l1/l2 and c2/c3 as indexes and a/b/c as columns.
Something like this:
a b c
l1 c1 0 1 2
c2 3 4 5
l2 c1 0 1 2
c2 3 4 5
What's the best way to do this?
Consider a dictionary comprehension to build a dictionary with tuple keys. Then, use pandas' MultiIndex.from_tuples. Below ast is used to rebuild you original dictionary from string (ignore the step on your end).
import pandas as pd
import ast
origDict = ast.literal_eval("""
{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}},
'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}}
}""")
# DICTIONARY COMPREHENSION
newdict = {(k1, k2):v2 for k1,v1 in origDict.items() \
for k2,v2 in origDict[k1].items()}
print(newdict)
# {('l1', 'c2'): {'c': 5, 'a': 3, 'b': 4},
# ('l2', 'c1'): {'c': 2, 'a': 0, 'b': 1},
# ('l1', 'c1'): {'c': 2, 'a': 0, 'b': 1},
# ('l2', 'c2'): {'c': 5, 'a': 3, 'b': 4}}
# DATA FRAME ASSIGNMENT
df = pd.DataFrame([newdict[i] for i in sorted(newdict)],
index=pd.MultiIndex.from_tuples([i for i in sorted(newdict.keys())]))
print(df)
# a b c
# l1 c1 0 1 2
# c2 3 4 5
# l2 c1 0 1 2
# c2 3 4 5
Related
I have 2 array of objects:
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
Output:
a - b = [{'c': 3, 'd': 4}] ("-" symbol is only for representation, showing difference. Not mathematical minus.)
b - a = [{'g': 3, 'h': 4}]
In every array, the order of key may be different. I can try following and check for that:
for i in range(len(a)):
current_val = a[i]
for x, y in current_val.items:
//search x keyword in array b and compare it with b
but this approach doesn't feel right. Is there simpler way to do this or any utility library which can do this similar to fnc or pydash?
You can use lambda:
g = lambda a,b : [x for x in a if x not in b]
g(a,b) # a-b
[{'c': 3, 'd': 4}]
g(b,a) # b-a
[{'g': 3, 'h': 4}]
Just test if all elements are in the other array
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
def find_diff(array_a, array_b):
diff = []
for e in array_a:
if e not in array_b:
diff.append(e)
return diff
print(find_diff(a, b))
print(find_diff(b, a))
the same with list comprehension
def find_diff(array_a, array_b):
return [e for e in array_a if e not in array_b]
here is the code for subtracting list of dictionaries
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 6, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 6}]
a_b = []
b_a = []
for element in a:
if element not in b:
a_b.append( element )
for element in b:
if element not in a:
b_a.append( element )
print("a-b =",a_b)
print("b-a =",b_a)
If I have many dictionaries that I would like to modify (e.g., to filter out some value in all of them), how can I proceed in a efficient/pythonic way?
In the following example, the filtering operation within the loop works, but the actual dictionaries are not changed/affected:
d1 = {key:val for key, val in zip(('a', 'b', 'c', 'd', 'e'), range(5))}
d2 = {key:val for key, val in zip(('a', 'b', 'c', 'd', 'e'), range(4, 9))}
for d in (d1, d2):
print d
d = {key: d[key] for key in d if d[key] != 4}
print d
print d1
print d2
# {'a': 0, 'c': 2, 'b': 1, 'e': 4, 'd': 3}
# {'a': 0, 'c': 2, 'b': 1, 'd': 3}
# {'a': 4, 'c': 6, 'b': 5, 'e': 8, 'd': 7}
# {'c': 6, 'b': 5, 'e': 8, 'd': 7}
# {'a': 0, 'c': 2, 'b': 1, 'e': 4, 'd': 3}
# {'a': 4, 'c': 6, 'b': 5, 'e': 8, 'd': 7}
This should do the trick:
d1 = {key:val for key, val in zip(('a', 'b', 'c', 'd', 'e'), range(5))}
d2 = {key:val for key, val in zip(('a', 'b', 'c', 'd', 'e'), range(4, 9))}
dicts = [d1, d2]
print dicts
#[{'a': 0, 'c': 2, 'b': 1, 'e': 4, 'd': 3}, {'a': 4, 'c': 6, 'b': 5, 'e': 8, 'd': 7}]
for i, d in enumerate(dicts):
for k, v in d.items():
if v == 4:
del dicts[i][k]
print dicts
#[{'a': 0, 'c': 2, 'b': 1, 'd': 3}, {'c': 6, 'b': 5, 'e': 8, 'd': 7}]
print d1
#{'a': 0, 'b': 1, 'c': 2, 'd': 3}
print d2
#{'b': 5, 'c': 6, 'd': 7, 'e': 8}
This question already has answers here:
How do I merge two dictionaries in a single expression in Python?
(43 answers)
Closed 5 years ago.
I have two lists of dictionaries which I am trying to get the product of:
from itertools import product
list1 = [{'A': 1, 'B': 1}, {'A': 2, 'B': 2}, {'A': 2, 'B': 1}, {'A': 1, 'B': 2}]
list2 = [{'C': 1, 'D': 1}, {'C': 1, 'D': 2}]
for p in product(list1, list2):
print p
and this gives me the output:
({'A': 1, 'B': 1}, {'C': 1, 'D': 1})
({'A': 1, 'B': 1}, {'C': 1, 'D': 2})
({'A': 2, 'B': 2}, {'C': 1, 'D': 1})
({'A': 2, 'B': 2}, {'C': 1, 'D': 2})
({'A': 2, 'B': 1}, {'C': 1, 'D': 1})
({'A': 2, 'B': 1}, {'C': 1, 'D': 2})
({'A': 1, 'B': 2}, {'C': 1, 'D': 1})
({'A': 1, 'B': 2}, {'C': 1, 'D': 2})
How would I flatten these so the output is a single dict rather than a tuple of dicts?:
{'A': 1, 'B': 1, 'C': 1, 'D': 1}
{'A': 1, 'B': 1, 'C': 1, 'D': 2}
{'A': 2, 'B': 2, 'C': 1, 'D': 1}
{'A': 2, 'B': 2, 'C': 1, 'D': 2}
{'A': 2, 'B': 1, 'C': 1, 'D': 1}
{'A': 2, 'B': 1, 'C': 1, 'D': 2}
{'A': 1, 'B': 2, 'C': 1, 'D': 1}
{'A': 1, 'B': 2, 'C': 1, 'D': 2}
Looks like you want to merge the dictionaries
for p1, p2 in product(list1, list2):
merged = {**p1, **p2}
print(merged)
In earlier versions of Python, you can't merge with this expression. Use p1.update(p2) instead.
I want to get all combinations of the values in a dictionary as multiple dictionaries (each containing every key of the original but only one value of the original values). Say I want to parametrize a function call with:
kwargs = {'a': [1, 2, 3], 'b': [1, 2, 3]}
How do I get a list of all the combinations like this:
combinations = [{'a': 1, 'b': 1}, {'a': 1, 'b': 2}, {'a': 1, 'b': 3},
{'a': 2, 'b': 1}, {'a': 2, 'b': 2}, {'a': 2, 'b': 3},
{'a': 3, 'b': 1}, {'a': 3, 'b': 2}, {'a': 3, 'b': 3}]
There can be an arbitary amount of keys in the original kwargs and each value is garantueed to be an iterable but the number of values is not fixed.
If possible: the final combinations should be a generator (not a list).
You can flatten the kwargs to something like this
>>> kwargs = {'a': [1, 2, 3], 'b': [1, 2, 3]}
>>> flat = [[(k, v) for v in vs] for k, vs in kwargs.items()]
>>> flat
[[('b', 1), ('b', 2), ('b', 3)], [('a', 1), ('a', 2), ('a', 3)]]
Then, you can use itertools.product like this
>>> from itertools import product
>>> [dict(items) for items in product(*flat)]
[{'a': 1, 'b': 1},
{'a': 2, 'b': 1},
{'a': 3, 'b': 1},
{'a': 1, 'b': 2},
{'a': 2, 'b': 2},
{'a': 3, 'b': 2},
{'a': 1, 'b': 3},
{'a': 2, 'b': 3},
{'a': 3, 'b': 3}]
itertools.product actually returns an iterator. So you can get the values on demand and build your dictionaries. Or you can use map, which also returns an iterator.
>>> for item in map(dict, product(*flat)):
... print(item)
...
...
{'b': 1, 'a': 1}
{'b': 1, 'a': 2}
{'b': 1, 'a': 3}
{'b': 2, 'a': 1}
{'b': 2, 'a': 2}
{'b': 2, 'a': 3}
{'b': 3, 'a': 1}
{'b': 3, 'a': 2}
{'b': 3, 'a': 3}
Just another way, building the value tuples first and then combining with keys afterwards (pretty much the opposite of #thefourtheye's way :-).
>>> combinations = (dict(zip(kwargs, vs)) for vs in product(*kwargs.values()))
>>> for c in combinations:
print(c)
{'a': 1, 'b': 1}
{'a': 1, 'b': 2}
{'a': 1, 'b': 3}
{'a': 2, 'b': 1}
{'a': 2, 'b': 2}
{'a': 2, 'b': 3}
{'a': 3, 'b': 1}
{'a': 3, 'b': 2}
{'a': 3, 'b': 3}
I have 2 lists like this:
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
and I want to obtain a list l3, which is a join of l1 and l2 where values of 'a' and 'b' are equal in both l1 and l2
i.e.
l3 = [{'a': 1, 'b: 2, 'c': 3, 'd': 4, 'e': 101}, {'a': 5, 'b: 6, 'c': 7, 'd': 8, 'e': 100}]
How can I do this?
You should accumulate the results in a dictionary. You should use the values of 'a' and 'b' to form a key of this dictionary
Here, I have used a defaultdict to accumulate the entries
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
from collections import defaultdict
D = defaultdict(dict)
for lst in l1, l2:
for item in lst:
key = item['a'], item['b']
D[key].update(item)
l3 = D.values()
print l3
output:
[{'a': 1, 'c': 3, 'b': 2, 'e': 101, 'd': 4}, {'a': 5, 'c': 7, 'b': 6, 'e': 100, 'd': 8}]
Simple list operations would do the thing for you as well:
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
l3 = []
for i in range(len(l1)):
for j in range(len(l2)):
if l1[i]['a'] == l2[j]['a'] and l1[i]['b'] == l2[j]['b']:
l3.append(dict(l1[i]))
l3[i].update(l2[j])
My approach is to sort the the combined list by the key, which is keys a + b. After that, for each group of dictionaries with similar key, combine them:
from itertools import groupby
def ab_key(dic):
return dic['a'], dic['b']
def combine_lists_of_dicts(list_of_dic1, list_of_dic2, keyfunc):
for key, dic_of_same_key in groupby(sorted(list_of_dic1 + list_of_dic2, key=keyfunc), keyfunc):
combined_dic = {}
for dic in dic_of_same_key:
combined_dic.update(dic)
yield combined_dic
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
for dic in combine_lists_of_dicts(l1, l2, ab_key):
print dic
Discussion
The function ab_key returns a tuple of value for key a and b, used for sorting a groupping
The groupby function groups all the dictionaries with similar keys together
This solution is less efficient than that of John La Rooy, but should work fine for small lists
One can achieve a nice and quick solution using pandas.
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
import pandas as pd
pd.DataFrame(l1).merge(pd.DataFrame(l2), on=['a','b']).to_dict('records')