I have a list of dictionaries like this:
X = [{"t":1, "a":1, "b":3},
{"t":2, "a":2, "b":4}]
How do I get:
[{"t":1, "a":1, "b":3}, {"t":2, "a":3, "b":7}]
In the second element of the desired output, the value for "b" is 7, which is the cumulative sum of "b" values up to that point, and likewise for other keys.
I know I can do this via pandas. But is there a more pythonic solution?
You can use the fact that collections.Counter objects update in the way you want to accumulate:
import collections
def cumulative_elementwise_sum(ds):
result = collections.Counter()
for d in ds:
result.update(d)
yield dict(result)
This will also "do the right thing" when a new key is encountered. Example:
>>> x = [
... {'t': 1, 'a': 1, 'b': 3},
... {'t': 2, 'a': 2, 'b': 4},
... {'t': 1, 'a': 4, 'b': 1, 'd': 2},
... ]
>>> list(cumulative_elementwise_sum(x))
[{'t': 1, 'a': 1, 'b': 3},
{'t': 3, 'a': 3, 'b': 7},
{'t': 4, 'a': 7, 'b': 8, 'd': 2}]
If you're using Python 3.8, the iteratools.accumulate method has gained an initial argument, so this can be simplified to:
def updated(c, items):
c.update(items)
return c
map(dict, itertools.accumulate(x, updated, initial=collections.Counter()))
If you only need the final result, and not the whole sequence of intermediate results, that can be obtained using functools.reduce, of course:
import functools
>>> functools.reduce(updated, x, collections.Counter())
Counter({'t': 4, 'a': 7, 'b': 8, 'd': 2})
# dict version
>>> dict(functools.reduce(updated, x, collections.Counter()))
{'t': 4, 'a': 7, 'b': 8, 'd': 2}
Related
I have only recently learned about coroutines using generators and tried to implement the concept in the following recursive function:
def _recursive_nWay_generator(input: list, output={}):
'''
Helper function; used to generate parameter-value pairs
to submit to the model for the simulation.
Parameters
----------
input : list of tuple
every tuple of the list must be of the form:
``('name_of_parameter', iterable_of_values)``
output : list, optional
parameter used for recursion; allows for list building
across subgenerators
Returns
-------
Generator :
Specifications used for simulation setup of the form:
``{'par1': val1, ...}``
'''
# exit condition
if len(input) == 0:
yield output
# recursive loop
else:
curr = input[0]
par_name = curr[0]
for par_value in curr[1]:
output[par_name] = par_value
# coroutines for the win!
yield from _recursive_nWay_generator(input[1:], output=output)
Function somewhat works as intended:
testlist = [('a', (1, 2, 3)), ('b', (4, 5, 6)), ('c', (7, 8))]
for a in _recursive_nWay_generator(testlist):
print(a)
Output:
{'a': 1, 'b': 4, 'c': 7}
{'a': 1, 'b': 4, 'c': 8}
{'a': 1, 'b': 5, 'c': 7}
{'a': 1, 'b': 5, 'c': 8}
{'a': 1, 'b': 6, 'c': 7}
{'a': 1, 'b': 6, 'c': 8}
{'a': 2, 'b': 4, 'c': 7}
{'a': 2, 'b': 4, 'c': 8}
{'a': 2, 'b': 5, 'c': 7}
{'a': 2, 'b': 5, 'c': 8}
{'a': 2, 'b': 6, 'c': 7}
{'a': 2, 'b': 6, 'c': 8}
{'a': 3, 'b': 4, 'c': 7}
{'a': 3, 'b': 4, 'c': 8}
{'a': 3, 'b': 5, 'c': 7}
{'a': 3, 'b': 5, 'c': 8}
{'a': 3, 'b': 6, 'c': 7}
{'a': 3, 'b': 6, 'c': 8}
However, it breaks when I try to append to an existing list or construct a new one:
gen = _recursive_nWay_generator(testlist)
print(list(gen))
Output:
[{'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}, {'a': 3, 'b': 6, 'c': 8}]
This question was attempting to do something close to what I have, but I'm not seeing answers that could help.
I am honestly clueless as to how to solve this, the online searches I tried gave nothing no matter how I phrase the question. If this was answered before I'll be happy to just follow the link.
The problem with your code is reusing the same mutable output dict during the iteration and recursive calls. That is, you yield output and then later on you modify it with output[par_name] = par_value but it's the same dict in each case - so you're modifying the instance which was already returned! If you append each result into a list and then print them all at the end, you'll see that they're identical - it's the same result yielded each time.
The simplest way to "fix" your existing code is to yield copies, i.e. change the line:
yield output
into this:
yield dict(output.items())
However, this algorithm is not great, and I recommend you look for something better. Using recursion is poor choice here. I'll offer you a simple/direct way to generate the sequence more efficiently:
import itertools as it
testlist = [('a', (1, 2, 3)), ('b', (4, 5, 6)), ('c', (7, 8))]
keys, vals = zip(*testlist)
for p in it.product(*vals):
print(dict(zip(keys, p)))
I have 2 array of objects:
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
Output:
a - b = [{'c': 3, 'd': 4}] ("-" symbol is only for representation, showing difference. Not mathematical minus.)
b - a = [{'g': 3, 'h': 4}]
In every array, the order of key may be different. I can try following and check for that:
for i in range(len(a)):
current_val = a[i]
for x, y in current_val.items:
//search x keyword in array b and compare it with b
but this approach doesn't feel right. Is there simpler way to do this or any utility library which can do this similar to fnc or pydash?
You can use lambda:
g = lambda a,b : [x for x in a if x not in b]
g(a,b) # a-b
[{'c': 3, 'd': 4}]
g(b,a) # b-a
[{'g': 3, 'h': 4}]
Just test if all elements are in the other array
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
def find_diff(array_a, array_b):
diff = []
for e in array_a:
if e not in array_b:
diff.append(e)
return diff
print(find_diff(a, b))
print(find_diff(b, a))
the same with list comprehension
def find_diff(array_a, array_b):
return [e for e in array_a if e not in array_b]
here is the code for subtracting list of dictionaries
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 6, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 6}]
a_b = []
b_a = []
for element in a:
if element not in b:
a_b.append( element )
for element in b:
if element not in a:
b_a.append( element )
print("a-b =",a_b)
print("b-a =",b_a)
I have 4 dictionaries, let's call them:
dict1 , dict2 , dict3 , dict4
Example:
dict1 = {'A': 1, 'B':2}
dict2 = {'A': 3, 'C':4}
dict3 = {'B': 5, 'D':6}
dict4 = {'A': 7, 'B':8, 'C': 9, 'D':10, 'E':11}
Each dictionary level is "stronger" than those who come after it. As in, A found in dict1 will be 'stronger' than A found in dict2 in terms of precedence.
Is there a short, elegant script to create a new dictionary, assembled from all four, where each key is taken from the "strongest" dictionary that contains that key?
The result should be: dict = {'A':1, 'B':2, 'C':4, 'D:6', 'E':11}
I think the easiest/clearest approach here would be to create a new dictionary then use its update method, which overwrites existing keys. Something like this makes the precedence pretty obvious:
>>> x = {}
>>> x.update(dict4)
>>> x.update(dict3)
>>> x.update(dict2)
>>> x.update(dict1)
>>> x
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
Docs
You could of course make a utility of some sort for this, something like:
>>> def collapse(*dicts):
... x = {}
... for dict in dicts:
... x.update(dict)
... return x
...
>>>
>>> collapse(dict4, dict3, dict2, dict1)
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
(Though you'd need to remember to pass the dictionaries in the correct order.)
You could do the following (works on python 3.5 and newer):
result = {**dict4, **dict3, **dict2, **dict1}
Here's a fairly simple way for an arbitrary number of dictionaries:
dict1 = {'A': 1, 'B':2}
dict2 = {'A': 3, 'C':4}
dict3 = {'B': 5, 'D':6}
dict4 = {'A': 7, 'B':8, 'C': 9, 'D':10, 'E':11}
# strongest dictionary last
dictionaries = [dict4, dict3, dict2, dict1]
dict(i for d in dictionaries for i in d.items())
Output:
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
You probably want a ChainMap, which is perfect for simulating scope.
>>> import collections
>>> cm = collections.ChainMap(dict1, dict2, dict3, dict4)
>>> dict(cm)
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
>>> cm['A'] = 'foo'
>>> cm
ChainMap({'A': 'foo', 'B': 2}, {'A': 3, 'C': 4}, {'B': 5, 'D': 6}, {'A': 7, 'B': 8, 'C': 9, 'D': 10, 'E': 11})
>>> dict1
{'A': 'foo', 'B': 2}
Let's say I have a list of dictionaries:
>>> d = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
And I want to perform a map operation where I change just one value in each dictionary. One possible way to do that is to create a new dictionary which simply contains the original values along with the changed ones:
>>> map(lambda x: {'a': x['a'], 'b': x['b'] + 1, 'c': x['c']}, d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
This can get unruly if the dictionaries have many items.
Another way might be to define a function which copies the original dictionary and only changes the desired values:
>>> def change_b(x):
... new_x = x.copy()
... new_x['b'] = x['b'] + 1
... return new_x
...
>>> map(change_b, d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
This, however, requires writing a separate function and loses the elegance of a lambda expression.
Is there a better way?
This works (and is compatible with python2 and python31):
>>> map(lambda x: dict(x, b=x['b']+1), d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
With that said, I think that more often than not, lambda based solutions are less elegant than non-lambda counterparts... The rational behind this statement is that I can immediately look at the non-lambda solution that you proposed and I know exactly what it does. The lambda based solution that I just wrote would take a bit of thinking to parse and then more thinking to actually understand...
1Though, map will give you an iterable object on python3.x that isn't a list...
First, writing a function doesn't seem that inelegant to me in the first place. That said, welcome to the brave new world of Python 3.5 and PEP 448:
>>> d = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
>>> d
[{'b': 3, 'a': 2, 'c': 4}, {'b': 6, 'a': 5, 'c': 7}]
>>> [{**x, 'b': x['b']+1} for x in d]
[{'b': 4, 'a': 2, 'c': 4}, {'b': 7, 'a': 5, 'c': 7}]
From how your map is behaving, it's clear you're using 2, but that's easy enough to fix. :-)
You can use a for loop with an update call. Here is a hacky one-liner:
dcts = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
dcts = [d.update({'b': d['b']+1}) or d for d in dcts]
Edit: To preserve original dicts:
from copy import copy
dcts = [d.update({'b': d['b']+1}) or d for d in map(copy, dcts)]
I have 2 lists like this:
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
and I want to obtain a list l3, which is a join of l1 and l2 where values of 'a' and 'b' are equal in both l1 and l2
i.e.
l3 = [{'a': 1, 'b: 2, 'c': 3, 'd': 4, 'e': 101}, {'a': 5, 'b: 6, 'c': 7, 'd': 8, 'e': 100}]
How can I do this?
You should accumulate the results in a dictionary. You should use the values of 'a' and 'b' to form a key of this dictionary
Here, I have used a defaultdict to accumulate the entries
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
from collections import defaultdict
D = defaultdict(dict)
for lst in l1, l2:
for item in lst:
key = item['a'], item['b']
D[key].update(item)
l3 = D.values()
print l3
output:
[{'a': 1, 'c': 3, 'b': 2, 'e': 101, 'd': 4}, {'a': 5, 'c': 7, 'b': 6, 'e': 100, 'd': 8}]
Simple list operations would do the thing for you as well:
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
l3 = []
for i in range(len(l1)):
for j in range(len(l2)):
if l1[i]['a'] == l2[j]['a'] and l1[i]['b'] == l2[j]['b']:
l3.append(dict(l1[i]))
l3[i].update(l2[j])
My approach is to sort the the combined list by the key, which is keys a + b. After that, for each group of dictionaries with similar key, combine them:
from itertools import groupby
def ab_key(dic):
return dic['a'], dic['b']
def combine_lists_of_dicts(list_of_dic1, list_of_dic2, keyfunc):
for key, dic_of_same_key in groupby(sorted(list_of_dic1 + list_of_dic2, key=keyfunc), keyfunc):
combined_dic = {}
for dic in dic_of_same_key:
combined_dic.update(dic)
yield combined_dic
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
for dic in combine_lists_of_dicts(l1, l2, ab_key):
print dic
Discussion
The function ab_key returns a tuple of value for key a and b, used for sorting a groupping
The groupby function groups all the dictionaries with similar keys together
This solution is less efficient than that of John La Rooy, but should work fine for small lists
One can achieve a nice and quick solution using pandas.
l1 = [{'a': 1, 'b': 2, 'c': 3, 'd': 4}, {'a': 5, 'b': 6, 'c': 7, 'd': 8}]
l2 = [{'a': 5, 'b': 6, 'e': 100}, {'a': 1, 'b': 2, 'e': 101}]
import pandas as pd
pd.DataFrame(l1).merge(pd.DataFrame(l2), on=['a','b']).to_dict('records')