I have a list with multiple dicts in, I need to check which dicts are repeated and create a new list with only a single occurrence of each but with the amount of repeated elements in the first list.
For example:
I have that list:
[{'a': 123, 'b': 1234, 'c': 'john', 'amount': 1},
{'a': 456, 'b': 1234, 'c': 'doe','amount': 1},
{'a': 456, 'b': 1234, 'c': 'steve','amount': 1},
{'a': 123, 'b': 1234, 'c': 'john','amount': 1},
{'a': 123, 'b': 1234, 'c': 'john','amount': 1}]
I need to output:
[{'a': 123, 'b': 1234, 'c': 'john', 'amount': 3},
{'a': 456, 'b': 1234, 'c': 'steve','amount': 1},
{'a': 456, 'b': 1234, 'c': 'doe','amount': 1}]
I've tried some things I found by Googling but nothing works completely, the last that I've tried let me know where the repeated ones where, but I'm stuck in what to do next.
def index(lst, element):
result = []
offset = -1
while True:
try:
offset = lst.index(element, offset+1)
except ValueError:
return result
result.append(offset)
for i in l:
if len(index(l,i)) > 1:
i['amount'] += 1
print l
But it returns
[{'a': 123, 'c': 'john', 'b': 1234, 'amount': 2},
{'a': 456, 'c': 'doe', 'b': 1234, 'amount': 1},
{'a': 456, 'c': 'steve', 'b': 1234, 'amount': 1},
{'a': 123, 'c': 'john', 'b': 1234, 'amount': 2},
{'a': 123, 'c': 'john', 'b': 1234, 'amount': 1}]
Here is an option using pandas by which we can concatenate the dictionary into a data frame, and then we can groupby column a, b and c and calculate the sum of amount. And if we want a dictionary back, pandas data frame has a built in to_dict() function. Specifying the parameter as index, we can get a dictionary as the desired output:
import pandas as pd
list(pd.DataFrame(mylist).groupby(['a', 'b', 'c']).sum().reset_index().to_dict('index').values())
# [{'a': 123, 'amount': 3, 'b': 1234, 'c': 'john'},
# {'a': 456, 'amount': 1, 'b': 1234, 'c': 'doe'},
# {'a': 456, 'amount': 1, 'b': 1234, 'c': 'steve'}]
Related
I have a dict I'm trying to loop through which stores integer values:
myDict = {"a":1,"b":2,"c":3}
And I'm trying to loop through every combination of these integers from 1 to their value. For example:
{"a":1,"b":1,"c":1}
{"a":1,"b":1,"c":2}
{"a":1,"b":1,"c":3}
{"a":1,"b":2,"c":1}
{"a":1,"b":2,"c":2}
And so on...
Ending at their max values at the start while going through all the combinations.
Is there an elegant way to this that applies to any size dictionary? :
myDict = {"a":1,"b":2,"c":3,"d":5}
I'm currently doing this with nested if statements that edit a clone dictionary, so any solution setting dict values would work :) :
copyDict["c"] = 2
Yes, this will do it, for any size input (that fits in memory, of course):
import itertools
myDict = {"a":1,"b":2,"c":3,"d":5}
keys = list(myDict.keys())
ranges = tuple(range(1,k+1) for k in myDict.values())
outs = [dict(zip(keys,s)) for s in itertools.product(*ranges)]
print(outs)
You can actually replace *ranges with the *(range(1,k+1) for k in myDict.values()) and reduce it by one line, but it's not easier to read.
Another way with itertools.product:
>>> [dict(zip(myDict.keys(), p)) for p in itertools.product(myDict.values(), repeat=len(myDict))]
[{'a': 1, 'b': 1, 'c': 1},
{'a': 1, 'b': 1, 'c': 2},
{'a': 1, 'b': 1, 'c': 3},
{'a': 1, 'b': 2, 'c': 1},
{'a': 1, 'b': 2, 'c': 2},
{'a': 1, 'b': 2, 'c': 3},
{'a': 1, 'b': 3, 'c': 1},
{'a': 1, 'b': 3, 'c': 2},
{'a': 1, 'b': 3, 'c': 3},
{'a': 2, 'b': 1, 'c': 1},
{'a': 2, 'b': 1, 'c': 2},
{'a': 2, 'b': 1, 'c': 3},
{'a': 2, 'b': 2, 'c': 1},
{'a': 2, 'b': 2, 'c': 2},
{'a': 2, 'b': 2, 'c': 3},
{'a': 2, 'b': 3, 'c': 1},
{'a': 2, 'b': 3, 'c': 2},
{'a': 2, 'b': 3, 'c': 3},
{'a': 3, 'b': 1, 'c': 1},
{'a': 3, 'b': 1, 'c': 2},
{'a': 3, 'b': 1, 'c': 3},
{'a': 3, 'b': 2, 'c': 1},
{'a': 3, 'b': 2, 'c': 2},
{'a': 3, 'b': 2, 'c': 3},
{'a': 3, 'b': 3, 'c': 1},
{'a': 3, 'b': 3, 'c': 2},
{'a': 3, 'b': 3, 'c': 3}]
In a given list:
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
Find all 'key' pairs and print out and if no pairs found for given dictionary print out that dictionary.
What I managed to write so far sort of works but it keeps testing some items of the list even though they were already tested. Not sure how to fix it.
for i in range(len(unmatched_items_array)):
for j in range(i + 1, len(unmatched_items_array)):
# when keys are the same print matching dictionary pairs
if unmatched_items_array[i].keys() == unmatched_items_array[j].keys():
print(unmatched_items_array[i], unmatched_items_array[j])
break
# when no matching pairs print currently processed dictionary
print(unmatched_items_array[i])
Output:
{'c': 45} {'c': 35}
{'c': 45}
{'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
{'a': 3.2}
{'a': 3}
What the output should be:
{'c': 45} {'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
What am I doing wrong here?
Using collections.defaultdict
Ex:
from collections import defaultdict
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
result = defaultdict(list)
for i in unmatched_items_array:
key, _ = i.items()[0]
result[key].append(i) #Group by key.
for _, v in result.items(): #print Result.
print(v)
Output:
[{'a': 3.2}, {'a': 3}]
[{'c': 45}, {'c': 35}]
[{'d': 5}]
With itertools.groupby:
from itertools import groupby
unmatched_items_array = [{'d': 5}, {'c': 35}, {'a': 3}, {'a': 3.2}, {'c': 45}]
for v, g in groupby(sorted(unmatched_items_array, key=lambda k: tuple(k.keys())), lambda k: tuple(k.keys())):
print([*g])
Prints:
[{'a': 3}, {'a': 3.2}]
[{'c': 35}, {'c': 45}]
[{'d': 5}]
EDIT: If your items in the list are sorted by keys already, then you can skip the sorted() call:
for v, g in groupby(unmatched_items_array, lambda k: tuple(k.keys()) ):
print([*g])
I have a list of dictionaries like
[
{'a': {'q': 1}, 'b': {'r': 2}, 'c': {'s': 3}},
{'a': {'t': 4}, 'b': {'u': 5}, 'c': {'v': 6}},
{'a': {'w': 7}, 'b': {'x': 8}, 'c': {'z': 9}}
]
and I want the output to be
{
'a': {'q': 1, 't': 4, 'w': 7},
'b': {'r': 2, 'u': 5, 'x': 8},
'c': {'s': 3, 'v': 6, 'z': 9}
}
There are several ways of doing this, one with usage of collections.defaultdict:
import collections
result = collections.defaultdict(dict)
lst = [
{'a': {'q': 1}, 'b': {'r': 2}, 'c': {'s': 3}},
{'a': {'t': 4}, 'b': {'u': 5}, 'c': {'v': 6}},
{'a': {'w': 7}, 'b': {'x': 8}, 'c': {'z': 9}}
]
for dct in lst:
for key, value in dct.items():
result[key].update(value)
print(result)
Imagine that you have to sort a list of dicts, by the value of a particular key. Note that the key might be missing from some of the dicts, in which case you default to the value of that key to being 0.
sample input
input = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
sample output (sorted by value of key 'a')
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
note that {'b': 5} is first in the sort-order because it has the lowest value for 'a' (0)
I would've used input.sort(key=operator.itemgetter('a')), if all the dicts were guaranteed to have the key 'a'. Or I could convert the input dicts to collections.defaultdict and then sort.
Is there a way to do this in-place without having to creating new dicts or updating the existing dicts? Can operator.itemgetter handle missing keys?
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items, key=lambda d: d.get('a', 0))
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
Or to update the existing dictionary in-place
items.sort(key=lambda d: d.get('a', 0))
Or if in sorted:
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items,key=lambda x: x['a'] if 'a' in x else 0)
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
>>>
I am trying to update values for a dict() key dynamically with a for loop.
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict.update({'a': i})
j['c'] = test_dict
print(j)
f.append(j)
print(f)
test_dict = dict({'a': 1})
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
Even print(j) gives iterating value (0,1,2,3,4), somehow the last dict is getting overwritten all over the list and giving wrong output (4,4,4,4,4).
Expected Output,
[{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]
Output obtained,
[{'b': 0, 'c': {'a': 4}}, {'b': 1, 'c': {'a': 4}}, {'b': 2, 'c': {'a': 4}}, {'b': 3, 'c': {'a': 4}}, {'b': 4, 'c': {'a': 4}}]
I need to understand how the dictionaries are getting overwritten and what could be the best solution to avoid this?
Thanks in advance!
P.S. : please avoid suggesting list or dict comprehension method as bare answer as i am aware of them and the only purpose of this question is to understand the wrong behavior of dict().
The reason of such behaviour is that all references in list points to the same dict. Line j['c'] = test_dict doesn't create copy of dictionary, but just make j['c'] refer to test_dict. To get expected result you need change this line to:
j['c'] = test_dict.copy(). It will make deep copy of test_dict and assign it to j['c'].
You try to add values to same dictionary every time in the loop and as loop progresses, you keep replacing the values.
You need to define dictionary in every iteration to create separate references of the dictionary:
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict = {'a': i}
j['c'] = test_dict
f.append(j)
print(f)
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
# [{'b': 0, 'c': {'a': 0}},
# {'b': 1, 'c': {'a': 1}},
# {'b': 2, 'c': {'a': 2}},
# {'b': 3, 'c': {'a': 3}},
# {'b': 4, 'c': {'a': 4}}]
A simpler solution could be to iterate through list_range and create c using the values from b
lista = [{'b': i } for i in range(0, 5)]
for i in lista:
i['c'] = {'a': i['b']}
# [{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]
def update_dict():
f = []
for i, j in enumerate(list_range):
j['c'] = {'a': i}
print(j)
f.append(j)
return f
list_range = [{'b': i} for i in range(0, 5)]
print(update_dict())
#output
{'b': 0, 'c': {'a': 0}}
{'b': 1, 'c': {'a': 1}}
{'b': 2, 'c': {'a': 2}}
{'b': 3, 'c': {'a': 3}}
{'b': 4, 'c': {'a': 4}}