Use counter on a list of Python dictionaries - python

I'm trying to use counter on a list of dictionaries in order to count how many time each dictionary repeats itself.
Not all the dictionaries in the list necessarily has the same keys.
lets assume I have the following list:
my_list=({"id":1,"first_name":"Jhon","last_name":"Smith"},{"id":2,"first_name":"Jeff","last_name":"Levi"},{"id":3,"first_name":"Jhon"},{"id":1,"first_name":"Jhon","last_name":"Smith"})
My desired solution is
solution={
{"id":1,"first_name":"Jhon","last_name":"Smith"}:2
{"id":2,"first_name":"Jeff","last_name":"Levi"}:1
{"id":3,"first_name":"Jhon"}}
I have tried
import collections
c=collections.Counter(my_list)
but I get the following error
TypeError: unhashable type: 'dict'
Do you have any suggestion
Thanks

You can't use dictionary as a key in other dictionary. That's why you get a TypeError: unhashable type: 'dict'.
You can serialize the dictionary to a JSON string, which can be used as a dictionary key.
import json
import collections
my_list = [{"id":1,"first_name":"Jhon","last_name":"Smith"},
{"id":2,"first_name":"Jeff","last_name":"Levi"},
{"id":3,"first_name":"Jhon"},
{"id":1,"first_name":"Jhon","last_name":"Smith"}]
c = collections.Counter(json.dumps(l) for l in my_list)
print c
>>> Counter({'{"first_name": "Jhon", "last_name": "Smith", "id": 1}': 2,
'{"first_name": "Jeff", "last_name": "Levi", "id": 2}': 1,
'{"first_name": "Jhon", "id": 3}': 1})

Counter is tool that stores items in a iterable as a dict wherein dict.keys() represent the items and dict.values() represent the count of that item in the iterable.
In a dictionary, however, you cannot have repetitive keys, as the keys must be unique. There is therefore no point in counting anything, since we already know it's 1. On the other hand, there may be repetitive values stored in the dict. For instance:
>>> from collections import Counter
>>> my_dict = {'a': 'me', 'b':'you', 'c':'me', 'd':'me'}
>>> Counter(my_dict) # As plain dict.
Counter({'b': 'you', 'a': 'me', 'c': 'me', 'd': 'me'})
>>> Counter(my_dict.values()) # As dict values.
Counter({'me': 3, 'you': 1})
Now let's say we have list of dictionaries, and we want to counter the values in those dictionaries; as is the case in your question:
>>> my_dict = [
... {'age': 30, 'name': 'John'},
... {'age': 20, 'name': 'Jeff'},
... {'age': 30, 'name': 'John'},
... {'age': 25, 'name': 'John'}
... ]
>>> Counter(tuple(i.values()) for i in a) # As a generator of values as tuple.
Counter({(30, 'John'): 2, (25, 'John'): 1, (20, 'Jeff'): 1})
Now you can of course take this tuples and convert them to a dict:
>>> {key: value for key, value in b.items()}
{(25, 'John'): 1, (30, 'John'): 2, (20, 'Jeff'): 1}
or go even further, and use named tuples from collections.namedtuple and identify your tuples by name, to which you can later refer much more easily, and clearly.
Hope this helps.
Learn more about collections.Counter from the documentations or this useful set of examples. You can also refer to Raymond Hettinger (Python's maintainer of collections toolbox) videos on YouTube. He has some great tutorials on different tools.

Unfortunately dict are not hashable. So I write this code. Result is not like your desired solution (because not possible) but may be you can use that.
ids_l = [i['id'] for i in my_list]
ids_s = list(set(ids_l))
#k is basickly [id, how many]
k = [[i,ids_l.count(i)] for i in ids_s]
#finding my_list from id
def finder(x):
for i in my_list:
if i['id'] == x:
return i
res = []
for i in range(len(ids_s)):
#k[i][1] how many value
#finder(k[i][0]) return dict
res.append([k[i][1],finder(k[i][0])])
print(res)
this code return
[
[2, {'id': 1, 'first_name': 'Jhon', 'last_name': 'Smith'}],
[1, {'id': 2, 'first_name': 'Jeff', 'last_name': 'Levi'}],
[1, {'id': 3, 'first_name': 'Jhon'}]
]
ps: sorry my poor english

Related

[python - get the list of keys in a list of dictionary]

I have a list of dictionaries
input:
x = [{'id': 19, 'number': 123, 'count': 1},
{'id': 1, 'number': 23, 'count': 7},
{'id': 2, 'number': 238, 'count': 17},
{'id': 1, 'number': 9, 'count': 1}]
How would I get the list of number:
[123, 23, 238, 9]
Thank you for you reading
To get these numbers you can use
>>> [ d['number'] for d in x ]
But this is not the "list of keys" for which you ask in the question title.
The list of keys of each dictionary d in x is obtained as d.keys()
which would yield something like ['id', 'number', ...]. Do for example
>>> [ list(d.keys()) for d in x ]
to see. If they are all equal you are probably only interested in the first of these lists. You can get it as
>>> list( x[0].keys() )
Note also that the "elements" of a dictionary are actually the keys rather than the values. So you will also get the list ['id', 'number',...] if you write
>>> [ key for key in x[0] ]
or simply (and better):
>>> list( x[0] )
To get the first element is more tricky when x is not a list but a set or dict. In that case you can use next(x.__iter__()).
P.S.: You should actually think what you really want the keys to be -- a priori that should be the 'id's, not the 'number's, but your 'id's have duplicates which is contradictory to the concept and very definition / meaning of 'id' -- and then use the chosen keys as identifiers to index the elements of your collection 'x'. So if the keys are the 'number's, you should have a dictionary (rather than a list)
x = {123: {'id': 19, 'count': 1}, 23: {'id': 1, 'count': 7}, ...}
(where I additionally assumed that the numbers are indeed integers [which is more efficient] rather than strings, but that's up to you).
Then you can also do, e.g., x[123]['count'] += 1 to increment the 'count' of entry 123.
You can use a list comprehension:
numbers = [dictionary.get('number') for dictionary in list_of_dictionaries]
Using a functional programming approach:
from operator import itemgetter
x_k = list(map(itemgetter('number'), x))
#[123, 23, 238, 9]

How to filter list of dicts to leave only unique dicts

I have query created using SQLAlchemy. This query generator return 4 objects that have list of rules. Each rule is dict like: {'type': u'psome_type', 'options': {}, 'weight': 1.0}. I want to get one iterable element that will contains only unique dict.
I try to use set():
rule_sets = DBSession.query(ProductSortRuleSet).all()
rules = set()
for i in rule_sets:
for j in i.rules:
rules.update(j)
But when we pass to set.update dictionary, it update only by key, but i want to update by all dictionary.
How can i do this?
Your loop adds all the keys of the different dictionaries to a set. If, instead, you do rules.update(d.items()) you would get a set of all the items, and could re-create one dict with all the entries from it (assuming no duplicate keys with different values), but that seems not to be what you want.
>>> lst = [{"foo": 42, "bar": 23}, {"foo": 3.14}, {"bar": 23, "foo": 42}]
>>> s = set()
>>> for d in lst:
... s.update(d.items())
...
>>> dict(s)
{'foo': 3.14, 'bar': 23}
Note how the "foo": 42 entry disappeared. Instead, you can create hashable frozenset from the dictionaries' items, collect those in a set, and create new unique dictionaries from those:
>>> lst = [{"foo": 42, "bar": 23}, {"foo": 3.14}, {"bar": 23, "foo": 42}]
>>> [dict(x) for x in set(frozenset(d.items()) for d in lst)]
[{'foo': 3.14}, {'foo': 42, 'bar': 23}]

Python - Filter list of dictionaries based on if key value contains all items in another list

I have a list of dictionaries that looks like this, but with around 500 items:
listOfDicts = [{'ID': 1, 'abc': {'123': 'foo'}}, ... {'ID': 7, 'abc': {'123':'foo','456': 'bar'}}]
sampleFilterList = ['123', '456']
I am trying to filter the listOfDicts for all the results where all the values in the sampleFilterList are in the key 'abc'
The result should be a list:
[{'ID': 7, 'abc': {'123':'foo','456': 'bar'}}, ...]
I tried [i for i in listOfDicts if a for a in sampleFilterList in i['abc']], but I am getting an UnboundLocalError: local variable 'a' referenced before assignment
First, convert your second list to a set for more efficient comparisons:
sampleFilterSet = set(sampleFilterList)
Now, compare the 'abc' keys for each list item to the aforesaid set:
[item for item in listOfDicts if not (sampleFilterSet - item['abc'].keys())]
#[{'ID': 7, 'abc': {'123': 'foo', '456': 'bar'}}]
This is the fastest solution. A more Pythonic (but somewhat slower) solution is to use filter():
list(filter(lambda item: not (sampleFilterSet - item['abc'].keys()), listOfDicts))
#[{'ID': 7, 'abc': {'123': 'foo', '456': 'bar'}}]
You need to move the in condition test before the for keyword in the list comprehension and also use get will be more safe, which returns a default value instead of throwing an error, if you are not sure if all the dictionaries in the list have the keyword abc:
listOfDicts = [{'ID': 1, 'abc': {'123': 'foo'}}, {'ID': 7, 'abc': {'123':'foo','456': 'bar'}}] ​
sampleFilterList = ['123', '456']
[d for d in listOfDicts if all(s in d.get('abc', {}) for s in sampleFilterList)]
# [{'ID': 7, 'abc': {'123': 'foo', '456': 'bar'}}]
Or if use a set as of #DYZ, you can use issubset:
filterSet = set(sampleFilterList)
[d for d in listOfDicts if filterSet.issubset(d.get('abc', {}))]
# [{'ID': 7, 'abc': {'123': 'foo', '456': 'bar'}}]
Here is a working version with nested list comprehensions. Your problem is that the a for a in... is a list comprehension, and needs to be used in constructing a new list.
[i for i in listOfDicts if [a for a in sampleFilterList if a in i['abc']] == sampleFilterList]
You could try the following one-liner:
passed_the_filter = [[dictionary_entry for dictionary_entry in list_of_dicts if filter_test in dictionary_entry['abc']] for filter_test in filter]
It is a nested list comprehension that iterates through both the filter and the dictionary list. It checks if the filter is a key in the dictionary entries' "abc" value. Your problem was that you used the wrong list comprehension syntax.
N.B. You may want to note that you might not be sure that an element has a "abc" key!
Thank you for reading this.
for i in zip(listOfDicts):
a = i[0]['abc']
print (a)
or:
for i in zip(listOfDicts):
if 'abc' in i[0]:
a = i
print (a)
This is an elegant way to do it, I hope it will be useful.

Python - Get dictionary element in a list of dictionaries after an if statement

How can I get a dictionary value in a list of dictionaries, based on the dictionary satisfying some condition? For instance, if one of the dictionaries in the list has the id=5, I want to print the value corresponding to the name key of that dictionary:
list = [{'name': 'Mike', 'id': 1}, {'name': 'Ellen', 'id': 5}]
id = 5
if any(m['id'] == id for m in list):
print m['name']
This won't work because m is not defined outside the if statement.
You have a list of dictionaries, so you can use a list comprehension:
[d for d in lst if d['id'] == 5]
# [{'id': 5, 'name': 'Ellen'}]
new_list = [m['name'] for m in list if m['id']==5]
print '\n'.join(new_list)
This will be easy to accomplish with a single for-loop:
for d in list:
if 'id' in d and d['in'] == 5:
print(d['name'])
There are two key concepts to learn here. The first is that we used a for loop to "go through each element of the list". The second, is that we used the in word to check if a dictionary had a certain key.
How about the following?
for entry in list:
if entry['id']==5:
print entry['name']
It doesn't exist in Python2, but a simple solution in Python3 would be to use a ChainMap instead of a list.
import collections
d = collections.ChainMap(*[{'name':'Mike', 'id': 1}, {'name':'Ellen', 'id': 5}])
if 'id' in d:
print(d['id'])
You can do it by using the filter function:
lis = [ {'name': 'Mike', 'id': 1}, {'name':'Ellen', 'id': 5}]
result = filter(lambda dic:dic['id']==5,lis)[0]['name']
print(result)

Get sorted list of indices, for a list of dictionaries sorted by a given key

I have a list of dictionaries with multiple keys and I would like to get multiple lists each with the indices of the dictionaries, sorted by a particular key.
For example, I have the following list
a = [{"name": "Zoe", "age": 13}, {"name": "Adam", "age": 31}]
I know that I can do the following
from operator import itemgetter
sorted(a, key=itemgetter("name"))
to get the following
[{'name': 'Adam', 'age': 31}, {'name': 'Zoe', 'age': 13}]
but I really want to be able to get two lists with the indices instead:
[1, 0] # order of the elements in `a` as sorted by name
[0, 1] # order of the elements in `a` as sorted by age
My real dictionary has many more key-value pairs and it's more efficient for me to return a and the additional lists with indices as sorted by different keys rather than multiple sorted lists of a.
You can produce a sorted list of indices by producing a range(), and then adjusting the sort key to translate an index to a specific value from a dictionary from the a list:
sorted(range(len(a)), key=lambda i: a[i]['name'])
sorted(range(len(a)), key=lambda i: a[i]['age'])
If you know the keys up front, just loop and produce multiple lists; you could perhaps create a dictionary keyed by each sorting key:
{key: sorted(range(len(a)), key=lambda i: a[i][key]) for key in ('name', 'age')}
The above dictionary comprehension then gives you access to each sorted list based on the sort key alone:
>>> a = [{'name': "Zoe", 'age': 13}, {'name': "Adam", 'age': 31}]
>>> {key: sorted(range(len(a)), key=lambda i: a[i][key]) for key in ('name', 'age')}
{'age': [0, 1], 'name': [1, 0]}
A simple way to do this is to add an index field to the dicts in your list.
from operator import itemgetter
a = [{'name':"Zoe", 'age':13}, {'name':"Adam", 'age':31}]
for i, d in enumerate(a):
d['index'] = i
def get_indices(seq):
return[d['index'] for d in seq]
by_name = sorted(a, key=itemgetter("name"))
print by_name
print get_indices(by_name)
by_age = sorted(a, key=itemgetter("age"))
print by_age
print get_indices(by_age)
output
[{'index': 1, 'age': 31, 'name': 'Adam'}, {'index': 0, 'age': 13, 'name': 'Zoe'}]
[1, 0]
[{'index': 0, 'age': 13, 'name': 'Zoe'}, {'index': 1, 'age': 31, 'name': 'Adam'}]
[0, 1]
Of course, you don't need to keep the sorted lists of dicts, you could just do
by_name = get_indices(sorted(a, key=itemgetter("name")))

Categories

Resources