Related
This is a follow-up to this question: Using pandas to add list elements together. I would like to generalize this function to getting unique elements in an array, even if they're not of a 'hashable' type, such as a dict. Here is the input array:
items = [
{
'FirstName': 'David',
'LastName': 'Smith',
'Residence': [{'Place': 'X', 'Age': 22}, {'Place': 'Y', 'Age': 23}]
},
{
'FirstName': 'David',
'LastName': 'Smith',
'Residence': [{'Place': 'Z', 'Age': 20}]
},
{
'FirstName': 'David',
'LastName': 'Smith',
'Residence': [{'Place': 'Z', 'Age': 20}]
},
{
'FirstName': 'Bob',
'LastName': 'Jones',
'Residence': [{'Place': 'Z', 'Age': 20}]
}
]
I want to add together the unique Residences (dicts) together, so the final result would be:
items = [
{
'FirstName': 'David',
'LastName': 'Smith',
'Residence': [{'Place': 'X', 'Age': 22}, {'Place': 'Y', 'Age': 23}, {'Place': 'Z', 'Age': 20}]
},
{
'FirstName': 'Bob',
'LastName': 'Jones',
'Residence': [{'Place': 'Z', 'Age': 20}]
}
]
The SQL I would use would be something like this:
SELECT FirstName, LastName, GROUP_CONCAT(DISTINCT **Residence Object**)
FROM items
GROUP BY FirstName, LastName
How would I do this in pandas, so that I don't get an unhashable type error when trying to get the distinct array elements?
Barring anything else, I don't think Pandas would give you any real benefit here:
from collections import defaultdict
d = defaultdict(list)
for e in items:
d[(e['FirstName'], e['LastName'])].append(e['Residence'])
items = [{'FirstName': k[0], 'LastName': k[1], 'Residence': v} for k, v in d.items()]
Solution from pandas
#df=pd.DataFrame(items)
df.groupby(['FirstName','LastName']).Residence.\
apply(lambda x : x.sum()).\
apply(lambda x : [dict(y) for y in set(tuple(t.items()) for t in x)]).\
reset_index().to_dict('r')
Out[104]:
[{'FirstName': 'Bob',
'LastName': 'Jones',
'Residence': [{'Age': 20, 'Place': 'Z'}]},
{'FirstName': 'David',
'LastName': 'Smith',
'Residence': [{'Age': 20, 'Place': 'Z'},
{'Age': 23, 'Place': 'Y'},
{'Age': 22, 'Place': 'X'}]}]
Suppose I have a named list as follows:
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
I want to select the element (not only the field) where an specific field meets certain criteria, e.g., the element with the minimum 'Age'. Something like:
youngerPerson = [person for person in myListOfPeople if person = ***person with minimum age***]
And will get as answer:
>>youngerPerson: {'ID': 0, 'Name': Mary, 'Age': 25}
How can I do that?
You can use the key parameter of min:
>>> myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
>>>
>>> min(myListOfPeople, key=lambda x: x["Age"])
{'ID': 0, 'Name': 'Mary', 'Age': 25}
>>>
You can use itemgetter :
from operator import itemgetter
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
sorted(myListOfPeople, key=itemgetter('Age'))[0]
# {'ID': 0, 'Name': 'Mary', 'Age': 25}
Imagine that you have the following list.
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
How do you convert it to an array of dictionary?
[
{
"name": "bob"
"age": 35
"gender": "Male"
},
{
"name": "kate"
"age": 12
"gender": "Female"
},
{
"name": "john"
"age": 57
"gender": "Male"
}
]
A generic method which works for any number of lists with customizable field names
import pprint
def make_complex(**kwargs):
return [dict(zip(kwargs.keys(), a)) for a in zip(*kwargs.values())]
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
l = make_complex(name=name, age=age, gender=gender)
pprint.pprint(l)
l = make_complex(user=name, year=age, sex=gender)
pprint.pprint(l)
output:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
[{'sex': 'Male', 'user': 'bob', 'year': 35},
{'sex': 'Female', 'user': 'kate', 'year': 12},
{'sex': 'Male', 'user': 'john', 'year': 57}]
Using zip ,List comprehension
Code:
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
dic= [ {"name":val[0], "age":val[1], "gender":val[2]} for val in zip(name, age, gender)]
Output:
[{'name':'bob','age':35,'gender':'Male'},
{'name':'kate','age':12,'gender':'Female'},
{'name':'john','age':57,'gender':'Male'}]
Using a simple loop it would look something like:
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
list=[]
for i in range(len(name)):
temp={}
temp['name']=name[i]
temp['age']=age[i]
temp['gender']=gender[i]
list.append(temp)
Using a list comprehension and itertools
import itertools
d = [{'name': n, 'age': a, 'gender': g} for n, a, g in itertools.izip(name, age, gender)]
Use list comprehension.
In [3]: [{"name":n,"age":a,"gender":g} for n,a,g in zip(name, age, gender)]
Out[3]:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
or,
In [5]: [dict(zip(['name','age','gender'], t)) for t in zip(name, age, gender)]
Out[5]:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
Go for this.
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
keys = [name, age, gender] #If there are more data to be added just change this one place
def get_var_name(var):
for k, v in list(globals().iteritems()):
if v is var:
return k
d = []
for i in range(len(keys[0])):
d.append({})
for key in keys:
d[i][get_var_name(key)] = key[i]
print d
Or use dict comprehension to avoid inner loop
d = []
for i in range(len(name)):
d.append({get_var_name(key):key[i] for key in keys})
print d
To make it one liner go combining dict comprehension inner and list comprehension outer
print [{get_var_name(key):key[i] for key in keys} for i in range(len(keys[0]))]
This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 7 years ago.
I have a list which looks like this:
some_list = [{'id':1, 'name':'Steve', 'age':23}, {'id':2, 'name':'John', 'age':17}, {'id':3, 'name':'Matt', 'age':31}]
I would like to sort the list my the name value in the dictionary. So Instead of the above order, it would be John then Matt then Steve.
How would I go about this? Thanks.
You can use operator.itemgetter:
>>> import operator
>>> some_list = [dict(id=1, name='Steve', age=23), dict(id=2, name='John', age=17), dict(id=3, name='Matt', age=31)]
>>> sorted(some_list, key=operator.itemgetter('name'))
[{'id': 2, 'age': 17, 'name': 'John'}, {'id': 3, 'age': 31, 'name': 'Matt'}, {'id': 1, 'age': 23, 'name': 'Steve'}]
Or a lambda function:
>>> some_list = [dict(id=1, name='Steve', age=23), dict(id=2, name='John', age=17), dict(id=3, name='Matt', age=31)]
>>> sorted(some_list, key=lambda x: x['name'])
[{'id': 2, 'age': 17, 'name': 'John'}, {'id': 3, 'age': 31, 'name': 'Matt'}, {'id': 1, 'age': 23, 'name': 'Steve'}]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a Python list as follows,
demo= {'age': 90, 'id': '12#2'}
{'age': 12, 'id': '12#3'}
{'age': 67, 'id': '12#1'}
{'age': 56, 'id': '12#2'}
{'age': 34, 'id': '12#2'}
How can I sort this list with id attribute?
I have tried
sorted(demo, key=lambda x: x.id) # sort by id
but it failed.
Expected output as follows:
{'age': 90, 'id': '12#2'}
{'age': 56, 'id': '12#2'}
{'age': 34, 'id': '12#2'}
{'age': 12, 'id': '12#3'}
{'age': 67, 'id': '12#1'}
Your code fails with an AttributeError because you are trying to do a lookup of id in a dict object, which doesn't have one. You need to access the desired dictionary key:
sorted(demo, key=lambda x: x['id'])
However, that will fail with a KeyError if at least one entry in the list does not have the id key. In that case, you can use:
sorted(demo, key=lambda x: x.get("id"))
Optionally you can use a default value in the get, if you wish to put all the entries with no id above or below the rest. In this case, the following would send entries with no id to the bottom:
sorted(demo, key=lambda x: x.get("id", "99"))
It may also happen that you have an id like 12#10 and you don't want it to be between 12#1 and 12#2. To solve that problem, you need to split the id and have a more complex sorting function.
def get_values(item):
return [int(x) for x in item['id'].split('#')]
def compare(a, b):
a = get_values(a)
b = get_values(b)
if not a[0] == b[0]:
return a[0] - b[0]
return a[1] - b[1]
Then you call sorted using that comparison function:
sorted(demo, cmp=compare)
Or in Python 3, where cmp has been eliminated:
from functools import cmp_to_key
sorted(demo, key=cmp_to_key(compare))
If demo is the list (note the brackets and commas)
demo= [{'age': 90, 'id': '12#2'},
{'age': 12, 'id': '12#3'},
{'age': 67, 'id': '12#1'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'}]
Then you could sort it by id with:
sorted(demo, key=lambda x: x['id'])
For example:
In [5]: sorted(demo, key=lambda x: x['id'])
Out[5]:
[{'age': 67, 'id': '12#1'},
{'age': 90, 'id': '12#2'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'},
{'age': 12, 'id': '12#3'}]
demo= [{'age': 90, 'id': '12#2'},
{'age': 12, 'id': '12#3'},
{'age': 67, 'id': '12#1'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'}]
a = sorted(demo, key=lambda x: x['id'])
for el in a:
print el
gives
{'age': 67, 'id': '12#1'}
{'age': 90, 'id': '12#2'}
{'age': 56, 'id': '12#2'}
{'age': 34, 'id': '12#2'}
{'age': 12, 'id': '12#3'}
which is sorted by id.
Sort by multiple attributes
demo= [{'age': 90, 'id': '12#2'},
{'age': 12, 'id': '12#3'},
{'age': 67, 'id': '12#1'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'}]
a = sorted(demo, key=lambda x: (x['id'], x['age']))
for el in a:
print el
gives
{'age': 67, 'id': '12#1'}
{'age': 34, 'id': '12#2'}
{'age': 56, 'id': '12#2'}
{'age': 90, 'id': '12#2'}
{'age': 12, 'id': '12#3'}
which is first sorted by id and then by age (ascending).
Alternatively, if you want to sort ASC by id and DESC by age, you can make something like this:
demo= [{'age': 90, 'id': '12#2'},
{'age': 12, 'id': '12#3'},
{'age': 67, 'id': '12#1'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'}]
a = sorted(demo, key=lambda x: (x['id'], -x['age']))
for el in a:
print el
which gives
{'age': 67, 'id': '12#1'}
{'age': 90, 'id': '12#2'}
{'age': 56, 'id': '12#2'}
{'age': 34, 'id': '12#2'}
{'age': 12, 'id': '12#3'}
Your example does not inlude a list, you need to have [] around your dictionaries. I have fixed that for you:
>>> demo= [{'age': 90, 'id': '12#2'},
{'age': 12, 'id': '12#3'},
{'age': 67, 'id': '12#1'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'}]
>>> sorted(demo, key=lambda x: x['id'])
[{'age': 67, 'id': '12#1'},
{'age': 90, 'id': '12#2'},
{'age': 56, 'id': '12#2'},
{'age': 34, 'id': '12#2'},
{'age': 12, 'id': '12#3'}]