How does pandas do sorting - python

At a high-level, how does pandas do sorting? For example, if I had the following dataframe:
l = [{
'Name': 'Todd',
'Age': 20,
}, {
'Name': 'Sarah',
'Age': 25,
}, {
'Name': 'Sarah',
'Age': 29,
}]
df = pd.DataFrame(l)
df.sort_values(by=['Name', 'Age'])
Is this using a python built-in, such as:
sorted(l, key = lambda x: (x['Name'], x['Age']))
Or is the pandas sort something more complex? What is its basic implementation?

Related

How to find the max value from a list of dicts based on different keys in Python

I have below list of dicts
[{
'NAV': 50,
'id': '61e6b2a1d0c32b744d3e3b2d'
}, {
'NAV': 25,
'id': '61e7fbe2d0c32b744d3e6ab4'
}, {
'NAV': 30,
'id': '61e801cbd0c32b744d3e7003'
}, {
'NAV': 30,
'id': '61e80663d0c32b744d3e7c51'
}, {
'NAV': 30,
'id': '61e80d9ad0c32b744d3e8da6'
}, {
'NAV': 30,
'id': '61e80f5fd0c32b744d3e93f0'
}, {
'NAV': 30,
'id': '61e90908d0c32b744d3ea967'
}, {
'NAV': 30,
'id': '61ea7cf3d0c32b744d3ed1b2'
}, {
'NAV': 50,
'id': '61fa387127e14670f3a67194'
}, {
'NAV': 30,
'id': '61fa3cea27e14670f3a6772c'
}, {
'Amount': 30,
'id': '61e6b373d0c32b744d3e3d14'
}, {
'Amount': 30,
'id': '61e6b49cd0c32b744d3e3ea0'
}, {
'Amount': 25,
'id': '61e7fe90d0c32b744d3e6ccd'
}, {
'Amount': 20,
'id': '61e80246d0c32b744d3e7242'
}, {
'Amount': 20,
'id': '61e80287d0c32b744d3e74ae'
}, {
'Amount': 20,
'id': '61e80253d0c32b744d3e733e'
}, {
'Amount': 34,
'id': '61e80697d0c32b744d3e7edd'
}, {
'Amount': 20,
'id': '61e806a3d0c32b744d3e7ff9'
}, {
'Amount': 30,
'id': '61e80e0ad0c32b744d3e906e'
}, {
'Amount': 30,
'id': '61e80e22d0c32b744d3e9198'
}, {
'Amount': 20,
'id': '61e81011d0c32b744d3e978e'
}, {
'Amount': 20,
'id': '61e8104bd0c32b744d3e9a92'
}, {
'Amount': 20,
'id': '61e81024d0c32b744d3e98cd'
}, {
'Amount': 20,
'id': '61e90994d0c32b744d3eac2b'
}, {
'Amount': 20,
'id': '61e909aad0c32b744d3ead76'
}, {
'Amount': 50,
'id': '61fa392a27e14670f3a67337'
}, {
'Amount': 50,
'id': '61fa393727e14670f3a67347'
}, {
'Amount': 50,
'id': '61fa3d6727e14670f3a67750'
}, {
'Amount': 150,
'id': '61fa3d7127e14670f3a67760'
}]
Above list contains dict which has key as NAV and Amount. I need to find the max value separately among all the dicts for NAV and Amount. So that output is
NAV = 50
Amount = 150
I have tried some approach like:
max(outList, key=lambda x: x['NAV'])
But this is giving me keyerror of 'NAV'. What is the best way to do it?
I don't understand why you are calling Current NAV ($ M). It doesn't exist in the list you have provided. Anyway, I came up with the code below:
def getMax(value):
if "NAV" in value:
return value["NAV"]
else:
return value["Amount"]
max(outList, key= getMax)
If you are interested in finding the max value for NAV and Amount separately, you can try filtering the list out and then calling the lambda as you used before.
print(max([x["NAV"] for x in outList if "NAV" in x]))
print(max([x["Amount"] for x in outList if "Amount" in x])
you could try something like this:
print max([i["NAV"] for i in t if "NAV" in i])
print max([i["Amount"] for i in t if "Amount" in i])
Result:
50
150
def max_from_list(t_list, key):
return max([i[key] for i in t_list if key in i])
if you are not only looking for NAV and Amount, maybe you can do as below:
from collections import defaultdict
res = defaultdict(list)
for i in d:
for k, v in i.items():
if k != 'id':
res[k] = max(res.get(k, 0), v)
You are on the right track with your max solution:
assuming your list is called outList (BTW, that is not a pythonic name, try out_list instead)
nav = max(outList, key=lambda item: item.get("NAV", float("-inf")))['NAV']
amount = max(outList, key=lambda item: item.get("Amount", float("-inf")))['Amount']

python list of dictionaries add a value by key

data = [
{
'name': 'Jack',
'points': 10
},
{
'name': 'John',
'points': 12
},
{
'name': 'Jack',
'points': 15
},
{
'name': 'Harry',
'points': 11
}
]
Output:
Jack: 25 points ,
John: 12 points ,
Harry: 11 points
Is there anyway to achieve this without using for loop ?
I can achieve this by storing key value pair or name and points and adding the points if already exists in the dictionary. But is there any alternative way to achieve this ?
You can use groupby from itertools:
from itertools import groupby
key = lambda d: d['name']
result = {n: sum(v['points'] for v in vs) for n, vs in groupby(sorted(data, key=key), key)}
Result:
{'Harry': 11, 'Jack': 25, 'John': 12}

how to convert a pandas dataframe to a list of dictionaries in python?

I have a dataframe like this:
data = {'id': [1,1,2,2,2,3],
'value': ['a','b','c','d','e','f']
}
df = pd.DataFrame (data, columns = ['id','value'])
I want to convert it to a list of dictionary like:
df_dict = [
{
'id': 1,
'value':['a','b']
},
{
'id': 2,
'value':['c','d','e']
},
{
'id': 3,
'value':['f']
}
]
And then eventually insert this list df_dict to another dictionary:
{
"products": [
{
"productID": 1234,
"tag": df_dict
}
]
}
We don't need to worry about how the other dictionary looks like. We can simply use the example I gave above.
How do I do that? Many thanks!
You can groupby and then use to_dict to convert it to a dictionary.
>>> df.groupby(df['id'], as_index=False).agg(list).to_dict(orient="records")
[{'id': 1, 'value': ['a', 'b']}, {'id': 2, 'value': ['c', 'd', 'e']}, {'id': 3, 'value': ['f']}]

Python: Convert multiple list into an array of dictionary

Imagine that you have the following list.
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
How do you convert it to an array of dictionary?
[
{
"name": "bob"
"age": 35
"gender": "Male"
},
{
"name": "kate"
"age": 12
"gender": "Female"
},
{
"name": "john"
"age": 57
"gender": "Male"
}
]
A generic method which works for any number of lists with customizable field names
import pprint
def make_complex(**kwargs):
return [dict(zip(kwargs.keys(), a)) for a in zip(*kwargs.values())]
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
l = make_complex(name=name, age=age, gender=gender)
pprint.pprint(l)
l = make_complex(user=name, year=age, sex=gender)
pprint.pprint(l)
output:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
[{'sex': 'Male', 'user': 'bob', 'year': 35},
{'sex': 'Female', 'user': 'kate', 'year': 12},
{'sex': 'Male', 'user': 'john', 'year': 57}]
Using zip ,List comprehension
Code:
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
dic= [ {"name":val[0], "age":val[1], "gender":val[2]} for val in zip(name, age, gender)]
Output:
[{'name':'bob','age':35,'gender':'Male'},
{'name':'kate','age':12,'gender':'Female'},
{'name':'john','age':57,'gender':'Male'}]
Using a simple loop it would look something like:
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
list=[]
for i in range(len(name)):
temp={}
temp['name']=name[i]
temp['age']=age[i]
temp['gender']=gender[i]
list.append(temp)
Using a list comprehension and itertools
import itertools
d = [{'name': n, 'age': a, 'gender': g} for n, a, g in itertools.izip(name, age, gender)]
Use list comprehension.
In [3]: [{"name":n,"age":a,"gender":g} for n,a,g in zip(name, age, gender)]
Out[3]:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
or,
In [5]: [dict(zip(['name','age','gender'], t)) for t in zip(name, age, gender)]
Out[5]:
[{'age': 35, 'gender': 'Male', 'name': 'bob'},
{'age': 12, 'gender': 'Female', 'name': 'kate'},
{'age': 57, 'gender': 'Male', 'name': 'john'}]
Go for this.
name = ['bob', 'kate', 'john']
age = [35, 12, 57]
gender = ["Male", "Female", "Male"]
keys = [name, age, gender] #If there are more data to be added just change this one place
def get_var_name(var):
for k, v in list(globals().iteritems()):
if v is var:
return k
d = []
for i in range(len(keys[0])):
d.append({})
for key in keys:
d[i][get_var_name(key)] = key[i]
print d
Or use dict comprehension to avoid inner loop
d = []
for i in range(len(name)):
d.append({get_var_name(key):key[i] for key in keys})
print d
To make it one liner go combining dict comprehension inner and list comprehension outer
print [{get_var_name(key):key[i] for key in keys} for i in range(len(keys[0]))]

How to sort list like this?

I have a list like this:
li = [
{
'name': 'Lee',
'age': 22
},
{
'name': 'Mike',
'age': 34
},
{
'name': 'John',
'age': 23
}
]
I want sort the list with sorted method, and sort by the the age key
How to achieve it?
Use a key function:
li_sorted = sorted(li, key=lambda x: x['age'])
The Python3 equivalent of what #kojiro suggests is this
>>> sorted(li, key=lambda x:sorted(x.items()))
[{'age': 22, 'name': 'Lee'}, {'age': 23, 'name': 'John'}, {'age': 34, 'name': 'Mike'}]
Clearly this is less efficient than
>>> sorted(li, key=lambda x:x['age'])
[{'age': 22, 'name': 'Lee'}, {'age': 23, 'name': 'John'}, {'age': 34, 'name': 'Mike'}]
anyway. There is also the advantage that it doesn't rely on the fact that 'age' < 'name'
Here's how to write the same thing using itemgetter
>>> from operator import itemgetter
>>> sorted(li, key=itemgetter('age'))
[{'age': 22, 'name': 'Lee'}, {'age': 23, 'name': 'John'}, {'age': 34, 'name': 'Mike'}]

Categories

Resources