This question already has answers here:
Python Pandas max value in a group as a new column
(3 answers)
Closed last year.
I have a DF, Sample below:
Group $ Type
1 50 A
1 0 B
1 0 C
2 150 A
2 0 B
2 0 C
What I want to do is populate the $ column with the value associated with the column A, by each group.
Resulting DF will look like the below:
Group $ Type
1 50 A
1 50 B
1 50 C
2 150 A
2 150 B
2 150 C
I have tried various np.where functions but can't seem to get the desired output.
Thanks in advance!
Try with groupby with transform max
df['new$'] = df.groupby('Group')['$'].transform('max')
df
Out[371]:
Group $ Type new$
0 1 50 A 50
1 1 0 B 50
2 1 0 C 50
3 2 150 A 150
4 2 0 B 150
5 2 0 C 150
Related
This question already has answers here:
Apply multiple functions to multiple groupby columns
(7 answers)
Closed 6 months ago.
I have a dataframe that looks something like this:
Individual Category Amount Extras
A 1 250 30
A 1 300 10
A 1 500 8
A 2 350 12
B 1 200 9
B 2 300 20
B 2 450 15
I want to get a dataframe that looks like this:
Individual Category Count Amount Extras
A 1 3 1050 48
A 2 1 350 12
B 1 1 200 9
B 2 2 750 35
I know that you can use groupby with Pandas, but is it possible to group using count and sum simultaneously?
You could try as follows:
output_df = df.groupby(['Individual','Category']).agg(
Count=('Individual', 'count'),
Amount=('Amount','sum'),
Extras=('Extras','sum')).reset_index(drop=False)
print(output_df)
Individual Category Count Amount Extras
0 A 1 3 1050 48
1 A 2 1 350 12
2 B 1 1 200 9
3 B 2 2 750 35
So, we are using df.groupby, and then apply named aggregation, allowing us to "[name] output columns when applying multiple aggregation functions to specific columns".
I have a pandas dataframe:
A B C D
1 1 0 32
1 4
2 0 43
1 12
3 0 58
1 34
2 1 0 37
1 5
[..]
where A, B and C are index columns. What I want to compute is for every group of rows with unique values for A and B: D WHERE C=1 / D WHERE C=0.
The result should look like this:
A B NEW
1 1 4/32
2 12/43
3 58/34
2 1 37/5
[..]
Can you help me?
Use Series.unstack first, so possible divide columns 0,1:
new = df['D'].unstack()
new = new[1].div(new[0]).to_frame('NEW')
print (new)
NEW
A B
1 1 0.125000
2 0.279070
3 0.586207
2 2 0.135135
This seems to be easy but couldn't find a working solution for it:
I have a dataframe with 3 columns:
df = pd.DataFrame({'A': [0,0,2,2,2],
'B': [1,1,2,2,3],
'C': [1,1,2,3,4]})
A B C
0 0 1 1
1 0 1 1
2 2 2 2
3 2 2 3
4 2 3 4
I want to select rows based on values of column A, then groupby based on values of column B, and finally transform values of column C into sum. something along the line of this (obviously not working) code:
df[df['A'].isin(['2']), 'C'] = df[df['A'].isin(['2']), 'C'].groupby('B').transform('sum')
desired output for above example is:
A B C
0 0 1 1
1 0 1 1
2 2 2 5
3 2 3 4
I also know how to split dataframe and do it. I am looking more for a solution that does it without the need of split+concat/merge. Thank you.
Is it just
s = df['A'].isin([2])
pd.concat((df[s].groupby(['A','B'])['C'].sum().reset_index(),
df[~s])
)
Output:
A B C
0 2 2 5
1 2 3 4
0 0 1 1
Update: Without splitting, you can assign a new column indicating special values of A:
(df.sort_values('A')
.assign(D=(~df['A'].isin([2])).cumsum())
.groupby(['D','A','B'])['C'].sum()
.reset_index('D',drop=True)
.reset_index()
)
Output:
A B C
0 0 1 1
1 0 1 1
2 2 2 5
3 2 3 4
This question already has answers here:
How can I map the headers to columns in pandas?
(5 answers)
Closed 5 years ago.
I am working with Python Pandas Dataframe and trying to print a list of columns for each row in my dataset, assume that each column can have a 0 or 1 value. Eg:
id A B C D
0 1 1 1 1
1 0 1 0 1
2 1 1 0 0
3 1 0 0 0
Now, I need my output to be:
id output
0 A,B,C,D
1 B,D
2 A,B
3 A
Please note that I need to prepare a generic function irrespective of column names or number.
You can do:
df = df.assign(output=df.dot(df.columns))
df[['output']]
id output
0 A,B,C,D
1 B,D
2 A,B
3 A
Hi I will show what im trying to do through examples:
I start with a dataframe like this:
> pd.DataFrame({'A':['a','a','a','c'],'B':[1,1,2,3], 'count':[5,6,1,7]})
A B count
0 a 1 5
1 a 1 6
2 a 2 1
3 c 3 7
I need to find a way to get all the unique combinations between column A and B, and merge them. The count column should be added together between the merged columns, the result should be like the following:
A B count
0 a 1 11
1 a 2 1
2 c 3 7
Thans for any help.
Use groupby with aggregating sum:
print (df.groupby(['A','B'], as_index=False)['count'].sum())
A B count
0 a 1 11
1 a 2 1
2 c 3 7
print (df.groupby(['A','B'])['count'].sum().reset_index())
A B count
0 a 1 11
1 a 2 1
2 c 3 7