Adding a entire column data below the other column in pandas [duplicate] - python

This question already has answers here:
Convert columns into rows with Pandas
(6 answers)
Closed 2 years ago.
I have a dataframe like this:
time a b
0 10 20
1 11 21
Now i need a dataframe like this:
time a
0 10
1 11
0 20
1 21

This can be done with melt:
df.melt('time', value_name='a').drop('variable', axis=1)
Output:
time a
0 0 10
1 1 11
2 0 20
3 1 21
Or if you have columns other than a,b in your data:
df.melt('time', ['a','b'], value_name='a').drop('variable', axis=1)

Related

How to pivot tables with duplicate entries? Order matters [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 6 months ago.
I have a Pandas DataFrame in Python such as this:
Group Pre/post Value
0 A Pre 3
1 A Pre 5
2 A Post 13
3 A Post 15
4 B Pre 7
5 B Pre 8
6 B Post 17
7 B Post 18
And I'd like to turn it into a different table such as:
Group Pre Post
0 A 3 13
1 A 5 15
2 B 7 17
3 B 8 18
I tried pivoting with df.pivot(index='Group', columns='Pre/post', values='Value') but since I have repeated values and order is important, it went traceback
Here is one way to do it, use list as an aggfunc in pivot_table, to collect the duplicate values for index and column as a list, then using explode split the list into multiple rows.
df.pivot_table(index='Group', columns='Pre/post', values='Value', aggfunc=list
).reset_index().explode(['Post','Pre'], ignore_index=True)
Pre/post Group Post Pre
0 A 13 3
1 A 15 5
2 B 17 7
3 B 18 8

How to add a new pandas column whose value is conditioned on one column, but value depends on other columns? [duplicate]

This question already has answers here:
Pandas conditional creation of a series/dataframe column
(13 answers)
Closed 1 year ago.
I have a dataframe that looks like this:
idx group valA valB
-----------------------
0 A 10 5
1 A 22 7
2 B 9 0
3 B 6 1
I want to add a new column 'val' that takes 'valA' if group = 'A' and takes 'valB' if group = 'B'.
idx group valA valB val
---------------------------
0 A 10 5 10
1 A 22 7 22
2 B 9 0 0
3 B 6 1 1
How can I do this?
This should do the trick
df['val'] = df.apply(lambda x: x['valA'] if x['group'] == 'A' else x['valB'], axis=1)

DataFrame values frequency [duplicate]

This question already has answers here:
Count number of values in an entire DataFrame
(3 answers)
Closed 1 year ago.
I have a DataFrame which I want to find value frequencies through all the frame.
a b
0 5 7
1 7 8
2 5 7
The result should be like:
5 2
7 3
8 1
Use DataFrame.stack with Series.value_counts and Series.sort_index:
s = df.stack().value_counts().sort_index()
Or DataFrame.melt:
s = df.melt()['value'].value_counts().sort_index()
print (s)
5 2
7 3
8 1
Name: value, dtype: int64
a simple way is to use pd.Series for finding the unique count:
import pandas as pd
# creating the series
s = pd.Series(data = [5,10,9,8,8,4,5,9,10,0,1])
# finding the unique count
print(s.value_counts())
output:
10 2
9 2
8 2
5 2
4 1
1 1
0 1

How to add all top cells in present cell of a column in dataframe [duplicate]

This question already has an answer here:
Cumsum as a new column in an existing Pandas data
(1 answer)
Closed 2 years ago.
For example.
Let us assume we are having below dataframe:
Num
0 2
1 4
2 1
3 5
4 3
The expected output in another "sum" should be as below:
Num sum
0 2 2
1 4 6 (2+4)
2 1 7 (2+4+1)
3 5 12 (2+4+1+5)
4 3 15 (2+4+1+5+3)
This can be achieved using cumsum:
df['sum'] = df['Num'].cumsum()

Sort Pandas dataframe according to list of column names [duplicate]

This question already has answers here:
How to change the order of DataFrame columns?
(41 answers)
Closed 4 years ago.
I have a pandas dataframe like this-
d = {'class': [0, 1,1,0,1,0], 'A': [0,4,8,1,0,0],'B':[4,1,0,0,3,1],'Z':[0,9,3,1,4,7]}
df = pd.DataFrame(data=d)
A B Z class
0 0 4 0 0
1 4 1 9 1
2 8 0 3 1
3 1 0 1 0
4 0 3 4 1
5 0 1 7 0
and I have a list like this-['Z','B','class','A']
Now I want to sort my pandas dataframe according to the list of column names
therefore the new dataframe would have the columns names-
Z B class A
Use reindex:
L = ['Z','B','class','A']
df = df.reindex(columns=L)
Or select by subset:
df = df[L]

Categories

Resources