This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 12 months ago.
I have the following data frame:
ID
value
freq
A
a
0.1
A
b
0.12
A
c
0.19
B
a
0.15
B
b
0.2
B
c
0.09
C
a
0.39
C
b
0.15
C
c
0.01
and I would like to get the following
ID
freq_a
freq_b
freq_c
A
0.1
0.12
0.19
B
0.15
0.2
0.09
C
0.39
0.15
0.01
Any ideas how to easily do this?
using pivot:
df.pivot(index='ID', columns='value', values='freq').add_prefix('freq_').reset_index()
output:
>>
value ID freq_a freq_b freq_c
0 A 0.10 0.12 0.19
1 B 0.15 0.20 0.09
2 C 0.39 0.15 0.01
Use pivot_table:
out = df.pivot_table('freq', 'ID', 'value').add_prefix('freq_') \
.rename_axis(columns=None).reset_index()
print(out)
# Output
ID freq_a freq_b freq_c
0 A 0.10 0.12 0.19
1 B 0.15 0.20 0.09
2 C 0.39 0.15 0.01
Related
Let's say in the dataframe df there is:
a b c d
ana 31% 26% 29%
bob 52% 45% 9%
cal 11% 6% 23%
dan 29% 12% 8%
where all data types under a, b c and d are objects. I want to convert b, c and d to their decimal forms with:
df.columns = df.columns.str.rstrip('%').astype('float') / 100.0
but I don't know how to not include column a
Let us do update with to_numeric
df.update(df.apply(lambda x : pd.to_numeric(x.str.rstrip('%'),errors='coerce'))/100)
df
Out[128]:
a b c d
0 ana 0.31 0.26 0.29
1 bob 0.52 0.45 0.09
2 cal 0.11 0.06 0.23
3 dan 0.29 0.12 0.08
Use Index.drop for all columns without a with DataFrame.replace, convert to floats and divide by 100:
cols = df.columns.drop('a')
df[cols] = df[cols].replace('%', '', regex=True).astype('float') / 100.0
print (df)
a b c d
0 ana 0.31 0.26 0.29
1 bob 0.52 0.45 0.09
2 cal 0.11 0.06 0.23
3 dan 0.29 0.12 0.08
Or you can convert first column to index by DataFrame.set_index, so all columns without a should be processing:
df = df.set_index('a').replace('%', '', regex=True).astype('float') / 100.0
print (df)
b c d
a
ana 0.31 0.26 0.29
bob 0.52 0.45 0.09
cal 0.11 0.06 0.23
dan 0.29 0.12 0.08
Hello I have dataframe with two column wanted to put some condition in column to create another column in a dataframe. condition will be according to values present in the column1 or second
dataframe example:
stage 21_A_ex1 21_B_ex2
stage1 0 1
stage2 0.55 0.45
stage3 0.66 0.34
stage4 0.87 0.13
stage5 0.63 0.37
stage6 1 0
stage7 0.95 0.05
stage8 0.97 0.03
stage9 0.02 0.98
my conditions are column1 <=0.95 and > 0.05 new column value will be "BOTH", value > 0.95 new column value will be "ex1" if value < 0.05 new column value ex2
df.column[1] <= 0.95 and >0.05 BOTH
df.column[1] > 0.95 ex1
df.column[1] < 0.05 ex2
output
stage 2131_A_ex1 2131_B_ex2 2131
stage1 0 1 ex2
stage2 0.55 0.45 BOTH
stage3 0.66 0.34 BOTH
stage4 0.87 0.13 BOTH
stage5 0.63 0.37 BOTH
stage6 1 0 ex1
stage7 0.95 0.05 BOTH
stage8 0.97 0.03 ex1
stage9 0.02 0.98 ex2
I tried below command but didnt get my output I know I didnot put all condition in below command. any one help me how can I put other condition to get my output
df['Type'] = df.apply(lambda x: "BOTH" if x["21_A_ex1"] <= 0.95 else "ex1", axis=1)
You should use np.select
import numpy as np
c1 = (0.05 < df["21_A_ex1"]) & (df["21_A_ex1"] <= 0.95 )
c2 = df["21_A_ex1"] <= 0.05
df['Type'] = np.select([c1, c2], ['BOTH', 'ex2'], 'ex1')
Out[148]:
stage 21_A_ex1 21_B_ex2 Type
0 stage1 0.00 1.00 ex2
1 stage2 0.55 0.45 BOTH
2 stage3 0.66 0.34 BOTH
3 stage4 0.87 0.13 BOTH
4 stage5 0.63 0.37 BOTH
5 stage6 1.00 0.00 ex1
6 stage7 0.95 0.05 BOTH
7 stage8 0.97 0.03 ex1
8 stage9 0.02 0.98 ex2
I have a pandas dataframe that looks like:
Best_val A B C Value(1 - Best_Val)
A 0.1 0.29 0.3 0.9
B 0.33 0.21 0.45 0.79
A 0.16 0.71 0.56 0.84
C 0.51 0.26 0.85 0.15
I want to fetch the column value from Best_val for that row an use it as column name to subtract t from 1 to be stored in Value
Use DataFrame.lookup for performance.
df['Value'] = 1 - df.lookup(df.index, df.BestVal)
df
BestVal A B C Value
0 A 0.10 0.29 0.30 0.90
1 B 0.33 0.21 0.45 0.79
2 A 0.16 0.71 0.56 0.84
3 C 0.51 0.26 0.85 0.15
You could use apply:
import pandas as pd
data = [['A', 0.1, 0.29, 0.3],
['B', 0.33, 0.21, 0.45],
['A', 0.16, 0.71, 0.56],
['C', 0.51, 0.26, 0.85]]
df = pd.DataFrame(data=data, columns=['BestVal', 'A', 'B', 'C'])
df['Value'] = df.apply(lambda x: 1 - x[x.BestVal], axis=1)
print(df)
Output
BestVal A B C Value
0 A 0.10 0.29 0.30 0.90
1 B 0.33 0.21 0.45 0.79
2 A 0.16 0.71 0.56 0.84
3 C 0.51 0.26 0.85 0.15
I need to do somthing like this:
Image
ID 20170101 20170106 20170111
A 0.31 0.1 0.2
B 0.3 0.2 0.1
C 0.11 0.12 0.13
D 0.3 0.3 0.4
ID DATES NDVI_mean
A 20170101 0.31
A 20170106 0.1
A 20170111 0.2
B 20170101 0.3
B 20170106 0.2
B 20170111 0.1
C 20170101 0.11
C 20170106 0.12
C 20170111 0.13
D 20170101 0.3
D 20170106 0.3
D 20170111 0.4
Description:I have one column with "id" and a lot of columns with dates, each column contains values od ndvi. I need to transpose every date to one column named "Dates" and values of that dates in other column named "NDVI_mean", the filed id must has to be repeated as many times as columns of dates we have
I canĀ“t use the tool "transpose fields" of arcpy, only free code.
Please, help me.
Thank you
You can use the melt function:
In [1611]: df
Out[1617]:
ID 20170101 20170106 20170111
0 A 0.31 0.10 0.20
1 B 0.30 0.20 0.10
2 C 0.11 0.12 0.13
3 D 0.30 0.30 0.40
In [1613]: pd.melt(df, id_vars='ID', var_name='Date', value_name="NDVI_mean").sort_values('ID')
Out[1614]:
ID Date NDVI_mean
0 A 20170101 0.31
4 A 20170106 0.10
8 A 20170111 0.20
1 B 20170101 0.30
5 B 20170106 0.20
9 B 20170111 0.10
2 C 20170101 0.11
6 C 20170106 0.12
10 C 20170111 0.13
3 D 20170101 0.30
7 D 20170106 0.30
11 D 20170111 0.40
Let me know if it works.
I have a dataframe that looks as following:
Type Month Value
A 1 0.29
A 2 0.90
A 3 0.44
A 4 0.43
B 1 0.29
B 2 0.50
B 3 0.14
B 4 0.07
I want to change the dataframe to following format:
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
Is this possible ?
Use set_index + unstack
df.set_index(['Month', 'Type']).Value.unstack()
Type A B
Month
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
To match your exact output
df.set_index(['Month', 'Type']).Value.unstack().rename_axis(None)
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
Pivot solution:
In [70]: df.pivot(index='Month', columns='Type', values='Value')
Out[70]:
Type A B
Month
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
In [71]: df.pivot(index='Month', columns='Type', values='Value').rename_axis(None)
Out[71]:
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
You're having a case of long format table which you want to transform to a wide format.
This is natively handled in pandas:
df.pivot(index='Month', columns='Type', values='Value')