Convert pandas dateframe from row to column [duplicate] - python

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 12 months ago.
I have the following data frame:
ID
value
freq
A
a
0.1
A
b
0.12
A
c
0.19
B
a
0.15
B
b
0.2
B
c
0.09
C
a
0.39
C
b
0.15
C
c
0.01
and I would like to get the following
ID
freq_a
freq_b
freq_c
A
0.1
0.12
0.19
B
0.15
0.2
0.09
C
0.39
0.15
0.01
Any ideas how to easily do this?

using pivot:
df.pivot(index='ID', columns='value', values='freq').add_prefix('freq_').reset_index()
output:
>>
value ID freq_a freq_b freq_c
0 A 0.10 0.12 0.19
1 B 0.15 0.20 0.09
2 C 0.39 0.15 0.01

Use pivot_table:
out = df.pivot_table('freq', 'ID', 'value').add_prefix('freq_') \
.rename_axis(columns=None).reset_index()
print(out)
# Output
ID freq_a freq_b freq_c
0 A 0.10 0.12 0.19
1 B 0.15 0.20 0.09
2 C 0.39 0.15 0.01

Related

Converting all object columns to float except for one column

Let's say in the dataframe df there is:
a b c d
ana 31% 26% 29%
bob 52% 45% 9%
cal 11% 6% 23%
dan 29% 12% 8%
where all data types under a, b c and d are objects. I want to convert b, c and d to their decimal forms with:
df.columns = df.columns.str.rstrip('%').astype('float') / 100.0
but I don't know how to not include column a
Let us do update with to_numeric
df.update(df.apply(lambda x : pd.to_numeric(x.str.rstrip('%'),errors='coerce'))/100)
df
Out[128]:
a b c d
0 ana 0.31 0.26 0.29
1 bob 0.52 0.45 0.09
2 cal 0.11 0.06 0.23
3 dan 0.29 0.12 0.08
Use Index.drop for all columns without a with DataFrame.replace, convert to floats and divide by 100:
cols = df.columns.drop('a')
df[cols] = df[cols].replace('%', '', regex=True).astype('float') / 100.0
print (df)
a b c d
0 ana 0.31 0.26 0.29
1 bob 0.52 0.45 0.09
2 cal 0.11 0.06 0.23
3 dan 0.29 0.12 0.08
Or you can convert first column to index by DataFrame.set_index, so all columns without a should be processing:
df = df.set_index('a').replace('%', '', regex=True).astype('float') / 100.0
print (df)
b c d
a
ana 0.31 0.26 0.29
bob 0.52 0.45 0.09
cal 0.11 0.06 0.23
dan 0.29 0.12 0.08

Pandas Compare two column and getting another column with three diffrent inference according to condition

Hello I have dataframe with two column wanted to put some condition in column to create another column in a dataframe. condition will be according to values present in the column1 or second
dataframe example:
stage 21_A_ex1 21_B_ex2
stage1 0 1
stage2 0.55 0.45
stage3 0.66 0.34
stage4 0.87 0.13
stage5 0.63 0.37
stage6 1 0
stage7 0.95 0.05
stage8 0.97 0.03
stage9 0.02 0.98
my conditions are column1 <=0.95 and > 0.05 new column value will be "BOTH", value > 0.95 new column value will be "ex1" if value < 0.05 new column value ex2
df.column[1] <= 0.95 and >0.05 BOTH
df.column[1] > 0.95 ex1
df.column[1] < 0.05 ex2
output
stage 2131_A_ex1 2131_B_ex2 2131
stage1 0 1 ex2
stage2 0.55 0.45 BOTH
stage3 0.66 0.34 BOTH
stage4 0.87 0.13 BOTH
stage5 0.63 0.37 BOTH
stage6 1 0 ex1
stage7 0.95 0.05 BOTH
stage8 0.97 0.03 ex1
stage9 0.02 0.98 ex2
I tried below command but didnt get my output I know I didnot put all condition in below command. any one help me how can I put other condition to get my output
df['Type'] = df.apply(lambda x: "BOTH" if x["21_A_ex1"] <= 0.95 else "ex1", axis=1)
You should use np.select
import numpy as np
c1 = (0.05 < df["21_A_ex1"]) & (df["21_A_ex1"] <= 0.95 )
c2 = df["21_A_ex1"] <= 0.05
df['Type'] = np.select([c1, c2], ['BOTH', 'ex2'], 'ex1')
Out[148]:
stage 21_A_ex1 21_B_ex2 Type
0 stage1 0.00 1.00 ex2
1 stage2 0.55 0.45 BOTH
2 stage3 0.66 0.34 BOTH
3 stage4 0.87 0.13 BOTH
4 stage5 0.63 0.37 BOTH
5 stage6 1.00 0.00 ex1
6 stage7 0.95 0.05 BOTH
7 stage8 0.97 0.03 ex1
8 stage9 0.02 0.98 ex2

Lookup values in cells based on values in another column

I have a pandas dataframe that looks like:
Best_val A B C Value(1 - Best_Val)
A 0.1 0.29 0.3 0.9
B 0.33 0.21 0.45 0.79
A 0.16 0.71 0.56 0.84
C 0.51 0.26 0.85 0.15
I want to fetch the column value from Best_val for that row an use it as column name to subtract t from 1 to be stored in Value
Use DataFrame.lookup for performance.
df['Value'] = 1 - df.lookup(df.index, df.BestVal)
df
BestVal A B C Value
0 A 0.10 0.29 0.30 0.90
1 B 0.33 0.21 0.45 0.79
2 A 0.16 0.71 0.56 0.84
3 C 0.51 0.26 0.85 0.15
You could use apply:
import pandas as pd
data = [['A', 0.1, 0.29, 0.3],
['B', 0.33, 0.21, 0.45],
['A', 0.16, 0.71, 0.56],
['C', 0.51, 0.26, 0.85]]
df = pd.DataFrame(data=data, columns=['BestVal', 'A', 'B', 'C'])
df['Value'] = df.apply(lambda x: 1 - x[x.BestVal], axis=1)
print(df)
Output
BestVal A B C Value
0 A 0.10 0.29 0.30 0.90
1 B 0.33 0.21 0.45 0.79
2 A 0.16 0.71 0.56 0.84
3 C 0.51 0.26 0.85 0.15

Transpose Fields csv, Python (Numpy or Pandas)

I need to do somthing like this:
Image
ID 20170101 20170106 20170111
A 0.31 0.1 0.2
B 0.3 0.2 0.1
C 0.11 0.12 0.13
D 0.3 0.3 0.4
ID DATES NDVI_mean
A 20170101 0.31
A 20170106 0.1
A 20170111 0.2
B 20170101 0.3
B 20170106 0.2
B 20170111 0.1
C 20170101 0.11
C 20170106 0.12
C 20170111 0.13
D 20170101 0.3
D 20170106 0.3
D 20170111 0.4
Description:I have one column with "id" and a lot of columns with dates, each column contains values od ndvi. I need to transpose every date to one column named "Dates" and values of that dates in other column named "NDVI_mean", the filed id must has to be repeated as many times as columns of dates we have
I canĀ“t use the tool "transpose fields" of arcpy, only free code.
Please, help me.
Thank you
You can use the melt function:
In [1611]: df
Out[1617]:
ID 20170101 20170106 20170111
0 A 0.31 0.10 0.20
1 B 0.30 0.20 0.10
2 C 0.11 0.12 0.13
3 D 0.30 0.30 0.40
In [1613]: pd.melt(df, id_vars='ID', var_name='Date', value_name="NDVI_mean").sort_values('ID')
Out[1614]:
ID Date NDVI_mean
0 A 20170101 0.31
4 A 20170106 0.10
8 A 20170111 0.20
1 B 20170101 0.30
5 B 20170106 0.20
9 B 20170111 0.10
2 C 20170101 0.11
6 C 20170106 0.12
10 C 20170111 0.13
3 D 20170101 0.30
7 D 20170106 0.30
11 D 20170111 0.40
Let me know if it works.

python pandas change dataframe to pivoted columns

I have a dataframe that looks as following:
Type Month Value
A 1 0.29
A 2 0.90
A 3 0.44
A 4 0.43
B 1 0.29
B 2 0.50
B 3 0.14
B 4 0.07
I want to change the dataframe to following format:
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
Is this possible ?
Use set_index + unstack
df.set_index(['Month', 'Type']).Value.unstack()
Type A B
Month
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
To match your exact output
df.set_index(['Month', 'Type']).Value.unstack().rename_axis(None)
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
Pivot solution:
In [70]: df.pivot(index='Month', columns='Type', values='Value')
Out[70]:
Type A B
Month
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
In [71]: df.pivot(index='Month', columns='Type', values='Value').rename_axis(None)
Out[71]:
Type A B
1 0.29 0.29
2 0.90 0.50
3 0.44 0.14
4 0.43 0.07
You're having a case of long format table which you want to transform to a wide format.
This is natively handled in pandas:
df.pivot(index='Month', columns='Type', values='Value')

Categories

Resources