Trying to calculate slope on SMA
df.date = pd.to_datetime(df.date)
df['date_ordinal'] = pd.to_datetime(df['date']).map(dt.toordinal)
slope, intercept, r_value, p_value, std_err = stats.linregress(df['date_ordinal'], df['SMA'])
df['slope'] = slope
Why slope is NaN?
Dataframe:
date open high low close volume token SMA serial ate_ordinal slope
0 2021-07-05 59.15 60.75 58.75 59.85 219009 168456 NaN 26 737976 NaN
1 2021-07-06 59.90 63.90 59.90 61.40 345452 168456 NaN 25 737977 NaN
2 2021-07-07 61.65 62.90 60.70 61.75 120200 168456 NaN 24 737978 NaN
3 2021-07-08 61.00 63.80 61.00 61.85 173059 168456 NaN 23 737979 NaN
4 2021-07-09 62.20 62.60 61.00 61.30 80536 168456 NaN 22 737980 NaN
5 2021-07-12 61.30 65.50 61.30 64.25 433789 168456 NaN 21 737983 NaN
6 2021-07-13 65.05 66.75 65.00 65.80 343672 168456 NaN 20 737984 NaN
7 2021-07-14 66.70 66.70 63.25 64.60 186786 168456 NaN 19 737985 NaN
8 2021-07-15 64.95 66.70 64.00 64.55 267449 168456 NaN 18 737986 NaN
9 2021-07-16 65.00 69.45 63.60 65.15 824427 168456 NaN 17 737987 NaN
10 2021-07-19 65.55 70.00 65.55 67.60 506566 168456 63.463636 16 737990 NaN
11 2021-07-20 68.90 69.15 65.60 66.25 345355 168456 64.045455 15 737991 NaN
12 2021-07-22 67.50 67.90 66.05 66.65 101745 168456 64.522727 14 737993 NaN
13 2021-07-23 67.50 67.55 64.60 65.05 176110 168456 64.822727 13 737994 NaN
14 2021-07-26 65.40 65.80 63.35 63.95 114623 168456 65.013636 12 737997 NaN
15 2021-07-27 64.00 64.90 62.50 62.95 124095 168456 65.163636 11 737998 NaN
16 2021-07-28 63.80 63.80 60.20 62.85 110505 168456 65.036364 10 737999 NaN
17 2021-07-29 63.50 64.50 63.00 64.20 58880 168456 64.890909 9 738000 NaN
18 2021-07-30 64.00 68.65 62.70 66.50 505882 168456 65.063636 8 738001 NaN
19 2021-08-02 66.70 68.40 66.20 66.60 191472 168456 65.250000 7 738004 NaN
20 2021-08-03 67.50 69.90 65.55 67.45 581423 168456 65.459091 6 738005 NaN
21 2021-08-04 68.40 69.05 65.00 65.90 177188 168456 65.304545 5 738006 NaN
22 2021-08-05 66.50 66.50 63.50 63.75 112842 168456 65.077273 4 738007 NaN
23 2021-08-06 64.20 66.60 64.00 66.25 102939 168456 65.040909 3 738008 NaN
24 2021-08-09 67.40 67.40 63.25 63.90 88957 168456 64.936364 2 738011 NaN
25 2021-08-10 65.45 65.45 59.00 60.30 202877 168456 64.604545 1 738012 NaN
Related
I have a time-series dataset with different amounts of released gases in each time step as follows, the data is monitored day to day, in Date the sampling time is reflected and in the other columns the amount of released gas.
import pandas as pd
from statistics import mean
import numpy as np
Data = pd.read_csv('PTR 69.csv')
Data.columns = ['Date', 'H2', 'CH4', 'C2H6', 'C2H4', 'C2H2', 'CO', 'CO2', 'O2']
Data.dropna(how='all', axis=1, inplace=True)
Data.head()
It looks like this:
Date H2 CH4 C2H6 C2H4 C2H2 CO CO2 O2
0 2021-04-14 2:00 8.301259 10.889560 7.205929 3.485577 0.108262 318.616211 1659.179688 866.826721
1 2021-04-13 3:00 8.190150 10.224614 7.369829 3.561115 0.130052 318.895599 1641.014526 883.500305
2 2021-04-12 4:00 8.223248 10.297009 7.571199 3.479434 0.113566 315.364594 1636.670776 896.083679
3 2021-04-11 5:00 8.342580 10.233653 7.726023 3.474085 0.234786 316.315277 1641.205078 875.664856
4 2021-04-10 6:00 8.365788 9.825816 7.640978 3.621368 0.320388 320.200409 1658.575806 880.871399
5 2021-04-09 7:00 8.113251 11.198173 7.588203 3.561790 0.200721 318.738922 1651.639038 886.923401
6 2021-04-08 8:00 7.881397 7.967482 7.382273 3.528960 0.180016 315.252838 1625.236328 878.604309
7 2021-04-07 9:00 7.833044 6.773924 7.292545 3.475330 0.401435 317.085449 1628.325562 893.305664
8 2021-04-06 10:00 7.908926 9.419571 7.018494 3.347562 0.406113 317.643768 1620.742554 912.732422
9 2021-04-05 11:00 8.192807 9.262563 7.227449 3.275920 0.133978 312.931152 1601.240 845 932.079102
10 2021-04-04 12:00 8.086914 9.480316 6.515196 3.312712 0.000000 315.486816 1609.530884 928.141907
11 2021-04-03 13:00 7.984566 9.406860 6.712120 3.476949 0.336859 312.862793 1596.182495 938.904724
12 2021-04-02 14:00 8.077889 8.335327 7.443592 3.605910 0.416443 315.546539 1605.549438 928.619568
13 2021-04-01 15:00 7.996786 9.087573 7.950811 3.626776 0.745824 311.601471 1608.987183 897.747498
14 2021-03-31 16:00 8.433417 10.078784 6.567528 3.646854 0.682301 313.811615 1619.164673 825.123596
15 2021-03-30 17:00 8.445275 9.768773 7.460344 3.712297 0.353539 314.944672 1606.494751 811.027161
16 2021-03-29 18:00 8.398427 9.607062 7.446943 3.674934 0.287205 314.554596 1599.793823 828.780090
17 2021-03-28 19:00 8.272332 9.678397 7.303371 3.617573 0.430137 311.486664 1590.122192 828.557312
18 2021-03-27 20:00 8.478241 9.364383 7.153194 3.616118 0.548547 314.538849 1578.516235 821.565125
19 2021-03-26 21:00 8.452413 10.828227 6.825691 3.260484 0.642971 314.990082 1561.811890 826.468079
First I used the [pd.to_datetime] and separate the data frame based on the month and year as you can see:
Data['Date'] = pd.to_datetime(Data['Date'])
# How long is the dataset
Data['Date'].max () - Data['Date'].min ()
Reults:
```python
Timedelta('1364 days 12:49:00')
Data['Month'] = Data['Date'].dt.month
Data['Year'] = Data['Date'].dt.year
Data.head()
Then like this:
```python
Date H2 CH4 C2H6 C2H4 C2H2 CO CO2 O2 Month Year
0 2021-04-14 02:00:00 8.301259 10.889560 7.205929 3.485577 0.108262 318.616211 1659.179688 866.826721 4 2021
1 2021-04-13 03:00:00 8.190150 10.224614 7.369829 3.561115 0.130052 318.895599 1641.014526 883.500305 4 2021
2 2021-04-12 04:00:00 8.223248 10.297009 7.571199 3.479434 0.113566 315.364594 1636.670776 896.083679 4 2021
3 2021-04-11 05:00:00 8.342580 10.233653 7.726023 3.474085 0.234786 316.315277 1641.205078 875.664856 4 2021
4 2021-04-10 06:00:00 8.365788 9.825816 7.640978 3.621368 0.320388 320.200409 1658.575806 880.871399 4 2021
5 2021-04-09 07:00:00 8.113251 11.198173 7.588203 3.561790 0.200721 318.738922 1651.639038 886.923401 4 2021
6 2021-04-08 08:00:00 7.881397 7.967482 7.382273 3.528960 0.180016 315.252838 1625.236328 878.604309 4 2021
7 2021-04-07 09:00:00 7.833044 6.773924 7.292545 3.475330 0.401435 317.085449 1628.325562 893.305664 4 2021
8 2021-04-06 10:00:00 7.908926 9.419571 7.018494 3.347562 0.406113 317.643768 1620.742554 912.732422 4 2021
9 2021-04-05 11:00:00 8.192807 9.262563 7.227449 3.275920 0.133978 312.931152 1601.240845 932.079102 4 2021
So, two other columns [Month] and [Year] are added to the data frame.
My question: How I can calculate the rate of H2 changes over a month?
I know that first, I should calculate the mean of H2 in each month and each year as my data is time-sereis.
Mean_month = Data.set_index('Date').groupby(pd.Grouper(freq = 'M'))['H2'].mean().reset_index()
I used the previous steps to convert the date to [pd.to_datetime]:
Mean_month['Date'] = pd.to_datetime(Mean_month['Date'])
Mean_month['Month_mean'] = Mean_month['Date'].dt.month
Mean_month['Year_mean'] = Mean_month['Date'].dt.year
Mean_month.head ()
looks like this one:
Date H2 CH4 C2H2 C2H4 C2H6 CO CO2 O2 Month_mean Year_mean
0 2017-07-31 0.892207 0.797776 0.572518 0.119328 0.203212 23.137884 230.986328 1756.658813 7 2017
1 2017-08-31 NaN NaN NaN NaN NaN NaN NaN NaN 8 2017
2 2017-09-30 NaN NaN NaN NaN NaN NaN NaN NaN 9 2017
3 2017-10-31 NaN NaN NaN NaN NaN NaN NaN NaN 10 2017
4 2017-11-30 NaN NaN NaN NaN NaN NaN NaN NaN 11 2017
5 2017-12-31 NaN NaN NaN NaN NaN NaN NaN NaN 12 2017
6 2018-01-31 NaN NaN NaN NaN NaN NaN NaN NaN 1 2018
7 2018-02-28 NaN NaN NaN NaN NaN NaN NaN NaN 2 2018
8 2018-03-31 NaN NaN NaN NaN NaN NaN NaN NaN 3 2018
9 2018-04-30 NaN NaN NaN NaN NaN NaN NaN NaN 4 2018
10 2018-05-31 NaN NaN NaN NaN NaN NaN NaN NaN 5 2018
11 2018-06-30 3.376091 1.780959 0.488345 0.431397 1.777461 59.424690 246.135108 2927.244192 6 2018
12 2018-07-31 3.785872 1.710799 0.479277 0.405084 2.416031 63.220747 256.035651 2971.905932 7 2018
13 2018-08-31 3.789915 1.874313 0.444453 0.339609 2.516580 67.629768 264.437564 3016.440033 8 2018
14 2018-09-30 3.882403 1.842717 0.443967 0.342131 2.848867 71.592693 271.972792 3073.598901 9 2018
15 2018-10-31 3.858354 2.037401 0.364234 0.358209 2.651448 75.036622 274.889362 3150.082060 10 2018
16 2018-11-30 3.861638 1.854492 0.276273 0.289241 2.813399 78.563868 289.631986 3176.243186 11 2018
17 2018-12-31 5.029865 2.526096 0.232814 0.510899 3.423260 95.641880 409.359902 2831.721010 12 2018
18 2019-01-31 6.103601 2.528294 0.177558 0.612607 4.039948 116.639744 516.362618 2423.434258 1 2019
19 2019-02-28 7.480646 3.316433 0.239254 0.959470 5.319684 142.571229 662.409360 1877.447767 2 2019
20 2019-03-31 8.363644 3.779225 0.213011 1.171834 6.179431 167.295488 815.904473 1415.431158 3 2019
21 2019-04-30 9.523452 4.620810 0.233048 1.703750 8.359211 195.914846 1044.554593 898.940531 4 2019
22 2019-05-31 10.118435 5.524447 0.311802 1.904199 9.275237 213.531002 1178.495602 657.617859 5 2019
23 2019-06-30 10.283766 6.186843 0.377420 2.165453 10.729356 226.061226 1226.489872 589.417023 6 2019
24 2019-07-31 9.943331 6.648062 0.492584 2.326774 11.791042 234.309877 1257.822071 572.162592 7 2019
25 2019-08-31 9.812387 6.681962 0.510871 2.483979 13.067311 243.440762 1302.643938 568.994610 8 2019
26 2019-09-30 9.661653 7.323367 0.420726 2.628199 13.308826 252.133648 1383.259943 550.533951 9 2019
27 2019-10-31 9.246261 7.644706 0.372446 2.673924 13.880747 257.093790 1407.996110 565.502500 10 2019
28 2019-11-30 8.226894 6.606762 0.411812 2.290050 12.958136 257.590110 1306.817593 654.086494 11 2019
29 2019-12-31 7.985734 7.461197 0.314830 2.417687 13.255049 259.519881 1309.507549 684.085808 12 2019
30 2020-01-31 7.754674 7.804206 0.336518 2.506526 13.554615 262.188585 1312.052006 700.065050 1 2020
31 2020-02-29 7.662918 7.607357 0.283796 2.483387 13.803671 264.348120 1300.252926 710.281917 2 2020
32 2020-03-31 7.602619 8.326974 0.278294 2.629290 13.983202 268.429411 1351.023144 698.012543 3 2020
33 2020-04-30 7.585870 8.028798 0.389348 2.856049 15.635886 273.859451 1426.279447 703.866225 4 2020
34 2020-05-31 7.752543 8.622809 0.329810 2.974434 16.470193 279.636700 1484.100789 685.164897 5 2020
35 2020-06-30 7.935418 8.632543 0.408732 3.410121 18.330232 287.545439 1593.554077 653.294214 6 2020
36 2020-07-31 8.226212 9.180892 0.474289 3.646311 19.746735 295.059049 1688.793476 613.164837 7 2020
37 2020-08-31 8.535027 9.583940 0.517722 3.860195 20.853958 303.025472 1759.655769 597.264223 8 2020
38 2020-09-30 8.782468 9.318198 0.447619 3.780273 21.613501 309.644693 1790.096266 594.891798 9 2020
39 2020-10-31 8.766880 17.531840 0.436720 3.671641 21.794714 312.511920 1783.446248 622.681765 10 2020
40 2020-11-30 8.535022 9.695740 0.427224 3.352291 11.561881 311.624202 1676.413354 713.680609 11 2020
41 2020-12-31 8.374398 9.114723 0.340198 3.351321 6.768138 312.902290 1642.077442 766.767532 12 2020
42 2021-01-31 8.238818 9.373566 0.344173 3.372903 6.670032 313.475182 1604.747685 788.205679 1 2021
43 2021-02-28 8.191080 9.900578 0.334562 3.352319 6.802692 314.076140 1572.294619 815.143081 2 2021
44 2021-03-31 8.317389 9.627182 0.385551 3.209554 5.862067 312.134351 1484.145511 867.169165 3 2021
45 2021-04-30 8.107043 9.457317 0.266317 3.488106 7.331760 316.181560 1627.434300 900.000397 4 2021
As the [Mean_month] data frame is sorted ascending, I resorted it again with:
Srt_Mean = Mean_month.sort_values(['Date'],ascending=False)
Srt_Mean
the results are:
Date H2 CH4 C2H2 C2H4 C2H6 CO CO2 O2 Month_mean Year_mean
45 2021-04-30 8.107043 9.457317 0.266317 3.488106 7.331760 316.181560 1627.434300 900.000397 4 2021
44 2021-03-31 8.317389 9.627182 0.385551 3.209554 5.862067 312.134351 1484.145511 867.169165 3 2021
43 2021-02-28 8.191080 9.900578 0.334562 3.352319 6.802692 314.076140 1572.294619 815.143081 2 2021
42 2021-01-31 8.238818 9.373566 0.344173 3.372903 6.670032 313.475182 1604.747685 788.205679 1 2021
41 2020-12-31 8.374398 9.114723 0.340198 3.351321 6.768138 312.902290 1642.077442 766.767532 12 2020
40 2020-11-30 8.535022 9.695740 0.427224 3.352291 11.561881 311.624202 1676.413354 713.680609 11 2020
39 2020-10-31 8.766880 17.531840 0.436720 3.671641 21.794714 312.511920 1783.446248 622.681765 10 2020
38 2020-09-30 8.782468 9.318198 0.447619 3.780273 21.613501 309.644693 1790.096266 594.891798 9 2020
37 2020-08-31 8.535027 9.583940 0.517722 3.860195 20.853958 303.025472 1759.655769 597.264223 8 2020
36 2020-07-31 8.226212 9.180892 0.474289 3.646311 19.746735 295.059049 1688.793476 613.164837 7 2020
35 2020-06-30 7.935418 8.632543 0.408732 3.410121 18.330232 287.545439 1593.554077 653.294214 6 2020
34 2020-05-31 7.752543 8.622809 0.329810 2.974434 16.470193 279.636700 1484.100789 685.164897 5 2020
33 2020-04-30 7.585870 8.028798 0.389348 2.856049 15.635886 273.859451 1426.279447 703.866225 4 2020
32 2020-03-31 7.602619 8.326974 0.278294 2.629290 13.983202 268.429411 1351.023144 698.012543 3 2020
31 2020-02-29 7.662918 7.607357 0.283796 2.483387 13.803671 264.348120 1300.252926 710.281917 2 2020
30 2020-01-31 7.754674 7.804206 0.336518 2.506526 13.554615 262.188585 1312.052006 700.065050 1 2020
29 2019-12-31 7.985734 7.461197 0.314830 2.417687 13.255049 259.519881 1309.507549 684.085808 12 2019
28 2019-11-30 8.226894 6.606762 0.411812 2.290050 12.958136 257.590110 1306.817593 654.086494 11 2019
27 2019-10-31 9.246261 7.644706 0.372446 2.673924 13.880747 257.093790 1407.996110 565.502500 10 2019
26 2019-09-30 9.661653 7.323367 0.420726 2.628199 13.308826 252.133648 1383.259943 550.533951 9 2019
25 2019-08-31 9.812387 6.681962 0.510871 2.483979 13.067311 243.440762 1302.643938 568.994610 8 2019
24 2019-07-31 9.943331 6.648062 0.492584 2.326774 11.791042 234.309877 1257.822071 572.162592 7 2019
23 2019-06-30 10.283766 6.186843 0.377420 2.165453 10.729356 226.061226 1226.489872 589.417023 6 2019
22 2019-05-31 10.118435 5.524447 0.311802 1.904199 9.275237 213.531002 1178.495602 657.617859 5 2019
21 2019-04-30 9.523452 4.620810 0.233048 1.703750 8.359211 195.914846 1044.554593 898.940531 4 2019
20 2019-03-31 8.363644 3.779225 0.213011 1.171834 6.179431 167.295488 815.904473 1415.431158 3 2019
19 2019-02-28 7.480646 3.316433 0.239254 0.959470 5.319684 142.571229 662.409360 1877.447767 2 2019
18 2019-01-31 6.103601 2.528294 0.177558 0.612607 4.039948 116.639744 516.362618 2423.434258 1 2019
17 2018-12-31 5.029865 2.526096 0.232814 0.510899 3.423260 95.641880 409.359902 2831.721010 12 2018
16 2018-11-30 3.861638 1.854492 0.276273 0.289241 2.813399 78.563868 289.631986 3176.243186 11 2018
15 2018-10-31 3.858354 2.037401 0.364234 0.358209 2.651448 75.036622 274.889362 3150.082060 10 2018
14 2018-09-30 3.882403 1.842717 0.443967 0.342131 2.848867 71.592693 271.972792 3073.598901 9 2018
13 2018-08-31 3.789915 1.874313 0.444453 0.339609 2.516580 67.629768 264.437564 3016.440033 8 2018
12 2018-07-31 3.785872 1.710799 0.479277 0.405084 2.416031 63.220747 256.035651 2971.905932 7 2018
11 2018-06-30 3.376091 1.780959 0.488345 0.431397 1.777461 59.424690 246.135108 2927.244192 6 2018
10 2018-05-31 NaN NaN NaN NaN NaN NaN NaN NaN 5 2018
9 2018-04-30 NaN NaN NaN NaN NaN NaN NaN NaN 4 2018
8 2018-03-31 NaN NaN NaN NaN NaN NaN NaN NaN 3 2018
7 2018-02-28 NaN NaN NaN NaN NaN NaN NaN NaN 2 2018
6 2018-01-31 NaN NaN NaN NaN NaN NaN NaN NaN 1 2018
5 2017-12-31 NaN NaN NaN NaN NaN NaN NaN NaN 12 2017
4 2017-11-30 NaN NaN NaN NaN NaN NaN NaN NaN 11 2017
3 2017-10-31 NaN NaN NaN NaN NaN NaN NaN NaN 10 2017
2 2017-09-30 NaN NaN NaN NaN NaN NaN NaN NaN 9 2017
1 2017-08-31 NaN NaN NaN NaN NaN NaN NaN NaN 8 2017
0 2017-07-31 0.892207 0.797776 0.572518 0.119328 0.203212 23.137884 230.986328 1756.658813 7 2017
I also defined the index for both data frames as finally, I want to divide the column of [H2] in the first data frame over the column of [H2] in the first dataframe:
df_Data = Data.set_index(['Month', 'Year'])
df_Data.head (50)
df_Srt_Mean = Srt_Mean.set_index (['Month_mean', 'Year_mean'])
df_Srt_Mean.head (50)
Date H2 CH4 C2H6 C2H4 C2H2 CO CO2 O2
Month Year
4 2021 2021-04-14 02:00:00 8.301259 10.889560 7.205929 3.485577 0.108262 318.616211 1659.179688 866.826721
2021 2021-04-13 03:00:00 8.190150 10.224614 7.369829 3.561115 0.130052 318.895599 1641.014526 883.500305
2021 2021-04-12 04:00:00 8.223248 10.297009 7.571199 3.479434 0.113566 315.364594 1636.670776 896.083679
2021 2021-04-11 05:00:00 8.342580 10.233653 7.726023 3.474085 0.234786 316.315277 1641.205078 875.664856
2021 2021-04-10 06:00:00 8.365788 9.825816 7.640978 3.621368 0.320388 320.200409 1658.575806 880.871399
2021 2021-04-09 07:00:00 8.113251 11.198173 7.588203 3.561790 0.200721 318.738922 1651.639038 886.923401
2021 2021-04-08 08:00:00 7.881397 7.967482 7.382273 3.528960 0.180016 315.252838 1625.236328 878.604309
2021 2021-04-07 09:00:00 7.833044 6.773924 7.292545 3.475330 0.401435 317.085449 1628.325562 893.305664
2021 2021-04-06 10:00:00 7.908926 9.419571 7.018494 3.347562 0.406113 317.643768 1620.742554 912.732422
2021 2021-04-05 11:00:00 8.192807 9.262563 7.227449 3.275920 0.133978 312.931152 1601.240845 932.079102
2021 2021-04-04 12:00:00 8.086914 9.480316 6.515196 3.312712 0.000000 315.486816 1609.530884 928.141907
2021 2021-04-03 13:00:00 7.984566 9.406860 6.712120 3.476949 0.336859 312.862793 1596.182495 938.904724
2021 2021-04-02 14:00:00 8.077889 8.335327 7.443592 3.605910 0.416443 315.546539 1605.549438 928.619568
2021 2021-04-01 15:00:00 7.996786 9.087573 7.950811 3.626776 0.745824 311.601471 1608.987183 897.747498
3 2021 2021-03-31 16:00:00 8.433417 10.078784 6.567528 3.646854 0.682301 313.811615 1619.164673 825.123596
2021 2021-03-30 17:00:00 8.445275 9.768773 7.460344 3.712297 0.353539 314.944672 1606.494751 811.027161
2021 2021-03-29 18:00:00 8.398427 9.607062 7.446943 3.674934 0.287205 314.554596 1599.793823 828.780090
2021 2021-03-28 19:00:00 8.272332 9.678397 7.303371 3.617573 0.430137 311.486664 1590.122192 828.557312
2021 2021-03-27 20:00:00 8.478241 9.364383 7.153194 3.616118 0.548547 314.538849 1578.516235 821.565125
2021 2021-03-26 21:00:00 8.452413 10.828227 6.825691 3.260484 0.642971 314.990082 1561.811890 826.468079
2021 2021-03-25 22:00:00 8.420037 10.468951 6.614395 3.279383 0.442519 314.821197 1538.289673 835.261902
2021 2021-03-24 23:00:00 8.290853 9.943011 5.952219 3.263231 0.077059 313.060883 1498.917969 859.999023
2021 2021-03-24 00:00:00 8.053485 9.717534 5.773523 3.210894 0.477235 309.256561 1461.547974 867.371643
2021 2021-03-23 01:00:00 8.813514 10.700623 5.444063 2.965948 0.421797 312.926971 1437.077026 867.363709
2021 2021-03-22 02:00:00 8.149124 9.727563 4.518490 2.958276 0.368664 311.796661 1420.417358 916.602539
2021 2021-03-21 03:00:00 8.169525 8.859634 5.212233 3.129839 0.416121 312.702301 1419.987427 904.523865
2021 2021-03-20 04:00:00 7.999515 8.994797 5.137753 3.148643 0.475540 307.183685 1420.932739 913.971130
2021 2021-03-19 05:00:00 8.183563 10.373088 4.949068 3.037351 0.584536 312.275482 1440.424683 895.362122
2021 2021-03-18 06:00:00 9.914630 10.722699 4.891720 3.121366 0.364292 312.476959 1446.715210 889.638367
2021 2021-03-17 07:00:00 8.063797 9.449814 4.965353 3.158536 0.332817 307.930389 1443.011108 883.420349
2021 2021-03-16 08:00:00 8.858215 9.454753 5.053194 3.093672 0.249709 313.467071 1456.114624 902.091492
2021 2021-03-15 09:00:00 8.146770 8.423282 5.213614 3.038460 0.228652 312.719238 1443.799438 900.013672
2021 2021-03-14 10:00:00 8.160034 14.032947 5.426914 2.981697 0.391028 313.519440 1459.276245 891.870300
2021 2021-03-13 11:00:00 7.876873 5.985085 5.602545 2.998276 0.607312 311.964203 1447.259399 886.466492
2021 2021-03-12 12:00:00 8.299830 9.434842 5.768423 2.931913 0.374833 312.165375 1450.703979 893.731873
2021 2021-03-11 13:00:00 8.258931 9.164996 5.773973 2.917338 0.367790 312.416412 1447.783203 884.459534
2021 2021-03-10 14:00:00 8.285775 9.396652 5.687450 3.018778 0.367582 312.764160 1452.421875 883.869568
2021 2021-03-09 15:00:00 8.069007 9.174088 5.641685 3.134619 0.282684 307.792206 1445.247192 887.044922
2021 2021-03-08 16:00:00 8.150889 8.341151 5.952223 3.310198 0.276260 310.551758 1453.108765 881.680664
2021 2021-03-07 17:00:00 8.148776 8.571256 5.962189 3.365770 0.321035 311.439789 1450.016235 881.019348
2021 2021-03-06 18:00:00 8.235992 9.840173 5.190016 3.325249 0.390993 313.732513 1476.067505 880.206055
2021 2021-03-05 19:00:00 8.041183 8.705338 6.181820 3.528234 0.299884 308.838959 1456.264038 857.722656
2021 2021-03-04 20:00:00 8.286016 8.883926 5.667931 3.196103 0.350631 314.590729 1479.576538 861.197266
2021 2021-03-03 21:00:00 8.245660 9.066014 5.785030 3.191303 0.378657 313.044281 1479.022095 850.414856
2021 2021-03-02 22:00:00 8.386712 9.401718 6.162895 3.043518 0.363813 312.941315 1493.645142 840.161438
2021 2021-03-01 23:00:00 8.231705 10.864131 6.184435 3.010111 0.217610 309.424164 1501.307983 834.103943
2021 2021-03-01 00:00:00 8.253326 10.673305 5.977970 3.028328 0.349412 310.304413 1501.962891 825.492371
2 2021 2021-02-28 01:00:00 8.313703 10.718976 5.379131 3.017091 0.303016 313.576935 1511.731079 837.980774
2021 2021-02-27 02:00:00 8.315781 10.122794 5.632700 3.183661 0.419333 309.140228 1502.215210 855.478516
2021 2021-02-26 03:00:00 7.974852 10.396459 6.063492 3.239314 0.497979 314.248688 1523.176880 852.766907
Date H2 CH4 C2H2 C2H4 C2H6 CO CO2 O2
Month_mean Year_mean
4 2021 2021-04-30 8.107043 9.457317 0.266317 3.488106 7.331760 316.181560 1627.434300 900.000397
3 2021 2021-03-31 8.317389 9.627182 0.385551 3.209554 5.862067 312.134351 1484.145511 867.169165
2 2021 2021-02-28 8.191080 9.900578 0.334562 3.352319 6.802692 314.076140 1572.294619 815.143081
1 2021 2021-01-31 8.238818 9.373566 0.344173 3.372903 6.670032 313.475182 1604.747685 788.205679
12 2020 2020-12-31 8.374398 9.114723 0.340198 3.351321 6.768138 312.902290 1642.077442 766.767532
11 2020 2020-11-30 8.535022 9.695740 0.427224 3.352291 11.561881 311.624202 1676.413354 713.680609
10 2020 2020-10-31 8.766880 17.531840 0.436720 3.671641 21.794714 312.511920 1783.446248 622.681765
9 2020 2020-09-30 8.782468 9.318198 0.447619 3.780273 21.613501 309.644693 1790.096266 594.891798
8 2020 2020-08-31 8.535027 9.583940 0.517722 3.860195 20.853958 303.025472 1759.655769 597.264223
7 2020 2020-07-31 8.226212 9.180892 0.474289 3.646311 19.746735 295.059049 1688.793476 613.164837
6 2020 2020-06-30 7.935418 8.632543 0.408732 3.410121 18.330232 287.545439 1593.554077 653.294214
5 2020 2020-05-31 7.752543 8.622809 0.329810 2.974434 16.470193 279.636700 1484.100789 685.164897
4 2020 2020-04-30 7.585870 8.028798 0.389348 2.856049 15.635886 273.859451 1426.279447 703.866225
3 2020 2020-03-31 7.602619 8.326974 0.278294 2.629290 13.983202 268.429411 1351.023144 698.012543
2 2020 2020-02-29 7.662918 7.607357 0.283796 2.483387 13.803671 264.348120 1300.252926 710.281917
1 2020 2020-01-31 7.754674 7.804206 0.336518 2.506526 13.554615 262.188585 1312.052006 700.065050
12 2019 2019-12-31 7.985734 7.461197 0.314830 2.417687 13.255049 259.519881 1309.507549 684.085808
11 2019 2019-11-30 8.226894 6.606762 0.411812 2.290050 12.958136 257.590110 1306.817593 654.086494
10 2019 2019-10-31 9.246261 7.644706 0.372446 2.673924 13.880747 257.093790 1407.996110 565.502500
9 2019 2019-09-30 9.661653 7.323367 0.420726 2.628199 13.308826 252.133648 1383.259943 550.533951
8 2019 2019-08-31 9.812387 6.681962 0.510871 2.483979 13.067311 243.440762 1302.643938 568.994610
7 2019 2019-07-31 9.943331 6.648062 0.492584 2.326774 11.791042 234.309877 1257.822071 572.162592
6 2019 2019-06-30 10.283766 6.186843 0.377420 2.165453 10.729356 226.061226 1226.489872 589.417023
5 2019 2019-05-31 10.118435 5.524447 0.311802 1.904199 9.275237 213.531002 1178.495602 657.617859
4 2019 2019-04-30 9.523452 4.620810 0.233048 1.703750 8.359211 195.914846 1044.554593 898.940531
3 2019 2019-03-31 8.363644 3.779225 0.213011 1.171834 6.179431 167.295488 815.904473 1415.431158
2 2019 2019-02-28 7.480646 3.316433 0.239254 0.959470 5.319684 142.571229 662.409360 1877.447767
1 2019 2019-01-31 6.103601 2.528294 0.177558 0.612607 4.039948 116.639744 516.362618 2423.434258
12 2018 2018-12-31 5.029865 2.526096 0.232814 0.510899 3.423260 95.641880 409.359902 2831.721010
11 2018 2018-11-30 3.861638 1.854492 0.276273 0.289241 2.813399 78.563868 289.631986 3176.243186
10 2018 2018-10-31 3.858354 2.037401 0.364234 0.358209 2.651448 75.036622 274.889362 3150.082060
9 2018 2018-09-30 3.882403 1.842717 0.443967 0.342131 2.848867 71.592693 271.972792 3073.598901
8 2018 2018-08-31 3.789915 1.874313 0.444453 0.339609 2.516580 67.629768 264.437564 3016.440033
7 2018 2018-07-31 3.785872 1.710799 0.479277 0.405084 2.416031 63.220747 256.035651 2971.905932
6 2018 2018-06-30 3.376091 1.780959 0.488345 0.431397 1.777461 59.424690 246.135108 2927.244192
5 2018 2018-05-31 NaN NaN NaN NaN NaN NaN NaN NaN
4 2018 2018-04-30 NaN NaN NaN NaN NaN NaN NaN NaN
3 2018 2018-03-31 NaN NaN NaN NaN NaN NaN NaN NaN
2 2018 2018-02-28 NaN NaN NaN NaN NaN NaN NaN NaN
1 2018 2018-01-31 NaN NaN NaN NaN NaN NaN NaN NaN
12 2017 2017-12-31 NaN NaN NaN NaN NaN NaN NaN NaN
11 2017 2017-11-30 NaN NaN NaN NaN NaN NaN NaN NaN
10 2017 2017-10-31 NaN NaN NaN NaN NaN NaN NaN NaN
9 2017 2017-09-30 NaN NaN NaN NaN NaN NaN NaN NaN
8 2017 2017-08-31 NaN NaN NaN NaN NaN NaN NaN NaN
7 2017 2017-07-31 0.892207 0.797776 0.572518 0.119328 0.203212 23.137884 230.986328 1756.658813
Now, for each month of each year, I have one mean, How can divide the column of H2 of the first data frame over this column which includes one number.. For example for
April 2021, we have 30 days and one mean,
May 2021, we have 31 days and one mean,
Based on the index of these two data frames this division should be performed.
I really appreciate it if if you can help me find a solution..
I have a dataset
df
Time Spot Ubalance
0 2017-01-01T00:00:00+01:00 20.96 NaN
1 2017-01-01T01:00:00+01:00 20.90 29.40
2 2017-01-01T02:00:00+01:00 18.13 24.73
3 2017-01-01T03:00:00+01:00 16.03 24.73
4 2017-01-01T04:00:00+01:00 16.43 27.89
5 2017-01-01T05:00:00+01:00 13.75 28.26
6 2017-01-01T06:00:00+01:00 11.10 30.43
7 2017-01-01T07:00:00+01:00 15.47 32.85
8 2017-01-01T08:00:00+01:00 16.88 33.91
9 2017-01-01T09:00:00+01:00 21.81 28.58
10 2017-01-01T10:00:00+01:00 26.24 28.58
I want to generate a series/dataframe in which I calculate the maximum difference between the highest and lowest value of the last n rows within multiple columns, i.e., the maximum difference of these "last" 10 rows would be
33.91 (highest is here in "ubalance") - 11.10 (lowest is in "Spot") = 22.81
I've tried .rolling() but it apparently does not contain a difference attribute.
Expected outcome:
Time Spot Ubalance Diff
0 2017-01-01T00:00:00+01:00 20.96 NaN NaN
1 2017-01-01T01:00:00+01:00 20.90 29.40 NaN
2 2017-01-01T02:00:00+01:00 18.13 24.73 NaN
3 2017-01-01T03:00:00+01:00 16.03 24.73 NaN
4 2017-01-01T04:00:00+01:00 16.43 27.89 NaN
5 2017-01-01T05:00:00+01:00 13.75 28.26 NaN
6 2017-01-01T06:00:00+01:00 11.10 30.43 NaN
7 2017-01-01T07:00:00+01:00 15.47 32.85 NaN
8 2017-01-01T08:00:00+01:00 16.88 33.91 NaN
9 2017-01-01T09:00:00+01:00 21.81 28.58 NaN
10 2017-01-01T10:00:00+01:00 26.24 28.58 22.81
Use Rolling.aggregate and then subtract:
df1 = df['Spot'].rolling(10).agg(['min','max'])
print (df1)
min max
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 11.1 21.81
10 11.1 26.24
df['dif'] = df1['max'].sub(df1['min'])
print (df)
Time Spot Ubalance dif
0 2017-01-01T00:00:00+01:00 20.96 NaN NaN
1 2017-01-01T01:00:00+01:00 20.90 29.40 NaN
2 2017-01-01T02:00:00+01:00 18.13 24.73 NaN
3 2017-01-01T03:00:00+01:00 16.03 24.73 NaN
4 2017-01-01T04:00:00+01:00 16.43 27.89 NaN
5 2017-01-01T05:00:00+01:00 13.75 28.26 NaN
6 2017-01-01T06:00:00+01:00 11.10 30.43 NaN
7 2017-01-01T07:00:00+01:00 15.47 32.85 NaN
8 2017-01-01T08:00:00+01:00 16.88 33.91 NaN
9 2017-01-01T09:00:00+01:00 21.81 28.58 10.71
10 2017-01-01T10:00:00+01:00 26.24 28.58 15.14
Or custom function with lambda:
df['diff'] = df['Spot'].rolling(10).agg(lambda x: x.max() - x.min())
EDIT:
For processing all columns from list use:
cols = ['Spot','Ubalance']
N = 10
df['dif'] = (df[cols].stack(dropna=False)
.rolling(len(cols) * N)
.agg(lambda x: x.max() - x.min())
.groupby(level=0)
.max())
print (df)
Time Spot Ubalance dif
0 2017-01-01T00:00:00+01:00 20.96 NaN NaN
1 2017-01-01T01:00:00+01:00 20.90 29.40 NaN
2 2017-01-01T02:00:00+01:00 18.13 24.73 NaN
3 2017-01-01T03:00:00+01:00 16.03 24.73 NaN
4 2017-01-01T04:00:00+01:00 16.43 27.89 NaN
5 2017-01-01T05:00:00+01:00 13.75 28.26 NaN
6 2017-01-01T06:00:00+01:00 11.10 30.43 NaN
7 2017-01-01T07:00:00+01:00 15.47 32.85 NaN
8 2017-01-01T08:00:00+01:00 16.88 33.91 NaN
9 2017-01-01T09:00:00+01:00 21.81 28.58 NaN
10 2017-01-01T10:00:00+01:00 26.24 28.58 22.81
you could a rolling window like this:
n = 10
df.rolling(3).apply(func=lambda x: x.max() - x.min())
you can specify in the lambda function the column you want to do the rolling window
I'm trying to concatenate a Series onto the right side of a dataframe with the column name 'RSI'. However, because the Series is of shorter length than the other columns in the dataframe, I need to ensure that NaN values are appended to the top of the column and not the bottom. Right now, I've used the following code but I can't find an argument that would allow me to have the desired output.
RSI = pd.Series(RSI)
df = pd.concat((df, RSI.rename('RSI'), axis='columns')
So far, this is my output:
Dates Prices Volumes RSI
0 2013-02-08 201.68 2893254 47.7357
1 2013-02-11 200.16 2944651 53.3967
2 2013-02-12 200.04 2461779 56.3866
3 2013-02-13 200.09 2169757 60.1845
4 2013-02-14 199.65 3294126 62.1784
5 2013-02-15 200.98 3627887 63.9720
6 2013-02-19 200.32 2998317 62.9671
7 2013-02-20 199.31 3715311 63.9232
8 2013-02-21 198.33 3923051 66.8817
9 2013-02-22 201.09 3107876 72.8258
10 2013-02-25 197.51 3845276 69.6578
11 2013-02-26 199.14 3391562 63.8458
12 2013-02-27 202.33 4185545 64.2776
13 2013-02-28 200.83 4689698 67.2445
14 2013-03-01 202.91 3308544 58.2408
15 2013-03-04 205.19 3693365 57.7058
16 2013-03-05 206.53 3807706 53.7482
17 2013-03-06 208.38 3594899 57.5396
18 2013-03-07 209.42 3884317 53.2722
19 2013-03-08 210.38 3700086 58.6824
20 2013-03-11 210.08 3048901 56.0161
21 2013-03-12 210.55 3591261 60.2066
22 2013-03-13 212.06 3355969 55.3322
23 2013-03-14 215.80 5505484 51.7492
24 2013-03-15 214.92 7935024 47.1241
25 2013-03-18 213.21 3006125 46.9102
26 2013-03-19 213.44 3198577 46.6569
27 2013-03-20 215.06 3019153 54.0822
28 2013-03-21 212.26 5830566 56.2525
29 2013-03-22 212.08 3015847 51.8359
... ... ... ... ...
1229 2017-12-26 152.83 2479017 80.1930
1230 2017-12-27 153.13 2149257 80.7444
1231 2017-12-28 154.04 2687624 56.4425
1232 2017-12-29 153.42 3327087 56.9183
1233 2018-01-02 154.25 4202503 63.6958
1234 2018-01-03 158.49 9441567 61.1962
1235 2018-01-04 161.70 7556249 61.3816
1236 2018-01-05 162.49 5195764 64.7724
1237 2018-01-08 163.47 5237523 63.0508
1238 2018-01-09 163.83 4341810 53.9559
1239 2018-01-10 164.18 4174105 54.1351
1240 2018-01-11 164.20 3794453 50.6824
1241 2018-01-12 163.14 5031886 43.0222
1242 2018-01-16 163.85 7794195 32.7428
1243 2018-01-17 168.65 11710033 39.4754
1244 2018-01-18 169.12 14259345 37.3409
1245 2018-01-19 162.37 21172488 NaN
1246 2018-01-22 162.60 8480795 NaN
1247 2018-01-23 166.25 7466232 NaN
1248 2018-01-24 165.37 5645003 NaN
1249 2018-01-25 165.47 3302520 NaN
1250 2018-01-26 167.34 3787913 NaN
1251 2018-01-29 166.80 3516995 NaN
1252 2018-01-30 163.62 4902341 NaN
1253 2018-01-31 163.70 4072830 NaN
1254 2018-02-01 162.40 4434242 NaN
1255 2018-02-02 159.03 5251938 NaN
1256 2018-02-05 152.53 8746599 NaN
1257 2018-02-06 155.34 9867678 NaN
1258 2018-02-07 153.85 6149207 NaN
However, I need it to look like this:
Dates Prices Volumes RSI
0 2013-02-08 201.68 2893254 NaN
1 2013-02-11 200.16 2944651 NaN
2 2013-02-12 200.04 2461779 NaN
3 2013-02-13 200.09 2169757 NaN
4 2013-02-14 199.65 3294126 NaN
5 2013-02-15 200.98 3627887 NaN
6 2013-02-19 200.32 2998317 NaN
7 2013-02-20 199.31 3715311 NaN
8 2013-02-21 198.33 3923051 NaN
9 2013-02-22 201.09 3107876 NaN
10 2013-02-25 197.51 3845276 NaN
11 2013-02-26 199.14 3391562 NaN
12 2013-02-27 202.33 4185545 NaN
13 2013-02-28 200.83 4689698 NaN
14 2013-03-01 202.91 3308544 NaN
15 2013-03-04 205.19 3693365 57.7058
16 2013-03-05 206.53 3807706 53.7482
17 2013-03-06 208.38 3594899 57.5396
18 2013-03-07 209.42 3884317 53.2722
19 2013-03-08 210.38 3700086 58.6824
20 2013-03-11 210.08 3048901 56.0161
21 2013-03-12 210.55 3591261 60.2066
22 2013-03-13 212.06 3355969 55.3322
23 2013-03-14 215.80 5505484 51.7492
24 2013-03-15 214.92 7935024 47.1241
25 2013-03-18 213.21 3006125 46.9102
26 2013-03-19 213.44 3198577 46.6569
27 2013-03-20 215.06 3019153 54.0822
28 2013-03-21 212.26 5830566 56.2525
29 2013-03-22 212.08 3015847 51.8359
... ... ... ... ...
1229 2017-12-26 152.83 2479017 80.1930
1230 2017-12-27 153.13 2149257 80.7444
1231 2017-12-28 154.04 2687624 56.4425
1232 2017-12-29 153.42 3327087 56.9183
1233 2018-01-02 154.25 4202503 63.6958
1234 2018-01-03 158.49 9441567 61.1962
1235 2018-01-04 161.70 7556249 61.3816
1236 2018-01-05 162.49 5195764 64.7724
1237 2018-01-08 163.47 5237523 63.0508
1238 2018-01-09 163.83 4341810 53.9559
1239 2018-01-10 164.18 4174105 54.1351
1240 2018-01-11 164.20 3794453 50.6824
1241 2018-01-12 163.14 5031886 43.0222
1242 2018-01-16 163.85 7794195 32.7428
1243 2018-01-17 168.65 11710033 39.4754
1244 2018-01-18 169.12 14259345 36.9999
1245 2018-01-19 162.37 21172488 41.1297
1246 2018-01-22 162.60 8480795 12.1231
1247 2018-01-23 166.25 7466232 39.0977
1248 2018-01-24 165.37 5645003 63.6958
1249 2018-01-25 165.47 3302520 56.4425
1250 2018-01-26 167.34 3787913 80.7444
1251 2018-01-29 166.80 3516995 61.1962
1252 2018-01-30 163.62 4902341 58.6824
1253 2018-01-31 163.70 4072830 53.7482
1254 2018-02-01 162.40 4434242 43.0222
1255 2018-02-02 159.03 5251938 61.1962
1256 2018-02-05 152.53 8746599 56.4425
1257 2018-02-06 155.34 9867678 36.0978
1258 2018-02-07 153.85 6149207 41.1311
Thanks for the help.
Another way is manipulating rsi series index to match df index from bottom up(I use only 13 rows of your sample for demo)
size_diff = df.index.size - rsi.index.size
rsi.index = df.index[size_diff:]
pd.concat([df, rsi], axis=1)
Out[1490]:
Dates Prices Volumes RSI
0 2013-02-08 201.68 2893254 NaN
1 2013-02-11 200.16 2944651 NaN
2 2013-02-12 200.04 2461779 NaN
3 2013-02-13 200.09 2169757 NaN
4 2013-02-14 199.65 3294126 NaN
5 2013-02-15 200.98 3627887 47.7357
6 2013-02-19 200.32 2998317 53.3967
7 2013-02-20 199.31 3715311 56.3866
8 2013-02-21 198.33 3923051 60.1845
9 2013-02-22 201.09 3107876 62.1784
10 2013-02-25 197.51 3845276 63.9720
11 2013-02-26 199.14 3391562 62.9671
12 2013-02-27 202.33 4185545 63.9232
13 2013-02-28 200.83 4689698 66.8817
Try like this:
df["RSI"].shift(len(df)-len(df["RSI"].dropna()))
We can get the difference in rows between the Series and the dataframe.
Then append the difference in NaN to the series (on top) with np.repeat
Finally append the new series with NaN to your original dataframe over axis=1 (columns)
diff = df.shape[0] - RSI.shape[0]
rpts = np.repeat(np.NaN, diff)
RSI = pd.concat([pd.Series(rpts, name='RSI'), RSI], ignore_index=True)
pd.concat([df, RSI['RSI']], axis=1).head(20)
Dates Prices Volumes RSI
0 2013-02-08 201.68 2893254 NaN
1 2013-02-11 200.16 2944651 NaN
2 2013-02-12 200.04 2461779 NaN
3 2013-02-13 200.09 2169757 NaN
4 2013-02-14 199.65 3294126 NaN
5 2013-02-15 200.98 3627887 NaN
6 2013-02-19 200.32 2998317 NaN
7 2013-02-20 199.31 3715311 NaN
8 2013-02-21 198.33 3923051 NaN
9 2013-02-22 201.09 3107876 NaN
10 2013-02-25 197.51 3845276 NaN
11 2013-02-26 199.14 3391562 NaN
12 2013-02-27 202.33 4185545 NaN
13 2013-02-28 200.83 4689698 47.7357
14 2013-03-01 202.91 3308544 53.3967
15 2013-03-04 205.19 3693365 56.3866
16 2013-03-05 206.53 3807706 60.1845
17 2013-03-06 208.38 3594899 62.1784
18 2013-03-07 209.42 3884317 63.9720
19 2013-03-08 210.38 3700086 62.9671
I have a dataset like below :
date =
2012-01-01 NaN NaN NaN
2012-01-02 NaN NaN NaN
2012-01-03 NaN NaN NaN
2012-01-04 0.880 2.981 -0.0179
2012-01-05 0.857 2.958 -0.0261
2012-01-06 0.858 2.959 0.0012
2012-01-07 NaN NaN NaN
2012-01-08 NaN NaN NaN
2012-01-09 0.880 2.981 0.0256
2012-01-10 0.905 3.006 0.0284
2012-01-11 0.905 3.006 0.0000
2012-01-12 0.902 3.003 -0.0033
2012-01-13 0.880 2.981 -0.0244
2012-01-14 NaN NaN NaN
2012-01-15 NaN NaN NaN
2012-01-16 0.858 2.959 -0.0250
2012-01-17 0.891 2.992 0.0385
2012-01-18 0.878 2.979 -0.0146
2012-01-19 0.887 2.988 0.0103
2012-01-20 0.899 3.000 0.0135
2012-01-21 NaN NaN NaN
2012-01-22 NaN NaN NaN
2012-01-23 NaN NaN NaN
2012-01-24 NaN NaN NaN
2012-01-25 NaN NaN NaN
2012-01-26 NaN NaN NaN
2012-01-27 NaN NaN NaN
2012-01-28 NaN NaN NaN
2012-01-29 NaN NaN NaN
2012-01-30 0.892 2.993 -0.0078
... ... ... ...
2016-12-02 1.116 3.417 -0.0124
2016-12-03 NaN NaN NaN
2016-12-04 NaN NaN NaN
2016-12-05 1.111 3.412 -0.0045
2016-12-06 1.111 3.412 0.0000
2016-12-07 1.120 3.421 0.0081
2016-12-08 1.113 3.414 -0.0063
2016-12-09 1.109 3.410 -0.0036
2016-12-10 NaN NaN NaN
2016-12-11 NaN NaN NaN
2016-12-12 1.072 3.373 -0.0334
2016-12-13 1.075 3.376 0.0028
2016-12-14 1.069 3.370 -0.0056
2016-12-15 1.069 3.370 0.0000
2016-12-16 1.073 3.374 0.0037
2016-12-17 NaN NaN NaN
2016-12-18 NaN NaN NaN
2016-12-19 1.071 3.372 -0.0019
2016-12-20 1.067 3.368 -0.0037
2016-12-21 1.076 3.377 0.0084
2016-12-22 1.076 3.377 0.0000
2016-12-23 1.066 3.367 -0.0093
2016-12-24 NaN NaN NaN
2016-12-25 NaN NaN NaN
2016-12-26 1.041 3.372 0.0047
2016-12-27 1.042 3.373 0.0010
2016-12-28 1.038 3.369 -0.0038
2016-12-29 1.035 3.366 -0.0029
2016-12-30 1.038 3.369 0.0029
2016-12-31 1.038 3.369 0.0000
when I do :
in_range_df = Days_Count_Sum["2012-01-01":"2016-12-31"]
print("In range: ",in_range_df)
Week_Count = in_range_df.groupby(in_range_df.index.week)
print("in_range_df.index.week: ",in_range_df.index.week)
print("Group by Week: ",Week_Count.sum())
I found the result always get list of 1 to 53 (weeks)
when print out :in_range_df.index.week: [52 1 1 ..., 52 52 52]
I realized the index value is always "52" after the first year of this range 2012.
How to group by weeks from the range of more than one year?
I set up a new data frame SimMean:
columns = ['Tenor','5x16', '7x8', '2x16H']
index = range(0,12)
SimMean = pd.DataFrame(index=index, columns=columns)
SimMean
Tenor 5x16 7x8 2x16H
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN
11 NaN NaN NaN NaN
I have another data frame FwdDf:
FwdDf
Tenor 5x16 7x8 2x16H
0 2017-01-01 50.94 34.36 43.64
1 2017-02-01 50.90 32.60 42.68
2 2017-03-01 42.66 26.26 37.26
3 2017-04-01 37.08 22.65 32.46
4 2017-05-01 42.21 20.94 33.28
5 2017-06-01 39.30 22.05 32.29
6 2017-07-01 50.90 21.80 38.51
7 2017-08-01 42.77 23.64 35.07
8 2017-09-01 37.45 19.61 32.68
9 2017-10-01 37.55 21.75 32.10
10 2017-11-01 35.61 22.73 32.90
11 2017-12-01 40.16 29.79 37.49
12 2018-01-01 53.45 36.09 47.61
13 2018-02-01 52.89 35.74 45.00
14 2018-03-01 44.67 27.79 38.62
15 2018-04-01 38.48 24.21 34.43
16 2018-05-01 43.87 22.17 34.69
17 2018-06-01 40.24 22.85 34.31
18 2018-07-01 49.98 23.58 39.96
19 2018-08-01 45.57 24.76 37.23
20 2018-09-01 38.90 21.74 34.22
21 2018-10-01 39.75 23.36 35.20
22 2018-11-01 38.04 24.20 34.62
23 2018-12-01 42.68 31.03 40.00
now I need to assign the 'Tenor' data from row 12 to row 23 in FwdDf to the new data frame SimMean.
I used
SimMean.loc[0:11,'Tenor'] = FwdDf.loc [12:23,'Tenor']
but it didn't work:
SimMean
Tenor 5x16 7x8 2x16H
0 None NaN NaN NaN
1 None NaN NaN NaN
2 None NaN NaN NaN
3 None NaN NaN NaN
4 None NaN NaN NaN
5 None NaN NaN NaN
6 None NaN NaN NaN
7 None NaN NaN NaN
8 None NaN NaN NaN
9 None NaN NaN NaN
10 None NaN NaN NaN
11 None NaN NaN NaN
I'm new to python. I would appreciate your help. Thanks
call .values so there are no index alignment issues:
In [35]:
SimMean.loc[0:11,'Tenor'] = FwdDf.loc[12:23,'Tenor'].values
SimMean
Out[35]:
Tenor 5x16 7x8 2x16H
0 2018-01-01 NaN NaN NaN
1 2018-02-01 NaN NaN NaN
2 2018-03-01 NaN NaN NaN
3 2018-04-01 NaN NaN NaN
4 2018-05-01 NaN NaN NaN
5 2018-06-01 NaN NaN NaN
6 2018-07-01 NaN NaN NaN
7 2018-08-01 NaN NaN NaN
8 2018-09-01 NaN NaN NaN
9 2018-10-01 NaN NaN NaN
10 2018-11-01 NaN NaN NaN
11 2018-12-01 NaN NaN NaN
EDIT
As your column is actually datetime then you need to convert the type again:
In [46]:
SimMean['Tenor'] = pd.to_datetime(SimMean['Tenor'])
SimMean
Out[46]:
Tenor 5x16 7x8 2x16H
0 2018-01-01 NaN NaN NaN
1 2018-02-01 NaN NaN NaN
2 2018-03-01 NaN NaN NaN
3 2018-04-01 NaN NaN NaN
4 2018-05-01 NaN NaN NaN
5 2018-06-01 NaN NaN NaN
6 2018-07-01 NaN NaN NaN
7 2018-08-01 NaN NaN NaN
8 2018-09-01 NaN NaN NaN
9 2018-10-01 NaN NaN NaN
10 2018-11-01 NaN NaN NaN
11 2018-12-01 NaN NaN NaN