Python to increment date every week in a dataframe

Python to increment date every week in a dataframe - python

I am trying to work on this requirement where I need to increment the date in weeks, here is the below code for the same:
import pandas as pd
import numpy as np
c=15
s={'week':[1,2,3,4,5,6,7,8],'Sales':[10,20,30,40,50,60,70,80]}
p=pd.DataFrame(data=s)
p['week'] =p['week'].apply(
lambda x: datetime.datetime.strptime(f'2021-{x:02}-1', '%Y-%U-%u')
)
O/P-
How would I be able to increment from last row of week column to get next 15 weeks?
Basically, the desired output of week starts from 2022-03-01 till the next 14 weeks.

One option is to use date_range to generate additional dates, then use set_index + reindex to append them:
p = p.set_index('week').reindex(pd.date_range('2021-01-04', periods=8+14, freq='W-MON')).rename_axis(['week']).reset_index()
Output:
week Sales
0 2021-01-04 10.0
1 2021-01-11 20.0
2 2021-01-18 30.0
3 2021-01-25 40.0
4 2021-02-01 50.0
5 2021-02-08 60.0
6 2021-02-15 70.0
7 2021-02-22 80.0
8 2021-03-01 NaN
9 2021-03-08 NaN
10 2021-03-15 NaN
11 2021-03-22 NaN
12 2021-03-29 NaN
13 2021-04-05 NaN
14 2021-04-12 NaN
15 2021-04-19 NaN
16 2021-04-26 NaN
17 2021-05-03 NaN
18 2021-05-10 NaN
19 2021-05-17 NaN
20 2021-05-24 NaN
21 2021-05-31 NaN

You can modify the length of of week list with range() function and your variable c, but you will also check for the length of sales, which has to have the same number of elements:
import pandas as pd
import numpy as np
import datetime
c=15
weeks = list(range(1, c+1))
sales = [10,20,30,40,50,60,70,80]
s={'week':weeks,'Sales':sales+[None]*(len(weeks)-len(sales) if (len(weeks)-len(sales)) >=0 else 0)}
p=pd.DataFrame(data=s)
p['week'] =p['week'].apply(
lambda x: datetime.datetime.strptime(f'2021-{x:02}-1', '%Y-%U-%u')
)
print(p)

another option in DateOffset:
p = pd.concat([p, pd.DataFrame({'week': [p.iloc[-1,0]+pd.DateOffset(weeks=i) for i in range(1,c)]})], ignore_index=True)
>>> p
'''
week Sales
0 2021-01-04 10.0
1 2021-01-11 20.0
2 2021-01-18 30.0
3 2021-01-25 40.0
4 2021-02-01 50.0
5 2021-02-08 60.0
6 2021-02-15 70.0
7 2021-02-22 80.0
8 2021-03-01 NaN
9 2021-03-08 NaN
10 2021-03-15 NaN
11 2021-03-22 NaN
12 2021-03-29 NaN
13 2021-04-05 NaN
14 2021-04-12 NaN
15 2021-04-19 NaN
16 2021-04-26 NaN
17 2021-05-03 NaN
18 2021-05-10 NaN
19 2021-05-17 NaN
20 2021-05-24 NaN
21 2021-05-31 NaN

Related

Upsample and interpolate in multivariate time series pandas [duplicate]

I am trying to work on this requirement where I need to increment the date in weeks, here is the below code for the same:
import pandas as pd
import numpy as np
c=15
s={'week':[1,2,3,4,5,6,7,8],'Sales':[10,20,30,40,50,60,70,80]}
p=pd.DataFrame(data=s)
p['week'] =p['week'].apply(
lambda x: datetime.datetime.strptime(f'2021-{x:02}-1', '%Y-%U-%u')
)
O/P-
How would I be able to increment from last row of week column to get next 15 weeks?
Basically, the desired output of week starts from 2022-03-01 till the next 14 weeks.

One option is to use date_range to generate additional dates, then use set_index + reindex to append them:
p = p.set_index('week').reindex(pd.date_range('2021-01-04', periods=8+14, freq='W-MON')).rename_axis(['week']).reset_index()
Output:
week Sales
0 2021-01-04 10.0
1 2021-01-11 20.0
2 2021-01-18 30.0
3 2021-01-25 40.0
4 2021-02-01 50.0
5 2021-02-08 60.0
6 2021-02-15 70.0
7 2021-02-22 80.0
8 2021-03-01 NaN
9 2021-03-08 NaN
10 2021-03-15 NaN
11 2021-03-22 NaN
12 2021-03-29 NaN
13 2021-04-05 NaN
14 2021-04-12 NaN
15 2021-04-19 NaN
16 2021-04-26 NaN
17 2021-05-03 NaN
18 2021-05-10 NaN
19 2021-05-17 NaN
20 2021-05-24 NaN
21 2021-05-31 NaN

You can modify the length of of week list with range() function and your variable c, but you will also check for the length of sales, which has to have the same number of elements:
import pandas as pd
import numpy as np
import datetime
c=15
weeks = list(range(1, c+1))
sales = [10,20,30,40,50,60,70,80]
s={'week':weeks,'Sales':sales+[None]*(len(weeks)-len(sales) if (len(weeks)-len(sales)) >=0 else 0)}
p=pd.DataFrame(data=s)
p['week'] =p['week'].apply(
lambda x: datetime.datetime.strptime(f'2021-{x:02}-1', '%Y-%U-%u')
)
print(p)

another option in DateOffset:
p = pd.concat([p, pd.DataFrame({'week': [p.iloc[-1,0]+pd.DateOffset(weeks=i) for i in range(1,c)]})], ignore_index=True)
>>> p
'''
week Sales
0 2021-01-04 10.0
1 2021-01-11 20.0
2 2021-01-18 30.0
3 2021-01-25 40.0
4 2021-02-01 50.0
5 2021-02-08 60.0
6 2021-02-15 70.0
7 2021-02-22 80.0
8 2021-03-01 NaN
9 2021-03-08 NaN
10 2021-03-15 NaN
11 2021-03-22 NaN
12 2021-03-29 NaN
13 2021-04-05 NaN
14 2021-04-12 NaN
15 2021-04-19 NaN
16 2021-04-26 NaN
17 2021-05-03 NaN
18 2021-05-10 NaN
19 2021-05-17 NaN
20 2021-05-24 NaN
21 2021-05-31 NaN

remove certain numbers from two dataframes python

I have two dataframes
dt AAPL AMC AMZN ASO ATH ... SPCE SRNE TH TSLA VIAC WKHS
0 2021-04-12 36 28 6 20 1 ... 5 0 0 50 23 0
1 2021-04-13 46 15 5 16 6 ... 5 0 0 122 12 1
2 2021-04-14 12 4 1 5 2 ... 2 0 0 39 1 0
3 2021-04-15 30 23 3 14 2 ... 15 0 0 101 9 0
dt AAPL AMC AMZN ASO ATH ... SPCE SRNE TH TSLA VIAC WKHS
0 2021-04-12 41 28 4 33 10 ... 5 0 0 56 14 3
1 2021-04-13 76 22 7 12 29 ... 4 0 0 134 8 2
2 2021-04-14 21 15 2 7 16 ... 2 0 0 61 3 0
3 2021-04-15 54 43 9 2 31 ... 16 0 0 83 13 1
I want to remove numbers from two dataframe that are lower than 10 if the instance is deleted from one dataframe the same cell should be remove in another dataframe same thing goes other way around
Appreciate your help

Use a mask:
## pre-requisite
df1 = df1.set_index('dt')
df2 = df2.set_index('dt')
## processing
mask = df1.lt(10) | df2.lt(10)
df1 = df1.mask(mask)
df2 = df2.mask(mask)
output:
>>> df1
AAPL AMC AMZN ASO ATH SPCE SRNE TH TSLA VIAC WKHS
dt
2021-04-12 36 28.0 NaN 20.0 NaN NaN NaN NaN 50 23.0 NaN
2021-04-13 46 15.0 NaN 16.0 NaN NaN NaN NaN 122 NaN NaN
2021-04-14 12 NaN NaN NaN NaN NaN NaN NaN 39 NaN NaN
2021-04-15 30 23.0 NaN NaN NaN 15.0 NaN NaN 101 NaN NaN
>>> df2
AAPL AMC AMZN ASO ATH SPCE SRNE TH TSLA VIAC WKHS
dt
2021-04-12 41 28.0 NaN 33.0 NaN NaN NaN NaN 56 14.0 NaN
2021-04-13 76 22.0 NaN 12.0 NaN NaN NaN NaN 134 NaN NaN
2021-04-14 21 NaN NaN NaN NaN NaN NaN NaN 61 NaN NaN
2021-04-15 54 43.0 NaN NaN NaN 16.0 NaN NaN 83 NaN NaN

Daily rate of return based on limited values - pandas

I wanted to calculate daily log rate of return for Optionvalue but only for first 252days in the data. I'm getting KeyError: 'log return'
import pandas as pd
import numpy as np
EUR = pd.read_csv('C:eurpln_d.csv', sep = ",", parse_dates=['Date'])
USD = pd.read_csv('C:usdpln_d.csv', sep = ",", parse_dates=['Date'])
w_1 = 0.5
w_2 = 1-w_1
EUR.merge(USD, on="Date")
EUR["Optionvalue"] = EUR["Close"]*w_1 + EUR["Close"]*w_2
So what i would like to have is log return but only on first 252days (which is to say I need to take only 252 first occurences, in this dailylogreturn calculation): log(yt)−log(yt−1). I've tried to use below.
EUR['log return'].iloc[0:252]= np.log(EUR["Optionvalue"]) - np.log(EUR["Optionvalue"].iloc[0])
Is my "np.log(EUR["Optionvalue"].iloc[0]" correctly taking previous value when calculating log return?
How can I limit data so I can calculate daily log return based only on first 252 dates? Above .iloc[0:252] seems to not work..Please help!

Small example
iloc[0] will just give you the first row of something, not the previous value. You can use shift(1) (shown below) to get the previous value.
When taking the previous value, the first item will be NA or NaN since there is no previous value of the first value. You can use fillna to provide an "artificial" value (1 in the below example)
Note that the first value in the last column is therefore artificial. Remove the fillna to keep this value NaN.
You should do iloc on an existing column. You can initialize a new column with a fixed value (e.g. -1 as in below)
You can remove the prev column below and use its value directly in the last assignment, if desired.
import pandas as pd
import numpy as np
from datetime import datetime
d = {'date': [datetime(2020, 5, d) for d in range(1, 30)],
'current': [x for x in range(1, 30)]}
df = pd.DataFrame(data=d)
df['prev'] = df.shift(1).fillna(1)['current']
df['logdiff'] = -1
df['logdiff'].iloc[0:20] = np.log(df['current']) - np.log(df['prev'])
print(df)
date current prev logdiff
0 2020-05-01 1 1.0 0.000000
1 2020-05-02 2 1.0 0.693147
2 2020-05-03 3 2.0 0.405465
3 2020-05-04 4 3.0 0.287682
4 2020-05-05 5 4.0 0.223144
5 2020-05-06 6 5.0 0.182322
6 2020-05-07 7 6.0 0.154151
7 2020-05-08 8 7.0 0.133531
8 2020-05-09 9 8.0 0.117783
9 2020-05-10 10 9.0 0.105361
10 2020-05-11 11 10.0 0.095310
11 2020-05-12 12 11.0 0.087011
12 2020-05-13 13 12.0 0.080043
13 2020-05-14 14 13.0 0.074108
14 2020-05-15 15 14.0 0.068993
15 2020-05-16 16 15.0 0.064539
16 2020-05-17 17 16.0 0.060625
17 2020-05-18 18 17.0 0.057158
18 2020-05-19 19 18.0 0.054067
19 2020-05-20 20 19.0 0.051293
20 2020-05-21 21 20.0 -1.000000
21 2020-05-22 22 21.0 -1.000000
22 2020-05-23 23 22.0 -1.000000
23 2020-05-24 24 23.0 -1.000000
24 2020-05-25 25 24.0 -1.000000
25 2020-05-26 26 25.0 -1.000000
26 2020-05-27 27 26.0 -1.000000
27 2020-05-28 28 27.0 -1.000000
28 2020-05-29 29 28.0 -1.000000

If row below duplicate, use value from other columns until a new value is found

I have a tricky data manipulation question. Basically, I have a list of dates. On each day, there is a count of how many issues are open. I want to create a new column, ideal_issues_left, that uses np.linspace to calculate the ideal number of issues left, if they are all to be completed at a steady rate each day to zero at the end of the date range.
I have managed to create a dataframe of the estimates per day from each starting point, but what I want to do now is fill the ideal_issues_left column with the estimates based on the following logic:
If the number of open issues is different the next day, fill ideal_issues_left with the first column from the estimates data frame.
If the number of open issues is the same, fill ideal_issues_left with data from the columns 1+, until a new number of open_issues is reached.
For example, say this is the date range and open issues:
import pandas as pd
chart_data = pd.DataFrame({
'date': pd.date_range('2018-08-19', '2018-09-01', freq='d'),
'open_issues': [23.0, 25.0, 26.0, 26.0, 28.0, 36.0, 33.0, 39.0, 39.0, 38.0, 38.0, 38.0, 38.0, 38.0]
})
chart_data
date open_issues
0 2020-08-19 23.0
1 2020-08-20 25.0
2 2020-08-21 26.0
3 2020-08-22 26.0
4 2020-08-23 28.0
5 2020-08-24 36.0
6 2020-08-25 33.0
7 2020-08-26 39.0
8 2020-08-27 39.0
9 2020-08-28 38.0
10 2020-08-29 38.0
11 2020-08-30 38.0
12 2020-08-31 38.0
13 2020-09-01 38.0
p = []
for day, val in enumerate(chart_data.loc[:, 'open_issues']):
days_left = 14 - day
p.append(np.linspace(start=val, stop=0, num=days_left))
estimates = pd.DataFrame(p)
estimates
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 23.0 21.230769 19.461538 17.692308 15.923077 14.153846 12.384615 10.615385 8.846154 7.076923 5.307692 3.538462 1.769231 0.0
1 25.0 22.916667 20.833333 18.750000 16.666667 14.583333 12.500000 10.416667 8.333333 6.250000 4.166667 2.083333 0.000000 NaN
2 26.0 23.636364 21.272727 18.909091 16.545455 14.181818 11.818182 9.454545 7.090909 4.727273 2.363636 0.000000 NaN NaN
3 26.0 23.400000 20.800000 18.200000 15.600000 13.000000 10.400000 7.800000 5.200000 2.600000 0.000000 NaN NaN NaN
4 28.0 24.888889 21.777778 18.666667 15.555556 12.444444 9.333333 6.222222 3.111111 0.000000 NaN NaN NaN NaN
5 36.0 31.500000 27.000000 22.500000 18.000000 13.500000 9.000000 4.500000 0.000000 NaN NaN NaN NaN NaN
6 33.0 28.285714 23.571429 18.857143 14.142857 9.428571 4.714286 0.000000 NaN NaN NaN NaN NaN NaN
7 39.0 32.500000 26.000000 19.500000 13.000000 6.500000 0.000000 NaN NaN NaN NaN NaN NaN NaN
8 39.0 31.200000 23.400000 15.600000 7.800000 0.000000 NaN NaN NaN NaN NaN NaN NaN NaN
9 38.0 28.500000 19.000000 9.500000 0.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN
10 38.0 25.333333 12.666667 0.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 38.0 19.000000 0.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
12 38.0 0.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
13 38.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
The desired end result should be:
chart_data
date open_issues ideal_issues_left
0 2018-08-19 23.0 23.0
1 2018-08-20 25.0 25.0
2 2018-08-21 26.0 26.0 # <- this value is from estimates row 2 col 0
3 2018-08-22 26.0 23.6 # <- this value is from estimates row 2 col 1
4 2018-08-23 28.0 28.0
5 2018-08-24 36.0 36.0
6 2018-08-25 33.0 33.0
7 2018-08-26 39.0 39.0 # <- this value is from estimates row 7 col 0
8 2018-08-27 39.0 32.5 # <- this value is from estimates row 7 col 1
9 2018-08-28 38.0 38.0 # <- this value is from estimates row 9 col 0
10 2018-08-29 38.0 28.5 # <- this value is from estimates row 9 col 1
11 2018-08-30 38.0 19.0 # <- this value is from estimates row 9 col 2
12 2018-08-31 38.0 9.5 # <- this value is from estimates row 9 col 3
13 2018-09-01 38.0 0.0 # <- this value is from estimates row 9 col 4
Thank you!

If there are an equal number of issues, the cumulative count is taken from the sum of the cumulative total. The value is updated with the data extracted in the same number of issues in the reference data frame.
chart_data['flg'] = chart_data['open_issues'].groupby(((chart_data['open_issues'] != chart_data['open_issues'].shift())).cumsum()).cumcount()
chart_data
date open_issues flg
0 2018-08-19 23.0 0
1 2018-08-20 25.0 0
2 2018-08-21 26.0 0
3 2018-08-22 26.0 1
4 2018-08-23 28.0 0
5 2018-08-24 36.0 0
6 2018-08-25 33.0 0
7 2018-08-26 39.0 0
8 2018-08-27 39.0 1
9 2018-08-28 38.0 0
10 2018-08-29 38.0 1
11 2018-08-30 38.0 2
12 2018-08-31 38.0 3
13 2018-09-01 38.0 4
for i,issues in enumerate(chart_data['open_issues']):
k = chart_data.loc[i,'flg']
df = estimates[estimates[0] == issues]
l = df.iloc[:1, k].values
# print(l)
chart_data.loc[i,'idea_issues_left'] = l
chart_data
date open_issues flg idea_issues_left
0 2018-08-19 23.0 0 23.000000
1 2018-08-20 25.0 0 25.000000
2 2018-08-21 26.0 0 26.000000
3 2018-08-22 26.0 1 23.636364
4 2018-08-23 28.0 0 28.000000
5 2018-08-24 36.0 0 36.000000
6 2018-08-25 33.0 0 33.000000
7 2018-08-26 39.0 0 39.000000
8 2018-08-27 39.0 1 32.500000
9 2018-08-28 38.0 0 38.000000
10 2018-08-29 38.0 1 28.500000
11 2018-08-30 38.0 2 19.000000
12 2018-08-31 38.0 3 9.500000
13 2018-09-01 38.0 4 0.000000

If your dataset is large and you want to avoid looping you can use merge instead.
chart_data["prev_day_open_issues"] = chart_data["open_issues"].shift(1)
chart_data["no match"] = chart_data["open_issues"] != chart_data["prev_day_open_issues"]
# same idea as in r-beginners code
chart_data["ideal_pos"] = (chart_data["open_issues"]
.groupby(chart_data["no match"].cumsum())
.cumcount())
# tidy up and remove temp columns
new_chart_data = chart_data[["date", "open_issues", "ideal_pos"]]
# make your estimates dataframe into a one-to-one lookup in long format
estimates["open_issues"] = estimates[0]
new_estimates = (estimates
.drop_duplicates(subset=["open_issues"])
.melt(id_vars="open_issues", var_name="ideal_pos",
value_name="ideal_issues_left"))
# join
final = new_chart_data.merge(new_estimates, how="left", on=["open_issues", "ideal_pos"])
print(final[["date", "open_issues", "ideal_issues_left"]])
date open_issues ideal_issues_left
2018-08-19 23.0 23.000000
2018-08-20 25.0 25.000000
2018-08-21 26.0 26.000000
2018-08-21 26.0 23.636364
2018-08-23 28.0 28.000000
2018-08-24 36.0 36.000000
2018-08-25 33.0 33.000000
2018-08-26 39.0 39.000000
2018-08-26 39.0 32.500000
2018-08-28 38.0 38.000000
2018-08-28 38.0 28.500000
2018-08-28 38.0 19.000000
2018-08-28 38.0 9.500000
2018-08-28 38.0 0.000000

Add missing times in dataframe column with pandas

I have a dataframe like so:
df = pd.DataFrame({'time':['23:59:45','23:49:50','23:59:55','00:00:00','00:00:05','00:00:10','00:00:15'],
'X':[-5,-4,-2,5,6,10,11],
'Y':[3,4,5,9,20,22,23]})
As you can see, the time is formed by hours (string format) and are across midnight. The time is given every 5 seconds!
My goal is however to add empty rows (filled with Nan for examples) so that the time is every second. Finally the column time should be converted as a time stamp and set as index.
Could you please suggest a smart and elegant way to achieve my goal?
Here is what the output should look like:
X Y
time
23:59:45 -5.0 3.0
23:59:46 NaN NaN
23:59:47 NaN NaN
23:59:48 NaN NaN
... ... ...
00:00:10 10.0 22.0
00:00:11 NaN NaN
00:00:12 NaN NaN
00:00:13 NaN NaN
00:00:14 NaN NaN
00:00:15 11.0 23.0
Note: I do not need the dates.

Use to_timedelta with reindex by timedelta_range:
df['time'] = pd.to_timedelta(df['time'])
idx = pd.timedelta_range('0', '23:59:59', freq='S', name='time')
df = df.set_index('time').reindex(idx).reset_index()
print (df.head(10))
time X Y
0 00:00:00 5.0 9.0
1 00:00:01 NaN NaN
2 00:00:02 NaN NaN
3 00:00:03 NaN NaN
4 00:00:04 NaN NaN
5 00:00:05 6.0 20.0
6 00:00:06 NaN NaN
7 00:00:07 NaN NaN
8 00:00:08 NaN NaN
9 00:00:09 NaN NaN
If need replace NaNs:
df = df.set_index('time').reindex(idx, fill_value=0).reset_index()
print (df.head(10))
time X Y
0 00:00:00 5 9
1 00:00:01 0 0
2 00:00:02 0 0
3 00:00:03 0 0
4 00:00:04 0 0
5 00:00:05 6 20
6 00:00:06 0 0
7 00:00:07 0 0
8 00:00:08 0 0
9 00:00:09 0 0
Another solution with resample, but is possible some rows are missing in the end:
df = df.set_index('time').resample('S').first()
print (df.tail(10))
X Y
time
23:59:46 NaN NaN
23:59:47 NaN NaN
23:59:48 NaN NaN
23:59:49 NaN NaN
23:59:50 NaN NaN
23:59:51 NaN NaN
23:59:52 NaN NaN
23:59:53 NaN NaN
23:59:54 NaN NaN
23:59:55 -2.0 5.0
EDIT1:
idx1 = pd.timedelta_range('23:59:45', '23:59:59', freq='S', name='time')
idx2 = pd.timedelta_range('0', '00:00:15', freq='S', name='time')
idx = np.concatenate([idx1, idx2])
df['time'] = pd.to_timedelta(df['time'])
df = df.set_index('time').reindex(idx).reset_index()
print (df.head(10))
time X Y
0 23:59:45 -5.0 3.0
1 23:59:46 NaN NaN
2 23:59:47 NaN NaN
3 23:59:48 NaN NaN
4 23:59:49 NaN NaN
5 23:59:50 NaN NaN
6 23:59:51 NaN NaN
7 23:59:52 NaN NaN
8 23:59:53 NaN NaN
9 23:59:54 NaN NaN
print (df.tail(10))
time X Y
21 00:00:06 NaN NaN
22 00:00:07 NaN NaN
23 00:00:08 NaN NaN
24 00:00:09 NaN NaN
25 00:00:10 10.0 22.0
26 00:00:11 NaN NaN
27 00:00:12 NaN NaN
28 00:00:13 NaN NaN
29 00:00:14 NaN NaN
30 00:00:15 11.0 23.0
EDIT:
Another solution - change next day to 1 day timedeltas:
df['time'] = pd.to_timedelta(df['time'])
a = pd.to_timedelta(df['time'].diff().dt.days.abs().cumsum().fillna(1).sub(1), unit='d')
df['time'] = df['time'] + a
print (df)
X Y time
0 -5 3 0 days 23:59:45
1 -4 4 0 days 23:49:50
2 -2 5 0 days 23:59:55
3 5 9 1 days 00:00:00
4 6 20 1 days 00:00:05
5 10 22 1 days 00:00:10
6 11 23 1 days 00:00:15
idx = pd.timedelta_range(df['time'].min(), df['time'].max(), freq='S', name='time')
df = df.set_index('time').reindex(idx).reset_index()
print (df.head(10))
time X Y
0 23:49:50 -4.0 4.0
1 23:49:51 NaN NaN
2 23:49:52 NaN NaN
3 23:49:53 NaN NaN
4 23:49:54 NaN NaN
5 23:49:55 NaN NaN
6 23:49:56 NaN NaN
7 23:49:57 NaN NaN
8 23:49:58 NaN NaN
9 23:49:59 NaN NaN
print (df.tail(10))
time X Y
616 1 days 00:00:06 NaN NaN
617 1 days 00:00:07 NaN NaN
618 1 days 00:00:08 NaN NaN
619 1 days 00:00:09 NaN NaN
620 1 days 00:00:10 10.0 22.0
621 1 days 00:00:11 NaN NaN
622 1 days 00:00:12 NaN NaN
623 1 days 00:00:13 NaN NaN
624 1 days 00:00:14 NaN NaN
625 1 days 00:00:15 11.0 23.0

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python to increment date every week in a dataframe - python

Related

Upsample and interpolate in multivariate time series pandas [duplicate]

remove certain numbers from two dataframes python

Daily rate of return based on limited values - pandas

If row below duplicate, use value from other columns until a new value is found

Add missing times in dataframe column with pandas

Categories

Resources