Slicing Pandas based on threshold and timestamp before and after

Slicing Pandas based on threshold and timestamp before and after - python

I have a dataframe that looks as follows:
Timestamp (Index) Status value
2017-01-01 12:01:00 OPEN 83
2017-01-01 12:02:00 OPEN 82
2017-01-01 12:03:00 OPEN 87
2017-01-01 12:04:00 CLOSE 82
2017-01-01 12:05:00 CLOSE 81
2017-01-01 12:06:00 CLOSE 81
2017-01-01 12:07:00 CLOSE 81
2017-01-01 12:08:00 CLOSE 81
2017-01-01 12:09:00 CLOSE 81
2017-01-01 12:10:00 CLOSE 81
2017-01-01 12:11:00 CLOSE 81
2017-01-01 12:12:00 OPEN 81
2017-01-01 12:13:00 OPEN 81
2017-01-01 12:14:00 OPEN 81
2017-01-01 12:15:00 OPEN 81
2017-01-01 12:16:00 CLEAR 34
2017-01-01 12:17:00 CLOSE 23
2017-01-01 12:18:00 CLOSE 23
2017-01-01 12:19:00 CLOSE 75
2017-01-01 12:20:00 CLOSE 65
2017-01-01 12:21:00 CLOSE 72
2017-01-01 12:22:00 CLOSE 76
2017-01-01 12:23:00 CLOSE 77
2017-01-01 12:24:00 OPEN 87
2017-01-01 12:25:00 OPEN 87
2017-01-01 12:26:00 OPEN 87
2017-01-01 12:27:00 OPEN 87
2017-01-01 12:28:00 OPEN 87
2017-01-01 12:29:00 CLOSE 75
2017-01-01 12:30:00 CLOSE 75
2017-01-01 12:31:00 CLOSE 75
In case the first of the consecutive CLOSE-values is below 70 I want to delete the OPEN-block that cames before and the CLOSE-block with the value below 70. It should look like this:
Timestamp (Index) Status value
2017-01-01 12:01:00 OPEN 83
2017-01-01 12:02:00 OPEN 82
2017-01-01 12:03:00 OPEN 87
2017-01-01 12:04:00 CLOSE 82
2017-01-01 12:05:00 CLOSE 81
2017-01-01 12:06:00 CLOSE 81
2017-01-01 12:07:00 CLOSE 81
2017-01-01 12:08:00 CLOSE 81
2017-01-01 12:09:00 CLOSE 81
2017-01-01 12:10:00 CLOSE 81
2017-01-01 12:11:00 CLOSE 81
2017-01-01 12:24:00 OPEN 87
2017-01-01 12:25:00 OPEN 87
2017-01-01 12:26:00 OPEN 87
2017-01-01 12:27:00 OPEN 87
2017-01-01 12:28:00 OPEN 87
2017-01-01 12:29:00 CLOSE 75
2017-01-01 12:30:00 CLOSE 75
2017-01-01 12:31:00 CLOSE 75
Any idea on how I can get hold of the relevant Timestamps in order to remove those periods?

Try:
df[df.groupby((df.status.shift().bfill().ne(df.status) & df.status.eq('OPEN')).cumsum()).transform(min).value.ge(70)]
result:
status value
timestamp
2017-01-01 12:01:00 OPEN 83
2017-01-01 12:02:00 OPEN 82
2017-01-01 12:03:00 OPEN 87
2017-01-01 12:04:00 CLOSE 82
2017-01-01 12:05:00 CLOSE 81
2017-01-01 12:06:00 CLOSE 81
2017-01-01 12:07:00 CLOSE 81
2017-01-01 12:08:00 CLOSE 81
2017-01-01 12:09:00 CLOSE 81
2017-01-01 12:10:00 CLOSE 81
2017-01-01 12:11:00 CLOSE 81
2017-01-01 12:24:00 OPEN 87
2017-01-01 12:25:00 OPEN 87
2017-01-01 12:26:00 OPEN 87
2017-01-01 12:27:00 OPEN 87
2017-01-01 12:28:00 OPEN 87
2017-01-01 12:29:00 CLOSE 75
2017-01-01 12:30:00 CLOSE 75
2017-01-01 12:31:00 CLOSE 75
The method is to create groups where status is not equal to previous status, begining with status == 'OPEN'.
Then preserving rows where value is greater or equal 70 per group.

Related

Error plotting a time column as x-axis ticks

I have a df as follows
Time Samstag
0 00:15:00 80.6
1 00:30:00 74.6
2 00:45:00 69.2
3 01:00:00 63.6
4 01:15:00 57.1
5 01:30:00 50.4
6 01:45:00 44.1
7 02:00:00 39.1
8 02:15:00 36.0
9 02:30:00 34.4
10 02:45:00 33.7
11 03:00:00 33.3
12 03:15:00 32.7
13 03:30:00 32.0
14 03:45:00 31.5
15 04:00:00 31.3
16 04:15:00 31.5
17 04:30:00 31.7
18 04:45:00 31.5
19 05:00:00 30.3
20 05:15:00 28.1
21 05:30:00 26.4
22 05:45:00 27.1
23 06:00:00 32.3
24 06:15:00 42.9
25 06:30:00 56.2
26 06:45:00 68.5
27 07:00:00 76.3
28 07:15:00 77.0
29 07:30:00 72.9
30 07:45:00 67.3
31 08:00:00 63.6
32 08:15:00 64.5
33 08:30:00 69.5
34 08:45:00 77.4
35 09:00:00 87.1
36 09:15:00 97.4
37 09:30:00 108.4
38 09:45:00 119.9
39 10:00:00 132.1
40 10:15:00 144.7
41 10:30:00 156.7
42 10:45:00 166.9
43 11:00:00 174.1
44 11:15:00 177.4
45 11:30:00 177.7
46 11:45:00 176.2
47 12:00:00 174.1
48 12:15:00 172.6
49 12:30:00 172.0
50 12:45:00 172.4
51 13:00:00 174.1
52 13:15:00 177.1
53 13:30:00 180.4
54 13:45:00 183.0
55 14:00:00 183.9
56 14:15:00 182.4
57 14:30:00 179.5
58 14:45:00 176.6
59 15:00:00 175.1
60 15:15:00 176.0
61 15:30:00 178.9
62 15:45:00 182.8
63 16:00:00 186.8
64 16:15:00 190.3
65 16:30:00 193.8
66 16:45:00 197.9
67 17:00:00 203.5
68 17:15:00 210.8
69 17:30:00 218.8
70 17:45:00 226.3
71 18:00:00 231.8
72 18:15:00 234.4
73 18:30:00 234.5
74 18:45:00 233.0
75 19:00:00 230.9
76 19:15:00 228.7
77 19:30:00 226.9
78 19:45:00 225.3
79 20:00:00 224.0
80 20:15:00 223.0
81 20:30:00 221.5
82 20:45:00 218.9
83 21:00:00 214.2
84 21:15:00 207.0
85 21:30:00 197.0
86 21:45:00 184.4
87 22:00:00 169.2
88 22:15:00 151.8
89 22:30:00 133.7
90 22:45:00 116.7
91 23:00:00 102.7
92 23:15:00 93.0
93 23:30:00 86.6
94 23:45:00 82.2
I am trying to plot this as follows:
sns.lineplot(x="Time", y="Samstag", data=w_df)
plt.xticks(rotation=15)
plt.xlabel("Time")
plt.ylabel("KWH")
plt.show()
and it gives:
The label of x-axis is 00:00, 05:33:20, .... and so on.
I am trying to plot the Time column as the ticks in x-axis
I tried:
t = pd.to_datetime(w_df["Time"], format='%H:%M:%S')
t = t.apply(lambda x: x.strftime('%H:%M:%S'))
sns.lineplot(x="Time", y="Samstag", data=w_df)
plt.xticks(ticks=t, rotation=15)
plt.xlabel("Time")
plt.ylabel("KWH")
plt.show()
It throws the following error:
Traceback (most recent call last):
File "", line 2, in
plt.xticks(ticks=t, rotation=15)
File
"/home/user/anaconda3/lib/python3.7/site-packages/matplotlib/pyplot.py",
line 1540, in xticks
locs = ax.set_xticks(ticks)
File
"/home/user/anaconda3/lib/python3.7/site-packages/matplotlib/axes/_base.py",
line 3350, in set_xticks
ret = self.xaxis.set_ticks(ticks, minor=minor)
File
"/home/user/anaconda3/lib/python3.7/site-packages/matplotlib/axis.py",
line 1755, in set_ticks
self.set_view_interval(min(ticks), max(ticks))
File
"/home/user/anaconda3/lib/python3.7/site-packages/matplotlib/axis.py",
line 1892, in setter
setter(self, min(vmin, vmax, oldmin), max(vmin, vmax, oldmax),
TypeError: '<' not supported between instances of 'numpy.ndarray' and
'str'
Can anyone please tell the mistake that I am doing?
Also,
w_df.dtypes
Out[27]:
Time object
Samstag float64
Sonntag float64
Werktag float64
dtype: object

So I took some of your data and attempted to get your result. Unfortunately, my Seaborn plot is plotting in the same format that you would like. This may have to do with the format of your time column. When I made my small dataset from your example, I made the time column a string, and it appears that everything is plotting fine.
d = {'Time': ["00:15:00", "00:30:00", "00:45:00", "01:00:00", "01:15:00", "01:30:00", "01:45:00",
"02:00:00", "02:15:00", "02:30:00", "02:45:00", "03:00:00", "03:15:00", "03:30:00", "03:45:00",
"04:00:00", "04:15:00", "04:30:00", "04:45:00", "05:00:00", "05:15:00", "05:30:00",
"05:45:00", "06:00:00"],
'Samstag': [80.6, 74.6,69.2, 62.6, 57.1,50.4, 44.1, 39.1, 36.0, 34.4, 33.7,33.3, 32.7, 32.0,
31.5, 31.3, 31.5, 31.7, 31.5,30.3, 28.1, 26.4, 27.1, 32.3]
}
df = pd.DataFrame(d)
sns.lineplot(x="Time", y="Samstag", data=df)
plt.xticks(rotation=15)
plt.xlabel("Time")
plt.ylabel("KWH")
plt.show()
This makes every time stamp a tick mark. Perhaps you can change your time column to be a string, if it is not already.
df['Time'] = df['Time'].astype(str)

Pandas - Sum of first X hours of datetime index

I have a dataframe with a datetime index and 100 columns.
I want to have a new dataframe with the same datetime index and columns, but the values would contain the sum of the first 10 hours of each day.
So if I had an original dataframe like this:
A B C
---------------------------------
2018-01-01 00:00:00 2 5 -10
2018-01-01 01:00:00 6 5 7
2018-01-01 02:00:00 7 5 9
2018-01-01 03:00:00 9 5 6
2018-01-01 04:00:00 10 5 2
2018-01-01 05:00:00 7 5 -1
2018-01-01 06:00:00 1 5 -1
2018-01-01 07:00:00 -4 5 10
2018-01-01 08:00:00 9 5 10
2018-01-01 09:00:00 21 5 -10
2018-01-01 10:00:00 2 5 -1
2018-01-01 11:00:00 8 5 -1
2018-01-01 12:00:00 8 5 10
2018-01-01 13:00:00 8 5 9
2018-01-01 14:00:00 7 5 -10
2018-01-01 15:00:00 7 5 5
2018-01-01 16:00:00 7 5 -10
2018-01-01 17:00:00 4 5 7
2018-01-01 18:00:00 5 5 8
2018-01-01 19:00:00 2 5 8
2018-01-01 20:00:00 2 5 4
2018-01-01 21:00:00 8 5 3
2018-01-01 22:00:00 1 5 3
2018-01-01 23:00:00 1 5 1
2018-01-02 00:00:00 2 5 2
2018-01-02 01:00:00 3 5 8
2018-01-02 02:00:00 4 5 6
2018-01-02 03:00:00 5 5 6
2018-01-02 04:00:00 1 5 7
2018-01-02 05:00:00 7 5 7
2018-01-02 06:00:00 5 5 1
2018-01-02 07:00:00 2 5 2
2018-01-02 08:00:00 4 5 3
2018-01-02 09:00:00 6 5 4
2018-01-02 10:00:00 9 5 4
2018-01-02 11:00:00 11 5 5
2018-01-02 12:00:00 2 5 8
2018-01-02 13:00:00 2 5 0
2018-01-02 14:00:00 4 5 5
2018-01-02 15:00:00 5 5 4
2018-01-02 16:00:00 7 5 4
2018-01-02 17:00:00 -1 5 7
2018-01-02 18:00:00 1 5 7
2018-01-02 19:00:00 1 5 7
2018-01-02 20:00:00 5 5 7
2018-01-02 21:00:00 2 5 7
2018-01-02 22:00:00 2 5 7
2018-01-02 23:00:00 8 5 7
So for all rows with date 2018-01-01:
The value for column A would be 68 (2+6+7+9+10+7+1-4+9+21)
The value for column B would be 50 (5+5+5+5+5+5+5+5+5+5)
The value for column C would be 22 (-10+7+9+6+2-1-1+10+10-10)
So for all rows with date 2018-01-02:
The value for column A would be 39 (2+3+4+5+1+7+5+2+4+6)
The value for column B would be 50 (5+5+5+5+5+5+5+5+5+5)
The value for column C would be 46 (2+8+6+6+7+7+1+2+3+4)
The outcome would be:
A B C
---------------------------------
2018-01-01 00:00:00 68 50 22
2018-01-01 01:00:00 68 50 22
2018-01-01 02:00:00 68 50 22
2018-01-01 03:00:00 68 50 22
2018-01-01 04:00:00 68 50 22
2018-01-01 05:00:00 68 50 22
2018-01-01 06:00:00 68 50 22
2018-01-01 07:00:00 68 50 22
2018-01-01 08:00:00 68 50 22
2018-01-01 09:00:00 68 50 22
2018-01-01 10:00:00 68 50 22
2018-01-01 11:00:00 68 50 22
2018-01-01 12:00:00 68 50 22
2018-01-01 13:00:00 68 50 22
2018-01-01 14:00:00 68 50 22
2018-01-01 15:00:00 68 50 22
2018-01-01 16:00:00 68 50 22
2018-01-01 17:00:00 68 50 22
2018-01-01 18:00:00 68 50 22
2018-01-01 19:00:00 68 50 22
2018-01-01 20:00:00 68 50 22
2018-01-01 21:00:00 68 50 22
2018-01-01 22:00:00 68 50 22
2018-01-01 23:00:00 68 50 22
2018-01-02 00:00:00 39 50 46
2018-01-02 01:00:00 39 50 46
2018-01-02 02:00:00 39 50 46
2018-01-02 03:00:00 39 50 46
2018-01-02 04:00:00 39 50 46
2018-01-02 05:00:00 39 50 46
2018-01-02 06:00:00 39 50 46
2018-01-02 07:00:00 39 50 46
2018-01-02 08:00:00 39 50 46
2018-01-02 09:00:00 39 50 46
2018-01-02 10:00:00 39 50 46
2018-01-02 11:00:00 39 50 46
2018-01-02 12:00:00 39 50 46
2018-01-02 13:00:00 39 50 46
2018-01-02 14:00:00 39 50 46
2018-01-02 15:00:00 39 50 46
2018-01-02 16:00:00 39 50 46
2018-01-02 17:00:00 39 50 46
2018-01-02 18:00:00 39 50 46
2018-01-02 19:00:00 39 50 46
2018-01-02 20:00:00 39 50 46
2018-01-02 21:00:00 39 50 46
2018-01-02 22:00:00 39 50 46
2018-01-02 23:00:00 39 50 46
I figured I'd group by date first and perform a sum and then merge the results based on the date. Is there a better/faster way to do this?
Thanks.
EDIT: I worked on this answer in the mean time:
df= df.between_time('0:00','9:00').groupby(pd.Grouper(freq='D')).sum()
df= df.resample('1H').ffill()

You need groupby df.index.date and use transfrom with lambda function to find sum of first 10 values as:
df.loc[:,['A','B','C']] = df.groupby(df.index.date).transform(lambda x: x[:10].sum())
Or if the sequence is the same for both grouped values and real columns
df.loc[:,:] = df.groupby(df.index.date).transform(lambda x: x[:10].sum())
print(df)
A B C
2018-01-01 00:00:00 68 50 22
2018-01-01 01:00:00 68 50 22
2018-01-01 02:00:00 68 50 22
2018-01-01 03:00:00 68 50 22
2018-01-01 04:00:00 68 50 22
2018-01-01 05:00:00 68 50 22
2018-01-01 06:00:00 68 50 22
2018-01-01 07:00:00 68 50 22
2018-01-01 08:00:00 68 50 22
2018-01-01 09:00:00 68 50 22
2018-01-01 10:00:00 68 50 22
2018-01-01 11:00:00 68 50 22
2018-01-01 12:00:00 68 50 22
2018-01-01 13:00:00 68 50 22
2018-01-01 14:00:00 68 50 22
2018-01-01 15:00:00 68 50 22
2018-01-01 16:00:00 68 50 22
2018-01-01 17:00:00 68 50 22
2018-01-01 18:00:00 68 50 22
2018-01-01 19:00:00 68 50 22
2018-01-01 20:00:00 68 50 22
2018-01-01 21:00:00 68 50 22
2018-01-01 22:00:00 68 50 22
2018-01-01 23:00:00 68 50 22
2018-01-02 00:00:00 39 50 46
2018-01-02 01:00:00 39 50 46
2018-01-02 02:00:00 39 50 46
2018-01-02 03:00:00 39 50 46
2018-01-02 04:00:00 39 50 46
2018-01-02 05:00:00 39 50 46
2018-01-02 06:00:00 39 50 46
2018-01-02 07:00:00 39 50 46
2018-01-02 08:00:00 39 50 46
2018-01-02 09:00:00 39 50 46
2018-01-02 10:00:00 39 50 46
2018-01-02 11:00:00 39 50 46
2018-01-02 12:00:00 39 50 46
2018-01-02 13:00:00 39 50 46
2018-01-02 14:00:00 39 50 46
2018-01-02 15:00:00 39 50 46
2018-01-02 16:00:00 39 50 46
2018-01-02 17:00:00 39 50 46
2018-01-02 18:00:00 39 50 46
2018-01-02 19:00:00 39 50 46
2018-01-02 20:00:00 39 50 46
2018-01-02 21:00:00 39 50 46
2018-01-02 22:00:00 39 50 46
2018-01-02 23:00:00 39 50 46

Slicing window on pandas dataframe

I have a pandas dataframe with time-series data in 1-min intervals. Is there a pythonic way to slice my data for every 15 min like this?
a=pd.DataFrame(index=pd.date_range('2017-01-01 00:04','2017-01-01 01:04',freq='1T'))
a['data']=np.arange(61)
for i in range(0,len(a),15):
print a[i:i+15]
Is there any built in function for this in pandas?

IIUC, use groups and pd.Grouper with freq=15min
for _, g in a.groupby(pd.Grouper(freq='15min')):
print(g)
Can also do
groups = a.groupby(pd.Grouper(freq='15min'))
list(groups)
Outputs
data
2017-01-01 00:04:00 0
2017-01-01 00:05:00 1
2017-01-01 00:06:00 2
2017-01-01 00:07:00 3
2017-01-01 00:08:00 4
2017-01-01 00:09:00 5
2017-01-01 00:10:00 6
2017-01-01 00:11:00 7
2017-01-01 00:12:00 8
2017-01-01 00:13:00 9
2017-01-01 00:14:00 10
data
2017-01-01 00:15:00 11
2017-01-01 00:16:00 12
2017-01-01 00:17:00 13
2017-01-01 00:18:00 14
2017-01-01 00:19:00 15
2017-01-01 00:20:00 16
2017-01-01 00:21:00 17
2017-01-01 00:22:00 18
2017-01-01 00:23:00 19
2017-01-01 00:24:00 20
2017-01-01 00:25:00 21
2017-01-01 00:26:00 22
2017-01-01 00:27:00 23
2017-01-01 00:28:00 24
2017-01-01 00:29:00 25
data
2017-01-01 00:30:00 26
2017-01-01 00:31:00 27
2017-01-01 00:32:00 28
2017-01-01 00:33:00 29
2017-01-01 00:34:00 30
2017-01-01 00:35:00 31
2017-01-01 00:36:00 32
2017-01-01 00:37:00 33
2017-01-01 00:38:00 34
2017-01-01 00:39:00 35
2017-01-01 00:40:00 36
2017-01-01 00:41:00 37
2017-01-01 00:42:00 38
2017-01-01 00:43:00 39
2017-01-01 00:44:00 40

python pandas String to TimeStramps convert ambigous

I'm trying to slice a Dataframe using DateTimeIndex, but a got one issue.
When the new DataFrame Change Month, he switch the day and the month.
Here is my dataframe:
Valeur
date
2015-01-08 00:00:00 93
2015-01-08 00:10:00 90
2015-01-08 00:20:00 88
2015-01-08 00:30:00 103
2015-01-08 00:40:00 86
2015-01-08 00:50:00 88
2015-01-08 01:00:00 86
2015-01-08 01:10:00 84
2015-01-08 01:20:00 95
2015-01-08 01:30:00 88
2015-01-08 01:40:00 85
2015-01-08 01:50:00 92
... ...
2016-10-30 22:20:00 98
2016-10-30 22:30:00 94
2016-10-30 22:40:00 94
2016-10-30 22:50:00 103
2016-10-30 23:00:00 92
2016-10-30 23:10:00 85
2016-10-30 23:20:00 98
2016-10-30 23:30:00 96
2016-10-30 23:40:00 95
2016-10-30 23:50:00 101
[65814 rows x 1 columns]
Here my two TimeStamps:
startingDate : 2015-10-31 23:50:00
lastDate : 2016-10-30 23:50:00
When i slice my df like this :
dfconso = dfconso[startingDate:lastDate]
i got something like this :
Valeur
date
2015-10-31 23:50:00 88
2015-01-11 00:00:00 83
2015-01-11 00:10:00 82
2015-01-11 00:20:00 87
2015-01-11 00:30:00 77
2015-01-11 00:40:00 72
2015-01-11 00:50:00 86
2015-01-11 01:00:00 77
2015-01-11 01:10:00 80
... ...
2016-10-30 23:10:00 85
2016-10-30 23:20:00 98
2016-10-30 23:30:00 96
2016-10-30 23:40:00 95
2016-10-30 23:50:00 101
The problem is the slice start at the good date, but when the DateTimeIndex change month, something wrong append.
Pass from 31 October 2015 to 11 January 2015.
And i don't understand why..
I try to print the month and day to see and i got that :
In:
print("Index 0 : month", dfconso.index[0].month, ", day", dfconso.index[0].day)
print("Index 1 : month", dfconso.index[1].month, ", day", dfconso.index[1].day)
Out:
Index 0 : month 10 , day 31
Index 1 : month 1 , day 11
If someone has an idea
EDIT :
After df.sort_index() my df, i can see the convert of String date to TimeStamps date, didn't work sometimes, and switch Month and Day.
Format at String :
"31/08/2015 20:00:00"
My code to transform from String to TimeStamps:
dfconso.index = pd.to_datetime(dfconso.index, infer_datetime_format=True, format="%d/%m/%Y")

SOLUTION :
that was a bad use of pd.to_datetime, i change infer_date_time_format to Dayfirst :
dfconso.index = pd.to_datetime(dfconso.index, dayfirst=True)
That solve my problem.

The error might not be a mixup of day and month, but just an ordering problem. Try reordering the data before slicing it (the provided part of your data looks fine, but who knows about the rest..).
Here is how reordering works: Sort a pandas datetime index

How can I efficiently convert hourly data into dates and times for every day of the year using Python pandas?

I have a pandas DataFrame that represents a value for every hour of a day and I want to report each value of each day for a year. I have written the 'naive' way to do it. Is there a more efficient way?
Naive way (that works correctly, but takes a lot of time):
dfConsoFrigo = pd.read_csv("../assets/datas/refregirateur.csv", sep=';')
dataframe = pd.DataFrame(columns=['Puissance'])
iterator = 0
for day in pd.date_range("01 Jan 2017 00:00", "31 Dec 2017 23:00", freq='1H'):
iterator = iterator % 24
dataframe.loc[day] = dfConsoFrigo.iloc[iterator]['Puissance']
iterator += 1
Input (time;value) 24 rows:
Heure;Puissance
00:00;48.0
01:00;47.0
02:00;46.0
03:00;46.0
04:00;45.0
05:00;46.0
...
19:00;55.0
20:00;53.0
21:00;51.0
22:00;50.0
23:00;49.0
Expected Output (8760 rows):
Puissance
2017-01-01 00:00:00 48
2017-01-01 01:00:00 47
2017-01-01 02:00:00 46
2017-01-01 03:00:00 46
2017-01-01 04:00:00 45
...
2017-12-31 20:00:00 53
2017-12-31 21:00:00 51
2017-12-31 22:00:00 50
2017-12-31 23:00:00 49

I think you need numpy.tile:
np.random.seed(10)
df = pd.DataFrame({'Puissance':np.random.randint(100, size=24)})
rng = pd.date_range("01 Jan 2017 00:00", "31 Dec 2017 23:00", freq='1H')
df = pd.DataFrame({'a':np.tile(df['Puissance'].values, 365)}, index=rng)
print (df.head(30))
a
2017-01-01 00:00:00 9
2017-01-01 01:00:00 15
2017-01-01 02:00:00 64
2017-01-01 03:00:00 28
2017-01-01 04:00:00 89
2017-01-01 05:00:00 93
2017-01-01 06:00:00 29
2017-01-01 07:00:00 8
2017-01-01 08:00:00 73
2017-01-01 09:00:00 0
2017-01-01 10:00:00 40
2017-01-01 11:00:00 36
2017-01-01 12:00:00 16
2017-01-01 13:00:00 11
2017-01-01 14:00:00 54
2017-01-01 15:00:00 88
2017-01-01 16:00:00 62
2017-01-01 17:00:00 33
2017-01-01 18:00:00 72
2017-01-01 19:00:00 78
2017-01-01 20:00:00 49
2017-01-01 21:00:00 51
2017-01-01 22:00:00 54
2017-01-01 23:00:00 77
2017-01-02 00:00:00 9
2017-01-02 01:00:00 15
2017-01-02 02:00:00 64
2017-01-02 03:00:00 28
2017-01-02 04:00:00 89
2017-01-02 05:00:00 93

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Slicing Pandas based on threshold and timestamp before and after - python

Related

Error plotting a time column as x-axis ticks

Pandas - Sum of first X hours of datetime index

Slicing window on pandas dataframe

python pandas String to TimeStramps convert ambigous

How can I efficiently convert hourly data into dates and times for every day of the year using Python pandas?

Categories

Resources