I have the below pandas data frame. I need to do a Group By by column B and sum col A and remove the time stamp. So..In the below...should have one record with the A's summed up. Som How I do thus in pandas?
A B
2013-03-15 17:00:00 1 134
2013-03-15 18:00:00 810 134
2013-03-15 19:00:00 1797 134
2013-03-15 20:00:00 813 134
2013-03-15 21:00:00 1323 134
2013-03-16 05:00:00 98 134
2013-03-16 06:00:00 515 134
2013-03-16 10:00:00 377 134
2013-03-16 11:00:00 1798 134
2013-03-16 12:00:00 985 134
2013-03-17 08:00:00 258 134
This can be done with a straight-forward groupby operation:
import io
import pandas as pd
content='''\
date time A B
2013-03-15 17:00:00 1 134
2013-03-15 18:00:00 810 134
2013-03-15 19:00:00 1797 134
2013-03-15 20:00:00 813 135
2013-03-15 21:00:00 1323 134
2013-03-16 05:00:00 98 134
2013-03-16 06:00:00 515 135
2013-03-16 10:00:00 377 134
2013-03-16 11:00:00 1798 136
2013-03-16 12:00:00 985 136
2013-03-17 08:00:00 258 137'''
df = pd.read_table(io.BytesIO(content), sep='\s+',
parse_dates=[[0, 1]], header=0,
index_col=0)
print(df.groupby(['B']).sum())
yields
A
B
134 4406
135 1328
136 2783
137 258
Some of the values in B were changed to show a more interesting groupby operation.
Related
I have the below data frame (date time index, with all working days in us calender)
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
import random
us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())
dt_rng = pd.date_range(start='1/1/2018', end='12/31/2018', freq=us_bd)
n1 = [round(random.uniform(20, 35),2) for _ in range(len(dt_rng))]
n2 = [random.randint(100, 200) for _ in range(len(dt_rng))]
df = pd.DataFrame(list(zip(n1,n2)), index=dt_rng, columns=['n1','n2'])
print(df)
n1 n2
2018-01-02 24.78 197
2018-01-03 23.33 176
2018-01-04 33.19 128
2018-01-05 32.49 110
... ... ...
2018-12-26 31.34 173
2018-12-27 29.72 166
2018-12-28 31.07 104
2018-12-31 33.52 184
[251 rows x 2 columns]
For each row in column n1 , how to get values from the same column for the same day of next month? (if value for that exact day is not available (due to weekends or holidays), then should get the value at the next available date. ). I tried using df.n1.shift(21), but its not working as the exact working days at each month differ.
Expected output as below
n1 n2 next_mnth_val
2018-01-02 25.97 184 28.14
2018-01-03 24.94 133 27.65 # three values below are same, because on Feb 2018, the next working day after 2nd is 5th
2018-01-04 23.99 143 27.65
2018-01-05 24.69 182 27.65
2018-01-08 28.43 186 28.45
2018-01-09 31.47 104 23.14
... ... ... ...
2018-12-26 29.06 194 20.45
2018-12-27 29.63 158 20.45
2018-12-28 30.60 148 20.45
2018-12-31 20.45 121 20.45
for December , the next month value should be last value of the data frame ie, value at index 2018-12-31 (20.45).
please help.
This is an interesting problem. I would shift the date by 1 month, then shift it again to the next business day:
df1 = df.copy().reset_index()
df1['new_date'] = df1['index'] + pd.DateOffset(months=1) + pd.offsets.BDay()
df.merge(df1, left_index=True, right_on='new_date')
Output (first 31st days):
n1_x n2_x index n1_y n2_y new_date
0 34.82 180 2018-01-02 29.83 129 2018-02-05
1 34.82 180 2018-01-03 24.28 166 2018-02-05
2 34.82 180 2018-01-04 27.88 110 2018-02-05
3 24.89 186 2018-01-05 25.34 111 2018-02-06
4 31.66 137 2018-01-08 26.28 138 2018-02-09
5 25.30 162 2018-01-09 32.71 139 2018-02-12
6 25.30 162 2018-01-10 34.39 159 2018-02-12
7 25.30 162 2018-01-11 20.89 132 2018-02-12
8 23.44 196 2018-01-12 29.27 167 2018-02-13
12 25.40 153 2018-01-19 28.52 185 2018-02-20
13 31.38 126 2018-01-22 23.49 141 2018-02-23
14 30.90 133 2018-01-23 25.56 145 2018-02-26
15 30.90 133 2018-01-24 23.06 155 2018-02-26
16 30.90 133 2018-01-25 24.95 174 2018-02-26
17 29.39 138 2018-01-26 21.28 157 2018-02-27
18 32.94 173 2018-01-29 20.26 189 2018-03-01
19 32.94 173 2018-01-30 22.41 196 2018-03-01
20 32.94 173 2018-01-31 27.32 149 2018-03-01
21 28.09 119 2018-02-01 31.39 192 2018-03-02
22 32.21 199 2018-02-02 28.22 151 2018-03-05
23 21.78 120 2018-02-05 34.82 180 2018-03-06
24 28.25 127 2018-02-06 24.89 186 2018-03-07
25 22.06 189 2018-02-07 32.85 125 2018-03-08
26 33.78 121 2018-02-08 30.12 102 2018-03-09
27 30.79 137 2018-02-09 31.66 137 2018-03-12
28 29.88 131 2018-02-12 25.30 162 2018-03-13
29 20.02 143 2018-02-13 23.44 196 2018-03-14
30 20.28 188 2018-02-14 20.04 102 2018-03-15
I'm getting a 5 minute feed and storing it in the dataframe. My EWM 200 doesn't match Marketwatch EWM 200
I've tried the piece of code posted on Feb 13th 2018. My data has dates already sorted in ascending order and for some reason, it doesn't do the trick
df = df.drop(df.index[-1])
print(df[['date','low','close']])
df.sort_values(by='date')
df = df.sort_index()
print(df[['date','low','close']])
#print(df)
df['ewm_5'] = round(df['close'].ewm(span=5,min_periods=0,adjust=False,ignore_na=False).mean(),2)
df['ewm_9'] = round(df['close'].ewm(span=9,min_periods=0,adjust=False,ignore_na=False).mean(),2)
df['ewm_15'] = df['close'].ewm(span=15,min_periods=0,adjust=False,ignore_na=False).mean()
df['ewm_65'] = df['close'].ewm(span=65,min_periods=0,adjust=False,ignore_na=False).mean()
df['ewm_200'] = df['close'].ewm(span=200,min_periods=0,adjust=False,ignore_na=False).mean()
print(df[['date','low','close','ewm_9','ewm_15','ewm_65', 'ewm_200','volume']])
Marketwatch at 630AM says ewm 200 is 155.70 and mine says 161.78 when I use the function
df['ewm_200'] = df['close'].ewm(span=200,min_periods=0,adjust=False,ignore_na=False).mean()
Here is my updated raw data with more than 200 datapoints by 630 AM and still mine and market watch dont conincide or even close to each other. Mine now says: 159.036336 and market watch is at 155.70.
date low close ewm_9 ewm_15 ewm_65 ewm_200 volume
0 20190214 04:15:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 1
1 20190214 04:20:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
2 20190214 04:25:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
3 20190214 04:30:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
4 20190214 04:35:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
5 20190214 04:40:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
6 20190214 04:45:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
7 20190214 04:50:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
8 20190214 04:55:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
9 20190214 05:00:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
10 20190214 05:05:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
11 20190214 05:10:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
12 20190214 05:15:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
13 20190214 05:20:00 153.51 153.51 153.51 153.510000 153.510000 153.510000 0
14 20190214 05:25:00 153.50 153.50 153.51 153.508750 153.509697 153.509900 1
15 20190214 05:30:00 153.50 153.50 153.51 153.507656 153.509403 153.509802 0
16 20190214 05:35:00 153.50 153.50 153.51 153.506699 153.509118 153.509704 0
17 20190214 05:40:00 153.50 153.50 153.50 153.505862 153.508842 153.509608 0
18 20190214 05:45:00 153.50 153.50 153.50 153.505129 153.508574 153.509512 0
19 20190214 05:50:00 153.50 153.50 153.50 153.504488 153.508314 153.509418 0
20 20190214 05:55:00 153.50 153.50 153.50 153.503927 153.508062 153.509324 0
21 20190214 06:00:00 153.50 153.50 153.50 153.503436 153.507818 153.509231 0
22 20190214 06:05:00 153.50 153.50 153.50 153.503007 153.507581 153.509139 0
23 20190214 06:10:00 153.50 153.50 153.50 153.502631 153.507351 153.509048 0
24 20190214 06:15:00 153.50 153.50 153.50 153.502302 153.507128 153.508958 0
25 20190214 06:20:00 153.70 153.70 153.54 153.527014 153.512973 153.510859 1
26 20190214 06:25:00 153.70 153.70 153.57 153.548637 153.518641 153.512741 0
27 20190214 06:30:00 153.70 153.70 153.60 153.567558 153.524136 153.514605 0
28 20190214 06:35:00 153.70 153.70 153.62 153.584113 153.529465 153.516449 0
29 20190214 06:40:00 153.70 153.70 153.63 153.598599 153.534633 153.518276 0
30 20190214 06:45:00 153.70 153.70 153.65 153.611274 153.539644 153.520084 0
31 20190214 06:50:00 153.70 153.70 153.66 153.622365 153.544504 153.521874 0
32 20190214 06:55:00 153.70 153.70 153.67 153.632069 153.549216 153.523646 0
33 20190214 07:00:00 153.81 154.05 153.74 153.684311 153.564391 153.528884 31
34 20190214 07:05:00 154.00 154.00 153.79 153.723772 153.577591 153.533572 3
35 20190214 07:10:00 154.07 154.37 153.91 153.804550 153.601604 153.541894 19
36 20190214 07:15:00 154.16 154.20 153.97 153.853981 153.619737 153.548442 15
37 20190214 07:20:00 154.28 154.56 154.09 153.942234 153.648230 153.558508 17
38 20190214 07:25:00 154.47 154.48 154.16 154.009455 153.673435 153.567677 5
39 20190214 07:30:00 154.43 154.43 154.22 154.062023 153.696361 153.576257 2
40 20190214 07:35:00 154.50 154.50 154.27 154.116770 153.720714 153.585449 30
41 20190214 07:40:00 154.50 154.50 154.32 154.164674 153.744328 153.594549 10
42 20190214 07:45:00 154.37 154.45 154.35 154.200339 153.765712 153.603061 20
43 20190214 07:50:00 154.35 154.35 154.35 154.219047 153.783418 153.610493 13
44 20190214 07:55:00 154.26 154.30 154.34 154.229166 153.799072 153.617354 29
45 20190214 08:00:00 153.90 154.17 154.30 154.221770 153.810312 153.622853 122
46 20190214 08:05:00 154.24 154.40 154.32 154.244049 153.828182 153.630585 26
47 20190214 08:10:00 154.37 154.43 154.34 154.267293 153.846419 153.638540 29
48 20190214 08:15:00 154.49 154.50 154.38 154.296381 153.866224 153.647111 6
49 20190214 08:20:00 154.45 154.45 154.39 154.315584 153.883914 153.655100 16
50 20190214 08:25:00 154.25 154.26 154.36 154.308636 153.895311 153.661119 18
51 20190214 08:30:00 153.97 154.02 154.30 154.272556 153.899089 153.664690 168
52 20190214 08:35:00 153.71 153.97 154.23 154.234737 153.901238 153.667728 165
53 20190214 08:40:00 153.50 153.51 154.09 154.144145 153.889382 153.666159 114
54 20190214 08:45:00 153.33 153.34 153.94 154.043627 153.872734 153.662913 57
55 20190214 08:50:00 153.40 153.71 153.89 154.001923 153.867803 153.663382 43
56 20190214 08:55:00 153.04 153.19 153.75 153.900433 153.847264 153.658672 142
57 20190214 09:00:00 152.25 152.51 153.50 153.726629 153.806740 153.647242 48
58 20190214 09:05:00 152.50 152.93 153.39 153.627050 153.780173 153.640105 139
59 20190214 09:10:00 152.55 152.75 153.26 153.517419 153.748955 153.631249 69
60 20190214 09:15:00 152.68 152.90 153.19 153.440242 153.723229 153.623972 82
61 20190214 09:20:00 152.50 152.50 153.05 153.322711 153.686162 153.612789 29
62 20190214 09:25:00 152.50 152.97 153.03 153.278622 153.664460 153.606393 21
63 20190214 09:30:00 151.10 152.30 152.89 153.156295 153.623113 153.593394 5143
64 20190214 09:35:00 151.97 153.76 153.06 153.231758 153.627261 153.595052 5437
65 20190214 09:40:00 153.12 153.39 153.13 153.251538 153.620071 153.593011 4097
66 20190214 09:45:00 152.75 153.71 153.24 153.308846 153.622796 153.594175 3310
67 20190214 09:50:00 153.36 154.53 153.50 153.461490 153.650287 153.603487 3637
68 20190214 09:55:00 154.39 155.29 153.86 153.690054 153.699975 153.620268 5561
69 20190214 10:00:00 154.96 155.12 154.11 153.868797 153.743006 153.635191 3372
70 20190214 10:05:00 154.75 154.76 154.24 153.980197 153.773824 153.646383 2327
71 20190214 10:10:00 154.47 154.58 154.31 154.055173 153.798254 153.655673 2215
72 20190214 10:15:00 154.15 154.49 154.35 154.109526 153.819216 153.663975 2565
73 20190214 10:20:00 154.21 154.33 154.34 154.137085 153.834694 153.670602 2410
74 20190214 10:25:00 153.75 154.10 154.29 154.132450 153.842734 153.674874 2822
75 20190214 10:30:00 153.77 153.91 154.22 154.104644 153.844772 153.677214 2188
76 20190214 10:35:00 152.31 152.57 153.89 153.912813 153.806143 153.666197 3800
77 20190214 10:40:00 152.56 153.11 153.73 153.812461 153.785048 153.660663 2184
78 20190214 10:45:00 153.10 154.26 153.84 153.868404 153.799440 153.666626 2952
79 20190214 10:50:00 153.82 153.85 153.84 153.866103 153.800972 153.668451 1483
80 20190214 10:55:00 153.46 153.98 153.87 153.880340 153.806397 153.671551 1842
81 20190214 11:00:00 153.32 153.46 153.79 153.827798 153.795900 153.669446 1386
82 20190214 11:05:00 153.33 153.84 153.80 153.829323 153.797237 153.671143 963
83 20190214 11:10:00 153.55 153.81 153.80 153.826908 153.797623 153.672524 1048
84 20190214 11:15:00 153.80 153.95 153.83 153.842294 153.802241 153.675285 1344
85 20190214 11:20:00 153.79 153.88 153.84 153.847007 153.804597 153.677322 859
86 20190214 11:25:00 153.61 153.75 153.82 153.834882 153.802943 153.678046 731
87 20190214 11:30:00 153.35 153.38 153.73 153.778021 153.790126 153.675080 1137
88 20190214 11:35:00 153.37 153.60 153.71 153.755769 153.784365 153.674333 814
89 20190214 11:40:00 153.40 153.42 153.65 153.713798 153.773324 153.671802 954
90 20190214 11:45:00 153.36 153.60 153.64 153.699573 153.768071 153.671088 1370
91 20190214 11:50:00 153.27 153.29 153.57 153.648376 153.753584 153.667296 2310
92 20190214 11:55:00 153.16 153.40 153.54 153.617329 153.742870 153.664636 727
93 20190214 12:00:00 153.13 153.37 153.50 153.586413 153.731571 153.661704 928
94 20190214 12:05:00 153.36 153.55 153.51 153.581861 153.726068 153.660593 721
95 20190214 12:10:00 153.35 153.53 153.52 153.575379 153.720127 153.659293 840
96 20190214 12:15:00 153.52 154.03 153.62 153.632206 153.729517 153.662982 1047
97 20190214 12:20:00 153.91 154.20 153.73 153.703181 153.743774 153.668326 961
98 20190214 12:25:00 154.05 154.34 153.86 153.782783 153.761842 153.675009 872
99 20190214 12:30:00 154.25 154.49 153.98 153.871185 153.783907 153.683118 1223
100 20190214 12:35:00 154.41 154.53 154.09 153.953537 153.806516 153.691545 1276
101 20190214 12:40:00 154.51 154.70 154.21 154.046845 153.833591 153.701579 942
102 20190214 12:45:00 154.63 154.66 154.30 154.123489 153.858634 153.711116 957
103 20190214 12:50:00 154.55 154.60 154.36 154.183053 153.881099 153.719960 687
104 20190214 12:55:00 154.61 154.84 154.46 154.265171 153.910157 153.731105 1217
105 20190214 13:00:00 154.70 154.71 154.51 154.320775 153.934395 153.740845 1945
106 20190214 13:05:00 154.52 154.74 154.55 154.373178 153.958807 153.750787 768
107 20190214 13:10:00 154.51 154.63 154.57 154.405281 153.979146 153.759536 819
108 20190214 13:15:00 154.51 154.59 154.57 154.428371 153.997657 153.767799 674
109 20190214 13:20:00 154.27 154.41 154.54 154.426074 154.010152 153.774189 708
110 20190214 13:25:00 154.28 154.46 154.52 154.430315 154.023784 153.781013 727
111 20190214 13:30:00 154.29 154.35 154.49 154.420276 154.033669 153.786675 814
112 20190214 13:35:00 154.27 154.35 154.46 154.411491 154.043255 153.792280 859
113 20190214 13:40:00 154.30 154.44 154.46 154.415055 154.055278 153.798725 632
114 20190214 13:45:00 154.37 154.46 154.46 154.420673 154.067542 153.805305 1216
115 20190214 13:50:00 154.43 154.45 154.46 154.424339 154.079132 153.811719 709
116 20190214 13:55:00 154.34 154.38 154.44 154.418797 154.088249 153.817374 1004
117 20190214 14:00:00 153.99 154.06 154.36 154.373947 154.087393 153.819788 907
118 20190214 14:05:00 154.05 154.16 154.32 154.347204 154.089593 153.823173 626
119 20190214 14:10:00 154.04 154.12 154.28 154.318803 154.090514 153.826127 521
120 20190214 14:15:00 154.09 154.16 154.26 154.298953 154.092620 153.829449 659
121 20190214 14:20:00 154.11 154.35 154.28 154.305334 154.100419 153.834629 737
122 20190214 14:25:00 154.24 154.28 154.28 154.302167 154.105861 153.839060 1391
123 20190214 14:30:00 153.98 154.32 154.29 154.304396 154.112350 153.843846 1289
124 20190214 14:35:00 154.16 154.22 154.27 154.293847 154.115612 153.847588 692
125 20190214 14:40:00 154.05 154.18 154.25 154.279616 154.117564 153.850896 883
126 20190214 14:45:00 153.98 154.04 154.21 154.249664 154.115213 153.852778 792
127 20190214 14:50:00 153.83 154.03 154.18 154.222206 154.112631 153.854541 1201
128 20190214 14:55:00 153.75 153.87 154.11 154.178180 154.105278 153.854695 1094
129 20190214 15:00:00 153.84 153.97 154.09 154.152158 154.101179 153.855842 1212
130 20190214 15:05:00 153.94 154.28 154.12 154.168138 154.106598 153.860063 2079
131 20190214 15:10:00 153.91 154.06 154.11 154.154621 154.105186 153.862052 1125
132 20190214 15:15:00 154.05 154.15 154.12 154.154043 154.106544 153.864917 1284
133 20190214 15:20:00 154.01 154.02 154.10 154.137288 154.103921 153.866460 1073
134 20190214 15:25:00 153.95 154.08 154.10 154.130127 154.103196 153.868585 2261
135 20190214 15:30:00 153.92 153.92 154.06 154.103861 154.097645 153.869097 1736
136 20190214 15:35:00 153.83 154.05 154.06 154.097128 154.096201 153.870897 1809
137 20190214 15:40:00 154.00 154.10 154.07 154.097487 154.096316 153.873176 2143
138 20190214 15:45:00 154.01 154.25 154.10 154.116551 154.100973 153.876926 2935
139 20190214 15:50:00 154.23 154.38 154.16 154.149482 154.109429 153.881932 4572
140 20190214 15:55:00 154.31 154.53 154.23 154.197047 154.122173 153.888380 5166
141 20190214 16:00:00 154.29 154.30 154.25 154.209916 154.127562 153.892476 4169
142 20190214 16:05:00 154.13 154.49 154.30 154.244927 154.138545 153.898421 174
143 20190214 16:10:00 154.01 154.40 154.32 154.264311 154.146468 153.903412 109
144 20190214 16:15:00 154.30 154.95 154.44 154.350022 154.170817 153.913826 157
145 20190214 16:20:00 154.50 163.74 156.30 155.523769 154.460793 154.011599 7078
146 20190214 16:25:00 163.00 168.55 158.75 157.152048 154.887738 154.156260 6894
147 20190214 16:30:00 166.67 166.86 160.37 158.365542 155.250534 154.282665 3482
148 20190214 16:35:00 166.21 167.03 161.70 159.448599 155.607488 154.409504 1863
149 20190214 16:40:00 165.10 166.00 162.56 160.267524 155.922412 154.524832 2184
150 20190214 16:45:00 165.13 166.00 163.25 160.984084 156.227794 154.639013 1633
151 20190214 16:50:00 165.50 166.29 163.86 161.647323 156.532709 154.754943 1110
152 20190214 16:55:00 165.70 165.90 164.27 162.178908 156.816566 154.865839 347
153 20190214 17:00:00 165.80 166.25 164.66 162.687794 157.102428 154.979115 1028
154 20190214 17:05:00 166.30 167.12 165.15 163.241820 157.405991 155.099920 491
155 20190214 17:10:00 167.00 167.40 165.60 163.761593 157.708840 155.222308 935
156 20190214 17:15:00 166.60 166.95 165.87 164.160144 157.988875 155.339002 571
157 20190214 17:20:00 166.76 167.40 166.18 164.565126 158.274060 155.459012 289
158 20190214 17:25:00 167.10 167.45 166.43 164.925735 158.552119 155.578325 601
159 20190214 17:30:00 166.75 166.85 166.52 165.166268 158.803570 155.690481 307
160 20190214 17:35:00 166.80 167.00 166.61 165.395485 159.051947 155.803014 297
161 20190214 17:40:00 167.00 167.25 166.74 165.627299 159.300373 155.916914 307
162 20190214 17:45:00 167.25 167.94 166.98 165.916387 159.562179 156.036547 1285
163 20190214 17:50:00 167.25 167.35 167.05 166.095588 159.798174 156.149118 501
164 20190214 17:55:00 166.59 166.80 167.00 166.183640 160.010351 156.255097 531
165 20190214 18:00:00 166.30 166.45 166.89 166.216935 160.205491 156.356539 477
166 20190214 18:05:00 166.20 166.21 166.76 166.216068 160.387446 156.454583 289
167 20190214 18:10:00 165.31 165.35 166.47 166.107809 160.537827 156.543095 619
168 20190214 18:15:00 164.66 165.21 166.22 165.995583 160.679408 156.629333 679
169 20190214 18:20:00 165.01 165.04 165.99 165.876135 160.811547 156.713021 383
170 20190214 18:25:00 164.25 164.42 165.67 165.694118 160.920894 156.789707 620
171 20190214 18:30:00 164.30 164.30 165.40 165.519854 161.023291 156.864437 259
172 20190214 18:35:00 163.60 163.74 165.07 165.297372 161.105616 156.932850 433
173 20190214 18:40:00 163.68 164.08 164.87 165.145200 161.195748 157.003966 218
174 20190214 18:45:00 163.76 163.80 164.66 164.977050 161.274665 157.071588 137
175 20190214 18:50:00 163.80 163.98 164.52 164.852419 161.356645 157.140329 186
176 20190214 18:55:00 163.81 164.02 164.42 164.748367 161.437353 157.208783 309
177 20190214 19:00:00 163.80 163.85 164.31 164.636071 161.510463 157.274865 69
178 20190214 19:05:00 163.80 163.80 164.20 164.531562 161.579843 157.339792 207
179 20190214 19:10:00 163.70 163.75 164.11 164.433867 161.645605 157.403575 125
180 20190214 19:15:00 163.41 163.50 163.99 164.317133 161.701799 157.464236 80
181 20190214 19:20:00 163.16 163.50 163.89 164.214992 161.756290 157.524293 166
182 20190214 19:25:00 163.17 163.30 163.77 164.100618 161.803069 157.581763 160
183 20190214 19:30:00 163.07 163.07 163.63 163.971791 161.841461 157.636372 233
184 20190214 19:35:00 162.91 163.01 163.51 163.851567 161.876871 157.689841 309
185 20190214 19:40:00 162.84 162.90 163.39 163.732621 161.907875 157.741684 130
186 20190214 19:45:00 162.58 162.58 163.23 163.588543 161.928243 157.789826 63
187 20190214 19:50:00 162.20 162.34 163.05 163.432475 161.940720 157.835101 268
188 20190214 19:55:00 162.29 162.75 162.99 163.347166 161.965244 157.884006 509
189 20190215 04:00:00 160.86 161.62 162.72 163.131270 161.954782 157.921180 14
190 20190215 04:05:00 161.62 162.25 162.62 163.021111 161.963728 157.964253 36
191 20190215 04:10:00 161.88 161.99 162.50 162.892222 161.964524 158.004310 17
192 20190215 04:15:00 162.00 162.00 162.40 162.780695 161.965599 158.044068 1
193 20190215 04:20:00 161.06 161.06 162.13 162.565608 161.938157 158.074077 7
194 20190215 04:25:00 160.93 161.05 161.91 162.376157 161.911243 158.103689 25
195 20190215 04:30:00 161.05 161.05 161.74 162.210387 161.885145 158.133005 0
196 20190215 04:35:00 161.05 161.05 161.60 162.065339 161.859837 158.162030 0
197 20190215 04:40:00 161.16 161.16 161.51 161.952171 161.838630 158.191860 1
198 20190215 04:45:00 161.40 162.08 161.63 161.968150 161.845944 158.230548 49
199 20190215 04:50:00 162.18 162.18 161.74 161.994631 161.856067 158.269846 1
200 20190215 04:55:00 162.00 162.00 161.79 161.995302 161.860429 158.306962 1
201 20190215 05:00:00 162.00 162.25 161.88 162.027140 161.872234 158.346197 8
202 20190215 05:05:00 162.05 162.50 162.01 162.086247 161.891257 158.387528 24
203 20190215 05:10:00 162.38 162.38 162.08 162.122966 161.906067 158.427254 1
204 20190215 05:15:00 162.35 162.35 162.13 162.151345 161.919520 158.466286 1
205 20190215 05:20:00 162.21 162.21 162.15 162.158677 161.928322 158.503537 43
206 20190215 05:25:00 162.22 162.22 162.16 162.166343 161.937161 158.540517 4
207 20190215 05:30:00 162.30 162.30 162.19 162.183050 161.948156 158.577925 2
208 20190215 05:35:00 162.30 162.30 162.21 162.197669 161.958818 158.614960 2
209 20190215 05:40:00 162.33 162.33 162.24 162.214210 161.970066 158.651926 1
210 20190215 05:45:00 162.34 162.68 162.32 162.272434 161.991579 158.692006 9
211 20190215 05:50:00 162.60 162.65 162.39 162.319630 162.011531 158.731389 9
212 20190215 05:55:00 162.64 162.65 162.44 162.360926 162.030879 158.770380 5
213 20190215 06:00:00 162.00 162.00 162.35 162.315810 162.029943 158.802516 92
214 20190215 06:05:00 162.24 162.24 162.33 162.306334 162.036309 158.836720 2
215 20190215 06:10:00 162.20 162.25 162.31 162.299292 162.042784 158.870683 4
216 20190215 06:15:00 162.23 162.77 162.41 162.358131 162.064821 158.909482 87
217 20190215 06:20:00 163.00 163.00 162.52 162.438364 162.093160 158.950184 28
218 20190215 06:25:00 163.10 163.10 162.64 162.521069 162.123670 158.991475 1
219 20190215 06:30:00 163.40 163.50 162.81 162.643435 162.165377 159.036336 23
220 20190215 06:35:00 163.50 163.50 162.95 162.750506 162.205820 159.080751 0
221 20190215 06:40:00 163.25 163.25 163.01 162.812943 162.237462 159.122236 15
222 20190215 06:45:00 163.35 163.35 163.08 162.880075 162.271175 159.164303 2
223 20190215 06:50:00 163.62 163.64 163.19 162.975065 162.312655 159.208837 5
224 20190215 06:55:00 163.05 163.39 163.23 163.026932 162.345302 159.250441 5
225 20190215 07:00:00 161.77 161.77 162.94 162.869816 162.327868 159.275511 257
226 20190215 07:05:00 161.70 162.40 162.83 162.811089 162.330054 159.306601 228
227 20190215 07:10:00 162.19 162.37 162.74 162.755953 162.331264 159.337082 79
228 20190215 07:15:00 162.35 163.25 162.84 162.817709 162.359105 159.376017 183
229 20190215 07:20:00 163.01 163.01 162.87 162.841745 162.378829 159.412176 28
230 20190215 07:25:00 163.05 163.35 162.97 162.905277 162.408258 159.451358 171
231 20190215 07:30:00 163.20 163.20 163.02 162.942117 162.432251 159.488658 134
232 20190215 07:35:00 163.00 163.21 163.05 162.975603 162.455819 159.525686 147
233 20190215 07:40:00 163.30 163.35 163.11 163.022402 162.482915 159.563739 90
234 20190215 07:45:00 163.36 163.75 163.24 163.113352 162.521312 159.605394 54
235 20190215 07:50:00 163.90 164.49 163.49 163.285433 162.580969 159.653997 194
236 20190215 07:55:00 164.40 164.80 163.75 163.474754 162.648212 159.705201 240
237 20190215 08:00:00 161.70 164.32 163.87 163.580410 162.698873 159.751119 1167
238 20190215 08:05:00 163.65 163.79 163.85 163.606608 162.731937 159.791307 216
239 20190215 08:10:00 163.35 163.67 163.81 163.614532 162.760363 159.829901 153
If that is all the data you have, you are calculating ewm 200 on 30 samples, so for sure you won't have the same results.
I have some time series data as:
import pandas as pd
index = pd.date_range('06/01/2014',periods=24*30,freq='H')
df1 = pd.DataFrame(range(len(index)),index=index)
Now I want to subset data of below dates
selec_dates = ['2014-06-10','2014-06-15','2014-06-20']
I tried following statement but it is not working
sub_data = df1.loc[df1.index.isin(pd.to_datetime(selec_dates))]
Where am I doing wrong? Is there any other approach to subset selected days data?
You need compare dates and for test membership use numpy.in1d:
sub_data = df1.loc[np.in1d(df1.index.date, pd.to_datetime(selec_dates).date)]
print (sub_data)
a
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
...
If want use isin, is necessary create Series with same index:
sub_data = df1.loc[pd.Series(df1.index.date, index=df1.index)
.isin(pd.to_datetime(selec_dates).date)]
print (sub_data)
a
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
2014-06-10 11:00:00 227
...
I'm sorry and misunderstood your question
df1[pd.Series(df1.index.date, index=df1.index).isin(pd.to_datetime(selec_dates).date)]
Should perform what was needed
original answer
Please check the pandas documentation on selection
You can easily do
sub_data = df1.loc[pd.to_datetime(selec_dates)]
You can use .query() method:
In [202]: df1.query('#index.normalize() in #selec_dates')
Out[202]:
0
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
... ...
2014-06-20 14:00:00 470
2014-06-20 15:00:00 471
2014-06-20 16:00:00 472
2014-06-20 17:00:00 473
2014-06-20 18:00:00 474
2014-06-20 19:00:00 475
2014-06-20 20:00:00 476
2014-06-20 21:00:00 477
2014-06-20 22:00:00 478
2014-06-20 23:00:00 479
[72 rows x 1 columns]
Edit: I have been made aware this only works if you are working with a daterange in the same month and year as in your query. For a more general (and better answer) see #jezrael solution.
You can use np.in1d and .day on your index if you wanted to do it as you tried:
selec_dates = ['2014-06-10','2014-06-15','2014-06-20']
df1.loc[np.in1d(df1.index.day, (pd.to_datetime(selec_dates).day))]
This gives you as you require:
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
2014-06-10 11:00:00 227
2014-06-10 12:00:00 228
2014-06-10 13:00:00 229
2014-06-10 14:00:00 230
2014-06-10 15:00:00 231
2014-06-10 16:00:00 232
2014-06-10 17:00:00 233
2014-06-10 18:00:00 234
2014-06-10 19:00:00 235
2014-06-10 20:00:00 236
2014-06-10 21:00:00 237
2014-06-10 22:00:00 238
2014-06-10 23:00:00 239
2014-06-15 00:00:00 336
2014-06-15 01:00:00 337
2014-06-15 02:00:00 338
2014-06-15 03:00:00 339
2014-06-15 04:00:00 340
2014-06-15 05:00:00 341
...
2014-06-15 18:00:00 354
2014-06-15 19:00:00 355
2014-06-15 20:00:00 356
2014-06-15 21:00:00 357
2014-06-15 22:00:00 358
2014-06-15 23:00:00 359
2014-06-20 00:00:00 456
2014-06-20 01:00:00 457
2014-06-20 02:00:00 458
2014-06-20 03:00:00 459
2014-06-20 04:00:00 460
2014-06-20 05:00:00 461
2014-06-20 06:00:00 462
2014-06-20 07:00:00 463
2014-06-20 08:00:00 464
2014-06-20 09:00:00 465
2014-06-20 10:00:00 466
2014-06-20 11:00:00 467
2014-06-20 12:00:00 468
2014-06-20 13:00:00 469
2014-06-20 14:00:00 470
2014-06-20 15:00:00 471
2014-06-20 16:00:00 472
2014-06-20 17:00:00 473
2014-06-20 18:00:00 474
2014-06-20 19:00:00 475
2014-06-20 20:00:00 476
2014-06-20 21:00:00 477
2014-06-20 22:00:00 478
2014-06-20 23:00:00 479
[72 rows x 1 columns]
I used these Sources for this answer:
- Selecting a subset of a Pandas DataFrame indexed by DatetimeIndex with a list of TimeStamps
- In Python-Pandas, How can I subset a dataframe by specific datetime index values?
- return pandas DF column with the number of days elapsed between index and today's date
- Get weekday/day-of-week for Datetime column of DataFrame
- https://stackoverflow.com/a/36893416/2254228
Use the string repr of the date, leaving out the time periods in the day.
pd.concat([df1['2014-06-10'] , df1['2014-06-15'], df1['2014-06-20']])
I am trying to set the x axis tick labels as the year but have the gridlines as the fiscal quarter. The data is quite simple, just a groupby date.count, see below. Each date has a count and I am plotting it as a line plot.
rc[(rc['form']=='Bakken')&(rc['tgt']=='oil')].groupby(['date']).date.count()
date count
2010-01-08 65
2010-01-15 68
2010-01-22 73
2010-01-29 76
2010-02-05 79
2010-02-12 76
2010-02-19 79
2010-02-26 83
2010-03-05 81
2010-03-12 83
2010-03-19 80
2010-03-26 87
2010-04-02 84
2010-04-09 87
2010-04-16 87
2010-04-23 91
2010-04-30 86
2010-05-07 92
2010-05-14 95
2010-05-21 91
2010-05-28 100
2010-06-04 96
2010-06-11 101
2010-06-18 100
2010-06-25 113
2010-07-02 112
2010-07-09 119
2010-07-16 121
2010-07-23 119
2010-07-30 115
2010-08-06 115
2010-08-13 114
2010-08-20 111
2010-08-27 114
2010-09-03 121
2010-09-10 128
2010-09-17 121
2010-09-24 118
2010-10-01 109
2010-10-08 120
2010-10-15 122
2010-10-22 120
2010-10-29 118
2010-11-05 117
2010-11-12 115
2010-11-19 113
2010-11-26 106
2010-12-03 112
2010-12-10 114
2010-12-17 122
2010-12-24 120
2010-12-31 120
2011-01-07 139
2011-01-14 141
2011-01-21 141
2011-01-28 145
2011-02-04 146
2011-02-11 145
2011-02-18 148
2011-02-25 149
2011-03-04 150
2011-03-11 149
2011-03-18 145
2011-03-25 140
2011-04-01 150
2011-04-08 153
2011-04-15 151
2011-04-22 148
2011-04-29 150
2011-05-06 148
2011-05-13 154
2011-05-20 155
2011-05-27 152
2011-06-03 158
2011-06-10 155
2011-06-17 152
2011-06-24 148
2011-07-01 160
2011-07-08 164
2011-07-15 163
2011-07-22 147
2011-07-29 158
2011-08-05 161
2011-08-12 166
2011-08-19 158
2011-08-26 154
2011-09-02 161
2011-09-09 166
2011-09-16 160
2011-09-23 169
2011-09-30 171
2011-10-07 155
2011-10-14 159
2011-10-21 156
2011-10-28 168
2011-11-04 154
2011-11-11 166
2011-11-18 168
2011-11-25 164
2011-12-02 179
2011-12-09 171
2011-12-16 172
2011-12-23 165
2011-12-30 170
2012-01-06 162
2012-01-13 172
2012-01-20 172
2012-01-27 186
2012-02-03 183
2012-02-10 175
2012-02-17 188
2012-02-24 182
2012-03-02 184
2012-03-09 189
2012-03-16 190
2012-03-23 181
2012-03-30 186
2012-04-06 180
2012-04-13 178
2012-04-20 179
2012-04-27 174
2012-05-04 201
2012-05-11 201
2012-05-18 201
2012-05-25 201
2012-06-01 206
2012-06-08 206
2012-06-15 199
2012-06-22 201
2012-06-29 186
2012-07-06 194
2012-07-13 192
2012-07-20 189
2012-07-27 189
2012-08-03 189
2012-08-10 194
2012-08-17 190
2012-08-24 192
2012-08-31 177
2012-09-07 186
2012-09-14 173
2012-09-21 178
2012-09-28 180
2012-10-05 173
2012-10-12 165
2012-10-19 167
2012-10-26 160
2012-11-02 160
2012-11-09 167
2012-11-16 159
2012-11-23 161
2012-11-30 166
2012-12-07 161
2012-12-14 150
2012-12-21 158
2012-12-28 122
2013-01-04 121
2013-01-11 115
2013-01-18 116
2013-01-25 119
2013-02-01 113
2013-02-08 112
2013-02-15 125
2013-02-22 113
2013-03-01 117
2013-03-08 113
2013-03-15 113
2013-03-22 116
2013-03-29 125
2013-04-05 113
2013-04-12 120
2013-04-19 120
2013-04-26 128
2013-05-03 131
2013-05-10 129
2013-05-17 135
2013-05-24 125
2013-05-31 140
2013-06-07 131
2013-06-14 129
2013-06-21 130
2013-06-28 139
2013-07-05 136
2013-07-12 137
2013-07-19 131
2013-07-26 132
2013-08-02 131
2013-08-09 138
2013-08-16 138
2013-08-23 140
2013-08-30 137
2013-09-06 132
2013-09-13 132
2013-09-20 129
2013-09-27 129
2013-10-04 128
2013-10-11 129
2013-10-18 130
2013-10-25 135
2013-11-01 128
2013-11-08 131
2013-11-15 130
2013-11-22 128
2013-11-29 134
2013-12-06 140
2013-12-13 131
2013-12-20 130
2013-12-27 125
2014-01-03 134
2014-01-10 138
2014-01-17 139
2014-01-24 129
2014-01-31 142
2014-02-07 145
2014-02-14 135
2014-02-21 140
2014-02-28 137
2014-03-07 148
2014-03-14 148
2014-03-21 140
2014-03-28 141
2014-04-04 148
2014-04-11 145
2014-04-18 145
2014-04-25 140
2014-05-02 157
2014-05-09 146
2014-05-16 143
2014-05-23 159
2014-05-30 152
2014-06-06 141
2014-06-13 145
2014-06-20 152
2014-06-27 145
2014-07-03 144
2014-07-11 150
2014-07-18 145
2014-07-25 146
2014-08-01 149
2014-08-08 145
2014-08-15 146
2014-08-22 151
2014-08-29 142
2014-09-05 155
2014-09-12 149
2014-09-19 158
2014-09-26 149
2014-10-03 154
2014-10-10 141
2014-10-17 150
2014-10-24 135
2014-10-31 145
2014-11-07 145
2014-11-14 155
2014-11-21 143
2014-11-26 148
2014-12-05 149
2014-12-12 151
2014-12-19 155
2014-12-26 143
2015-01-02 131
2015-01-09 132
2015-01-16 124
2015-01-23 132
2015-01-30 121
2015-02-06 116
2015-02-13 115
2015-02-20 105
2015-02-27 77
2015-03-06 73
2015-03-13 72
2015-03-20 65
2015-03-27 64
2015-04-03 65
2015-04-10 62
2015-04-17 61
2015-04-24 59
2015-05-01 56
2015-05-08 58
2015-05-15 54
2015-05-22 53
2015-05-29 50
2015-06-05 50
2015-06-12 52
2015-06-19 54
2015-06-26 52
2015-07-02 50
2015-07-10 48
2015-07-17 45
2015-07-24 44
2015-07-31 43
2015-08-07 42
2015-08-14 45
2015-08-21 45
2015-08-28 47
2015-09-04 46
2015-09-11 43
2015-09-18 43
2015-09-25 44
2015-10-02 44
2015-10-09 44
2015-10-16 40
2015-10-23 38
2015-10-30 39
2015-11-06 32
2015-11-13 30
2015-11-20 31
2015-11-27 28
2015-12-04 31
2015-12-11 26
2015-12-18 26
2015-12-25 28
2016-01-01 25
2016-01-08 26
2016-01-15 25
2016-01-22 21
2016-01-29 23
2016-02-05 20
2016-02-12 21
2016-02-19 37
2016-02-26 34
2016-03-04 32
2016-03-11 31
2016-03-18 32
2016-03-24 30
2016-04-01 27
2016-04-08 25
2016-04-15 23
2016-04-22 23
lanery pointed to right place. you need to define you quarters and use in the same fashion.
Define years
years = ['2009-12-31', '2010-12-31', '2011-12-30', '2012-12-31',
'2013-12-31', '2014-12-31', '2015-12-31']
Define quarters
quarters = ['2009-12-31', '2010-03-31', '2010-06-30', '2010-09-30',
'2010-12-31', '2011-03-31', '2011-06-30', '2011-09-30',
'2011-12-30', '2012-03-30', '2012-06-29', '2012-09-28',
'2012-12-31', '2013-03-29', '2013-06-28', '2013-09-30',
'2013-12-31', '2014-03-31', '2014-06-30', '2014-09-30',
'2014-12-31', '2015-03-31', '2015-06-30', '2015-09-30',
'2015-12-31', '2016-03-31']
Load the data you supplied
import pandas as pd
from StringIO import StringIO
text = """date count
2010-01-08 65
2010-01-15 68
2010-01-22 73
2010-01-29 76
2010-02-05 79
2010-02-12 76
2010-02-19 79
2010-02-26 83
2010-03-05 81
2010-03-12 83
2010-03-19 80
2010-03-26 87
2010-04-02 84
2010-04-09 87
2010-04-16 87
2010-04-23 91
2010-04-30 86
2010-05-07 92
2010-05-14 95
2010-05-21 91
2010-05-28 100
2010-06-04 96
2010-06-11 101
2010-06-18 100
2010-06-25 113
2010-07-02 112
2010-07-09 119
2010-07-16 121
2010-07-23 119
2010-07-30 115
2010-08-06 115
2010-08-13 114
2010-08-20 111
2010-08-27 114
2010-09-03 121
2010-09-10 128
2010-09-17 121
2010-09-24 118
2010-10-01 109
2010-10-08 120
2010-10-15 122
2010-10-22 120
2010-10-29 118
2010-11-05 117
2010-11-12 115
2010-11-19 113
2010-11-26 106
2010-12-03 112
2010-12-10 114
2010-12-17 122
2010-12-24 120
2010-12-31 120
2011-01-07 139
2011-01-14 141
2011-01-21 141
2011-01-28 145
2011-02-04 146
2011-02-11 145
2011-02-18 148
2011-02-25 149
2011-03-04 150
2011-03-11 149
2011-03-18 145
2011-03-25 140
2011-04-01 150
2011-04-08 153
2011-04-15 151
2011-04-22 148
2011-04-29 150
2011-05-06 148
2011-05-13 154
2011-05-20 155
2011-05-27 152
2011-06-03 158
2011-06-10 155
2011-06-17 152
2011-06-24 148
2011-07-01 160
2011-07-08 164
2011-07-15 163
2011-07-22 147
2011-07-29 158
2011-08-05 161
2011-08-12 166
2011-08-19 158
2011-08-26 154
2011-09-02 161
2011-09-09 166
2011-09-16 160
2011-09-23 169
2011-09-30 171
2011-10-07 155
2011-10-14 159
2011-10-21 156
2011-10-28 168
2011-11-04 154
2011-11-11 166
2011-11-18 168
2011-11-25 164
2011-12-02 179
2011-12-09 171
2011-12-16 172
2011-12-23 165
2011-12-30 170
2012-01-06 162
2012-01-13 172
2012-01-20 172
2012-01-27 186
2012-02-03 183
2012-02-10 175
2012-02-17 188
2012-02-24 182
2012-03-02 184
2012-03-09 189
2012-03-16 190
2012-03-23 181
2012-03-30 186
2012-04-06 180
2012-04-13 178
2012-04-20 179
2012-04-27 174
2012-05-04 201
2012-05-11 201
2012-05-18 201
2012-05-25 201
2012-06-01 206
2012-06-08 206
2012-06-15 199
2012-06-22 201
2012-06-29 186
2012-07-06 194
2012-07-13 192
2012-07-20 189
2012-07-27 189
2012-08-03 189
2012-08-10 194
2012-08-17 190
2012-08-24 192
2012-08-31 177
2012-09-07 186
2012-09-14 173
2012-09-21 178
2012-09-28 180
2012-10-05 173
2012-10-12 165
2012-10-19 167
2012-10-26 160
2012-11-02 160
2012-11-09 167
2012-11-16 159
2012-11-23 161
2012-11-30 166
2012-12-07 161
2012-12-14 150
2012-12-21 158
2012-12-28 122
2013-01-04 121
2013-01-11 115
2013-01-18 116
2013-01-25 119
2013-02-01 113
2013-02-08 112
2013-02-15 125
2013-02-22 113
2013-03-01 117
2013-03-08 113
2013-03-15 113
2013-03-22 116
2013-03-29 125
2013-04-05 113
2013-04-12 120
2013-04-19 120
2013-04-26 128
2013-05-03 131
2013-05-10 129
2013-05-17 135
2013-05-24 125
2013-05-31 140
2013-06-07 131
2013-06-14 129
2013-06-21 130
2013-06-28 139
2013-07-05 136
2013-07-12 137
2013-07-19 131
2013-07-26 132
2013-08-02 131
2013-08-09 138
2013-08-16 138
2013-08-23 140
2013-08-30 137
2013-09-06 132
2013-09-13 132
2013-09-20 129
2013-09-27 129
2013-10-04 128
2013-10-11 129
2013-10-18 130
2013-10-25 135
2013-11-01 128
2013-11-08 131
2013-11-15 130
2013-11-22 128
2013-11-29 134
2013-12-06 140
2013-12-13 131
2013-12-20 130
2013-12-27 125
2014-01-03 134
2014-01-10 138
2014-01-17 139
2014-01-24 129
2014-01-31 142
2014-02-07 145
2014-02-14 135
2014-02-21 140
2014-02-28 137
2014-03-07 148
2014-03-14 148
2014-03-21 140
2014-03-28 141
2014-04-04 148
2014-04-11 145
2014-04-18 145
2014-04-25 140
2014-05-02 157
2014-05-09 146
2014-05-16 143
2014-05-23 159
2014-05-30 152
2014-06-06 141
2014-06-13 145
2014-06-20 152
2014-06-27 145
2014-07-03 144
2014-07-11 150
2014-07-18 145
2014-07-25 146
2014-08-01 149
2014-08-08 145
2014-08-15 146
2014-08-22 151
2014-08-29 142
2014-09-05 155
2014-09-12 149
2014-09-19 158
2014-09-26 149
2014-10-03 154
2014-10-10 141
2014-10-17 150
2014-10-24 135
2014-10-31 145
2014-11-07 145
2014-11-14 155
2014-11-21 143
2014-11-26 148
2014-12-05 149
2014-12-12 151
2014-12-19 155
2014-12-26 143
2015-01-02 131
2015-01-09 132
2015-01-16 124
2015-01-23 132
2015-01-30 121
2015-02-06 116
2015-02-13 115
2015-02-20 105
2015-02-27 77
2015-03-06 73
2015-03-13 72
2015-03-20 65
2015-03-27 64
2015-04-03 65
2015-04-10 62
2015-04-17 61
2015-04-24 59
2015-05-01 56
2015-05-08 58
2015-05-15 54
2015-05-22 53
2015-05-29 50
2015-06-05 50
2015-06-12 52
2015-06-19 54
2015-06-26 52
2015-07-02 50
2015-07-10 48
2015-07-17 45
2015-07-24 44
2015-07-31 43
2015-08-07 42
2015-08-14 45
2015-08-21 45
2015-08-28 47
2015-09-04 46
2015-09-11 43
2015-09-18 43
2015-09-25 44
2015-10-02 44
2015-10-09 44
2015-10-16 40
2015-10-23 38
2015-10-30 39
2015-11-06 32
2015-11-13 30
2015-11-20 31
2015-11-27 28
2015-12-04 31
2015-12-11 26
2015-12-18 26
2015-12-25 28
2016-01-01 25
2016-01-08 26
2016-01-15 25
2016-01-22 21
2016-01-29 23
2016-02-05 20
2016-02-12 21
2016-02-19 37
2016-02-26 34
2016-03-04 32
2016-03-11 31
2016-03-18 32
2016-03-24 30
2016-04-01 27
2016-04-08 25
2016-04-15 23
2016-04-22 23"""
Parse your data
data = pd.read_csv(StringIO(text), index_col=[0], parse_dates=[0], delim_whitespace=True)
Use info from
How to add a grid line at a specific location in matplotlib plot?
fig, ax = plt.subplots()
ax.set_xticks(quarters, minor=True)
ax.set_xticks(years, minor=False)
ax.xaxis.grid(True, which='minor')
ax.xaxis.grid(False, which='major')
data.plot(ax=ax)
I have a pandas dataframe like this
order_id buyer_id item_id time
537 79 93 2016-01-04 10:20:00
540 191 93 2016-01-04 10:30:00
556 251 82 2016-01-04 13:39:00
589 191 104 2016-01-05 10:59:00
596 251 99 2016-01-05 13:48:00
609 79 106 2016-01-06 10:39:00
611 261 97 2016-01-06 10:50:00
680 64 135 2016-01-11 11:58:00
681 261 133 2016-01-11 12:03:00
682 309 135 2016-01-11 12:08:00
I want to subset this dataframe on date == '2016-01-04.Datatypes of df dataframe are
df.dtypes
Out[1264]:
order_id object
buyer_id object
item_id object
time datetime64[ns]
This is what I am doing in python
df[df['time'] == '2016-01-04']
But it returns me an empty dataframe. But,when I do
df[df['time'] < '2016-01-05'] it works. Please help
The problem here is that the comparison is being performed for an exact match, as none of the times are '00:00:00' then no matches occur, you'd have to compare just the date components in order for this to work:
In [20]:
df[df['time'].dt.date == pd.to_datetime('2016-01-04').date()]
Out[20]:
order_id buyer_id item_id time
0 537 79 93 2016-01-04 10:20:00
1 540 191 93 2016-01-04 10:30:00
2 556 251 82 2016-01-04 13:39:00
IIUC you can use DatetimeIndex Partial String Indexing:
print df
order_id buyer_id item_id time
0 537 79 93 2016-01-04 10:20:00
1 540 191 93 2016-01-04 10:30:00
2 556 251 82 2016-01-04 13:39:00
3 589 191 104 2016-01-05 10:59:00
4 596 251 99 2016-01-05 13:48:00
5 609 79 106 2016-01-06 10:39:00
6 611 261 97 2016-01-06 10:50:00
7 680 64 135 2016-01-11 11:58:00
8 681 261 133 2016-01-11 12:03:00
9 682 309 135 2016-01-11 12:08:00
df = df.set_index('time')
print df['2016-01-04']
order_id buyer_id item_id
time
2016-01-04 10:20:00 537 79 93
2016-01-04 10:30:00 540 191 93
2016-01-04 13:39:00 556 251 82