How to plot candlestick skipping empty dates using matplotlib? - python

I'm still a newbie to matplotlib. Currently, I have below dataset for plotting:
Date Open High Low Close
Trade_Date
2018-01-02 736696.0 42.45 42.45 41.45 41.45
2018-01-03 736697.0 41.60 41.70 40.70 40.95
2018-01-04 736698.0 40.90 41.05 40.20 40.25
2018-01-05 736699.0 40.35 41.60 40.35 41.50
2018-01-08 736702.0 40.20 40.20 37.95 38.00
2018-01-09 736703.0 37.15 39.00 37.15 38.00
2018-01-10 736704.0 38.70 38.70 37.15 37.25
2018-01-11 736705.0 37.50 37.50 36.55 36.70
2018-01-12 736706.0 37.00 37.40 36.90 37.20
2018-01-15 736709.0 37.50 37.70 37.15 37.70
2018-01-16 736710.0 37.80 38.25 37.45 37.95
2018-01-17 736711.0 38.00 38.05 37.65 37.75
2018-01-18 736712.0 38.00 38.20 37.70 37.75
2018-01-19 736713.0 36.70 37.10 35.30 36.45
2018-01-22 736716.0 36.25 36.25 35.50 36.10
2018-01-23 736717.0 36.20 36.30 35.65 36.00
2018-01-24 736718.0 35.80 36.00 35.60 36.00
2018-01-25 736719.0 36.10 36.10 35.45 35.45
2018-01-26 736720.0 35.50 35.75 35.00 35.00
2018-01-29 736723.0 34.80 35.00 33.65 33.70
2018-01-30 736724.0 33.70 34.45 33.65 33.90
I've converted the date value to number using mdates.date2num
After that, I've tried to plot candlestick graph with codes below:
f1, ax = plt.subplots(figsize= (10,5))
candlestick_ohlc(ax, ohlc.values, width=.6, colorup='red', colordown='green')
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.show()
However, I'm still getting the graph with gaps.
I've tried the possible solution from How do I plot only weekdays using Python's matplotlib candlestick?
However, I was not able to solve my problem with the solution above.
Can anyone kindly help me with this issue?
Thanks!

Related

how to convert a dictionary to pandas dataframe?

I want this output which is in dict to be converted into a pandas DataFrame with few columns of interest. Note my output is actually more but I have posted only part of the output but I hope you can understand what I actually want
My dataframe should have columns `['Date','Change in OI','Open interest']. preferably Date should be index.
strikes=[690,700,710]
data={}
for s in strikes:
data[s]=get_history(symbol="CIPLA",
start=date(2020,7,1),
end=date(2201,7,17),
option_type="CE",
strike_price=s,
expiry_date=date(2020,7,30))
OUTPUT
{690: Symbol Expiry Option Type Strike Price Open High Low \
Date
2020-07-01 CIPLA 2020-07-30 CE 690.0 11.50 11.50 7.70
2020-07-02 CIPLA 2020-07-30 CE 690.0 8.90 20.90 8.50
2020-07-03 CIPLA 2020-07-30 CE 690.0 17.75 17.75 12.00
2020-07-06 CIPLA 2020-07-30 CE 690.0 11.30 11.30 9.60
2020-07-07 CIPLA 2020-07-30 CE 690.0 10.70 12.25 10.60
2020-07-08 CIPLA 2020-07-30 CE 690.0 12.95 14.10 11.45
2020-07-09 CIPLA 2020-07-30 CE 690.0 14.00 14.00 11.60
2020-07-10 CIPLA 2020-07-30 CE 690.0 12.50 13.00 10.95
2020-07-13 CIPLA 2020-07-30 CE 690.0 11.10 11.65 9.65
2020-07-14 CIPLA 2020-07-30 CE 690.0 10.65 10.70 8.40
2020-07-15 CIPLA 2020-07-30 CE 690.0 7.55 10.00 7.55
2020-07-16 CIPLA 2020-07-30 CE 690.0 11.20 18.65 7.25
2020-07-17 CIPLA 2020-07-30 CE 690.0 18.85 25.75 14.80
Close Last Settle Price Number of Contracts Turnover \
Date
2020-07-01 8.85 8.85 8.85 66 5.995900e+07
2020-07-02 16.85 20.50 16.85 68 6.235500e+07
2020-07-03 13.00 13.25 13.00 117 1.069840e+08
2020-07-06 10.65 10.75 10.65 76 6.918800e+07
2020-07-07 11.00 11.00 11.00 64 5.836300e+07
2020-07-08 11.95 11.95 11.95 84 7.674300e+07
2020-07-09 12.00 12.00 12.00 25 2.284000e+07
2020-07-10 11.10 11.10 11.10 50 4.564100e+07
2020-07-13 10.05 10.05 10.05 36 3.278000e+07
2020-07-14 8.50 8.40 8.50 39 3.546700e+07
2020-07-15 8.45 8.40 8.45 31 2.816200e+07
2020-07-16 17.20 16.80 17.20 803 7.350000e+08
2020-07-17 20.05 19.30 20.05 1708 1.577693e+09
Premium Turnover Open Interest Change in OI Underlying
Date
2020-07-01 757000.0 119600 5200 NaN
2020-07-02 1359000.0 113100 -6500 646.20
2020-07-03 2035000.0 131300 18200 638.80
2020-07-06 1016000.0 123500 -7800 NaN
2020-07-07 955000.0 128700 5200 636.55
2020-07-08 1395000.0 130000 1300 NaN
2020-07-09 415000.0 130000 0 NaN
2020-07-10 791000.0 130000 0 NaN
2020-07-13 488000.0 123500 -6500 NaN
2020-07-14 484000.0 115700 -7800 NaN
2020-07-15 355000.0 124800 9100 NaN
2020-07-16 14709000.0 302900 178100 NaN
2020-07-17 45617000.0 243100 -59800 689.10 }
the same goes on for the values in strikes=[700,710] in output
Already tried using pd.DataFrame.from_dict(data) no use
You can try the following approaches
1.
import pandas as pd
df = pd.DatFrame(<<your dictionary>>) # you can pass the dictionary
Alternatively you can also use the following
import pandas as pd
cols = [<<list of column names>>] # to specify different column names
df = pd.DataFrame.from_dict(<<dictionary name>>,columns=cols)
use your dictionary in this way to convert it into a dataframe
import pandas as pd
df = pd.DataFrame({"your dictionary here"})

Creating a temporal range time-series spiral plot

Similarly to this question, I'm interested in creating time series spirals. The solution doesn't necessarily have to be implemented in R or using ggplot, but it seems the majority of solutions have been implemented in R with ggplot, with a handful in Python and one in d3. My attempts so far have all used R. Unlike this question, I'm interested in displaying specific ranges of data without quantizing/binning the data. That is, I'd like to display a spiral timeline showing when particular events start and stop, where theta-min and theta-max of every event represent specific points in time.
Consider this travel data:
trip_start trip_stop dist
2017-04-01 17:42:00 2017-04-01 18:34:00 1.95
2017-04-01 18:42:00 2017-04-01 19:05:00 6.54
2017-04-02 01:09:00 2017-04-02 01:12:00 1.07
2017-04-02 01:22:00 2017-04-02 01:27:00 1.03
2017-04-02 08:17:00 2017-04-02 08:23:00 1.98
2017-04-02 11:23:00 2017-04-02 11:30:00 1.98
2017-04-02 15:44:00 2017-04-02 15:56:00 4.15
2017-04-02 16:29:00 2017-04-02 16:45:00 4.08
2017-04-03 10:24:00 2017-04-03 10:55:00 19.76
2017-04-03 14:01:00 2017-04-03 14:18:00 8.21
2017-04-03 14:25:00 2017-04-03 14:31:00 1.49
2017-04-03 14:45:00 2017-04-03 14:50:00 1.59
2017-04-03 15:44:00 2017-04-03 16:10:00 4.44
2017-04-03 16:14:00 2017-04-03 16:37:00 9.96
2017-04-03 16:40:00 2017-04-03 16:45:00 0.7
2017-04-03 17:15:00 2017-04-03 17:46:00 16.92
2017-04-03 17:56:00 2017-04-03 18:19:00 5.23
2017-04-03 18:42:00 2017-04-03 18:45:00 0.49
2017-04-03 19:02:00 2017-04-03 19:04:00 0.48
2017-04-04 07:24:00 2017-04-04 07:27:00 0.66
2017-04-04 07:30:00 2017-04-04 08:04:00 13.55
2017-04-04 08:32:00 2017-04-04 09:25:00 25.09
2017-04-04 13:32:00 2017-04-04 13:40:00 3.06
2017-04-04 13:52:00 2017-04-04 13:57:00 1.3
2017-04-04 14:55:00 2017-04-04 15:01:00 2.47
2017-04-04 18:40:00 2017-04-04 19:12:00 22.71
2017-04-04 22:16:00 2017-04-04 23:54:00 38.28
2017-04-04 23:59:00 2017-04-05 00:03:00 1.02
2017-04-05 11:04:00 2017-04-05 11:49:00 25.73
2017-04-05 12:05:00 2017-04-05 12:18:00 2.97
2017-04-05 15:19:00 2017-04-05 16:25:00 25.13
2017-04-05 16:38:00 2017-04-05 16:40:00 0.41
2017-04-05 18:58:00 2017-04-05 19:02:00 1.25
2017-04-05 19:13:00 2017-04-05 19:18:00 1.09
2017-04-05 19:25:00 2017-04-05 19:48:00 6.63
2017-04-06 10:01:00 2017-04-06 10:44:00 20.81
2017-04-06 13:22:00 2017-04-06 13:33:00 1.63
2017-04-06 20:58:00 2017-04-06 21:25:00 24.85
2017-04-06 21:32:00 2017-04-06 21:56:00 6.06
2017-04-07 10:55:00 2017-04-07 11:37:00 24.53
2017-04-07 17:14:00 2017-04-07 17:48:00 19.66
2017-04-07 17:57:00 2017-04-07 18:07:00 2.12
2017-04-08 20:57:00 2017-04-08 21:06:00 1.06
2017-04-08 21:23:00 2017-04-08 21:36:00 2.97
2017-04-09 08:14:00 2017-04-09 08:19:00 1.99
2017-04-09 11:40:00 2017-04-09 11:50:00 2.24
2017-04-09 11:50:00 2017-04-09 11:57:00 1.64
2017-04-09 16:29:00 2017-04-09 16:34:00 0.53
2017-04-09 16:43:00 2017-04-09 16:45:00 0.5
2017-04-09 17:46:00 2017-04-09 17:48:00 0.44
2017-04-09 17:53:00 2017-04-09 17:56:00 0.4
2017-04-09 21:33:00 2017-04-09 21:56:00 2.48
2017-04-09 21:57:00 2017-04-09 22:14:00 2.92
2017-04-09 22:22:00 2017-04-09 22:25:00 0.9
2017-04-10 10:37:00 2017-04-10 11:22:00 19.27
2017-04-10 16:12:00 2017-04-10 16:59:00 21.31
2017-04-11 11:14:00 2017-04-11 11:18:00 1.24
2017-04-11 11:21:00 2017-04-11 11:48:00 22.95
2017-04-11 18:24:00 2017-04-11 19:05:00 28.64
2017-04-11 19:21:00 2017-04-11 19:34:00 5.37
2017-04-12 11:00:00 2017-04-12 12:08:00 28.91
2017-04-12 14:03:00 2017-04-12 15:20:00 28.56
2017-04-12 20:24:00 2017-04-12 20:29:00 1.17
2017-04-12 20:32:00 2017-04-12 21:09:00 30.89
2017-04-13 01:37:00 2017-04-13 02:09:00 32.3
2017-04-13 08:08:00 2017-04-13 08:39:00 19.39
2017-04-13 10:53:00 2017-04-13 11:23:00 24.59
2017-04-13 18:56:00 2017-04-13 19:22:00 22.74
2017-04-14 01:06:00 2017-04-14 01:37:00 31.36
2017-04-14 01:48:00 2017-04-14 01:51:00 1.03
2017-04-14 12:08:00 2017-04-14 12:22:00 1.94
2017-04-14 12:29:00 2017-04-14 13:01:00 19.07
2017-04-14 16:17:00 2017-04-14 17:03:00 19.74
2017-04-14 17:05:00 2017-04-14 17:32:00 3.99
2017-04-14 21:57:00 2017-04-14 22:02:00 1.98
2017-04-15 01:46:00 2017-04-15 01:49:00 1.07
2017-04-15 01:56:00 2017-04-15 01:58:00 1.03
2017-04-15 07:13:00 2017-04-15 07:15:00 0.45
2017-04-15 07:19:00 2017-04-15 07:21:00 0.41
2017-04-15 15:54:00 2017-04-15 16:05:00 1.94
2017-04-15 22:23:00 2017-04-15 22:26:00 0.86
2017-04-15 22:46:00 2017-04-15 22:47:00 0.25
2017-04-15 22:51:00 2017-04-15 22:53:00 0.71
2017-04-16 11:35:00 2017-04-16 11:54:00 11.4
2017-04-16 11:58:00 2017-04-16 12:15:00 10.43
2017-04-17 10:44:00 2017-04-17 10:53:00 3.04
2017-04-17 10:55:00 2017-04-17 11:22:00 18.26
2017-04-17 18:09:00 2017-04-17 18:12:00 0.85
2017-04-17 18:21:00 2017-04-17 19:07:00 37.22
2017-04-18 02:07:00 2017-04-18 02:47:00 32.41
2017-04-18 10:55:00 2017-04-18 10:57:00 0.41
2017-04-18 11:02:00 2017-04-18 11:12:00 2.3
2017-04-18 11:15:00 2017-04-18 11:52:00 24.05
2017-04-18 16:59:00 2017-04-18 17:55:00 22.66
2017-04-19 00:46:00 2017-04-19 01:35:00 39.25
2017-04-19 10:57:00 2017-04-19 11:44:00 24.06
2017-04-19 13:23:00 2017-04-19 14:10:00 25.96
2017-04-19 16:21:00 2017-04-19 17:07:00 18.05
2017-04-19 23:32:00 2017-04-20 00:19:00 39.67
2017-04-20 10:47:00 2017-04-20 11:13:00 24.07
2017-04-20 16:21:00 2017-04-20 16:30:00 0.86
2017-04-20 16:36:00 2017-04-20 16:58:00 0.85
2017-04-20 17:41:00 2017-04-20 17:44:00 0.37
2017-04-20 17:49:00 2017-04-20 18:40:00 19.32
2017-04-20 22:22:00 2017-04-20 22:53:00 29.2
2017-04-20 23:07:00 2017-04-20 23:27:00 10.94
2017-04-21 08:29:00 2017-04-21 08:40:00 1.91
2017-04-21 11:30:00 2017-04-21 11:32:00 0.42
2017-04-21 11:38:00 2017-04-21 11:40:00 0.4
2017-04-21 11:42:00 2017-04-21 12:15:00 19.09
2017-04-21 16:50:00 2017-04-21 18:17:00 40.61
2017-04-21 18:55:00 2017-04-21 19:11:00 1.73
2017-04-21 22:20:00 2017-04-21 22:53:00 28.26
2017-04-21 23:01:00 2017-04-21 23:22:00 11.76
2017-04-22 08:56:00 2017-04-22 08:58:00 0.63
2017-04-22 09:04:00 2017-04-22 09:08:00 0.3
2017-04-22 09:12:00 2017-04-22 09:15:00 0.42
2017-04-22 16:48:00 2017-04-22 16:52:00 0.54
2017-04-22 17:06:00 2017-04-22 17:09:00 0.51
2017-04-22 17:10:00 2017-04-22 17:13:00 1.03
2017-04-22 17:22:00 2017-04-22 17:27:00 1.1
2017-04-23 08:13:00 2017-04-23 08:15:00 0.41
2017-04-23 08:19:00 2017-04-23 08:20:00 0.4
2017-04-23 08:21:00 2017-04-23 08:25:00 1.99
2017-04-23 11:41:00 2017-04-23 11:48:00 2.04
2017-04-23 12:35:00 2017-04-23 12:50:00 7.59
2017-04-23 14:08:00 2017-04-23 14:21:00 7.31
2017-04-23 14:33:00 2017-04-23 15:38:00 37.6
2017-04-24 00:26:00 2017-04-24 01:18:00 39.21
2017-04-24 10:24:00 2017-04-24 10:26:00 0.41
2017-04-24 10:31:00 2017-04-24 10:35:00 1.37
2017-04-24 10:38:00 2017-04-24 10:43:00 1.19
2017-04-24 10:49:00 2017-04-24 11:15:00 19.58
2017-04-24 17:13:00 2017-04-24 18:20:00 37.42
2017-04-24 19:02:00 2017-04-24 19:08:00 1.76
2017-04-24 19:49:00 2017-04-24 19:55:00 1.79
2017-04-24 20:41:00 2017-04-24 21:16:00 32.31
2017-04-25 10:53:00 2017-04-25 11:25:00 24.83
2017-04-25 15:15:00 2017-04-25 15:24:00 3.07
2017-04-25 15:30:00 2017-04-25 15:40:00 3.01
2017-04-25 17:34:00 2017-04-25 18:18:00 24.8
2017-04-26 09:59:00 2017-04-26 10:28:00 24.05
2017-04-26 12:56:00 2017-04-26 13:40:00 29.13
2017-04-26 14:37:00 2017-04-26 15:34:00 21
2017-04-27 08:57:00 2017-04-27 10:21:00 40.56
2017-04-27 16:12:00 2017-04-27 16:44:00 9.89
2017-04-27 17:09:00 2017-04-27 18:01:00 17.51
2017-04-28 05:18:00 2017-04-28 06:06:00 39.28
2017-04-28 12:57:00 2017-04-28 13:52:00 35.82
2017-04-28 16:48:00 2017-04-28 18:14:00 39.1
2017-05-01 11:41:00 2017-05-01 12:20:00 18.74
2017-05-01 18:53:00 2017-05-01 19:34:00 37.15
2017-05-01 23:08:00 2017-05-01 23:09:00 0.06
2017-05-01 23:18:00 2017-05-02 00:11:00 38.61
2017-05-02 11:05:00 2017-05-02 11:42:00 24.07
2017-05-02 17:34:00 2017-05-02 18:53:00 26.42
2017-05-03 12:13:00 2017-05-03 12:25:00 3.96
2017-05-03 12:25:00 2017-05-03 12:56:00 21.15
2017-05-03 13:26:00 2017-05-03 13:44:00 3.32
2017-05-03 13:57:00 2017-05-03 14:08:00 3.49
2017-05-03 18:39:00 2017-05-03 19:08:00 24.85
2017-05-03 19:09:00 2017-05-03 19:13:00 0.99
2017-05-03 19:29:00 2017-05-03 19:32:00 0.84
2017-05-04 10:38:00 2017-05-04 11:06:00 24.05
2017-05-04 13:34:00 2017-05-04 14:10:00 1.73
2017-05-04 17:14:00 2017-05-04 18:23:00 24.68
2017-05-05 20:38:00 2017-05-05 20:52:00 2.24
2017-05-06 11:45:00 2017-05-06 12:30:00 20.19
2017-05-06 14:36:00 2017-05-06 15:35:00 14.49
2017-05-06 15:48:00 2017-05-06 16:17:00 5.25
2017-05-06 17:11:00 2017-05-06 17:13:00 0.43
2017-05-06 17:19:00 2017-05-06 17:21:00 0.43
2017-05-07 08:16:00 2017-05-07 08:22:00 3.27
2017-05-07 12:09:00 2017-05-07 12:16:00 2.01
2017-05-07 17:28:00 2017-05-07 17:50:00 10.36
2017-05-07 17:54:00 2017-05-07 18:01:00 1.19
2017-05-07 18:02:00 2017-05-07 18:35:00 28.31
2017-05-07 21:48:00 2017-05-07 21:52:00 1.46
2017-05-07 22:01:00 2017-05-07 22:05:00 1.37
2017-05-08 00:59:00 2017-05-08 02:19:00 39.23
2017-05-08 11:30:00 2017-05-08 11:58:00 22.55
2017-05-08 18:08:00 2017-05-08 18:30:00 10.47
2017-05-08 18:33:00 2017-05-08 19:09:00 28.44
2017-05-08 22:25:00 2017-05-08 23:09:00 38.65
2017-05-08 23:14:00 2017-05-08 23:17:00 1.04
2017-05-09 11:35:00 2017-05-09 12:19:00 23.99
2017-05-09 17:57:00 2017-05-09 18:59:00 29.38
2017-05-09 20:03:00 2017-05-09 20:13:00 1.9
2017-05-10 10:18:00 2017-05-10 10:54:00 24.06
2017-05-10 15:43:00 2017-05-10 16:46:00 24.71
2017-05-11 12:28:00 2017-05-11 13:07:00 21.75
2017-05-11 18:00:00 2017-05-11 18:31:00 19.3
2017-05-12 08:26:00 2017-05-12 08:55:00 20.46
2017-05-12 13:00:00 2017-05-12 13:34:00 14.6
2017-05-13 08:44:00 2017-05-13 08:46:00 0.38
2017-05-13 08:57:00 2017-05-13 09:01:00 0.33
2017-05-13 14:22:00 2017-05-13 14:41:00 6.86
2017-05-13 15:17:00 2017-05-13 15:35:00 5.2
2017-05-13 18:10:00 2017-05-13 18:21:00 1.91
2017-05-14 11:22:00 2017-05-14 11:26:00 0.9
2017-05-14 11:36:00 2017-05-14 11:38:00 0.39
2017-05-14 14:56:00 2017-05-14 15:59:00 40.07
2017-05-14 16:34:00 2017-05-14 16:41:00 1.49
2017-05-14 16:56:00 2017-05-14 17:04:00 1.45
2017-05-14 19:05:00 2017-05-14 20:06:00 39.21
2017-05-15 11:24:00 2017-05-15 11:33:00 1.91
2017-05-15 11:41:00 2017-05-15 12:13:00 19.84
2017-05-15 17:41:00 2017-05-15 18:11:00 16
2017-05-15 18:15:00 2017-05-15 19:23:00 31.52
2017-05-15 23:41:00 2017-05-16 00:26:00 39.32
2017-05-16 09:49:00 2017-05-16 11:02:00 24.91
2017-05-16 16:08:00 2017-05-16 16:32:00 3.37
2017-05-16 17:11:00 2017-05-16 17:32:00 4.8
2017-05-16 17:42:00 2017-05-16 17:56:00 1.81
2017-05-16 18:13:00 2017-05-16 18:46:00 24.85
2017-05-16 21:07:00 2017-05-16 21:10:00 1.04
2017-05-16 21:26:00 2017-05-16 21:29:00 1.02
2017-07-28 16:10:00 2017-07-28 16:17:00 2.22
2017-07-28 16:17:00 2017-07-28 16:42:00 7.84
2017-08-10 12:00:00 2017-08-10 12:44:00 24.05
2017-08-10 14:56:00 2017-08-10 15:10:00 1.61
2017-08-10 18:51:00 2017-08-10 19:21:00 24.85
2017-08-10 19:46:00 2017-08-10 19:56:00 1.14
2017-08-10 20:08:00 2017-08-10 20:12:00 1.09
2017-08-11 12:44:00 2017-08-11 12:49:00 0.82
2017-08-11 12:59:00 2017-08-11 13:01:00 0.56
2017-08-11 13:18:00 2017-08-11 15:12:00 1.79
2017-08-11 15:14:00 2017-08-11 16:53:00 34.6
2017-08-11 19:27:00 2017-08-11 20:34:00 34.91
2017-08-12 13:52:00 2017-08-12 13:56:00 1.05
2017-08-12 13:59:00 2017-08-12 14:02:00 0.28
2017-08-12 14:10:00 2017-08-12 14:30:00 1.22
2017-08-12 17:15:00 2017-08-12 17:36:00 11.37
2017-08-12 20:49:00 2017-08-12 21:05:00 10.43
2017-08-13 12:16:00 2017-08-13 12:44:00 12.96
2017-08-13 16:03:00 2017-08-13 16:32:00 14.33
2017-08-13 18:19:00 2017-08-13 18:42:00 9.32
2017-08-13 18:52:00 2017-08-13 19:05:00 3.99
2017-08-13 21:42:00 2017-08-13 21:53:00 5.6
2017-08-14 08:50:00 2017-08-14 09:45:00 24.1
2017-08-14 13:22:00 2017-08-14 13:54:00 24.84
2017-08-14 14:02:00 2017-08-14 15:34:00 36.92
2017-08-14 15:58:00 2017-08-14 17:17:00 35.7
2017-08-14 17:35:00 2017-08-14 17:45:00 1.99
2017-08-14 18:07:00 2017-08-14 18:27:00 9.92
2017-08-15 10:15:00 2017-08-15 10:51:00 25
2017-08-15 19:23:00 2017-08-15 19:29:00 0.4
2017-08-15 19:51:00 2017-08-15 20:45:00 24.39
2017-08-15 20:56:00 2017-08-15 21:04:00 2.78
2017-08-15 21:09:00 2017-08-15 21:37:00 19.22
2017-08-16 00:03:00 2017-08-16 00:27:00 15.51
2017-08-16 00:36:00 2017-08-16 00:41:00 1.23
2017-08-16 00:46:00 2017-08-16 01:18:00 11.35
2017-08-16 09:38:00 2017-08-16 09:41:00 1.21
2017-08-16 09:41:00 2017-08-16 09:43:00 0.08
2017-08-16 09:47:00 2017-08-16 10:32:00 22.89
2017-08-16 16:51:00 2017-08-16 17:11:00 3.14
2017-08-16 17:12:00 2017-08-16 17:25:00 2.76
2017-08-16 17:41:00 2017-08-16 18:36:00 24.78
2017-08-17 09:34:00 2017-08-17 10:13:00 24.03
2017-08-17 12:32:00 2017-08-17 13:07:00 24.82
2017-08-17 13:35:00 2017-08-17 13:40:00 0.4
2017-08-17 13:47:00 2017-08-17 15:07:00 36.06
2017-08-17 15:18:00 2017-08-17 15:24:00 0.06
2017-08-17 16:03:00 2017-08-17 18:05:00 35.16
2017-08-18 09:47:00 2017-08-18 10:23:00 24.47
2017-08-18 16:04:00 2017-08-18 16:42:00 1.63
2017-08-18 17:56:00 2017-08-18 18:25:00 10.74
2017-08-18 18:27:00 2017-08-18 18:48:00 1.85
2017-08-19 00:07:00 2017-08-19 00:41:00 18.92
2017-08-19 00:52:00 2017-08-19 00:55:00 0.99
2017-08-19 11:52:00 2017-08-19 12:14:00 7.56
2017-08-19 15:57:00 2017-08-19 16:12:00 4.02
2017-08-19 16:37:00 2017-08-19 16:56:00 5.32
2017-08-19 23:32:00 2017-08-19 23:50:00 7.54
2017-08-19 23:51:00 2017-08-20 00:17:00 9.59
2017-08-20 09:03:00 2017-08-20 09:16:00 5.22
2017-08-20 19:17:00 2017-08-20 19:32:00 4.69
2017-08-21 09:24:00 2017-08-21 09:40:00 2.31
2017-08-21 10:59:00 2017-08-21 11:02:00 0.47
2017-08-21 13:40:00 2017-08-21 15:29:00 36.09
2017-08-21 15:54:00 2017-08-21 16:48:00 2.24
2017-08-21 16:57:00 2017-08-21 18:15:00 32.3
2017-08-22 08:38:00 2017-08-22 09:06:00 0.65
2017-08-22 09:18:00 2017-08-22 09:19:00 0.04
2017-08-22 09:22:00 2017-08-22 10:05:00 23.49
2017-08-22 14:30:00 2017-08-22 15:02:00 1.7
2017-08-22 16:37:00 2017-08-22 17:41:00 24.8
2017-08-23 17:16:00 2017-08-23 18:14:00 24.01
2017-08-23 18:27:00 2017-08-23 18:32:00 1.05
2017-08-23 19:24:00 2017-08-23 20:04:00 18.14
2017-08-23 22:01:00 2017-08-23 22:28:00 16.33
2017-08-23 22:46:00 2017-08-23 22:50:00 1.04
2017-08-24 09:41:00 2017-08-24 09:44:00 0.02
2017-08-24 09:59:00 2017-08-24 10:00:00 0.02
2017-08-24 13:57:00 2017-08-24 15:33:00 42.51
2017-08-24 16:43:00 2017-08-24 17:00:00 0.07
2017-08-24 17:06:00 2017-08-24 17:33:00 10.01
2017-08-24 18:12:00 2017-08-24 19:03:00 27.67
2017-08-25 09:36:00 2017-08-25 09:55:00 2.63
2017-08-25 10:01:00 2017-08-25 10:32:00 20.92
2017-08-25 20:40:00 2017-08-25 21:45:00 17.41
2017-08-25 21:49:00 2017-08-25 22:14:00 16.02
2017-08-26 00:10:00 2017-08-26 02:14:00 29.77
2017-08-26 16:31:00 2017-08-26 16:55:00 7.15
2017-08-26 17:54:00 2017-08-26 18:19:00 10
2017-08-26 20:07:00 2017-08-26 20:08:00 0.19
2017-08-26 20:08:00 2017-08-26 20:11:00 1.35
2017-08-27 12:39:00 2017-08-27 12:54:00 1
2017-08-27 12:55:00 2017-08-27 13:48:00 9.29
2017-08-27 14:00:00 2017-08-27 14:34:00 3.86
2017-08-27 15:56:00 2017-08-27 16:37:00 10.45
2017-08-27 16:44:00 2017-08-27 16:51:00 1.8
2017-08-27 16:55:00 2017-08-27 17:00:00 0.68
2017-08-27 17:04:00 2017-08-27 17:19:00 4.96
2017-08-27 17:28:00 2017-08-27 17:39:00 2.33
2017-08-27 17:47:00 2017-08-27 18:58:00 24.19
2017-08-27 22:17:00 2017-08-27 22:41:00 16.24
2017-08-28 00:33:00 2017-08-28 01:22:00 13.62
2017-08-28 12:48:00 2017-08-28 12:51:00 0.47
2017-08-28 14:01:00 2017-08-28 14:03:00 0.4
2017-08-28 14:12:00 2017-08-28 15:31:00 34.86
2017-08-28 15:56:00 2017-08-28 17:04:00 34.47
2017-08-28 22:15:00 2017-08-28 22:38:00 18.57
2017-08-29 01:42:00 2017-08-29 02:05:00 18.88
2017-08-29 11:40:00 2017-08-29 11:44:00 1.04
2017-08-29 11:48:00 2017-08-29 12:09:00 0.03
2017-08-29 12:18:00 2017-08-29 12:21:00 0.03
2017-08-29 12:26:00 2017-08-29 12:32:00 1.05
2017-08-29 12:35:00 2017-08-29 13:15:00 24.05
2017-08-29 19:40:00 2017-08-29 19:42:00 0.35
2017-08-29 19:50:00 2017-08-29 20:19:00 27.72
2017-08-29 20:25:00 2017-08-29 20:41:00 10.42
2017-08-30 10:00:00 2017-08-30 10:47:00 24.25
2017-08-30 14:31:00 2017-08-30 14:56:00 1.68
2017-08-30 17:19:00 2017-08-30 17:43:00 0.04
2017-08-30 17:43:00 2017-08-30 17:50:00 0.29
2017-08-30 17:56:00 2017-08-30 18:40:00 16.85
2017-08-30 22:57:00 2017-08-30 23:35:00 17.31
2017-08-31 11:30:00 2017-08-31 11:41:00 0.43
2017-08-31 14:04:00 2017-08-31 14:06:00 0.41
2017-08-31 14:24:00 2017-08-31 14:26:00 0.68
2017-08-31 14:31:00 2017-08-31 15:42:00 34.88
2017-08-31 16:01:00 2017-08-31 17:07:00 30.45
2017-08-31 20:54:00 2017-08-31 21:21:00 19.6
2017-09-01 10:30:00 2017-09-01 10:59:00 17.63
2017-09-01 14:07:00 2017-09-01 15:07:00 27.45
2017-09-01 17:17:00 2017-09-01 17:36:00 1.93
2017-09-01 18:16:00 2017-09-01 19:19:00 20.58
2017-09-01 19:25:00 2017-09-01 19:38:00 4.8
2017-09-01 21:30:00 2017-09-01 21:54:00 1.94
2017-09-02 15:46:00 2017-09-02 16:06:00 0.99
2017-09-02 16:13:00 2017-09-02 16:16:00 1.01
2017-09-02 16:56:00 2017-09-02 16:59:00 0.42
2017-09-02 17:04:00 2017-09-02 17:06:00 0.4
2017-09-02 22:52:00 2017-09-02 22:54:00 0.07
2017-09-02 22:55:00 2017-09-02 23:15:00 18.62
2017-09-03 01:46:00 2017-09-03 02:10:00 18.9
2017-09-03 14:49:00 2017-09-03 15:04:00 3.14
2017-09-03 15:50:00 2017-09-03 16:07:00 10.17
2017-09-03 16:21:00 2017-09-03 16:38:00 7.79
2017-09-03 16:47:00 2017-09-03 16:52:00 1.11
2017-09-03 18:32:00 2017-09-03 18:37:00 1.2
2017-09-03 18:37:00 2017-09-03 18:44:00 0.91
2017-09-04 15:50:00 2017-09-04 15:54:00 0.42
2017-09-04 15:59:00 2017-09-04 16:11:00 2.3
2017-09-04 16:21:00 2017-09-04 16:43:00 8.31
2017-09-04 17:05:00 2017-09-04 17:15:00 2.54
2017-09-04 17:26:00 2017-09-04 17:41:00 4.52
2017-09-04 17:49:00 2017-09-04 18:25:00 29.55
2017-09-04 19:36:00 2017-09-04 19:51:00 0.93
2017-09-04 19:54:00 2017-09-04 19:59:00 0.5
2017-09-04 21:21:00 2017-09-04 21:55:00 29.37
2017-09-05 11:08:00 2017-09-05 11:51:00 35.5
2017-09-05 12:36:00 2017-09-05 13:07:00 2.29
2017-09-05 13:19:00 2017-09-05 13:22:00 0.51
2017-09-05 13:26:00 2017-09-05 14:03:00 33.09
2017-09-05 14:13:00 2017-09-05 15:01:00 24.03
2017-09-05 17:33:00 2017-09-05 18:11:00 14.55
2017-09-05 19:01:00 2017-09-05 19:19:00 11.31
2017-09-06 09:21:00 2017-09-06 09:39:00 7.73
2017-09-06 10:14:00 2017-09-06 10:30:00 7.75
2017-09-06 10:37:00 2017-09-06 11:13:00 24.13
2017-09-06 16:48:00 2017-09-06 17:35:00 25.3
2017-09-06 17:49:00 2017-09-06 17:55:00 0.18
2017-09-06 17:58:00 2017-09-06 18:00:00 0.39
2017-09-06 18:38:00 2017-09-06 19:04:00 15.93
2017-09-06 23:45:00 2017-09-07 00:14:00 19.45
2017-09-07 00:26:00 2017-09-07 00:30:00 1.01
2017-09-07 10:42:00 2017-09-07 11:35:00 31.74
2017-09-07 14:04:00 2017-09-07 14:39:00 27.38
2017-09-07 14:43:00 2017-09-07 14:52:00 3.06
2017-09-07 14:54:00 2017-09-07 16:00:00 32.96
2017-09-07 16:32:00 2017-09-07 16:33:00 0.07
2017-09-07 16:38:00 2017-09-07 17:04:00 2.31
2017-09-07 17:23:00 2017-09-07 18:14:00 33.03
2017-09-08 10:02:00 2017-09-08 10:30:00 19.73
2017-09-08 18:09:00 2017-09-08 18:37:00 18.97
2017-09-08 19:04:00 2017-09-08 19:18:00 1.87
2017-09-09 02:25:00 2017-09-09 02:28:00 1.1
2017-09-09 02:33:00 2017-09-09 02:35:00 1.05
2017-09-10 17:09:00 2017-09-10 17:44:00 14.25
2017-09-10 22:50:00 2017-09-10 22:53:00 0.25
2017-09-10 22:56:00 2017-09-10 22:57:00 0.02
2017-09-10 23:00:00 2017-09-10 23:23:00 16.18
2017-09-11 00:01:00 2017-09-11 00:19:00 1.83
2017-09-11 09:59:00 2017-09-11 10:06:00 1.91
2017-09-11 10:12:00 2017-09-11 10:51:00 27.49
2017-09-11 13:39:00 2017-09-11 14:13:00 27.23
2017-09-11 14:31:00 2017-09-11 15:31:00 35.45
2017-09-11 16:03:00 2017-09-11 17:09:00 36.01
2017-09-11 17:39:00 2017-09-11 18:01:00 9.88
2017-09-11 23:01:00 2017-09-11 23:05:00 1.14
2017-09-11 23:16:00 2017-09-11 23:30:00 5.93
2017-09-11 23:30:00 2017-09-11 23:54:00 4.94
2017-09-12 02:56:00 2017-09-12 04:00:00 25.87
2017-09-12 10:06:00 2017-09-12 10:46:00 24.84
2017-09-12 16:33:00 2017-09-12 17:20:00 22.43
2017-09-12 19:38:00 2017-09-12 20:14:00 21.79
2017-09-13 06:24:00 2017-09-13 06:59:00 25.84
2017-09-13 07:02:00 2017-09-13 07:14:00 5.77
2017-09-13 11:14:00 2017-09-13 11:36:00 16.26
2017-09-13 16:01:00 2017-09-13 16:57:00 24.79
2017-09-13 17:07:00 2017-09-13 17:48:00 15.94
2017-09-13 23:13:00 2017-09-13 23:35:00 16.73
2017-09-14 12:00:00 2017-09-14 12:27:00 19.71
2017-09-14 12:28:00 2017-09-14 12:30:00 0.18
2017-09-14 14:36:00 2017-09-14 15:06:00 14.98
2017-09-14 15:11:00 2017-09-14 15:17:00 2.99
2017-09-14 15:26:00 2017-09-14 16:44:00 37.48
2017-09-14 17:03:00 2017-09-14 18:17:00 34.18
2017-09-14 18:32:00 2017-09-14 18:41:00 3.03
2017-09-15 10:25:00 2017-09-15 10:26:00 0.05
2017-09-15 10:45:00 2017-09-15 10:48:00 0.29
2017-09-15 10:59:00 2017-09-15 11:05:00 0.3
2017-09-15 11:09:00 2017-09-15 11:36:00 10.82
2017-09-15 13:00:00 2017-09-15 13:17:00 8.37
2017-09-15 13:36:00 2017-09-15 14:30:00 25.19
2017-09-15 14:37:00 2017-09-15 15:01:00 0.45
2017-09-15 15:04:00 2017-09-15 16:59:00 85.51
2017-09-15 17:06:00 2017-09-15 18:57:00 129.72
2017-09-15 19:03:00 2017-09-15 20:02:00 60.96
2017-09-16 10:18:00 2017-09-16 10:39:00 16.04
2017-09-16 11:52:00 2017-09-16 12:12:00 16.68
2017-09-16 12:28:00 2017-09-16 13:29:00 49
2017-09-16 18:36:00 2017-09-16 19:30:00 45.7
2017-09-16 19:39:00 2017-09-16 19:47:00 2.1
2017-09-17 13:32:00 2017-09-17 13:41:00 2.24
2017-09-17 14:19:00 2017-09-17 14:48:00 14.68
2017-09-17 18:25:00 2017-09-17 18:26:00 0.05
2017-09-17 18:36:00 2017-09-17 19:03:00 12.26
2017-09-18 07:52:00 2017-09-18 08:03:00 2.04
2017-09-18 08:21:00 2017-09-18 08:56:00 37.94
2017-09-18 09:01:00 2017-09-18 09:53:00 65.7
2017-09-18 10:04:00 2017-09-18 10:34:00 39.43
2017-09-18 10:46:00 2017-09-18 11:07:00 14.25
2017-09-18 11:19:00 2017-09-18 13:29:00 138.98
2017-09-18 14:24:00 2017-09-18 14:26:00 0.04
2017-09-18 14:28:00 2017-09-18 15:23:00 35.52
2017-09-18 15:53:00 2017-09-18 17:49:00 36.64
2017-09-19 09:24:00 2017-09-19 10:22:00 24.37
2017-09-19 15:55:00 2017-09-19 16:53:00 15.87
2017-09-19 16:53:00 2017-09-19 17:20:00 0.85
2017-09-19 17:33:00 2017-09-19 18:06:00 10.95
2017-09-19 18:10:00 2017-09-19 18:34:00 8.41
2017-09-19 21:06:00 2017-09-19 21:10:00 1.24
2017-09-19 21:17:00 2017-09-19 21:21:00 1.05
2017-09-20 11:12:00 2017-09-20 11:16:00 1.22
2017-09-20 11:18:00 2017-09-20 11:59:00 24.15
2017-09-20 17:20:00 2017-09-20 18:07:00 24.15
2017-09-20 18:50:00 2017-09-20 19:17:00 16.02
2017-09-20 22:05:00 2017-09-20 22:32:00 17.5
2017-09-21 13:38:00 2017-09-21 13:44:00 0.72
2017-09-21 13:50:00 2017-09-21 15:26:00 35.81
2017-09-21 15:59:00 2017-09-21 16:15:00 8.26
2017-09-21 16:19:00 2017-09-21 17:32:00 28.1
2017-09-21 18:49:00 2017-09-21 19:25:00 16.05
2017-09-21 22:30:00 2017-09-21 22:59:00 16.97
2017-09-22 10:19:00 2017-09-22 10:21:00 0.43
2017-09-22 10:25:00 2017-09-22 10:26:00 0.4
2017-09-22 10:30:00 2017-09-22 10:54:00 19.15
2017-09-22 11:58:00 2017-09-22 12:02:00 1.05
2017-09-22 18:32:00 2017-09-22 18:59:00 20.95
2017-09-23 08:34:00 2017-09-23 08:51:00 1.15
2017-09-23 09:19:00 2017-09-23 10:31:00 37.57
2017-09-23 11:09:00 2017-09-23 11:23:00 5.67
2017-09-23 11:51:00 2017-09-23 12:15:00 4.64
2017-09-23 12:47:00 2017-09-23 13:40:00 8.45
2017-09-23 13:56:00 2017-09-23 15:08:00 34.62
2017-09-23 15:37:00 2017-09-23 16:07:00 1.56
2017-09-24 14:59:00 2017-09-24 15:02:00 0.43
2017-09-24 15:14:00 2017-09-24 17:09:00 6.6
2017-09-24 17:37:00 2017-09-24 18:01:00 7.05
2017-09-24 18:05:00 2017-09-24 18:07:00 0.41
2017-09-24 19:35:00 2017-09-24 20:31:00 25.28
2017-09-25 00:24:00 2017-09-25 00:26:00 0.42
2017-09-25 00:30:00 2017-09-25 01:10:00 23.13
2017-09-25 12:12:00 2017-09-25 12:38:00 19.45
2017-09-25 14:22:00 2017-09-25 14:50:00 19.86
2017-09-25 14:52:00 2017-09-25 15:54:00 35.53
2017-09-25 16:37:00 2017-09-25 18:17:00 34.54
2017-09-25 20:36:00 2017-09-25 21:08:00 28.91
2017-09-26 01:46:00 2017-09-26 02:21:00 26.32
2017-09-26 09:36:00 2017-09-26 10:18:00 24.02
2017-09-26 14:05:00 2017-09-26 14:39:00 25.3
2017-09-26 15:49:00 2017-09-26 15:58:00 1.53
2017-09-26 16:15:00 2017-09-26 16:22:00 1.1
2017-09-27 09:15:00 2017-09-27 10:16:00 24.76
2017-09-27 16:26:00 2017-09-27 17:49:00 35.87
2017-09-27 17:58:00 2017-09-27 18:46:00 27.64
2017-09-27 18:51:00 2017-09-27 18:59:00 2.08
2017-09-27 19:10:00 2017-09-27 20:17:00 21.17
2017-09-27 20:25:00 2017-09-27 21:56:00 3.6
2017-09-27 22:04:00 2017-09-27 22:32:00 16.56
2017-09-28 06:46:00 2017-09-28 07:19:00 14.4
2017-09-28 09:05:00 2017-09-28 09:29:00 8.06
2017-09-28 10:41:00 2017-09-28 11:21:00 22.34
2017-09-28 14:26:00 2017-09-28 16:05:00 35.57
2017-09-28 16:09:00 2017-09-28 16:21:00 1.17
2017-09-28 20:37:00 2017-09-28 20:40:00 1.1
2017-09-28 20:56:00 2017-09-28 21:00:00 1.15
2017-09-29 09:32:00 2017-09-29 10:02:00 19.73
I'd like to plot these discrete events the same way the below plots do, but where 2pi is one week rather than 24 hours in order to illuminate the periodicity of these events, where color represents distance.
I've attempted modifying the solution linked at the beginning of this question, but it hasn't gotten me anywhere. My new approach is to modify this solution, but I'm having a difficult time getting anything but horizontal and vertical lines scattered about a spiral. Making them curve and display in the correct locations is tough.
I'm open to any approach that successfully displays the data in a spiral plot without quantizing/binning it into specific intervals but rather allows the intervals themselves to describe discrete events along a continuous spiralling timeline. Likewise, I'm not interested in converting this to a raw single-point time series format where I'd have a great deal of data representing the time between trips. I'd like to achieve this in a temporal format (one that describes a time window rather than an event at a particular time).
Still needs work, but it's a start, with python and matplotlib.
The idea is to plot a spiral timeline in polar coordinates with 1 week period, each event is an arc of this spiral with a color depending on dist data.
There are lots of overlapping intervals though that this visualization tends to hide... maybe semitransparent arcs could be better, with a carefully chosen colormap.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patheffects as mpe
import pandas as pd
# styling
LINEWIDTH=4
EDGEWIDTH=1
CAPSTYLE="projecting"
COLORMAP="viridis_r"
ALPHA=1
FIRSTDAY=6 # 0=Mon, 6=Sun
# load dataset and parse timestamps
df = pd.read_csv('trips.csv')
df[['trip_start', 'trip_stop']] = df[['trip_start', 'trip_stop']].apply(pd.to_datetime)
# set origin at the first FIRSTDAY before the first trip, midnight
first_trip = df['trip_start'].min()
origin = (first_trip - pd.to_timedelta(first_trip.weekday() - FIRSTDAY, unit='d')).replace(hour=0, minute=0, second=0)
weekdays = pd.date_range(origin, origin + np.timedelta64(1, 'W')).strftime("%a").tolist()
# # convert trip timestamps to week fractions
df['start'] = (df['trip_start'] - origin) / np.timedelta64(1, 'W')
df['stop'] = (df['trip_stop'] - origin) / np.timedelta64(1, 'W')
# sort dataset so shortest trips are plotted last
# should prevent longer events to cover shorter ones, still suboptimal
df = df.sort_values('dist', ascending=False).reset_index()
fig = plt.figure(figsize=(8, 6))
ax = fig.gca(projection="polar")
for idx, event in df.iterrows():
# sample normalized distance from colormap
ndist = event['dist'] / df['dist'].max()
color = plt.cm.get_cmap(COLORMAP)(ndist)
tstart, tstop = event.loc[['start', 'stop']]
# timestamps are in week fractions, 2pi is one week
nsamples = int(1000. * (tstop - tstart))
t = np.linspace(tstart, tstop, nsamples)
theta = 2 * np.pi * t
arc, = ax.plot(theta, t, lw=LINEWIDTH, color=color, solid_capstyle=CAPSTYLE, alpha=ALPHA)
if EDGEWIDTH > 0:
arc.set_path_effects([mpe.Stroke(linewidth=LINEWIDTH+EDGEWIDTH, foreground='black'), mpe.Normal()])
# grid and labels
ax.set_rticks([])
ax.set_theta_zero_location("N")
ax.set_theta_direction(-1)
ax.set_xticks(np.linspace(0, 2*np.pi, 7, endpoint=False))
ax.set_xticklabels(weekdays)
ax.tick_params('x', pad=2)
ax.grid(True)
# setup a custom colorbar, everything's always a bit tricky with mpl colorbars
vmin = df['dist'].min()
vmax = df['dist'].max()
norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
sm = plt.cm.ScalarMappable(cmap=COLORMAP, norm=norm)
sm.set_array([])
plt.colorbar(sm, ticks=np.linspace(vmin, vmax, 10), fraction=0.04, aspect=60, pad=0.1, label="distance", ax=ax)
plt.savefig("spiral.png", pad_inches=0, bbox_inches="tight")
Full timeline
To see it's a spiral that never overlaps and it works for longer events too you can plot the full timeline (here with LINEWIDTH=3.5 to limit moiré fringing).
fullt = np.linspace(df['start'].min(), df['stop'].max(), 10000)
theta = 2 * np.pi * fullt
ax.plot(theta, fullt, lw=LINEWIDTH,
path_effects=[mpe.Stroke(linewidth=LINEWIDTH+LINEBORDER, foreground='black'), mpe.Normal()])
Example with a random set...
Here's the plot for a random dataset of 200 mainly short trips with the occasional 1 to 2 weeks long ones.
N = 200
df = pd.DataFrame()
df["start"] = np.random.uniform(0, 20, size=N)
df["stop"] = df["start"] + np.random.choice([np.random.uniform(0, 0.1),
np.random.uniform(1., 2.)], p=[0.98, 0.02], size=N)
df["dist"] = np.random.random(size=N)
... and different styles
inferno_r color map, rounded or butted linecaps, semitransparent, bolder edges, etc (click for full size)
Here's a start. Let me know if this is what you had in mind.
I began with your data sample and put trip_start and trip_stop into POSIXct format before continuing with the code below.
library(tidyverse)
library(lubridate)
dat = dat %>%
mutate(start=(hour(trip_start)*60 + minute(trip_start) + second(trip_start))/(24*60) + wday(trip_start),
stop=(hour(trip_stop)*60 + minute(trip_stop) + second(trip_stop))/(24*60) + wday(trip_stop),
tod = case_when(hour(trip_start) < 6 ~ "night",
hour(trip_start) < 12 ~ "morning",
hour(trip_start) < 18 ~ "afternoon",
hour(trip_start) < 24 ~ "evening"))
ggplot(dat) +
geom_segment(aes(x=start, xend=stop,
y=trip_start,
yend=trip_stop,
colour=tod),
size=5, show.legend = FALSE) +
coord_polar() +
scale_y_datetime(breaks=seq(as.POSIXct("2017-09-01"), as.POSIXct("2017-12-31"), by="week")) +
scale_x_continuous(limits=c(1,8), breaks=1:7,
labels=weekdays(x=as.Date(seq(7)+2, origin="1970-01-01"),
abbreviate=TRUE))+
expand_limits(y=as.POSIXct("2017-08-25")) +
theme_bw() +
scale_colour_manual(values=c(night="black", morning="orange",
afternoon="orange", evening="blue")) +
labs(x="",y="")
This could be achieved relatively straightforwardly with d3. I'll use your data to create a rough template of one basic possible approach. Here's what the result of this approach might look like:
The key ingredient is d3's radial line component that lets us define a line by plotting angle and radius (here's a recent answer showing another spiral graph, that answer started me down the path on this answer).
All we need to do is scale angle and radius to be able to use this effectively (for which we need the first time and last time in the dataset):
var angle = d3.scaleTime()
.domain([start,end])
.range([0,Math.PI * 2 * numberWeeks])
var radius = d3.scaleTime()
.domain([start,end])
.range([minInnerRadius,maxOuterRadius])
And from there we can create a spiral quite easily, we sample some dates throughout the interval and then pass them to the radial line function:
var spiral = d3.radialLine()
.curve(d3.curveCardinal)
.angle(angle)
.radius(radius);
Here's a quick demonstration of just the spiral covering your time period. I'm assuming a base familiarity with d3 for this answer, so have not touched on a few parts of the code.
Once we have that, it's just a matter of adding sections from the data. The most simple way would be to plainly draw a stroke with some width and color it appropriately. This requires the same as above, but rather than sampling points from the start and end times of the dataset, we just need the start and end times of each datum:
// append segments on spiral:
var segments = g.selectAll()
.data(data)
.enter()
.append("path")
.attr("d", function(d) {
return /* sample points and feed to spiral function here */;
})
.style("stroke-width", /* appropriate width here */ )
.style("stroke",function(d) { return /* color logic here */ })
This might look something like this (with data mouseover).
This is just a proof of concept, if you were looking for more control and a nicer look, you could create a polygonal path for each data entry and use both fill & stroke. As is, you'll have to make do with layering strokes to get borders if desired and svg manipulations like line capping options.
Also, as it's d3, and longer timespans may be hard to show all at once, you could show less time but rotate the spiral so that it animates through your time span, dropping off events at the end and creating them in the origin. The actual chart might need to be canvas for this to happen smoothly depending on number of nodes, but to convert to canvas is relatively trivial in this case.
For the sake of filling out the visualization a little with a legend and day labels, this is what I have.

Having trouble plotting this data frame of mutual funds

First off, here is my dataframe:
Date 2012-09-04 00:00:00 2012-09-05 00:00:00 2012-09-06 00:00:00 2012-09-07 00:00:00 2012-09-10 00:00:00 2012-09-11 00:00:00 2012-09-12 00:00:00 2012-09-13 00:00:00 2012-09-14 00:00:00 2012-09-17 00:00:00 ... 2017-08-22 00:00:00 2017-08-23 00:00:00 2017-08-24 00:00:00 2017-08-25 00:00:00 2017-08-28 00:00:00 2017-08-29 00:00:00 2017-08-30 00:00:00 2017-08-31 00:00:00 2017-09-01 00:00:00 Type
AABTX 9.73 9.73 9.83 9.86 9.83 9.86 9.86 9.96 9.98 9.96 ... 11.44 11.45 11.44 11.46 11.46 11.47 11.47 11.51 11.52 Hybrid
AACTX 9.66 9.65 9.77 9.81 9.78 9.81 9.82 9.92 9.95 9.93 ... 12.32 12.32 12.31 12.33 12.34 12.34 12.35 12.40 12.41 Hybrid
AADTX 9.71 9.70 9.85 9.90 9.86 9.89 9.91 10.02 10.07 10.05 ... 13.05 13.04 13.03 13.05 13.06 13.06 13.08 13.14 13.15 Hybrid
AAETX 9.92 9.91 10.07 10.13 10.08 10.12 10.14 10.26 10.32 10.29 ... 13.84 13.84 13.82 13.85 13.86 13.86 13.89 13.96 13.98 Hybrid
AAFTX 9.85 9.84 10.01 10.06 10.01 10.05 10.07 10.20 10.26 10.23 ... 14.09 14.08 14.07 14.09 14.11 14.11 14.15 14.24 14.26 Hybrid
That is a bit hard to read but essentially these are just closing prices for several mutual funds (638) which the Type label in the last column. I'd like to plot all of these on a single plot and have a legend labeling what type each plot is.
I'd like to see how many potential clusters I may need. This was my first though to visualize the data but if you have any other recommendations, feel free to suggest it.
Also, in my first attempt, I tried:
parallel_coordinates(closing_data, 'Type', alpha=0.2, colormap=dark2_cmap)
plt.show()
It just shows up as a black blob and after some research I found that it doesn't handle large number of features that well.
My suggestion is to transpose the dataframe, as timestamp comes more naturally as an index and you will be able to address individual time series as df.AABTX or df['AABTX'].
With a smaller number of time series you could have tried df.plot(), but when in it is rather large you should not be surpried to see some mess initially.
Try plotting a subset of your data, but please make sure the time is in index, not columns names.
You may be looking for something like the silhouette analysis which is implemented in the scikit-learn machine learning library. It should allow to find an optimal number of clusters to consider for your data.

Modify output from series.rolling to 2 decimal points

Using the following data:
Open High Low Last Volume
Timestamp
2016-06-10 16:10:00 2088.00 2088.0 2087.75 2087.75 1418
2016-06-10 16:11:00 2088.00 2088.0 2087.75 2088.00 450
2016-06-10 16:12:00 2088.00 2088.0 2087.25 2087.25 2898
I am looking to use a rolling moving average as follows:
data["sma_9_volume"] = data.Volume.rolling(window=9,center=False).mean()
and this gives me this output:
Open High Low Last Volume candle_range sma_9_close sma_9_volume
Timestamp
2014-03-04 09:38:00 1785.50 1785.50 1784.75 1785.25 24 0.75 1785.416667 48.000000
2014-03-04 09:39:00 1785.50 1786.00 1785.25 1785.25 13 0.75 1785.500000 30.444444
2014-03-04 09:40:00 1786.00 1786.25 1783.50 1783.75 28 2.75 1785.333333 30.444444
2014-03-04 09:41:00 1784.00 1785.00 1784.00 1784.25 12 1.00 1785.083333 22.777778
2014-03-04 09:42:00 1784.25 1784.75 1784.00 1784.25 18 0.75 1784.972222 20.222222
2014-03-04 09:43:00 1784.75 1785.00 1784.50 1784.50 10 0.50 1784.888889 20.111111
2014-03-04 09:44:00 1784.25 1784.25 1783.75 1784.00 32 0.50 1784.694444 18.222222
what is the best way to take the output from:
data["sma_9_volume"] = data.Volume.rolling(window=9,center=False).mean()
and have the output only return 2 decimal points i.e. 48.00 instead of 48.000000
you can use pandas' round function
data["sma_9_volume"]=data["sma_9_volume"].round(decimals=2)
or directly:
data["sma_9_volume"] = data.Volume.rolling(window=9,center=False).mean().round(decimals=2)
documentation

vectorize for-loop to fill Pandas DataFrame

For a financial application, I'm trying to create a DataFrame where each row is a session date value for a particular equity. To get the data, I'm using Pandas Remote Data. So, for example, the features I'm trying to create might be the adjusted closes for the preceding 32 sessions.
This is easy to do in a for-loop, but it takes quite a long time for large features sets (like going back to 1960 on "ge" and making each row contain the preceding 256 session values). Does anyone see a good way to vectorize this code?
import pandas as pd
def featurize(equity_data, n_sessions, col_label='Adj Close'):
"""
Generate a raw (unnormalized) feature set from the input data.
The value at col_label on the given date is taken
as a feature, and each row contains values for n_sessions
"""
features = pd.DataFrame(index=equity_data.index[(n_sessions - 1):],
columns=range((-n_sessions + 1), 1))
for i in range(len(features.index)):
features.iloc[i, :] = equity_data[i:(n_sessions + i)][col_label].values
return features
I could alternatively just multi-thread this easily, but I'm guessing that pandas does that automatically if I can vectorize it. I mention that mainly because my primary concern is performance. So, if multi-threading is likely to outperform vectorization in any significant way, then I'd prefer that.
Short example of input and output:
>>> eq_data
Open High Low Close Volume Adj Close
Date
2014-01-02 15.42 15.45 15.28 15.44 31528500 14.96
2014-01-03 15.52 15.64 15.30 15.51 46122300 15.02
2014-01-06 15.72 15.76 15.52 15.58 42657600 15.09
2014-01-07 15.73 15.74 15.35 15.38 54476300 14.90
2014-01-08 15.60 15.71 15.51 15.54 48448300 15.05
2014-01-09 15.83 16.02 15.77 15.84 67836500 15.34
2014-01-10 16.01 16.11 15.94 16.07 44984000 15.57
2014-01-13 16.37 16.53 16.08 16.11 57566400 15.61
2014-01-14 16.31 16.43 16.17 16.40 44039200 15.89
2014-01-15 16.37 16.73 16.35 16.70 64118200 16.18
2014-01-16 16.67 16.76 16.56 16.73 38410800 16.21
2014-01-17 16.78 16.78 16.45 16.52 37152100 16.00
2014-01-21 16.64 16.68 16.36 16.41 35597200 15.90
2014-01-22 16.44 16.62 16.37 16.55 28741900 16.03
2014-01-23 16.49 16.53 16.31 16.43 37860800 15.92
2014-01-24 16.19 16.21 15.78 15.83 66023500 15.33
2014-01-27 15.90 15.91 15.52 15.71 51218700 15.22
2014-01-28 15.97 16.01 15.51 15.72 57677500 15.23
2014-01-29 15.48 15.53 15.20 15.26 52241500 14.90
2014-01-30 15.43 15.45 15.18 15.25 32654100 14.89
2014-01-31 15.09 15.10 14.90 14.96 64132600 14.61
>>> features = data.featurize(eq_data, 3)
>>> features
-2 -1 0
Date
2014-01-06 14.96 15.02 15.09
2014-01-07 15.02 15.09 14.9
2014-01-08 15.09 14.9 15.05
2014-01-09 14.9 15.05 15.34
2014-01-10 15.05 15.34 15.57
2014-01-13 15.34 15.57 15.61
2014-01-14 15.57 15.61 15.89
2014-01-15 15.61 15.89 16.18
2014-01-16 15.89 16.18 16.21
2014-01-17 16.18 16.21 16
2014-01-21 16.21 16 15.9
2014-01-22 16 15.9 16.03
2014-01-23 15.9 16.03 15.92
2014-01-24 16.03 15.92 15.33
2014-01-27 15.92 15.33 15.22
2014-01-28 15.33 15.22 15.23
2014-01-29 15.22 15.23 14.9
2014-01-30 15.23 14.9 14.89
2014-01-31 14.9 14.89 14.61
So each row of features is a series of 3 (n_sessions) successive values from the 'Adj Close' column of the features DataFrame.
====================
Improved version based on Primer's answer below:
def featurize(equity_data, n_sessions, column='Adj Close'):
"""
Generate a raw (unnormalized) feature set from the input data.
The value at column on the given date is taken
as a feature, and each row contains values for n_sessions
>>> timeit.timeit('data.featurize(data.get("ge", dt.date(1960, 1, 1),
dt.date(2014, 12, 31)), 256)', setup=s, number=1)
1.6771750450134277
"""
features = pd.DataFrame(index=equity_data.index[(n_sessions - 1):],
columns=map(str, range((-n_sessions + 1), 1)), dtype='float64')
values = equity_data[column].values
for i in range(n_sessions - 1):
features.iloc[:, i] = values[i:(-n_sessions + i + 1)]
features.iloc[:, n_sessions - 1] = values[(n_sessions - 1):]
return features
It looks like shift is your friend here and something like this will do:
df = pd.DataFrame({'adj close': np.random.random(10) + 15},index=pd.date_range(start='2014-01-02', periods=10, freq='B'))
df.index.name = 'date'
df
adj close
date
2014-01-02 15.650
2014-01-03 15.775
2014-01-06 15.750
2014-01-07 15.464
2014-01-08 15.966
2014-01-09 15.475
2014-01-10 15.164
2014-01-13 15.281
2014-01-14 15.568
2014-01-15 15.648
features = pd.DataFrame(data=df['adj close'], index=df.index)
features.columns = ['0']
features['-1'] = df['adj close'].shift()
features['-2'] = df['adj close'].shift(2)
features.dropna(inplace=True)
features
0 -1 -2
date
2014-01-06 15.750 15.775 15.650
2014-01-07 15.464 15.750 15.775
2014-01-08 15.966 15.464 15.750
2014-01-09 15.475 15.966 15.464
2014-01-10 15.164 15.475 15.966
2014-01-13 15.281 15.164 15.475
2014-01-14 15.568 15.281 15.164
2014-01-15 15.648 15.568 15.281

Categories

Resources