So I've spent some time managing to plot data using time on the x-axis, and the way I've found to do that is to use matplotlib.plot_date after converting datetime objects to pltdates objects.
X_d = pltdates.date2num(X) # X is an array containing datetime objects
(...)
plt.plot_date(X_d, Y)
It works great, all my data is plotted properly.
Plot with dates appearing on x-axis
However, all the measures I want to plot were made the same day (17/12/2021), the only difference is the time.
As shown on the image, matplotlib still displays the number of the the day (17th) although it is the same within the whole plot.
Anyone has a clue how to keep only the time, still using matplotlib.plot_date?
Use this example:
import matplotlib
import matplotlib.pyplot as plt
from datetime import datetime
origin = ['2020-02-05 04:11:55',
'2020-02-05 05:01:51',
'2020-02-05 07:44:49']
a = [datetime.strptime(d, '%Y-%m-%d %H:%M:%S') for d in origin]
b = ['35.764299', '20.3008', '36.94704']
x = matplotlib.dates.date2num(a)
formatter = matplotlib.dates.DateFormatter('%H:%M')
figure = plt.figure()
axes = figure.add_subplot(1, 1, 1)
axes.xaxis.set_major_formatter(formatter)
plt.setp(axes.get_xticklabels(), rotation=15)
axes.plot(x, b)
plt.show()
I am pretty much brand new to all things python, and much to my chagrin I have been trying to produce a fairly straight forward OHLC chart. Code below with dataframe samples.
I am trying to plot and save an OHLC chart, for a single stock, on a single trading day, in 1m ticks. The yaxis appears to working fine, however the chart when shown is blank. The xaxis is showing the starting time of 09:30 but with no other 1m ticks. Moving the chart over the blank figure shows values for the yaxis but the x= nada.
Example
What I am hoping to eventually achieve, is the xaxis label to show the time, in minutes, no dae required, 90 degrees rotated, at say 15min intervals. I would rather an OHLC chart than a candlestick, but I also want it to be decipherable, as I have seen many versions that are just a blur of tiny vertical lines that are no use to anyone. If the size needs to be stretched horizontally in order to fit the some 376 1m records in the dataframe, then so be it. If it is too cluttered then I would like to be able to space out the tick interval perhaps to every 2 or 5 mins. The xaxis xticks should still remain at 15min intervals however. I would like to then save the result as a jpg.
I have tried so many variations of mplfinace, now no longer know what is the most recent of valid module. I have tried both 'quotes' and values in the candlestick_ohlc statement, there seems to be no apparent difference. I have read and re-read and tried so many examples but all seems to fail at the translation of the time in all things to do with the xaxis and it is very confusing for me to understand and beyond frustrating .. heh.
If anyone could kindly point me in the right direction here I would be very grateful for any and all assistance.
Many thanks, Tim.D
import pandas as pd
import numpy as np
from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mplfinance.original_flavor import candlestick_ohlc
sym = sys.argv[1] #symbol in all caps
run_dt = sys.argv[2] #run date of the required process requires the date to be surrounded by 'quotes'
run_int = sys.argv[2].replace('/', '-')
run_int = run_int.replace("'", "")
import pyodbc #database connectivity
cnxn = pyodbc.connect(dsn='abc', user='abc', password='abc', autocommit=False)
df = pd.read_sql_query(" \
SELECT TIMESTAMP(ACT_DATE||' '||TIME(TICK)) AS TIME, OPEN, HIGH, LOW, CLOSE \
FROM INTRADAY_IDX \
WHERE ACT_DATE = "+run_dt+" \
AND SYMBOL = '"+sym+"' \
ORDER BY 1",cnxn, )
print(df)
This produces a dataframe as follows:
TIME OPEN HIGH LOW CLOSE
0 2021-02-12 09:30:00 314.27 314.50 314.22 314.49
1 2021-02-12 09:31:00 314.51 314.73 314.44 314.63
2 2021-02-12 09:32:00 314.63 314.79 314.54 314.73
.. ... ... ... ... ...
375 2021-02-12 15:59:00 315.01 315.14 314.85 315.00
376 2021-02-12 16:00:00 315.00 315.18 314.97 315.18
df.TIME = mdates.date2num(df.TIME.dt.to_pydatetime())
print(df.head(5))
TIME OPEN HIGH LOW CLOSE
0 737833.395833 314.27 314.50 314.22 314.49
1 737833.396528 314.51 314.73 314.44 314.63
2 737833.397222 314.63 314.79 314.54 314.73
3 737833.397917 314.83 314.89 314.76 314.85
...
#quotes = [tuple(x) for x in df[['TIME', 'OPEN', 'HIGH', 'LOW', 'CLOSE']].to_records(index=False)]
#print(quotes)
fig, ax = plt.subplots(figsize=(12,7))
plt.yscale('linear') #default scaling of the y axis
ax.set_xlim('09:30', '16:00') #sets the start and end values for the xaxis charting
start, end = ax.get_xlim() #initializes the start and end variables
ax.xaxis.set_ticks(np.arange(start, end, 1800)) #sets the tick values for charting
plt.xticks(rotation=90, fontsize=12) #sets the rotation value of the x axis ticks
plt.yticks(fontsize=12)
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight = 'bold')
ax.set_ylabel('Price', fontsize=12, fontweight = 'bold')
ax.set_xlabel('Time', fontsize=12, fontweight = 'bold')
plt.tight_layout() #reduces the space padding surrounding the graph
ax.grid(True)
candlestick_ohlc(ax, df.values, width = 1/(24*60*2.5), alpha = 1.0, colorup = 'g', colordown ='r')
#candlestick_ohlc(ax, quotes, width = 1/(24*60*2.5), alpha = 1.0, colorup = 'g', colordown ='r')
bbox_inches='tight') #saves the data to to jpg file
#plt.savefig('c:\\temp\\charts\\'+sym+'_OHLC_'+run_int+'.jpg', format='jpg', quality=95, #plt.close()
plt.show()
and thanks for much for the response. Using your code I have managed to get it working now, also adding a secondary plot. Code below:
import sys, os, time, warnings #csv
import pandas as pd
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
#import numpy as np
#from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
#from matplotlib import dates, ticker
from mplfinance.original_flavor import candlestick_ohlc
sym = sys.argv[1] #symbol in all caps
run_dt = sys.argv[2] #run date of the required process requires the date to be surrounded by 'quotes'
run_int = sys.argv[2].replace('/', '-') #reformat the date
run_int = run_int.replace("'", "") #reformat the date
import pyodbc #database connectivity
cnxn = pyodbc.connect(dsn='abc', user='abc', password='abc', autocommit=False)
db = pd.read_sql_query(" \
SELECT timestamp(ACT_DATE||' '||TIME(TICK)) AS TIME, OPEN, HIGH, LOW, CLOSE \
FROM SQ4_INTRADAY_IDX \
WHERE ACT_DATE = "+run_dt+" \
AND SYMBOL = '"+sym+"' \
ORDER BY 1",cnxn, )
print(db)
db['TIME']= pd.to_datetime(db['TIME'])
db.set_index('TIME', inplace=True) #this resets the dataframe index to the time values
#db.info() #shows column data types
#setup an array for the candlestick chart
dd = db.copy() #create a copy of the dataframe
dd.index = mdates.date2num(dd.index) #set the datetime to numeric for the chart to work
dd_data = dd.reset_index().values #set the index
#print(dd_data)
clse = db["CLOSE"] #setup the data for plotting an additional subplot line
fig, ax = plt.subplots(figsize=(12,7))
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight='bold')
ax.set_ylabel('Price', fontsize=12, fontweight='bold')
ax.set_xlabel('Time', fontsize=12, fontweight='bold')
candlestick_ohlc(ax, dd_data, width=.0003, alpha=.8, colorup='g', colordown='r')
ax.plot(clse, color = 'k', linestyle='--', linewidth = .5, label='Close')
plt.xticks(rotation=90, fontsize=12) #sets the rotation value of the x axis ticks
plt.yticks(fontsize=12) #sets the rotation value of the x axis ticks
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=30))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.tight_layout() #reduces the space padding surrounding the graph
plt.savefig('c:\\temp\\'+sym+'_OHLC Intrday Chart for '+run_int+'.jpg', format='jpg', quality=95, bbox_inches='tight') #saves the data to to jpg file
plt.show()
This produces the attached chart.
My issue is that I am trying to remove the padded space between the left and right y-axis scales. In other words I would like the 9:30 label to appear directly under the left and 16:00 under the right margins. Basically I guess I am trying to stretch the chart to fill the entire chart box.
Also is there anyway to add the left Price scale values to both the left and right sides ?
Thanks for assist, much appreciated.
Regards, Tim.D
The argument of this function must be an array. Also, the format of the date and time must be converted to mdates2num(). The rest of the time, the date and time are controlled using a locator and a formatter. I think ax.set_xlim('09:30', '16:00') related in your code is the cause of the error. The data acquisition is from Yahoo Finance.
import pandas as pd
import numpy as np
from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mplfinance.original_flavor import candlestick_ohlc
import yfinance as yf
dia = yf.download("DIA", period='1d', interval='1m', start="2021-02-11", end='2021-02-12')
df = dia.copy()
df.index = mdates.date2num(df.index)
data = df.reset_index().values
fig, ax = plt.subplots(figsize=(12,7))
sym = 'DIA'
candlestick_ohlc(ax, data, width=1/(24*60*2.5), alpha=1.0, colorup='g', colordown='r')
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight='bold')
ax.set_ylabel('Price', fontsize=12, fontweight='bold')
ax.set_xlabel('Time', fontsize=12, fontweight='bold')
# update start
ax.set_xlim(data[0][0], data[382][0])
ax1 = ax.twinx()
ax1.set_yticks(ax.get_yticks())
ax1.set_ybound(ax.get_ybound())
ax1.set_yticklabels([str(x) for x in ax.get_yticks()])
# update end
ax.grid()
locator = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(locator))
plt.show()
I am new to Python and learning data visualization using matplotlib.
I am trying to plot Date/Time vs Values using matplotlib from this CSV file:
https://drive.google.com/file/d/1ex2sElpsXhxfKXA4ZbFk30aBrmb6-Y3I/view?usp=sharing
Following is the code snippet which I have been playing around with:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
years = mdates.YearLocator()
months = mdates.MonthLocator()
days = mdates.DayLocator()
hours = mdates.HourLocator()
minutes = mdates.MinuteLocator()
years_fmt = mdates.DateFormatter('%H:%M')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
# format the ticks
ax.xaxis.set_major_locator(minutes)
ax.xaxis.set_major_formatter(years_fmt)
ax.xaxis.set_minor_locator(hours)
datemin = min(data['Date/Time'])
datemax = max(data['Date/Time'])
ax.set_xlim(datemin, datemax)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.format_ydata = lambda x: '%1.2f' % x # format the price.
ax.grid(True)
fig.autofmt_xdate()
plt.show()
The code is plotting the graph but it is not labeling the X-Axis and also giving some unknown values (on mouse over) for x on the bottom right corner as shown in the below screenshot:
Screenshot of matplotlib figure window
Can someone please suggest what changes are needed to plot the x-axis dates and also make the correct values appear when I move the cursor over the graph?
Thanks
I haven't used matplotlib. Instead I used pandas plotting
import pandas as pd
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
ax = data.plot.line(x='Date/Time', y='Discharge')
Here, you need to convert the Date/Time to pandas datetime type.
The main issue you have there is that the date formats are mixed up - your data uses '%d.%m.%Y %H:%M', but you set '%Y.%m.%d %H:%M' and this is why you saw 'rubbish' values in x ticks labels. Anyway the number of lines in your code can be reduced heavily if you convert your Date/Time column to timestamps, ie.:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.tick_params(axis='x', rotation=45)
ax.grid(True)
fig.autofmt_xdate()
plt.show()
Note that the format of labels in the plot will depend on the zoom level, so you will need to enlarge a portion of the graph to see hours and minutes in the tick labels, but the cursor locator on the bottom bar of the window should be always displaying the detailed timestamp under the cursor.
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
My data looks as follows:
2012021305, 65217
2012021306, 82418
2012021307, 71316
2012021308, 66833
2012021309, 69406
2012021310, 76422
2012021311, 94188
2012021312, 111817
2012021313, 127002
2012021314, 141099
2012021315, 147830
2012021316, 136330
2012021317, 122252
2012021318, 118619
2012021319, 115763
2012021320, 121393
2012021321, 130022
2012021322, 137658
2012021323, 139363
Where the first column is the data YYYYMMDDHH . I'm trying to graph the data using the csv2rec module. I can get the data to graph but the x axis and labels are not showing up the way that I expect them to.
import matplotlib
matplotlib.use('Agg')
from matplotlib.mlab import csv2rec
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from pylab import *
output_image_name='plot1.png'
input_filename="data.log"
input = open(input_filename, 'r')
input.close()
data = csv2rec(input_filename, names=['time', 'count'])
rcParams['figure.figsize'] = 10, 5
rcParams['font.size'] = 8
fig = plt.figure()
plt.plot(data['time'], data['count'])
ax = fig.add_subplot(111)
ax.plot(data['time'], data['count'])
hours = mdates.HourLocator()
fmt = mdates.DateFormatter('%Y%M%D%H')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(fmt)
ax.grid()
plt.ylabel("Count")
plt.title("Count Log Per Hour")
fig.autofmt_xdate(bottom=0.2, rotation=90, ha='left')
plt.savefig(output_image_name)
I assume this has something to do with the date format. Any suggestions?
You need to convert the x-values to datetime objects
Something like:
time_vec = [datetime.strp(str(x),'%Y%m%d%H') for x in data['time']]
plot(time_vec,data['count'])
Currently, you are telling python to format integers (2012021305) as a date, which it does not know how to do, so it returns and empty string (although, I suspect that you are getting errors raised someplace).
You should also check your format string mark up.