I am trying to write a python program that show me the stock price chart for google.
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from pandas_datareader import data as pdr
import yfinance as yf
yf.pdr_override()
#set the start and end date
start_date = "2020-03-01"
end_date = "2020-04-12"
#choose stock ticker symbol
ticker = "GOOGLE"
#get stock price
stock = pdr.get_data_yahoo(ticker, start=start_date, end=end_date)
print(stock)
#obtain dates
stock["Date"]=stock.index.map(mdates.date2num)
#choose figure size
fig = plt.figure(dpi=128, figsize=(10, 6))
#format date to place on the x-axis
formatter = mdates.DateFormatter('%m/%d/%Y')
plt.gca().xaxis.set_major_formatter(formatter)
# Plot data.
plt.plot(stock['Date'], stock['Adj Close'], c='red')
# Format plot.
plt.title("The Stock Price", fontsize=16)
plt.xlabel('Date', fontsize=10)
fig.autofmt_xdate()
plt.ylabel("Price", fontsize=10)
plt.show()
But the program keeps showing me this error.
AttributeError: 'BlockManager' object has no attribute 'refs'
the error keeps focusing on this exception 'Date' somehow during handle it another exception occurred.
Update on the problem: somehow solved itself I don't know how but I run the same code again after 24 h and it gave me the desired output.
I am pretty much brand new to all things python, and much to my chagrin I have been trying to produce a fairly straight forward OHLC chart. Code below with dataframe samples.
I am trying to plot and save an OHLC chart, for a single stock, on a single trading day, in 1m ticks. The yaxis appears to working fine, however the chart when shown is blank. The xaxis is showing the starting time of 09:30 but with no other 1m ticks. Moving the chart over the blank figure shows values for the yaxis but the x= nada.
Example
What I am hoping to eventually achieve, is the xaxis label to show the time, in minutes, no dae required, 90 degrees rotated, at say 15min intervals. I would rather an OHLC chart than a candlestick, but I also want it to be decipherable, as I have seen many versions that are just a blur of tiny vertical lines that are no use to anyone. If the size needs to be stretched horizontally in order to fit the some 376 1m records in the dataframe, then so be it. If it is too cluttered then I would like to be able to space out the tick interval perhaps to every 2 or 5 mins. The xaxis xticks should still remain at 15min intervals however. I would like to then save the result as a jpg.
I have tried so many variations of mplfinace, now no longer know what is the most recent of valid module. I have tried both 'quotes' and values in the candlestick_ohlc statement, there seems to be no apparent difference. I have read and re-read and tried so many examples but all seems to fail at the translation of the time in all things to do with the xaxis and it is very confusing for me to understand and beyond frustrating .. heh.
If anyone could kindly point me in the right direction here I would be very grateful for any and all assistance.
Many thanks, Tim.D
import pandas as pd
import numpy as np
from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mplfinance.original_flavor import candlestick_ohlc
sym = sys.argv[1] #symbol in all caps
run_dt = sys.argv[2] #run date of the required process requires the date to be surrounded by 'quotes'
run_int = sys.argv[2].replace('/', '-')
run_int = run_int.replace("'", "")
import pyodbc #database connectivity
cnxn = pyodbc.connect(dsn='abc', user='abc', password='abc', autocommit=False)
df = pd.read_sql_query(" \
SELECT TIMESTAMP(ACT_DATE||' '||TIME(TICK)) AS TIME, OPEN, HIGH, LOW, CLOSE \
FROM INTRADAY_IDX \
WHERE ACT_DATE = "+run_dt+" \
AND SYMBOL = '"+sym+"' \
ORDER BY 1",cnxn, )
print(df)
This produces a dataframe as follows:
TIME OPEN HIGH LOW CLOSE
0 2021-02-12 09:30:00 314.27 314.50 314.22 314.49
1 2021-02-12 09:31:00 314.51 314.73 314.44 314.63
2 2021-02-12 09:32:00 314.63 314.79 314.54 314.73
.. ... ... ... ... ...
375 2021-02-12 15:59:00 315.01 315.14 314.85 315.00
376 2021-02-12 16:00:00 315.00 315.18 314.97 315.18
df.TIME = mdates.date2num(df.TIME.dt.to_pydatetime())
print(df.head(5))
TIME OPEN HIGH LOW CLOSE
0 737833.395833 314.27 314.50 314.22 314.49
1 737833.396528 314.51 314.73 314.44 314.63
2 737833.397222 314.63 314.79 314.54 314.73
3 737833.397917 314.83 314.89 314.76 314.85
...
#quotes = [tuple(x) for x in df[['TIME', 'OPEN', 'HIGH', 'LOW', 'CLOSE']].to_records(index=False)]
#print(quotes)
fig, ax = plt.subplots(figsize=(12,7))
plt.yscale('linear') #default scaling of the y axis
ax.set_xlim('09:30', '16:00') #sets the start and end values for the xaxis charting
start, end = ax.get_xlim() #initializes the start and end variables
ax.xaxis.set_ticks(np.arange(start, end, 1800)) #sets the tick values for charting
plt.xticks(rotation=90, fontsize=12) #sets the rotation value of the x axis ticks
plt.yticks(fontsize=12)
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight = 'bold')
ax.set_ylabel('Price', fontsize=12, fontweight = 'bold')
ax.set_xlabel('Time', fontsize=12, fontweight = 'bold')
plt.tight_layout() #reduces the space padding surrounding the graph
ax.grid(True)
candlestick_ohlc(ax, df.values, width = 1/(24*60*2.5), alpha = 1.0, colorup = 'g', colordown ='r')
#candlestick_ohlc(ax, quotes, width = 1/(24*60*2.5), alpha = 1.0, colorup = 'g', colordown ='r')
bbox_inches='tight') #saves the data to to jpg file
#plt.savefig('c:\\temp\\charts\\'+sym+'_OHLC_'+run_int+'.jpg', format='jpg', quality=95, #plt.close()
plt.show()
and thanks for much for the response. Using your code I have managed to get it working now, also adding a secondary plot. Code below:
import sys, os, time, warnings #csv
import pandas as pd
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
#import numpy as np
#from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
#from matplotlib import dates, ticker
from mplfinance.original_flavor import candlestick_ohlc
sym = sys.argv[1] #symbol in all caps
run_dt = sys.argv[2] #run date of the required process requires the date to be surrounded by 'quotes'
run_int = sys.argv[2].replace('/', '-') #reformat the date
run_int = run_int.replace("'", "") #reformat the date
import pyodbc #database connectivity
cnxn = pyodbc.connect(dsn='abc', user='abc', password='abc', autocommit=False)
db = pd.read_sql_query(" \
SELECT timestamp(ACT_DATE||' '||TIME(TICK)) AS TIME, OPEN, HIGH, LOW, CLOSE \
FROM SQ4_INTRADAY_IDX \
WHERE ACT_DATE = "+run_dt+" \
AND SYMBOL = '"+sym+"' \
ORDER BY 1",cnxn, )
print(db)
db['TIME']= pd.to_datetime(db['TIME'])
db.set_index('TIME', inplace=True) #this resets the dataframe index to the time values
#db.info() #shows column data types
#setup an array for the candlestick chart
dd = db.copy() #create a copy of the dataframe
dd.index = mdates.date2num(dd.index) #set the datetime to numeric for the chart to work
dd_data = dd.reset_index().values #set the index
#print(dd_data)
clse = db["CLOSE"] #setup the data for plotting an additional subplot line
fig, ax = plt.subplots(figsize=(12,7))
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight='bold')
ax.set_ylabel('Price', fontsize=12, fontweight='bold')
ax.set_xlabel('Time', fontsize=12, fontweight='bold')
candlestick_ohlc(ax, dd_data, width=.0003, alpha=.8, colorup='g', colordown='r')
ax.plot(clse, color = 'k', linestyle='--', linewidth = .5, label='Close')
plt.xticks(rotation=90, fontsize=12) #sets the rotation value of the x axis ticks
plt.yticks(fontsize=12) #sets the rotation value of the x axis ticks
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=30))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.tight_layout() #reduces the space padding surrounding the graph
plt.savefig('c:\\temp\\'+sym+'_OHLC Intrday Chart for '+run_int+'.jpg', format='jpg', quality=95, bbox_inches='tight') #saves the data to to jpg file
plt.show()
This produces the attached chart.
My issue is that I am trying to remove the padded space between the left and right y-axis scales. In other words I would like the 9:30 label to appear directly under the left and 16:00 under the right margins. Basically I guess I am trying to stretch the chart to fill the entire chart box.
Also is there anyway to add the left Price scale values to both the left and right sides ?
Thanks for assist, much appreciated.
Regards, Tim.D
The argument of this function must be an array. Also, the format of the date and time must be converted to mdates2num(). The rest of the time, the date and time are controlled using a locator and a formatter. I think ax.set_xlim('09:30', '16:00') related in your code is the cause of the error. The data acquisition is from Yahoo Finance.
import pandas as pd
import numpy as np
from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from mplfinance.original_flavor import candlestick_ohlc
import yfinance as yf
dia = yf.download("DIA", period='1d', interval='1m', start="2021-02-11", end='2021-02-12')
df = dia.copy()
df.index = mdates.date2num(df.index)
data = df.reset_index().values
fig, ax = plt.subplots(figsize=(12,7))
sym = 'DIA'
candlestick_ohlc(ax, data, width=1/(24*60*2.5), alpha=1.0, colorup='g', colordown='r')
ax.set_title(sym+' OHLC Intraday Chart', fontsize=14, fontweight='bold')
ax.set_ylabel('Price', fontsize=12, fontweight='bold')
ax.set_xlabel('Time', fontsize=12, fontweight='bold')
# update start
ax.set_xlim(data[0][0], data[382][0])
ax1 = ax.twinx()
ax1.set_yticks(ax.get_yticks())
ax1.set_ybound(ax.get_ybound())
ax1.set_yticklabels([str(x) for x in ax.get_yticks()])
# update end
ax.grid()
locator = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(locator))
plt.show()
I am new to Python and learning data visualization using matplotlib.
I am trying to plot Date/Time vs Values using matplotlib from this CSV file:
https://drive.google.com/file/d/1ex2sElpsXhxfKXA4ZbFk30aBrmb6-Y3I/view?usp=sharing
Following is the code snippet which I have been playing around with:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
years = mdates.YearLocator()
months = mdates.MonthLocator()
days = mdates.DayLocator()
hours = mdates.HourLocator()
minutes = mdates.MinuteLocator()
years_fmt = mdates.DateFormatter('%H:%M')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
# format the ticks
ax.xaxis.set_major_locator(minutes)
ax.xaxis.set_major_formatter(years_fmt)
ax.xaxis.set_minor_locator(hours)
datemin = min(data['Date/Time'])
datemax = max(data['Date/Time'])
ax.set_xlim(datemin, datemax)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.format_ydata = lambda x: '%1.2f' % x # format the price.
ax.grid(True)
fig.autofmt_xdate()
plt.show()
The code is plotting the graph but it is not labeling the X-Axis and also giving some unknown values (on mouse over) for x on the bottom right corner as shown in the below screenshot:
Screenshot of matplotlib figure window
Can someone please suggest what changes are needed to plot the x-axis dates and also make the correct values appear when I move the cursor over the graph?
Thanks
I haven't used matplotlib. Instead I used pandas plotting
import pandas as pd
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
ax = data.plot.line(x='Date/Time', y='Discharge')
Here, you need to convert the Date/Time to pandas datetime type.
The main issue you have there is that the date formats are mixed up - your data uses '%d.%m.%Y %H:%M', but you set '%Y.%m.%d %H:%M' and this is why you saw 'rubbish' values in x ticks labels. Anyway the number of lines in your code can be reduced heavily if you convert your Date/Time column to timestamps, ie.:
import pandas as pd
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
plt.style.use('seaborn')
data = pd.read_csv('datafile.csv')
data.sort_values('Date/Time', inplace=True)
data["Date/Time"] = pd.to_datetime(data["Date/Time"], format="%d.%m.%Y %H:%M")
data.sort_values('Date/Time', inplace=True)
fig, ax = plt.subplots()
ax.plot('Date/Time', 'Discharge', data=data)
ax.format_xdata = mdates.DateFormatter('%Y.%m.%d %H:%M')
ax.tick_params(axis='x', rotation=45)
ax.grid(True)
fig.autofmt_xdate()
plt.show()
Note that the format of labels in the plot will depend on the zoom level, so you will need to enlarge a portion of the graph to see hours and minutes in the tick labels, but the cursor locator on the bottom bar of the window should be always displaying the detailed timestamp under the cursor.
My question is if there is any way to use matplotlib date tick labels with a log xscale.
I find whenever I try to set_xscale('log') it just erases the labels and doesn't actually log the xscale...
Example code:
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
years = mdates.YearLocator() # every year
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
# Load a numpy record array from yahoo csv data with fields date, open, close,
# volume, adj_close from the mpl-data/example directory. The record array
# stores the date as an np.datetime64 with a day unit ('D') in the date column.
with cbook.get_sample_data('goog.npz') as datafile:
r = np.load(datafile)['price_data'].view(np.recarray)
# Matplotlib works better with datetime.datetime than np.datetime64, but the
# latter is more portable.
date = r.date.astype('O')
fig, ax = plt.subplots()
ax.plot(date, r.adj_close)
# format the ticks
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
datemin = datetime.date(date.min().year, 1, 1)
datemax = datetime.date(date.max().year + 1, 1, 1)
ax.set_xlim(datemin, datemax)
# format the coords message box
def price(x):
return '$%1.2f' % x
ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')
ax.format_ydata = price
ax.grid(True)
# rotates and right aligns the x labels, and moves the bottom of the
# axes up to make room for them
fig.autofmt_xdate()
ax.set_xscale('log')
plt.show()
Try using ScalarFormatter:
from matplotlib.ticker import ScalarFormatter
ax.xaxis.set_major_formatter(ScalarFormatter())
I'm trying to build matplotlib charts whose x-axis is a dateIndex from a pandas dataframe. Trying to mimic some examples from matplotlib, I've been unsuccessful. The xaxis ticks and labels never appear.
I thought maybe matplotlib wasn't properly digesting the pandas index, so I converted it to ordinal with the matplotlib date2num helper function, but that gave the same result.
# https://matplotlib.org/api/dates_api.html
# https://matplotlib.org/examples/api/date_demo.html
import datetime as dt
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
import matplotlib.dates as mpd
years = mdates.YearLocator() # every year
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
majorLocator = years
majorFormatter = yearsFmt #FormatStrFormatter('%d')
minorLocator = months
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
"""neither of these indices works"""
x = pd.date_range(start='4/1/2012', periods=len(y1))
#x = map(mpd.date2num, pd.date_range(start='4/1/2012', periods=len(y1)))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
datemin = x[0]
datemax = x[-1]
ax.set_xlim(datemin, datemax)
fig.autofmt_xdate()
plt.show()
The problem is the following. pd.date_range(start='4/1/2012', periods=len(y1)) creates dates from the first of April 2012 to the 9th of July 2012.
Now you set the major locator to be a YearLocator. This means, that you want to have a tick for each year on the axis. However, all dates are within the same year 2012. So there is no major tick to be shown within the plot range.
The suggestion would be to use a MonthLocator instead, such that the first of each month is ticked. Also if would make sense to use a formatter, which actually shows the months, e.g. '%b %Y'. You may use a DayLocator for the minor ticks, if you want, to show the small tickmarks for each day.
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
Complete example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
x = pd.date_range(start='4/1/2012', periods=len(y1))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
fig.autofmt_xdate()
plt.show()
You could use pd.DataFrame.plot to handle most of that
df = pd.DataFrame(dict(
y1=y1, y2=y2
), index=x)
df.plot()