Adding arrows to mpf finance plots - python

I am trying to add an arrow on a given date and price to mpf plot. To do this i have the following code:
import pandas as pd
import yfinance as yf
import datetime
from dateutil.relativedelta import relativedelta
import pandas as pd, mplfinance as mpf, matplotlib.pyplot as plt
db = yf.download(tickers='goog', start=datetime.datetime.now()-relativedelta(days=7), end= datetime.datetime.now(), interval="5m")
db = db.dropna()
a = db['Close'][31:32]
test = mpf.make_addplot(a, type='scatter', markersize=200, marker='^')
mpf.plot(db, type='candle', style= 'charles', addplot=test)
But it is producing the following error:
ValueError: x and y must be the same size
Could you please advise how can i resolve this.

The data passed into mpf.make_addplot() must be the same length as the dataframe passed into mpf.plot(). To plot only some points, the remaining points must be filled with nan values (float('nan'), or np.nan).
You can see this clearly in the documentation at cell **In [7]** (and used in the following cells). See there where the signal data is generated as follows:
def percentB_belowzero(percentB,price):
import numpy as np
signal = []
previous = -1.0
for date,value in percentB.iteritems():
if value < 0 and previous >= 0:
signal.append(price[date]*0.99)
else:
signal.append(np.nan) # <- Make `nan` where no marker needed.
previous = value
return signal
Note: alternatively the signal data can be generated by first initializing to all nan values, and then replacing those nans where you want your arrows:
signal = [float('nan')]*len(db)
signal[31] = db['Close'][31:32]
test = mpf.make_addplot(signal, type='scatter', markersize=200, marker='^')
...

If your ultimate goal is to add an arrow to the title of the question, you can add it in the way shown in #Daniel Goldfarb's How to add value of hlines in y axis using mplfinance python. I used this answer to create a code that meets the end goal. As you can see in the answer, the way to do this is to get the axis and then add an annotation for that axis, where 31 is the date/time index and a[0] is the closing price.
import pandas as pd
import yfinance as yf
import datetime
from dateutil.relativedelta import relativedelta
import pandas as pd
import mplfinance as mpf
import matplotlib.pyplot as plt
db = yf.download(tickers='goog', start=datetime.datetime.now()-relativedelta(days=7), end= datetime.datetime.now(), interval="5m")
db = db.dropna()
a = db['Close'][31:32]
#test = mpf.make_addplot(a, type='scatter', markersize=200, marker='^')
fig, axlist = mpf.plot(db, type='candle', style= 'charles', returnfig=True)#addplot=test
axlist[0].annotate('X', (31, a[0]), fontsize=20, xytext=(34, a[0]+20),
color='r',
arrowprops=dict(
arrowstyle='->',
facecolor='r',
edgecolor='r'))
mpf.show()

Related

matplotlib how do I reduce the amount of space between bars in a stacked bar chart when x-axis are dates 1-week apart?

import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
x=pd.date_range(end=datetime.today(),periods=150,freq='W').to_pydatetime().tolist()
x_1 = np.random.rand(150)
x_2 = np.random.rand(150)/2
fig = plt.figure(figsize=(10,6),dpi=100)
ax=fig.add_subplot(111)
ax.bar(x,x_1,label='x_1')
ax.bar(x,x_2,label='x_2',bottom=x_1)
plt.legend()
plt.show()
The above code will provide this stacked bar chart.
stacked_chart1
Because the x-axis are specified as dates with 1 week apart, the distance between bars are very large.
I would like to change the chart so that the bars are next to each other with no space like the picture below.
x=np.arange(150)
x_1 = np.random.rand(150)
x_2 = np.random.rand(150)/2
fig = plt.figure(figsize=(10,6),dpi=100)
ax=fig.add_subplot(111)
ax.bar(x,x_1,label='x_1')
ax.bar(x,x_2,label='x_2',bottom=x_1)
plt.legend()
plt.show()
stacked_chart2
Except numbers as x-axis, I would still want to keep the dates in chart 1. I am wondering is there a way to do that? Thanks!!
The reason for the difference is that matplotlib will try to simplify the x-axis when you pass a datetime, because usually you cannot fit every date in the x-ticks. It doesn't try this for int or string types, which is why your second sample looks normal.
However I'm unable to figure out why in this particular example why the spacing is so odd. I looked at this post to no avail.
In any case, there are other plotting modules that tend to handle dates a little more elegantly.
import pandas as pd
from datetime import datetime
import plotly.express as px
import numpy as np
x=pd.date_range(end=datetime.today(),periods=150,freq='W').tolist()
x_1 = np.random.rand(150)
x_2 = np.random.rand(150)/2
df = pd.DataFrame({
'date':x,
'x_1':x_1,
'x_2':x_2}).melt(id_vars='date')
px.bar(df, x='date', y='value',color='variable')
Output

how to change xy axis with matplot in python

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
corona_data = pd.read_csv("서울시 코로나19 확진자 현황 csv.csv", encoding="cp949")
confirmed_dates = corona_data["확진일"]
confirmed_date = [datetime.strptime(date, "%Y-%m-%d") for date in confirmed_dates]
corona_data["확진일"]= confirmed_date
plt.rc('font', family='Malgun Gothic')
corona_data["확진일"].plot(title="확진일 별 확진자 추이")
plt.show()
This plot show x-axis is just number and y-axis is date but I wanna change x-axis is date and y-axis is number how can I solve it?
If your data is in a dataframe, I recommend using Seaborn to visualize it. It has a great API that allows you to plot elements of your dataframe by referening column names. Here is a toy example:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Load data
df = pd.read_csv(...)
# Plot scatter plot
sns.scatter(x='col_1', y='col_2', data=df)
plt.show()
Check out the Seaborn documentation for more
The problem seems to be that your dataframe only contains one dataset which are the dated. You could add a column that contains the row numbers and then select what you want to have on x and y axis by passing the column name to the plot function:
import matplotlib.pyplot as plt
from datetime import datetime
corona_data = pd.read_csv("서울시 코로나19 확진자 현황 csv.csv", encoding="cp949")
confirmed_dates = corona_data["확진일"]
confirmed_date = [datetime.strptime(date, "%Y-%m-%d") for date in confirmed_dates]
corona_data["확진일"]= confirmed_date
# now add the numbers to the dataset
corona_data["numbers"]=[i for i in len(confirmed_dates)]
plt.rc('font', family='Malgun Gothic')
# and tell the plot function that you want "확진일" as x ans "numbers" as y axis
corona_data.plot("확진일","numbers",title="확진일 별 확진자 추이")
plt.show()```

Pandas line plot suppresses half of the xticks, how to stop it?

I am trying to make a line plot in which every one of the elements from the index appears as an xtick.
import pandas as pd
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
df.plot(kind='line',x_compat=True)
however the resultant plot skips every second element of the index like so:
My code to call the plot includes the (x_compat=True) parameter which the documentation for pandas suggests should stop the auto tick configuratioin but it seems to have no effect.
You need to use ticker object on axis and then use that axis when plotting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
ax2 = plt.axes()
ax2.xaxis.set_major_locator(ticker.MultipleLocator(1))
df.plot(kind='line', ax=ax2)

time format of xaxis does not change correctly on pandas' plot

I'd like to plot pandas DataFrame as a line chart with x-axis having '%H:%M' style.
However, when I try this:
import datetime
import pandas as pd
import numpy as np
d = pd.DataFrame([np.random.rand(3) for _ in range(9)])
d.index = [datetime.time(i, 0) for i in range(15,24)]
ax = d.plot(xticks=d.index)
plt.xticks([x.strftime("%H:%M") for x in d.index][0::4])
Chart with %H:%M:%s style x-axis is ploted:
Following this answer, I've added DateFormatter:
import matplotlib.dates as md
d = pd.DataFrame([np.random.rand(3) for _ in range(9)])
d.index = [datetime.time(i, 0) for i in range(15,24)]
ax = d.plot(xticks=d.index)
#plt.xticks([x.strftime("%H:%M") for x in d.index][0::4])
ax.xaxis.set_major_formatter(md.DateFormatter('%H:%M'))
Unfortunately, I got a plot with only 00:00 x-axis.
I'd like to get %H:%M style x-axis with proper interval ticks label like "15:00, 17:00, 19:00, 21:00" (every 2 intervals here). How do I realize that?
I believe that is happening because DateFormatting is not recognizing the instructions for "%H:%M". Just replace those by the integer instruction, %d, and it should give you the result you want (or at least something you can adapt). In the following fix of your code I also added rotation to labels to avoid excessive cluttering in the x-axis:
import datetime
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as md
d = pd.DataFrame([np.random.rand(3) for _ in range(9)])
d.index = [datetime.time(i, 0) for i in range(15,24)]
ax = d.plot(xticks=d.index)
plt.xticks(rotation=45) # Changed here
ax.xaxis.set_major_formatter(md.DateFormatter('%d:%d')) # and here...
plt.show()
EDIT: The example above only changes the interpreter string for the formatter. I didn't mess with the data or anything else. In any case ideally you should be using this:
ax.xaxis.set_major_locator(md.HourLocator(byhour=range(0,24,2)))
, to achieve what you want. But I just tested it and it appears to have a bug. That being said my advice is that you parse directly your desired labels into the xticks.
import datetime
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
d = pd.DataFrame([np.random.rand(3) for _ in range(9)])
d.index = [datetime.time(i, 0).hour for i in range(15,24)]
ax = d.plot(xticks=d.index)
plt.xticks([datetime.time(i, 0).hour for i in range(15,24)][0::2],["%d:00"%datetime.time(i, 0).hour for i in range(15,24)][0::2],rotation=45) # Changed here
plt.show()
, which results in this:

Add date tickers to a matplotlib/python chart

I have a question that sounds simple but it's driving me mad for some days. I have a historical time series closed in two lists: the first list is containing prices, let's say P = [1, 1.5, 1.3 ...] while the second list is containing the related dates, let's say D = [01/01/2010, 02/01/2010...]. What I would like to do is to plot SOME of these dates (when I say "some" is because the "best" result I got so far is to show all of them as tickers, so creating a black cloud of unreadable data in the x-axis) that, when you zoom in, are shown more in details. This picture is now having the progressive automated range made by Matplotlib:
Instead of 0, 200, 400 etc. I would like to have the dates values that are related to the data-point plotted. Moreover, when I zoom-in I get the following:
As well as I get the detail between 0 and 200 (20, 40 etc.) I would like to get the dates attached to the list.
I'm sure this is a simple problem to solve but I'm new to Matplotlib as well as to Python and any hint would be appreciated. Thanks in advance
Matplotlib has sophisticated support for plotting dates. I'd recommend the use of AutoDateFormatter and AutoDateLocator. They are even locale-specific, so they choose month-names according to your locale.
import matplotlib.pyplot as plt
from matplotlib.dates import AutoDateFormatter, AutoDateLocator
xtick_locator = AutoDateLocator()
xtick_formatter = AutoDateFormatter(xtick_locator)
ax = plt.axes()
ax.xaxis.set_major_locator(xtick_locator)
ax.xaxis.set_major_formatter(xtick_formatter)
EDIT
For use with multiple subplots, use multiple locator/formatter pairs:
import datetime
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import AutoDateFormatter, AutoDateLocator, date2num
x = [datetime.datetime.now() + datetime.timedelta(days=30*i) for i in range(20)]
y = np.random.random((20))
xtick_locator = AutoDateLocator()
xtick_formatter = AutoDateFormatter(xtick_locator)
for i in range(4):
ax = plt.subplot(2,2,i+1)
ax.xaxis.set_major_locator(xtick_locator)
ax.xaxis.set_major_formatter(xtick_formatter)
ax.plot(date2num(x),y)
plt.show()
You can do timeseries plot with pandas
For detail refer this : http://pandas.pydata.org/pandas-docs/dev/timeseries.html and
http://pandas.pydata.org/pandas-docs/dev/generated/pandas.Series.plot.html
import pandas as pd
DateStrList = ['01/01/2010','02/01/2010']
P = [2,3]
D = pd.Series([pd.to_datetime(date) for date in DateStrList])
series =pd.Series(P, index=D)
pd.Series.plot(series)
import matplotlib.pyplot as plt
import pandas
pandas.TimeSeries(P, index=D).plot()
plt.show()

Categories

Resources