OHLC python chart - python

I'm new to pandas and matplotlib and I'm trying to code some algorithmic trading.
I bought this course, and now I understand more, BUT...
It does not includes sample code for OHLC chart in intraday (I mean, it is not complete)
And there are others problems that i have like that my native language is not English (there is no quality material in Spanish about those libraries)
All the material that I found online only plots "daily chart" and is based in matplotlib.finance, and now it is deprecated, currently python uses mplfinance.
Please I need a sample code to chart the csv file in seconds, minutes, hours and days.
I really had tried, I'm not a lazy person, but is taking a lot of time just to plot that chart, the course does not solve my requirement.
Here you have csv file for Alibaba (BABA) in 1 second, 5 second, 15 second, 30 second and 1 minute OHLC chart.
My data

MPLFINANCE
You can use mplfinance. I tried it and it worked, here is the sample code.
note: you need to rename the column in your source data so the columns Open, High, Low, Close have uppercase in their first character.
import mplfinance as mpf
import pandas as pd
data = pd.read_csv('NYSE_BABA, 5s.csv', index_col=0)
data.index = pd.to_datetime(data.index)
mpf.plot(data,type='candle')
Well yes the candlestick is difficult to see because we have the short range data, but you get the idea. Hope it helps!
PLOTLY
You might want to consider Plotly for a nicer visualization.
import plotly.graph_objects as go
import pandas as pd
data = pd.read_csv('NYSE_BABA, 5s.csv')
data['time'] = pd.to_datetime(data['time'], unit='s')
fig = go.Figure(data=[go.Candlestick(x=data['time'],
open=data['Open'],
high=data['High'],
low=data['Low'],
close=data['Close'])])
fig.show()

Related

Pandas Data frames and sorting values

I am having a difficult time with writing this hw assignment, and am not sure where I messed up. I have tried several things, and believe my issue lies in the sort_values or maybe in the groupby command.
The issue is that I want to only display graph data from the year 2007. (using pandas and plotly in jupyternotebook for my class). I have the graph I want mostly but cannot get it to display the data correctly. It simply isn't filtering out the years, or taking data from specific dates as requested.
import pandas as pd
import plotly.express as px
df = pd.read_csv('Data/Country_Data.csv')
print(df.shape)
df.head(2)
df_Q1 = df.query("year == '2007'")
print(df_Q1.shape)
df_Q1.head()
This is where the issue begins, because it prints a table with only header information. As in it prints all the column names, but none of the data for them, and then later on it displays a graph of what I assume is the most recent death data rather than the year 2007 as specified.

How can I change the formatting of the mplfinance volume on the chart?

I am using mplfinance package to plot candlestick charts of the stock. I am currently trying to figure out how can I change the formatting of the volume in the mplfinance. In all examples provided by the package, and in my own chart the volume comes out in strange notation like 1e23 etc. I would like my volume to reflect the numerical value of what is actually in the pandas dataframe. I trade myself and when I am looking at charts anywhere on the actual trading platforms, it shows normal, it actually shows the volume. But when I look at matplotlib, pandas, mplfinance examples online, the notations is formatted in a strange way everywhere.
Example of what I am talking about
Alternatively, to show the volumes not in scientific notation, but keeping the original values (not scaled down) ... using the same data/code as in the answer from #r-beginners ...
fig, axlist = mpf.plot(daily,type='candle',volume=True,
title='\nS&P 500, Nov 2019',
ylabel='OHLC Candles',
ylabel_lower='Shares\nTraded',
returnfig=True)
import matplotlib.ticker as mticker
axlist[2].yaxis.set_major_formatter(mticker.FormatStrFormatter('%d'))
mpf.show()
The result:
In theory it would be relatively easy to enhance mplfinance to accept a kwarg for formating the axis labels; but for now the above will work.
The volume notation is automatically in exponential form based on the size of the volume, so if you want to avoid this, you can avoid it by making the original data smaller with unit data. The following example shows how to deal with this problem by dividing by 1 million. This data is taken from the official website.
daily['Volume'] = daily['Volume'] / 1000000
This is how we responded.
%matplotlib inline
import pandas as pd
daily = pd.read_csv('data/SP500_NOV2019_Hist.csv',index_col=0,parse_dates=True)
daily['Volume'] = daily['Volume'] / 1000000
import mplfinance as mpf
mpf.plot(daily,type='candle',volume=True,
title='\nS&P 500, Nov 2019',
ylabel='OHLC Candles',
ylabel_lower='Shares\nTraded')
Example of normal output

Plotting multiple time series from pandas dataframe

I have a pandas dataframe loaded from file in the following format:
ID,Date,Time,Value1,Value2,Value3,Value4
0063,04/21/2020,11:22:55,0.0347,0.41,1440,10.5
0064,04/21/2020,11:22:56,0.0355,0.41,1440,10.4
...
9849,04/22/2020,10:46:19,0.058,1.05,1460,10.6
I have tried multiple methods of plotting a line graph of each value vs date/time or a single graph with multiple subplots with limited success. I am hoping someone with much more experience may have an elegant solution to try as opposed to my blind swinging. Note that the dataset may have large breaks in time between days.
Thanks!
parsing dates during the import of the pandas dataframe seemed to be my biggest issue. Once I added parse_dates to the pd.read_csv I was able to define the dt column and plot with matplotlib as expected.
df = pd.read_csv(input_text, parse_dates = [["Date", "Time"]])
dt = df["Date_Time"]

python use plotly to plot time series without date gaps

I am using python3. I have a price quote series of 1 minute frequency. The quote is only available in trading hours. I tried to plot it using plotly, but there are gaps in non trading hours and weekends. How can I make this plot consecutive?
My code is like
ifBasisPlot=go.Scatter( x=ifBasis.date, y=ifBasis.basis, line=go.Line(width=1,color='blue'), name='basis' )
data = go.Data([ifBasisPlot])
ifBasisPlot_url = py.plot(data, filename='ifBasisPlot', auto_open=False,)
the plot and the data is here: https://plot.ly/~shuaihou96/14/if/
I believe there is an open PR for the plotly project. link
As mentioned in the PR, we could use a tickformat x axis attribute; #etpinard had made a proof of concept chart, but that may not work if zooming is involved.
You can try to change this code
ifBasisPlot=go.Scatter( x=ifBasis.date, y=ifBasis.basis, line=go.Line(width=1,color='blue'), name='basis' )
into
ifBasisPlot=go.Scatter( x=range(len(ifBasis.date)), y=ifBasis.basis, line=go.Line(width=1,color='blue'), name='basis' )

Pandas timeseries plot -- show date when hovering mouse over the line

I am plotting a pandas Series where the index is date by date. When I say series.plot(), a chart is generated correctly. The problem is that when I hover the mouse over interesting points on the chart, it only shows the Month and Year of that point. It does not show the exact date of that point.
Below is a sample of the code. Depending on luck, when I mouse over the line, sometimes I see the exact date displayed on the status bar but sometimes I only see year and month.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
idx = pd.date_range('2011-1-1', '2015-1-1')
x = pd.Series(np.cumsum(np.random.randn(len(idx))), idx)
df = pd.DataFrame(x)
df.plot()
plt.show()
Is there any way to display the exact date? How does matplotlib control what to display on status bar? I wonder it has something to do with pandas changing the default configuration after some code is called.
When launching your code everything seems to be working and a complete date (the x-coordinate) is shown in the status bar all the time. But the two coordinates are shown also when I am not directly over the graph (so it is difficult to know the actual values of your graph). Are you looking for a tooltip that shows the exact date, when mousing over the graph, or are the right values (complete dates) in the status bar enough? Can you make a screenshot of how your problem looks like, when it occurs and provide details on the versions you are using? I am on matplotlib 1.4.3 and numpy 1.9.2 combined with pandas 0.15.2.
Also have a look at the matplotlib recipes (http://matplotlib.org/users/recipes.html)! Section "Fixing common date annoyances" sounds very similar to your problem!

Categories

Resources