Summary
I want to use streamlit to create a dashboard of all the trades (buy and sell) happening in a given market. I connect to a websocket stream to receive data of BTCUSDT from the Binance exchange. Messages are received every ~0.1s and I would like to update my dashboard in ~0.09s.
How can you handle this kind of situation where messages are delivered at high frequency? With my code, I successfully create a dashboard but it doesn't get updated fast enough. I am wondering if the dashboard is running behind.
The dashboard must display the buy and sell volumes at any moment in time as bar charts. I am also adding some metrics to show the total volume of buy and sell, as well as their change.
Steps to reproduce
My code is structured in the following way.
There is a streamer.py file, that defines a class Streamer. The Streamer object is a Websocket client. It connects to a stream, handles messages, and updates the dashboard. Whenever a new message is received, Streamer acquires a threading.Lock() and updates the pandas dataframes (one dataframe for buy orders and one dataframe for sell orders). If there are multiple orders happening at the same timestamp, it combines them by summing the corresponding volumes. Then, it releases the threading.Lock() and it creates a new thread where the update function (defined in streamer.py) is executed. The update function acquires the lock to avoid messing up with memory.
In the main.py file, streamlit's dashboard and the Streamerobject are initialized.
To reproduce the following code you need to connect to the Websocket from a region where Binance is not restricted. Since I live in the US, I must use a VPN to properly receive the data.
Code snippet:
main.py file
# main.py
import streamer
import pandas as pd
import streamlit as st # web development
import numpy as np # np mean, np random
import time # to simulate a real time data, time loop
import plotly.express as px # interactive charts
df_buy = pd.DataFrame(columns = [ 'Price', 'Quantity', 'USD Value'])
df_sell = pd.DataFrame(columns = [ 'Price', 'Quantity', 'USD Value'])
st.set_page_config(
page_title='Real-Time Data Science Dashboard',
page_icon='✅',
layout='wide'
)
# dashboard title
st.title("Real-Time / Live Data Science Dashboard")
placeholder = st.empty()
streamer.Stream(df_buy,df_sell,placeholder).connect()
streamer.py file
# streamer.py
import websocket
import json
import streamlit as st
import plotly.express as px
import pandas as pd
from threading import Thread, Lock
from streamlit.script_run_context import add_script_run_ctx
from datetime import datetime
import time
def on_close(ws, close_status_code, close_msg):
print('LOG', 'Closed orderbook client')
def update(df_buy,df_sell, placeholder, lock):
lock.acquire()
with placeholder.container():
# create three columns
kpi1, kpi2 = st.columns(2)
current_sumSellVolumes = df_sell['Quantity'].sum()
previous_sumSellVolumes = df_sell.iloc[:-1]['Quantity'].sum()
current_sumBuyVolumes = df_buy['Quantity'].sum()
previous_sumBuyVolumes = df_buy.iloc[:-1]['Quantity'].sum()
# fill in those three columns with respective metrics or KPIs
kpi2.metric(label="Sell quantity 📉", value=round(current_sumSellVolumes, 2),
delta=round(current_sumSellVolumes - previous_sumSellVolumes, 2))
kpi1.metric(label="Buy quantity 📈", value=round(current_sumBuyVolumes, 2),
delta=round(current_sumBuyVolumes - previous_sumBuyVolumes, 2))
# create two columns for charts
fig_col1, fig_col2 = st.columns(2)
with fig_col1:
st.markdown("### Buy Volumes")
fig = px.bar(data_frame=df_buy, x=df_buy.index, y='Quantity')
st.write(fig)
with fig_col2:
st.markdown("### Sell Volumes")
fig2 = px.bar(data_frame=df_sell, x=df_sell.index, y='Quantity')
st.write(fig2)
st.markdown("### Detailed Data View")
st.dataframe(df_buy)
st.dataframe(df_sell)
lock.release()
class Stream():
def __init__(self, df_buy, df_sell, placeholder):
self.symbol = 'BTCUSDT'
self.df_buy = df_buy
self.df_sell = df_sell
self.placeholder = placeholder
self.lock = Lock()
self.url = "wss://stream.binance.com:9443/ws"
self.stream = f"{self.symbol.lower()}#aggTrade"
self.times = []
def on_error(self, ws, error):
print(self.times)
print('ERROR', error)
def on_open(self, ws):
print('LOG', f'Opening WebSocket stream for {self.symbol}')
subscribe_message = {"method": "SUBSCRIBE",
"params": [self.stream],
"id": 1}
ws.send(json.dumps(subscribe_message))
def handle_message(self, message):
self.lock.acquire()
timestamp = datetime.utcfromtimestamp(int(message['T']) / 1000)
price = float(message['p'])
qty = float(message['q'])
USDvalue = price * qty
side = 'BUY' if message['m'] == False else 'SELL'
if side == 'BUY':
df = self.df_buy
else:
df = self.df_sell
if timestamp not in df.index:
df.loc[timestamp] = [price, qty, USDvalue]
else:
df.loc[df.index == timestamp, 'Quantity'] += qty
df.loc[df.index == timestamp, 'USD Value'] += USDvalue
self.lock.release()
def on_message(self, ws, message):
message = json.loads(message)
self.times.append(time.time())
if 'e' in message:
self.handle_message(message)
thr = Thread(target=update, args=(self.df_buy, self.df_sell, self.placeholder, self.lock,))
add_script_run_ctx(thr)
thr.start()
def connect(self):
print('LOG', 'Connecting to websocket')
self.ws = websocket.WebSocketApp(self.url, on_close=on_close, on_error=self.on_error,
on_open=self.on_open, on_message=self.on_message)
self.ws.run_forever()
Debug info
Streamlit version: 1.4.0
Python version: 3.10.4
OS version: MacOS 13.1
Browser version: Safari 16.2
Related
The goal is to pulling real time data in the background (say every 5 seconds) and pull into the dashboard when needed. Here is my code. It kinda works but two issues I am seeing: 1. if I move st.write("TESTING!") to the end, it will never get executed because of the while loop. Is there a way to improve? I can imagine as the dashboard grows, there will be multiple pages/tables etc.. This won't give much flexibility. 2. The return px line in the async function, I am not very comfortable with it because I got it right via trial and error. Sorry for being such a newbie, but if there are better ways to do it, I would really appreciate.
Thank you!
import asyncio
import streamlit as st
import numpy as np
st.set_page_config(layout="wide")
async def data_generator(test):
while True:
with test:
px = np.random.randn(5, 1)
await asyncio.sleep(1)
return px
test = st.empty()
st.write("TESTING!")
with test:
while True:
px = asyncio.run(data_generator(test))
st.write(px[0])
From my experience, the trick to using asyncio is to create your layout ahead of time, using empty widgets where you need to display async info. The async coroutine would take in these empty slots and fill them out. This should help you create a more complex application.
Then the asyncio.run command can become the last streamlit action taken. Any streamlit commands after this wouldn't be processed, as you have observed.
I would also recommend to arrange any input widgets outside of the async function, during the initial layout, and then send in the widget output for processing. Of course you could draw your input widgets inside the function, but the layout then might become tricky.
If you still want to have your input widgets inside your async function, you'd definitely have to put them outside of the while loop, otherwise you would get duplicated widget error. (You might try to overcome this by creating new widgets all the time, but then the input widgets would be "reset" and interaction isn't achieved, let alone possible memory issue.)
Here's a complete example of what I mean:
import asyncio
import pandas as pd
import plotly.express as px
import streamlit as st
from datetime import datetime
CHOICES = [1, 2, 3]
def main():
print('\nmain...')
# layout your app beforehand, with st.empty
# for the widgets that the async function would populate
graph = st.empty()
radio = st.radio('Choose', CHOICES, horizontal=True)
table = st.empty()
try:
# async run the draw function, sending in all the
# widgets it needs to use/populate
asyncio.run(draw_async(radio, graph, table))
except Exception as e:
print(f'error...{type(e)}')
raise
finally:
# some additional code to handle user clicking stop
print('finally')
# this doesn't actually get called, I think :(
table.write('User clicked stop!')
async def draw_async(choice, graph, table):
# must send in all the streamlit widgets that
# this fn would interact with...
# this could possibly work, but layout is tricky
# choice2 = st.radio('Choose 2', CHOICES)
while True:
# this would not work because you'd be creating duplicated
# radio widgets
# choice3 = st.radio('Choose 3', CHOICES)
timestamp = datetime.now()
sec = timestamp.second
graph_df = pd.DataFrame({
'x': [0, 1, 2],
'y': [max(CHOICES), choice, choice*sec/60.0],
'color': ['max', 'current', 'ticking']
})
df = pd.DataFrame({
'choice': CHOICES,
'current_choice': len(CHOICES)*[choice],
'time': len(CHOICES)*[timestamp]
})
graph.plotly_chart(px.bar(graph_df, x='x', y='y', color='color'))
table.dataframe(df)
_ = await asyncio.sleep(1)
if __name__ == '__main__':
main()
Would something like this work?
import asyncio
import streamlit as st
async def tick(placeholder):
tick = 0
while True:
with placeholder:
tick += 1
st.write(tick)
await asyncio.sleep(1)
async def main():
st.header("Async")
placeholder = st.empty()
await tick(placeholder)
asyncio.run(main())
import datetime
import time
import requests
from plyer import notification
covidDATA = None# no data intilly
try:
covidDATA = requests.get('https://corona-rest-api.herokuapp.com/Api/india')
except:
print(' Please ! check your internet connection')
if covidDATA != None:
# converting data to JSON format( for easier reading)
data = covidDATA.json()['Success']
while True:
notification.notify(
# title for notification
title = 'COVID19 stats on {}'.format(datetime.date.today()),
# body of the notification
message = 'Total cases = {totalcases}\nToday cases : {todaycases}\nToday deaths :{todaydeaths}\nTotal active:{active}'.format(
totalcases = data['cases'],
todaycases = data['todayCases'],
todaydeaths = data['todayDeaths'],
active = data['active']),
# ceating icon for notfications
app_icon = 'alarm-bell_icon-icons.com_68596.ico',
# time for notifications to stay
timeout = 50
)
# time for repeat
time.sleep(60*60*4)
I can see the icon showing up but there are no notifications, even after clicking on the icon no notifcations pop up, i am a learner and don't know alot about the modules., how can the notifications be seen .
I am trying to resample live ticks from KiteTicker websocket into OHLC candles using pandas and this is the code I have written, which works fine with single instrument (The commented trd_portfolio on line 9) but doesn't work with multiple instruments (Line 8) as it mixes up data of different instruments.
Is there any way to relate the final candles df to instrument tokens? or make this work with multiple intruments?
I would like to run my algo on multiple instruments at once, please suggest if there is a better way around it.
from kiteconnect import KiteTicker;
from kiteconnect import KiteConnect;
import logging
import time,os,datetime,math;
import winsound
import pandas as pd
trd_portfolio = {954883:"USDINR19MARFUT",4632577:"JUBLFOOD"}
# trd_portfolio = {954883:"USDINR19MARFUT"}
trd_tkn1 = [];
for x in trd_portfolio:
trd_tkn1.append(x)
c_id = '****************'
ak = '************'
asecret = '*************************'
kite = KiteConnect(api_key=ak)
print('[*] Generate access Token : ',kite.login_url())
request_tkn = input('[*] Enter Your Request Token Here : ')[-32:];
data = kite.generate_session(request_tkn, api_secret=asecret)
kite.set_access_token(data['access_token'])
kws = KiteTicker(ak, data['access_token'])
#columns in data frame
df_cols = ["Timestamp", "Token", "LTP"]
data_frame = pd.DataFrame(data=[],columns=df_cols, index=[])
def on_ticks(ws, ticks):
global data_frame, df_cols
data = dict()
for company_data in ticks:
token = company_data["instrument_token"]
ltp = company_data["last_price"]
timestamp = company_data['timestamp']
data[timestamp] = [timestamp, token, ltp]
tick_df = pd.DataFrame(data.values(), columns=df_cols, index=data.keys()) #
data_frame = data_frame.append(tick_df)
ggframe=data_frame.set_index(['Timestamp'],['Token'])
print ggframe
gticks=ggframe.ix[:,['LTP']]
candles=gticks['LTP'].resample('1min').ohlc().dropna()
print candles
def on_connect(kws , response):
print('Connected')
kws.subscribe(trd_tkn1)
kws.set_mode(kws.MODE_FULL, trd_tkn1)
def on_close(ws, code, reason):
print('Connection Error')
kws.on_ticks = on_ticks
kws.on_connect = on_connect
kws.on_close = on_close
kws.connect()
I don't have access to the Kite API, but I've been looking at some code snippets that use it trying to figure out another issue I'm having related to websockets. I came across this open question, and I think I can help, though I can't really test this solution.
The problem I think is that you're not calculating OHLC for each "token"... it just does it for all tokens.
data_frame = data_frame.append(tick_df)
ggframe=data_frame.set_index('Timestamp')
candles=ggframe.groupby('token').resample('1min').agg({'LTP':'ohlc'})
You'll get a multi-index output, but the column names might not quite line up for the rest of your code. To fix that:
candles.columns=['open','high','low','close']
I was trying to export IB position/account value into data frame for further processing purposes in python. But failed to figure out how to achieve this. Can anyone help?
import pandas as pd
import numpy as np
import time
import ibapi
from ibapi.client import EClient
from ibapi.wrapper import EWrapper
import threading
import sys
import queue
from ibapi.contract import Contract
class MyWrapper(EWrapper):
##property
def updatePortfolio(self, contract: Contract, position: float, marketPrice: float, marketValue: float, averageCost: float, unrealizedPNL: float, realizedPNL: float, accountName: str):
super().updatePortfolio(contract, position, marketPrice, marketValue, averageCost, unrealizedPNL, realizedPNL, accountName)
if (len(contract.symbol)<5) & (contract.secType == 'STK'):
new_symbol = contract.symbol.zfill(5)
else:
new_symbol = contract.symbol
print (contract.secType, contract.exchange, new_symbol, "Position:", position, "MarketPrice:", marketPrice, "MarketValue:", marketValue, "AverageCost:", averageCost, "UnrealizedPNL:", unrealizedPNL, "RealizedPNL:", realizedPNL)
accountName = ''
callback = MyWrapper() # wrapper = MyWrapper()
#Instntiate My Wrapper.callback
tws = EClient(callback) # app = EClient(wrapper)
#Instantiate EClient and return data to call back
host = '127.0.0.1'
port = 4001
clientID = 8
tws.connect(host, port, clientID)
print("serverVersion:%s connectionTime:%s" % (tws.serverVersion(), tws.twsConnectionTime()))
print(tws.isConnected())
tws.reqAccountUpdates(1, accountName)
time.sleep(2)
tws.run()
accvalue = pd.DataFrame(callback.updatePortfolio, columns = ['Symbol','Position','MarketPrice','MarketValue',
'AverageCost', 'UnrealisedPnL', 'RealisedPnL'])
#accvalue = callback.updateAccountValue
print ('Account: \n' + accvalue)
You are on the right track. You need to set up the queue class objects inside of the wrapper to collect the response from the client function you are calling. Then, you can do anything you want with the data. Take a look at this blog --> https://qoppac.blogspot.com/2017/03/interactive-brokers-native-python-api.html
There is some code there you can reuse to help with the implementation.
I am trying to display live price updates coming from a redis pubsub channel in a grid in Jupyter. Everytime there is a price update, the message will be added at the end of the grid. In order words, a gridview widget will be tied to a Dataframe so everytime it changes, the gridview will change. The idea is to get something like this:
I tried to do that by displaying and clearing the output. However, I am not getting a the streaming grid that gets updated in-place but rather displaying and clearing the output which is very annoying.
Here is the output widget in one jupyter cell
import ipywidgets as iw
from IPython.display import display
o = iw.Output()
def output_to_widget(df, output_widget):
output_widget.clear_output()
with output_widget:
display(df)
o
Here is the code to subscribe to redis and get handle the message
import redis, json, time
r = redis.StrictRedis(host = HOST, password = PASS, port = PORT, db = DB)
p = r.pubsub(ignore_subscribe_messages=True)
p.subscribe('QUOTES')
mdf = pd.DataFrame()
while True:
message = p.get_message()
if message:
json_msg = json.loads(message['data'])
df = pd.DataFrame([json_msg]).set_index('sym')
mdf = mdf.append(df)
output_to_widget(mdf, o)
time.sleep(0.001)
Try changing the first line of output_to_widget to output_widget.clear_output(wait = True).
https://ipython.org/ipython-doc/3/api/generated/IPython.display.html
I was able to get it to work using Streaming DataFrames from the streamz library.
Here is the class to emit the data to the streamming dataframe.
class DataEmitter:
def __init__(self, pubsub, src):
self.pubsub = pubsub
self.src = src
self.thread = None
def emit_data(self, channel):
self.pubsub.subscribe(**{channel: self._handler})
self.thread = self.pubsub.run_in_thread(sleep_time=0.001)
def stop(self):
self.pubsub.unsubscribe()
self.thread.stop()
def _handler(self, message):
json_msg = json.loads(message['data'])
df = pd.DataFrame([json_msg])
self.src.emit(df)
and here is the cell to display the streaming dataframe
r = redis.StrictRedis(host = HOST, password = PASS, port = PORT, db = DB)
p = r.pubsub(ignore_subscribe_messages=True)
source = Stream()
emitter = DataEmitter(p, source, COLUMNS)
emitter.emit_data(src='PRICE_UPDATES')
#sample for how the dataframe it's going to look like
example = pd.DataFrame({'time': [], 'sym': []})
sdf = source.to_dataframe(example=example)
sdf