I want to implement a class with the possibility to start various websockets in different threads to retrieve market data and update the class attributes. I am using the kucoin-python-sdk library to that purpose.
The below works fine in spyder, however when I set my script to run via a conda batch it fails with the following errors over and over.
Thank you.
<Task finished name='Task-4' coro=<ConnectWebsocket._run() done,>
defined at > path\lib\site-packages\kucoin\websocket\websocket.py:33>>
exception=RuntimeError("can't register atexit after shutdown")> got
an> exception can't register atexit after shutdown pending name='Task-3' coro=<ConnectWebsocket._recover_topic_req_msg()
running> at>
path\lib\site-packages\kucoin\websocket\websocket.py:127>>
wait_for=> cancel ok.> _reconnect
over.
<Task finished name='Task-7' coro=<ConnectWebsocket._run() done,
defined at>> path\lib\site-packages\kucoin\websocket\websocket.
py:33>> exception=RuntimeError("can't register atexit after shutdown")> got an> exception can't register atexit after shutdown pending name='Task-6' coro=<ConnectWebsocket._recover_topic_req_msg() running> at
path\lib\site-packages\kucoin\websocket\websocket.py:127>> wait_for=> cancel ok.> _reconnect over.
Hence wondering:
Does the issue come from the Kucoin package or is my implementation of threads/asyncio incorrect ?
How to explain the different behavior between Spyder execution and conda on the same environment ?
Python 3.9.13 | Spyder 5.3.3 | Spyder kernel 2.3.3 | websocket 0.2.1 | nest-asyncio 1.5.6 | kucoin-python 1.0.11
Class_X.py
import asyncio
import nest_asyncio
nest_asyncio.apply()
from kucoin.client import WsToken
from kucoin.ws_client import KucoinWsClient
from threading import Thread
class class_X():
def __init__(self):
self.msg= ""
async def main(self):
async def book_msg(msg):
self.msg = msg
client = WsToken()
ws_client = await KucoinWsClient.create(None, client, book_msg, private=False)
await ws_client.subscribe(f'/market/level2:BTC-USDT')
while True:
await asyncio.sleep(20)
def launch(self):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(self.main())
instance = class_X()
t = Thread(target=instance.launch)
t.start()
Batch
call path\anaconda3\Scripts\activate myENV
python "path1\class_X.py"
conda deactivate
I want to say it's your implementation but I haven't tried using that client the way you're doing it. Here's a pared down skeleton of what I'm doing to implement that kucoin-python in async.
import asyncio
from kucoin.client import WsToken
from kucoin.ws_client import KucoinWsClient
from kucoin.client import Market
from kucoin.client import User
from kucoin.client import Trade
async def main():
async def handle_event(msg):
if '/market/snapshot:' in msg['topic']:
snapshot = msg['data']['data']
## trade logic here using snapshot data
elif msg['topic'] == '/spotMarket/tradeOrders':
print(msg['data'])
else:
print("Unhandled message type")
print(msg)
async def unsubscribeFromPublicSnapsot(symbol):
ksm.unsubscribe('/market/snapshot:' + symbol)
async def subscribeToPublicSnapshot(symbol):
try:
print("subscribing to " + symbol)
await ksm.subscribe('/market/snapshot:' + symbol)
except Exception as e:
print("Error subscribing to snapshot for " + doc['currency'])
print(e)
pubClient = WsToken()
print("creating websocket client")
ksm = await KucoinWsClient.create(None, pubClient, handle_event, private=False)
# for private topics pass private=True
privateClient = WsToken(config["tradeKey"], config["tradeSecret"], config["tradePass"])
ksm_private = await KucoinWsClient.create(None, privateClient, handle_event, private=True)
# Always subscribe to BTC-USDT
await subscribeToPublicSnapshot('BTC-USDT')
# Subscribe to the currency-BTC spot market for each available currency
for doc in tradeable_holdings:
if doc['currency'] != 'BTC': # Don't need to resubscribe :D
await subscribeToPublicSnapshot(doc['currency'] + "-BTC")
# Subscribe to spot market trade orders
await ksm_private.subscribe('/spotMarket/tradeOrders')
if __name__ == "__main__":
print("Step 1: Kubot initialzied")
print("Step 2: ???")
print("Step 2: Profit")
loopMain = asyncio.get_event_loop()
loopMain.create_task(main())
loopMain.run_forever()
loopMain.close()
As you can probably guess, "tradeable_holdings" is a list of symbols I'm interested in that I already own. You'll also notice I'm using the snapshot instead of the market/ticker subscription. I think at 100ms updates on the ticker, it could quickly run into latency and race conditions - at least until I figure out how to deal with those. So I opted for the snapshot which only updates every 2 seconds and for the less active coins, not even that often.
Anyway, I'm not to where it's looking to trade but I'm quickly getting to that logic.
Hope this helps you figure your implementation out even though it's different.
Related
I am running a code to receive messages from Azure Eventhub, similar to the official startup example. However, I am using Jupyter notebooks so I cannot use asyncio to manage the tasks (if I understand correctly). I want to stop receiving messages when I cross a certain time threshold (e.g. 10 minutes before now) however the task continues running even if I raise an exception. The only way to stop it is by interrupting the kernel.
How to approach this? I am new to asynchronous programming so I might have misunderstood some concepts.
Thank you in advance for any help.
A code example (raising exception that did not work):
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
from datetime import datetime, timedelta
async def on_event(partition_context, event):
# Print the event data.
result_data = []
for event in event_batch:
result_data.append(event.body_as_json(encoding='UTF-8'))
max_session_time = max([datetime.utcfromtimestamp(event['session']['serverTime']/1000) for event in result_data])
# Update the checkpoint so that the program doesn't read the events
# that it has already read when you run it next time.
await partition_context.update_checkpoint(event)
if datetime.utcnow() - progress_dict[partition_context.partition_id] < timedelta(minutes = 10):
print('Calculation finished')
raise Exception('Calculation finished')
async def main():
# Create an Azure blob checkpoint store to store the checkpoints.
checkpoint_store = BlobCheckpointStore.from_connection_string("AZURE STORAGE CONNECTION STRING", "BLOB CONTAINER NAME")
# Create a consumer client for the event hub.
client = EventHubConsumerClient.from_connection_string("EVENT HUBS NAMESPACE CONNECTION STRING", consumer_group="$Default", eventhub_name="EVENT HUB NAME", checkpoint_store=checkpoint_store)
try:
async with client:
# Call the receive method. Read from the beginning of the partition (starting_position: "-1")
await client.receive_batch(on_event_batch = on_event_batch, #on_error = on_error,
max_batch_size = 100, starting_position="-1") #
except Exception as e:
if e.args == ('Calculation finished',):
print(e)
pass
else:
raise e
I have a class that contains a function that I would like to be able to invoke by invoking a flask-resful endpoint. Is there a way to define an asynchronous function that would await/subscribe to this endpoint to be called? I can make changes to the flask app (but can't switch to SocketIO) as well if required or write some sort of async requests function. I can only work with the base Anaconda 3.7 library and I don't have any additional message brokers installed or available.
class DaemonProcess:
def __init__(self):
pass
async def await_signal():
signal = await http://ip123/signal
self.(process_signal) # do stuff with signal
For context, this isn't the main objective of the process. I simply want to be able to use this to tell my process remotely or via UI to shut down worker processes either gracefully or forcefully. The only other idea I came up with is pinging a database table repeatedly to see if a signal has been inserted, but time is of the essence and would require pinging at too short of intervals in my opinion and an asynchronous approach would be favored. The database would be SQLite3 and it doesn't appear to support update_hook callbacks.
Here's sample pattern to send a singal and process it:
import asyncio
import aiotools
class DaemonProcess
async def process(reader, writer):
data = await reader.read(100)
writer.write(data)
print(f"We got a message {data} - time to do something about it.")
await writer.drain()
writer.close()
#aiotools.server
async def worker(loop, pidx, args):
server = await asyncio.start_server(echo, '127.0.0.1', 8888,
reuse_port=True, loop=loop)
print(f'[{pidx}] started')
yield # wait until terminated
server.close()
await server.wait_closed()
print(f'[{pidx}] terminated')
def start(self):
aiotools.start_server(myworker, num_workers=4)
if __name__ == '__main__':
# Run the above server using 4 worker processes.
d = DaemonProcess()
d.start()
if you save it in a file, for example, process.py, you should be able to start it:
python3 process.py
Now once you have this daemon in background, you should be able to ping it (see a sample client below):
import asyncio
async def tcp_echo_client(message):
reader, writer = await asyncio.open_connection('127.0.0.1', 8888)
print(f'Send: {message!r}')
writer.write(message.encode())
await writer.drain()
data = await reader.read(100)
print(f'Received: {data.decode()!r}')
print('Close the connection')
writer.close()
await writer.wait_closed()
And now, somewhere in your Flask view, you should be able to invoke:
asyncio.run(tcp_echo_client('I want my daemon to do something for me'))
Notice this all used localhost 127.0.0.1 and port 8888, so those to be made available unless you have your own ports and IPs, then you'll need to configure them accordingly.
Also notice the use of aiotools which is a module providing a set of common asyncio patterns (daemons, etc...).
I am running into some trouble with Azure Event Bub with Python. Below is my strater code for connection (Taken from microsoft docs)
import asyncio
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
async def on_event(partition_context, event):
# Print the event data.
print("Received the event: \"{}\" from the partition with ID: \"{}\"".format(event.body_as_str(encoding='UTF-8'), partition_context.partition_id))
# Update the checkpoint so that the program doesn't read the events
# that it has already read when you run it next time.
await partition_context.update_checkpoint(event)
async def main():
# Create an Azure blob checkpoint store to store the checkpoints.
checkpoint_store = BlobCheckpointStore.from_connection_string("AZURE STORAGE CONNECTION STRING", "BLOB CONTAINER NAME")
# Create a consumer client for the event hub.
client = EventHubConsumerClient.from_connection_string("EVENT HUBS NAMESPACE CONNECTION STRING", consumer_group="$Default", eventhub_name="EVENT HUB NAME", checkpoint_store=checkpoint_store)
async with client:
# Call the receive method. Read from the beginning of the partition (starting_position: "-1")
await client.receive(on_event=on_event, starting_position="-1")
if __name__ == '__main__':
loop = asyncio.get_event_loop()
# Run the main method.
loop.run_until_complete(main())
Here, the receiver/consumer keeps listening. If I remove any of the awaits the consumer throws an error.
Does anyone know how to stop the consumer after running for some time like timeout).
#Abhishek
There are 2 options here :
You could stop listening when there is an inactivity for certain period time.
You could stop listening after fixed duration.
Have detailed both in below steps.
OPTION 1
You could use the max_wait_time parameter in order to stop listening in case there is no activity for certain time.
I did spin up a simple use case of the above. But you could optimize this further.
import asyncio
from azure.eventhub.aio import EventHubConsumerClient
event_hub_connection_str = '<CON_STR>'
eventhub_name = '<EventHub_NAME>'
consumer = EventHubConsumerClient.from_connection_string(
conn_str=event_hub_connection_str,
consumer_group='$Default',
eventhub_name=eventhub_name # EventHub name should be specified if it doesn't show up in connection string.
)
#this event gets called when the message is received or Max_Wait_time is clocked
async def on_event(partition_context, event):
print(event) #Optional - to see output
#Checks whether there is any event returned None. None is returned when this event is called after the Max_Wait_time is crossed
if(event !=None):
print("Received the event: \"{}\" from the partition with ID: \"{}\"".format(event.body_as_str(encoding='UTF-8'), partition_context.partition_id))
#you can update other code like updating blob store
else:
print("Timeout is Hit")
#updating the
global receive
receive = False
async def close():
print("Closing the client.")
await consumer.close()
print("Closed")
async def main():
recv_task = asyncio.ensure_future(consumer.receive(on_event=on_event,max_wait_time=15))
while(True): # keep receiving for 3 seconds
await asyncio.sleep(3)
if(receive != True):
print("Cancelling the Task")
recv_task.cancel() # stop receiving by cancelling the task
break;
receive = True
asyncio.run(main())
asyncio.run(close())#closing the Client
With regards to the above code. If there is no activity 15 seconds the async task gets cancelled and the consumer clients gets closed. The program is eventually exited gracefully.
OPTION 2
If you are looking for a code in which the you would like to make client to listen for fixed time like 1 hour or some thing. You could refer the below code
Reference Code
event_hub_connection_str = '<>'
eventhub_name = '<>'
import asyncio
from azure.eventhub.aio import EventHubConsumerClient
consumer = EventHubConsumerClient.from_connection_string(
conn_str=event_hub_connection_str,
consumer_group='$Default',
eventhub_name=eventhub_name # EventHub name should be specified if it doesn't show up in connection string.
)
async def on_event(partition_context, event):
# Put your code here.
# If the operation is i/o intensive, async will have better performance.
print("Received event from partition: {}".format(partition_context.partition_id))
# The receive method is a coroutine which will be blocking when awaited.
# It can be executed in an async task for non-blocking behavior, and combined with the 'close' method.
async def main():
recv_task = asyncio.ensure_future(consumer.receive(on_event=on_event))
await asyncio.sleep(15) # keep receiving for 3 seconds
recv_task.cancel() # stop receiving
async def close():
print("Closing.....")
await consumer.close()
print("Closed")
asyncio.run(main())
asyncio.run(close())#closing the Client
The below code that is responsible for the client to be listening for a certain time :
recv_task =
asyncio.ensure_future(consumer.receive(on_event=on_event))
await asyncio.sleep(3) # keep receiving for 3 seconds
recv_task.cancel()
You could increase the time as per your need.
#Satya V I tried the option 2 but however I am seeing the error ,
There is no current event loop in thread 'MainThread'.
But However your code helped me in a better way . I had configured the code with Storage Account check point
import asyncio
import os
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
CONNECTION_STR = ''
EVENTHUB_NAME = ''
STORAGE_CONNECTION_STR = ''
BLOB_CONTAINER_NAME = ""
async def on_event(partition_context, event):
print("Received event from partition: {}.".format(partition_context.partition_id))
await partition_context.update_checkpoint(event)
async def receive(client):
await client.receive(
on_event=on_event,
starting_position="-1", # "-1" is from the beginning of the partition.
)
async def main():
checkpoint_store = BlobCheckpointStore.from_connection_string(STORAGE_CONNECTION_STR, BLOB_CONTAINER_NAME)
client = EventHubConsumerClient.from_connection_string(
CONNECTION_STR,
consumer_group="$Default",
eventhub_name=EVENTHUB_NAME,
checkpoint_store=checkpoint_store, # For load-balancing and checkpoint. Leave None for no load-balancing.
)
async with client:
recv_task = asyncio.ensure_future(receive(client))
await asyncio.sleep(4) # keep receiving for 3 seconds
recv_task.cancel() # stop receiving
await client.close()
async def close():
print("Closing.....")
print("Closed")
if __name__ == '__main__':
asyncio.run(main())
asyncio.run(close())#closing the Client
I try to create a client which uses a asyncio.Queue to feed the messages I want to send to the server. Receiving data from websocket server works great. Sending data which is just generated by the producer works, too. For explaning what works and what fails, first here's my code:
import sys
import asyncio
import websockets
class WebSocketClient:
def __init__(self):
self.send_queue = asyncio.Queue()
#self.send_queue.put_nowait('test-message-1')
async def startup(self):
await self.connect_websocket()
consumer_task = asyncio.create_task(
self.consumer_handler()
)
producer_task = asyncio.create_task(
self.producer_handler()
)
done, pending = await asyncio.wait(
[consumer_task, producer_task],
return_when=asyncio.ALL_COMPLETED
)
for task in pending:
task.cancel()
async def connect_websocket(self):
try:
self.connection = await websockets.client.connect('ws://my-server')
except ConnectionRefusedError:
sys.exit('error: cannot connect to backend')
async def consumer_handler(self):
async for message in self.connection:
await self.consumer(message)
async def consumer(self, message):
self.send_queue.put_nowait(message)
# await self.send_queue.put(message)
print('mirrored message %s now in queue, queue size is %s' % (message, self.send_queue.qsize()))
async def producer_handler(self):
while True:
message = await self.producer()
await self.connection.send(message)
async def producer(self):
result = await self.send_queue.get()
self.send_queue.task_done()
#await asyncio.sleep(10)
#result = 'test-message-2'
return result
if __name__ == '__main__':
wsc = WebSocketClient()
asyncio.run(wsc.startup())
Connecting works great. If I send something from my server to the client, this works great too and prints the message in consumer(). But producer never gets any message I put in send_queue inside consumer().
The reason why I chose send_queue.put_nowait in consumer() was that I wanted to prevent deadlocks. If I use the line await self.send_queue.put(message) line instead of self.send_queue.put_nowait(message) it makes no difference.
I thought, maybe the queue dos not work at all, so I filled something to the queue just at creation in __init__(): self.send_queue.put_nowait("test-message-1"). This works and is sent to my server. So the basic concept of the queue and await queue.get() works.
I als thought, maybe there is some issue with the producer, so let's just randomly generate messages during runtime: result = "test-message-2" instead of result = await self.send_queue.get(). This works too: every 10 seconds 'test-message-2' is sent to my server.
EDIT: This also happens if I try to add stuff from another source to the queue on the fly. I build a small asyncio socket server which pushes any message to the queue, which works great, and you can see the messages I added from the other source with qsize() in consumer(), but still no successfull queue.get(). So the queue itself seems to work, just not get(). This is btw the reason for the queue, too: I would like to send data from quite different sources.
So, this is the point where I'm stuck. My wild guess is that the queue I use in producer() is not the same as in consumer(), something which happens at threading quite easily if you use non-thread-safe queues like asyncio.Queue, but as I understood it I don't use threading at all, just coroutines. So, what else went wrong here?
Just for the context: it's a Ubuntu 20.04 python 3.8.2 inside a docker container.
Thanks,
Ernesto
Just for the records - the solution for my problem was quite simple: I defined send_queue outside the event loop created by my websocket client. So it called events.get_event_loop() and got its own loop - which was not part of the main loop and therefore never called, therefore await queue.get() really never got anything back.
In normal mode, you don't see any message which is a hint to this issue. But, python documentation to the rescue: for course they mentioned it at https://docs.python.org/3/library/asyncio-dev.html : logging.DEBUG gave the hints I needed to find the problem.
It should look like this:
class WebSocketClient:
async def startup(self):
self.send_queue = asyncio.Queue()
await self.connect_websocket()
Then the queue is defined inside the main loop.
For an internship on the Python library fluidimage, we are investigating if it could be a good idea to write a HPC parallel application with a client/servers model using the library trio.
For asynchronous programming and i/o, trio is indeed great!
Then, I'm wondering how to
spawn processes (the servers doing the CPU-GPU bounded work)
communicating complex Python objects (potentially containing large numpy arrays) between the processes.
I didn't find what was the recommended way to do this with trio in its documentation (even if the echo client/server tutorial is a good start).
One obvious way for spawning processes in Python and communicate is using multiprocessing.
In the HPC context, I think one good solution would be to use MPI (http://mpi4py.readthedocs.io/en/stable/overview.html#dynamic-process-management). For reference, I also have to mention rpyc (https://rpyc.readthedocs.io/en/latest/docs/zerodeploy.html#zerodeploy).
I don't know if one can use such tools together with trio and what would be the right way to do this.
An interesting related question
Share python object between multiprocess in python3
Remark PEP 574
It seems to me that the PEP 574 (see https://pypi.org/project/pickle5/) could also be part of a good solution to this problem.
Unfortunately, as of today (July 2018), Trio doesn't yet have support for spawning and communicating with subprocesses, or any kind of high-wrappers for MPI or other high-level inter-process coordination protocols.
This is definitely something we want to get to eventually, and if you want to talk in more detail about what would need to be implemented, then you can hop in our chat, or this issue has an overview of what's needed for core subprocess support. But if your goal is to have something working within a few months for your internship, honestly you might want to consider more mature HPC tools like dask.
As of mid-2018, Trio doesn't do that yet. Your best option to date is to use trio_asyncio to leverage asyncio's support for the features which Trio still needs to learn.
I post a very naive example of a code using multiprocessing and trio (in the main program and in the server). It seems to work.
from multiprocessing import Process, Queue
import trio
import numpy as np
async def sleep():
print("enter sleep")
await trio.sleep(0.2)
print("end sleep")
def cpu_bounded_task(input_data):
result = input_data.copy()
for i in range(1000000-1):
result += input_data
return result
def server(q_c2s, q_s2c):
async def main_server():
# get the data to be processed
input_data = await trio.run_sync_in_worker_thread(q_c2s.get)
print("in server: input_data received", input_data)
# a CPU-bounded task
result = cpu_bounded_task(input_data)
print("in server: sending back the answer", result)
await trio.run_sync_in_worker_thread(q_s2c.put, result)
trio.run(main_server)
async def client(q_c2s, q_s2c):
input_data = np.arange(10)
print("in client: sending the input_data", input_data)
await trio.run_sync_in_worker_thread(q_c2s.put, input_data)
result = await trio.run_sync_in_worker_thread(q_s2c.get)
print("in client: result received", result)
async def parent(q_c2s, q_s2c):
async with trio.open_nursery() as nursery:
nursery.start_soon(sleep)
nursery.start_soon(client, q_c2s, q_s2c)
nursery.start_soon(sleep)
def main():
q_c2s = Queue()
q_s2c = Queue()
p = Process(target=server, args=(q_c2s, q_s2c))
p.start()
trio.run(parent, q_c2s, q_s2c)
p.join()
if __name__ == '__main__':
main()
A simple example with mpi4py... It may be a bad work around from the trio point of view, but it seems to work.
Communications are done with trio.run_sync_in_worker_thread so (as written by Nathaniel J. Smith) (1) no cancellation (and no control-C support) and (2) use more memory than trio tasks (but one Python thread does not use so much memory).
But for communications involving large numpy arrays, I would go like this since communication of buffer-like objects is going to be very efficient with mpi4py.
import sys
from functools import partial
import trio
import numpy as np
from mpi4py import MPI
async def sleep():
print("enter sleep")
await trio.sleep(0.2)
print("end sleep")
def cpu_bounded_task(input_data):
print("cpu_bounded_task starting")
result = input_data.copy()
for i in range(1000000-1):
result += input_data
print("cpu_bounded_task finished ")
return result
if "server" not in sys.argv:
comm = MPI.COMM_WORLD.Spawn(sys.executable,
args=['trio_spawn_comm_mpi.py', 'server'])
async def client():
input_data = np.arange(4)
print("in client: sending the input_data", input_data)
send = partial(comm.send, dest=0, tag=0)
await trio.run_sync_in_worker_thread(send, input_data)
print("in client: recv")
recv = partial(comm.recv, tag=1)
result = await trio.run_sync_in_worker_thread(recv)
print("in client: result received", result)
async def parent():
async with trio.open_nursery() as nursery:
nursery.start_soon(sleep)
nursery.start_soon(client)
nursery.start_soon(sleep)
trio.run(parent)
print("in client, end")
comm.barrier()
else:
comm = MPI.Comm.Get_parent()
async def main_server():
# get the data to be processed
recv = partial(comm.recv, tag=0)
input_data = await trio.run_sync_in_worker_thread(recv)
print("in server: input_data received", input_data)
# a CPU-bounded task
result = cpu_bounded_task(input_data)
print("in server: sending back the answer", result)
send = partial(comm.send, dest=0, tag=1)
await trio.run_sync_in_worker_thread(send, result)
trio.run(main_server)
comm.barrier()
You could also check out tractor which finally seems to have a first alpha release out.
it has built-in function-focussed-style RPC system (much like trio) using TCP and msgpack (but i think they have more transports planned). You just call functions in other processes directly and stream/get results back a variety of different ways.
Here's their first example:
"""
Run with a process monitor from a terminal using::
$TERM -e watch -n 0.1 "pstree -a $$" \
& python examples/parallelism/single_func.py \
&& kill $!
"""
import os
import tractor
import trio
async def burn_cpu():
pid = os.getpid()
# burn a core # ~ 50kHz
for _ in range(50000):
await trio.sleep(1/50000/50)
return os.getpid()
async def main():
async with tractor.open_nursery() as n:
portal = await n.run_in_actor(burn_cpu)
# burn rubber in the parent too
await burn_cpu()
# wait on result from target function
pid = await portal.result()
# end of nursery block
print(f"Collected subproc {pid}")
if __name__ == '__main__':
trio.run(main)