I am writing a Tornado webserver in Python 3.7 to display the status of processes run by the multiprocessing library.
The following code works, but I'd like to be able to do it using Tornado's built-in library instead of hacking in the threading library. I haven't figured out how to do it without blocking Tornado during queue.get. I think the correct solution is to wrap the get calls in some sort of future. I've tried for hours, but haven't figured out how to do this.
Inside of my multiprocessing script:
class ProcessToMonitor(multiprocessing.Process)
def __init__(self):
multiprocessing.Process.__init__(self)
self.queue = multiprocessing.Queue()
def run():
while True:
# do stuff
self.queue.put(value)
Then, in my Tornado script
class MyWebSocket(tornado.websocket.WebSocketHandler):
connections = set()
def open(self):
self.connections.add(self)
def close(self):
self.connections.remove(self)
#classmethod
def emit(self, message):
[client.write_message(message) for client in self.connections]
def worker():
ptm = ProcessToMonitor()
ptm.start()
while True:
message = ptm.queue.get()
MyWebSocket.emit(message)
if __name__ == '__main__':
app = tornado.web.Application([
(r'/', MainHandler), # Not shown
(r'/websocket', MyWebSocket)
])
app.listen(8888)
threading.Thread(target=worker)
ioloop = tornado.ioloop.IOLoop.current()
ioloop.start()
queue.get isn't a blocking function, it just waits until there's an item in the queue in case the queue is empty. I can see from your code that queue.get fits perfectly for you use case inside a while loop.
I think you're probably using it incorrectly. You'll have to make the worker function a coroutine (async/await syntax):
async def worker():
...
while True:
message = await queue.get()
...
However, if you don't want to wait for an item and would like to proceed immediately, its alternative is queue.get_nowait.
One thing to note here is thatqueue.get_nowait will raise an exception called QueueEmpty if the queue is empty. So, you'll need to handle that exception.
Example:
while True:
try:
message = queue.get_nowait()
except QueueEmpty:
# wait for some time before
# next iteration
# otherwise this loop will
# keep running for no reason
MyWebSocket.emit(message)
As you can see, you'll have to use pause the while loop for some time if the queue is empty to prevent it from overwhelming the system.
So why not use queue.get in the first place?
Related
I have a Python asyncio script that needs to run a long running task in a thread. During the operation of the thread, it needs to make network connections to another server. Is there any problem calling network/socket write functions in a thread as opposed to doing it in the main thread?
I know that in the Tiwsted library for example, one must always do network operations in the main thread. Are there any such limitations in asyncio? And if so, how does one get around this problem.
Here's my sample code:
import asyncio
import threading
#
# global servers dict keeps track of connected instances of each protocol
#
servers={}
class SomeOtherServer(asyncio.Protocol):
def __init__(self):
self.transport = None
def connection_made(self,transport):
self.transport=transport
servers["SomeOtherServer"] = self
def connection_lost(self):
self.transport=None
class MyServer(asyncio.Protocol):
def __init__(self):
self.transport = None
def connection_made(self,transport);
self.transport=transport
servers["MyServer"] = self
def connection_lost(self):
self.transport=None
def long_running_task(self,data):
# some long running operations here, then write data to other server
# other_server is also an instance of some sort of asyncio.Protocol
# is it ok to call this like this, even though this method is running in a thread?
other_server = servers["SomeOtherServer"]
other_server.transport.write(data)
def data_received(self,data):
task_thread = threading.Thread(target=self.long_running_task,args=[data])
task_thread.start()
async def main():
global loop
loop = asyncio.get_running_loop()
other_server_obj = await loop.create_server(lambda: SomeOtherServer(),"localhost",9001)
my_server_obj = await loop.create_server(lambda: MyServer(),"localhost",9002)
async with other_server_obj, my_server_obj:
while True:
await asyncio.sleep(3600)
asyncio.run(main())
Note that data_received will set up and call long_running_task in a thread, and long running_task makes a network connection to another server, and does so in the task thread, not the main thread. Is this ok or is there some other way this must be done?
I want to start a new Process (Pricefeed) from my Executor class and then have the Executor class keep running in its own event loop (the shoot method). In my current attempt, the asyncio loop gets blocked on the line p.join(). However, without that line, my code just exits. How do I do this properly?
Note: fh.run() blocks as well.
import asyncio
from multiprocessing import Process, Queue
from cryptofeed import FeedHandler
from cryptofeed.defines import L2_BOOK
from cryptofeed.exchanges.ftx import FTX
class Pricefeed(Process):
def __init__(self, queue: Queue):
super().__init__()
self.coin_symbol = 'SOL-USD'
self.fut_symbol = 'SOL-USD-PERP'
self.queue = queue
async def _book_update(self, feed, symbol, book, timestamp, receipt_timestamp):
self.queue.put(book)
def run(self):
fh = FeedHandler()
fh.add_feed(FTX(symbols=[self.fut_symbol, self.coin_symbol], channels=[L2_BOOK],
callbacks={L2_BOOK: self._book_update}))
fh.run()
class Executor:
def __init__(self):
self.q = Queue()
async def shoot(self):
print('in shoot')
for i in range(5):
msg = self.q.get()
print(msg)
await asyncio.sleep(1) # do some stuff
async def run(self):
asyncio.create_task(self.shoot())
p = Pricefeed(self.q)
p.start()
p.join()
async def main():
g = Executor()
await g.run()
if __name__ == '__main__':
asyncio.run(main())
Since you're using a queue to communicate this is a somewhat tricky problem. To answer your first question as to why removing join makes the program work, join blocks until the process finishes. In asyncio you can't do anything blocking in a function marked async or it will freeze the event loop. To do this properly you'll need to run your process with the asyncio event loop's run_in_executor method which will run things in a process pool and return an awaitable that is compatible with the asyncio event loop.
Secondly, you'll need to use a multiprocessing Manager which creates shared state that can be used by multiple processes to properly share your queue. Managers directly support creation of a shared queue. Using these two bits of knowledge you can adapt your code to something like the following which works:
import asyncio
import functools
import time
from multiprocessing import Manager
from concurrent.futures import ProcessPoolExecutor
def run_pricefeed(queue):
i = 0
while True: #simulate putting an item on the queue every 250ms
queue.put(f'test-{i}')
i += 1
time.sleep(.25)
class Executor:
async def shoot(self, queue):
print('in shoot')
for i in range(5):
while not queue.empty():
msg = queue.get(block=False)
print(msg)
await asyncio.sleep(1) # do some stuff
async def run(self):
with ProcessPoolExecutor() as pool:
with Manager() as manager:
queue = manager.Queue()
asyncio.create_task(self.shoot(queue))
await asyncio.get_running_loop().run_in_executor(pool, functools.partial(run_pricefeed, queue))
async def main():
g = Executor()
await g.run()
if __name__ == '__main__':
asyncio.run(main())
This code has a drawback in that you need to empty the queue in a non-blocking fashing from your asyncio process and wait for a while for new items to come in before emptying it again, effectively implementing a polling mechanism. If you don't wait after emptying, you'll wind up with blocking code and you will freeze the event loop again. This isn't as good as just waiting for the queue to have an item in it by blocking, but may suit your needs. If possible, I would avoid asyncio here and use multiprocessing entirely, for example, by implementing queue processing as a separate process.
I'm trying to combine multiprocessing with asyncio. The program has two main components - one which streams/generates content, and another that consumes it.
What I want to do is to create multiple processes in order to exploit multiple CPU cores - one for the stream listener/generator, another for the consumer, and a simple one to shut down everything when the consumer has stopped.
My approach so far has been to create the processes, and start them. Each such process creates an async task. Once all processes have started, I run the asyncio tasks. What I have so far (stripped down) is:
def consume_task(loop, consumer):
loop.create_task(consume_queue(consumer))
def stream_task(loop, listener, consumer):
loop.create_task(create_stream(listener, consumer))
def shutdown_task(loop, listener):
loop.create_task(shutdown(consumer))
async def shutdown(consumer):
print("Shutdown task created")
while not consumer.is_stopped():
print("No activity")
await asyncio.sleep(5)
print("Shutdown initiated")
loop.stop()
async def create_stream(listener, consumer):
stream = Stream(auth, listener)
print("Stream created")
stream.filter(track=KEYWORDS, is_async=True)
await asyncio.sleep(EVENT_DURATION)
print("Stream finished")
consumer.stop()
async def consume_queue(consumer):
await consumer.run()
loop = asyncio.get_event_loop()
p_stream = Process(target=stream_task, args=(loop, listener, consumer, ))
p_consumer = Process(target=consume_task, args=(loop, consumer, ))
p_shutdown = Process(target=shutdown_task, args=(loop, consumer, ))
p_stream.start()
p_consumer.start()
p_shutdown.start()
loop.run_forever()
loop.close()
The problem is that everything hangs (or does it block?) - no tasks are actually running. My solution was to change the first three functions to:
def consume_task(loop, consumer):
loop.create_task(consume_queue(consumer))
loop.run_forever()
def stream_task(loop, listener, consumer):
loop.create_task(create_stream(listener, consumer))
loop.run_forever()
def shutdown_task(loop, listener):
loop.create_task(shutdown(consumer))
loop.run_forever()
This does actually run. However, the consumer and the listener objects are not able to communicate. As a simple example, when the create_stream function calls consumer.stop(), the consumer does not stop. Even when I change a consumer class variable, the changes are not made - case in point, the shared queue remains empty. This is how I am creating the instances:
queue = Queue()
consumer = PrintConsumer(queue)
listener = QueuedListener(queue, max_time=EVENT_DURATION)
Please note that if I do not use processes, but only asyncio tasks, everything works as expected, so I do not think it's a reference issue:
loop = asyncio.get_event_loop()
stream_task(loop, listener, consumer)
consume_task(loop, consumer)
shutdown_task(loop, listener)
loop.run_forever()
loop.close()
Is it because they are running on different processes? How should I go about fixing this issue please?
Found the problem! Multi-processing creates copies of instances. The solution is to create a Manager, which shares the instances itself.
EDIT [11/2/2020]:
import asyncio
from multiprocessing import Process, Manager
"""
These two functions will be created as separate processes.
"""
def task1(loop, shared_list):
output = loop.run_until_complete(asyncio.gather(async1(shared_list)))
def task2(loop, shared_list):
output = loop.run_until_complete(asyncio.gather(async2(shared_list)))
"""
These two functions will be called (in different processes) asynchronously.
"""
async def async1(shared_list):
pass
async def async2(shared_list):
pass
"""
Create the manager and start it up.
From this manager, also create a list that is shared by functions in different threads.
"""
manager = Manager()
manager.start()
shared_list = manager.list()
loop = asyncio.get_event_loop() # the event loop
"""
Create two processes.
"""
process1 = Process(target=task1, args=(loop, shared_list, ))
process2 = Process(target=task2, args=(loop, shared_list, ))
"""
Start the two processes and wait for them to finish.
"""
process1.start()
process2.start()
output1 = process1.join()
output2 = process2.join()
"""
Clean up
"""
loop.close()
manager.shutdown()
Trying to use pyserial with asyncio on a windows machine.
Inspired by https://stackoverflow.com/a/27927704/1629704 my code is constantly watching a serial port for incoming data.
# This coroutine is added as a task to the event loop.
#asyncio.coroutine
def get_from_serial_port(self):
while 1:
serial_data = yield from self.get_byte_async()
<doing other stuff with serial_data>
# The method which gets executed in the executor
def get_byte(self):
data = self.s.read(1)
time.sleep(0.5)
tst = self.s.read(self.s.inWaiting())
data += tst
return data
# Runs blocking function in executor, yielding the result
#asyncio.coroutine
def get_byte_async(self):
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
res = yield from self.loop.run_in_executor(executor, self.get_byte)
return res
After serial data has been returned. the coroutine get_byte_async is called inside the while loop creating a new executor. I always learned creating a new thread is expensive so I feel I should take another approach, but I am not sure how to do that.
I've been reading this article https://hackernoon.com/threaded-asynchronous-magic-and-how-to-wield-it-bba9ed602c32#.964j4a5s7
And I guess I need to do the reading of the serial port in another thread. But how to get the serial data back to the "main" loop ?
You can either use the default executor and lock the access to get_byte with an asyncio lock:
async def get_byte_async(self):
async with self.lock:
return await self.loop.run_in_executor(None, self.get_byte)
Or simply create your own executor once:
async def get_byte_async(self):
if self.executor is None:
self.executor = concurrent.futures.ThreadPoolExecutor(max_workers=1)
return await self.loop.run_in_executor(self.executor, self.get_byte)
I'm trying to understand, is it possible to run the asyncio.Server instance while the event loop is already running by run_forever method (from a separate thread, of course).
As I understand, the server can be started either by loop.run_until_complete(asyncio.start_server(...)) or by
await asyncio.start_server(...), if the loop is already running.
The first way is not acceptable for me, since the loop is already running by run_forever method. But I also can't use the await expression, since I'm going to start it from outside the "loop area" (e.g. from the main method, which can't be marked as async, right?)
def loop_thread(loop):
asyncio.set_event_loop(loop)
try:
loop.run_forever()
finally:
loop.close()
print("loop clesed")
class SchedulerTestManager:
def __init__(self):
...
self.loop = asyncio.get_event_loop()
self.servers_loop_thread = threading.Thread(
target=loop_thread, args=(self.loop, ))
...
def start_test(self):
self.servers_loop_thread.start()
return self.servers_loop_thread
def add_router(self, router):
r = self.endpoint.add_router(router)
host = router.ConnectionParameters.Host
port = router.ConnectionParameters.Port
srv = TcpServer(host, port)
server_coro = asyncio.start_server(
self.handle_connection, self.host, self.port)
# does not work since add_router is not async
# self.server = await server_coro
# does not work, since the loop is already running
# self.server = self.loop.run_until_complete(server_coro)
return r
def maind():
st_manager = SchedulerTestManager()
thread = st_manager.start_test()
router = st_manager.add_router(router)
Of cource, the simplest solution is to add all routers (servers) before starting the test (running the loop). But I want try to implement it, so it would be possible to add a router when a test is already running. I thought the loop.call_soon (call_soon_threadsafe) methods can help me, but it seems the can't shedule a coroutine, but just a simple function.
Hope that my explanation is not very confusing. Thanks in advance!
For communicating between event loop executed in one thread and conventional old good threaded code executed in other thread you might use janus library.
It's a queue with two interfaces: async and thread-safe sync one.
This is usage example:
import asyncio
import janus
loop = asyncio.get_event_loop()
queue = janus.Queue(loop=loop)
def threaded(sync_q):
for i in range(100):
sync_q.put(i)
sync_q.join()
#asyncio.coroutine
def async_coro(async_q):
for i in range(100):
val = yield from async_q.get()
assert val == i
async_q.task_done()
fut = loop.run_in_executor(None, threaded, queue.sync_q)
loop.run_until_complete(async_coro(queue.async_q))
loop.run_until_complete(fut)
You may create a task waiting new messages from the queue in a loop and starting new servers on request. Other thread may push new message into the queue asking for a new server.