I have setup that uses Tornado as http server and custom made http framework. Idea is to have single tornado handler and every request that arrives should be just submitted to ThreadPoolExecutor and leave Tornado to listen for new requests. Once thread finishes processing request, callback is called that sends response to client in same thread where IO loop is being executes.
Stripped down, code looks something like this. Base http server class:
class HttpServer():
def __init__(self, router, port, max_workers):
self.router = router
self.port = port
self.max_workers = max_workers
def run(self):
raise NotImplementedError()
Tornado backed implementation of HttpServer:
class TornadoServer(HttpServer):
def run(self):
executor = futures.ThreadPoolExecutor(max_workers=self.max_workers)
def submit(callback, **kwargs):
future = executor.submit(Request(**kwargs))
future.add_done_callback(callback)
return future
application = web.Application([
(r'(.*)', MainHandler, {
'submit': submit,
'router': self.router
})
])
application.listen(self.port)
ioloop.IOLoop.instance().start()
Main handler that handles all tornado requests (implemented only GET, but other would be the same):
class MainHandler():
def initialize(self, submit, router):
self.submit = submit
self.router = router
def worker(self, request):
responder, kwargs = self.router.resolve(request)
response = responder(**kwargs)
return res
def on_response(self, response):
# when this is called response should already have result
if isinstance(response, Future):
response = response.result()
# response is my own class, just write returned content to client
self.write(response.data)
self.flush()
self.finish()
def _on_response_ready(self, response):
# schedule response processing in ioloop, to be on ioloop thread
ioloop.IOLoop.current().add_callback(
partial(self.on_response, response)
)
#web.asynchronous
def get(self, url):
self.submit(
self._on_response_ready, # callback
url=url, method='post', original_request=self.request
)
Server is started with something like:
router = Router()
server = TornadoServer(router, 1111, max_workers=50)
server.run()
So, as you can see, main handler just submits every request to thread pool and when processing is done, callback is called (_on_response_ready) which just schedules request finish to be executed on IO loop (to make sure that it is done on same thread where IO loop is being executed).
This works. At least it looks like it does.
My problem here is performance regarding max workers in ThreadPoolExecutor.
All handlers are IO bound, there is no computation going on (they are mostly waiting for DB or external services), so with 50 workers I would expect 50 concurent requests to finish approximately 50 times faster then 50 concurent requests with only one worker.
But that is not the case. What I see is almost identical requests per second when I have 50 workers in thread pool and 1 worker.
For measuring, I have used Apache-Bench with something like:
ab -n 100 -c 10 http://localhost:1111/some_url
Does anybody have idea what am I doing wrong? Did I misunderstand how Tornado or ThreadPool works? Or combination?
The momoko wrapper for postgres remedies this issue, as suggested by kwarunek. If you want to solicit further debugging advice from outside collaborators, it would help to post timestamped debug logs from a test task that does sleep(10) before each DB access.
Related
I have the following FastAPI application:
from fastapi import FastAPI
import socket
app = FastAPI()
#app.get("/")
async def root():
return {"message": "Hello World"}
#app.get("/healthcheck")
def health_check():
result = some_network_operation()
return result
def some_network_operation():
HOST = "192.168.30.12" # This host does not exist so the connection will time out
PORT = 4567
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.settimeout(10)
s.connect((HOST, PORT))
s.sendall(b"Are you ok?")
data = s.recv(1024)
print(data)
This is a simple application with two routes:
/ handler that is async
/healthcheck handler that is sync
With this particular example, if you call /healthcheck, it won't complete until after 10 seconds because the socket connection will timeout. However, if you make a call to / in the meantime, it will return the response right away because FastAPI's main thread is not blocked. This makes sense because according to the docs, FastAPI runs sync handlers on an external threadpool.
My question is, if it is at all possible for us to block the application (block FastAPI's main thread) by doing something inside the health_check method.
Perhaps by acquiring the global interpreter lock?
Some other kind of lock?
Yes, if you try to do sync work in a async method it will block FastAPI, something like this:
#router.get("/healthcheck")
async def health_check():
result = some_network_operation()
return result
Where some_network_operation() is blocking the event loop because it is a synchronous method.
I think I may have an answer to my question, which is that there are some weird edge cases where a sync endpoint handler can block FastAPI.
For instance, if we adjust the some_network_operation in my example to the following, it will block the entire application.
def some_network_operation():
""" No, this is not a network operation, but it illustrates the point """
block = pow (363,100000000000000)
I reached this conclusion based on this question: pow function blocking all threads with ThreadPoolExecutor.
So, it looks like the GIL maybe the culprit here.
That SO question suggests using the multiprocessing module (which will get around GIL). However, I tried this, and it still resulted in the same behavior. So my root problem remains unsolved.
Either way, here is the entire example in the question edited to reproduce the problem:
from fastapi import FastAPI
app = FastAPI()
#app.get("/")
async def root():
return {"message": "Hello World"}
#app.get("/healthcheck")
def health_check():
result = some_network_operation()
return result
def some_network_operation():
block = pow(363,100000000000000)
I have a TCP server running and have a handler function which needs to take the contents of the request, add it to an asyncio queue and reply with an OK status.
On the background I have an async coroutine running that detects when a new item is added and performs some processing.
How do I put items in the asyncio queue from the handler function, which isn't and can't be an async coroutine?
I am running a DICOM server pynetdicom which listens on port 104 for incoming TCP requests (DICOM C-STORE specifically).
I need to save the contents of the request to a queue and return a a 0x0000 response so that the listener is available to the network.
This is modeled by a producer-consumer pattern.
I have tried to define a consumer co-routine consume_dicom() that is currently stuck in await queue.get() since I can't properly define the producer.
The producer needs to simply invoke queue.put(produce_item) but this happens inside a handle_store(event) function which is not part of the event_loop but is called every time a request is received by the server.
import asyncio
from pynetdicom import (
AE, evt,
StoragePresentationContexts
)
class PacsServer():
def __init__(self, par, listen=True):
# Initialize other stuff...
# Initialize DICOM server
ae = AE(ae_title='DICOM-NODE')
ae.supported_contexts = StoragePresentationContexts
# When a C-STORE request comes, it will be passed to self.handle_store
handlers = [(evt.EVT_C_STORE, self.handle_store)]
# Define queue
loop = asyncio.get_event_loop()
self.queue = asyncio.Queue(loop=loop)
# Define consumer
loop.create_task(self.consume_dicom(self.queue))
# Start server in the background with specified handlers
self.scp = ae.start_server(('', 104), block=False, evt_handlers=handlers)
# Start async loop
self.loop.run_forever()
def handle_store(self, event):
# Request handling
ds = event.dataset
# Here I want to add to the queue but this is not an async method
await queue.put(ds)
return 0x0000
async def consume_dicom(self, queue):
while True:
print(f"AWAITING FROM QUEUE")
ds = await queue.get()
do_some_processing(ds)
I would like to find a way to add items to the queue and return the OK status in the handle_store() function.
Since handle_store is running in a different thread, it needs to tell the event loop to enqueue the item. This is done with call_soon_threadsafe:
self.loop.call_soon_threadsafe(queue.put_nowait, ds)
Note that you need to call queue.put_nowait instead of queue.put because the former is a function rather than a coroutine. The function will always succeed for unbounded queues (the default), otherwise it will raise an exception if the queue is full.
TL;DR: Calling future.set_result doesn't immediately resolve loop.run_until_complete. Instead it blocks for an additional 5 seconds.
Full context:
In my project, I'm using autobahn and asyncio to send and receive messages with a websocket server. For my use case, I need a 2nd thread for websocket communication, since I have arbitrary blocking code that will be running in the main thread. The main thread also needs to be able to schedule messages for the communication thread to send back and forth with the server. My current goal is to send a message originating from the main thread and block until the response comes back, using the communication thread for all message passing.
Here is a snippet of my code:
import asyncio
import threading
from autobahn.asyncio.websocket import WebSocketClientFactory, WebSocketClientProtocol
CLIENT = None
class MyWebSocketClientProtocol(WebSocketClientProtocol):
# -------------- Boilerplate --------------
is_connected = False
msg_queue = []
msg_listeners = []
def onOpen(self):
self.is_connected = True
for msg in self.msg_queue[::]:
self.publish(msg)
def onClose(self, wasClean, code, reason):
is_connected = False
def onMessage(self, payload, isBinary):
for listener in self.msg_listeners:
listener(payload)
def publish(self, msg):
if not self.is_connected:
self.msg_queue.append(msg)
else:
self.sendMessage(msg.encode('utf-8'))
# /----------------------------------------
def send_and_wait(self):
future = asyncio.get_event_loop().create_future()
def listener(msg):
print('set result')
future.set_result(123)
self.msg_listeners.append(listener)
self.publish('hello')
return future
def worker(loop, ready):
asyncio.set_event_loop(loop)
factory = WebSocketClientFactory('ws://127.0.0.1:9000')
factory.protocol = MyWebSocketClientProtocol
transport, protocol = loop.run_until_complete(loop.create_connection(factory, '127.0.0.1', 9000))
global CLIENT
CLIENT = protocol
ready.set()
loop.run_forever()
if __name__ == '__main__':
# Set up communication thread to talk to the server
threaded_loop = asyncio.new_event_loop()
thread_is_ready = threading.Event()
thread = threading.Thread(target=worker, args=(threaded_loop, thread_is_ready))
thread.start()
thread_is_ready.wait()
# Send a message and wait for response
print('starting')
loop = asyncio.get_event_loop()
result = loop.run_until_complete(CLIENT.send_and_wait())
print('done') # this line gets called 5 seconds after it should
I'm using the autobahn echo server example to respond to my messages.
Problem: The WebSocketClientProtocol receives the response to its outgoing message and calls set_result on its pending future, but loop.run_until_complete blocks an additional ~4.9 seconds until eventually resolving.
I understand that run_until_complete also processes other pending events on the event loop. Is it possible that the main thread has somehow queued up a bunch of events that have to now get processed once I start the loop? Also, if I move run_until_complete into the communications thread or move the create_connection into the main thread, then the event loop doesn't block me.
Lastly, I tried to recreate this problem without using autobahn, but I couldn't cause the extra delay. I'm curious if maybe this is an issue with the nature of autobahn's callback timing (onMessage for example).
I managed to code a rather silly bug that would make one of my request handlers run a very slow DB query.
Interesting bit is that I noticed that even long-after siege completed Tornado was still churning through requests (sometimes 90s later). (Comment --> I'm not 100% sure of the workings of Siege, but I'm fairly sure it closed the connection..)
My question in two parts:
- Does Tornado cancel request handlers when client closes the connection?
- Is there a way to timeout request handlers in Tornado?
I read through the code and can't seem to find anything. Even though my request handlers are running asynchronously in the above bug the number of pending requests piled up to a level where it was slowing down the app and it would have been better to close out the connections.
Tornado does not automatically close the request handler when the client drops the connection. However, you can override on_connection_close to be alerted when the client drops, which would allow you to cancel the connection on your end. A context manager (or a decorator) could be used to handle setting a timeout for handling the request; use tornado.ioloop.IOLoop.add_timeout to schedule some method that times out the request to run after timeout as part of the __enter__ of the context manager, and then cancel that callback in the __exit__ block of the context manager. Here's an example demonstrating both of those ideas:
import time
import contextlib
from tornado.ioloop import IOLoop
import tornado.web
from tornado import gen
#gen.coroutine
def async_sleep(timeout):
yield gen.Task(IOLoop.instance().add_timeout, time.time() + timeout)
#contextlib.contextmanager
def auto_timeout(self, timeout=2): # Seconds
handle = IOLoop.instance().add_timeout(time.time() + timeout, self.timed_out)
try:
yield handle
except Exception as e:
print("Caught %s" % e)
finally:
IOLoop.instance().remove_timeout(handle)
if not self._timed_out:
self.finish()
else:
raise Exception("Request timed out") # Don't continue on passed this point
class TimeoutableHandler(tornado.web.RequestHandler):
def initialize(self):
self._timed_out = False
def timed_out(self):
self._timed_out = True
self.write("Request timed out!\n")
self.finish() # Connection to client closes here.
# You might want to do other clean up here.
class MainHandler(TimeoutableHandler):
#gen.coroutine
def get(self):
with auto_timeout(self): # We'll timeout after 2 seconds spent in this block.
self.sleeper = async_sleep(5)
yield self.sleeper
print("writing") # get will abort before we reach here if we timed out.
self.write("hey\n")
def on_connection_close(self):
# This isn't the greatest way to cancel a future, since it will not actually
# stop the work being done asynchronously. You'll need to cancel that some
# other way. Should be pretty straightforward with a DB connection (close
# the cursor/connection, maybe?)
self.sleeper.set_exception(Exception("cancelled"))
application = tornado.web.Application([
(r"/test", MainHandler),
])
application.listen(8888)
IOLoop.instance().start()
Another solution to this problem is to use gen.with_timeout:
import time
from tornado import gen
from tornado.util import TimeoutError
class MainHandler
#gen.coroutine
def get(self):
try:
# I'm using gen.sleep here but you can use any future in this place
yield gen.with_timeout(time.time() + 2, gen.sleep(5))
self.write("This will never be reached!!")
except TimeoutError as te:
logger.warning(te.__repr__())
self.timed_out()
def timed_out(self):
self.write("Request timed out!\n")
I liked the way handled by the contextlib solution but I'm was always getting logging leftovers.
The native coroutine solution would be:
async def get(self):
try:
await gen.with_timeout(time.time() + 2, gen.sleep(5))
self.write("This will never be reached!!")
except TimeoutError as te:
logger.warning(te.__repr__())
self.timed_out()
I am implementing a WebSockets server in Tornado 3.2. The client connecting to the server won't be a browser.
For cases in which there is back-and-forth communication between server and client, I would like to add a max. time the server will wait for a client response before closing the connection.
This is roughly what I've been trying:
import datetime
import tornado
class WSHandler(WebSocketHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.timeout = None
def _close_on_timeout(self):
if self.ws_connection:
self.close()
def open(self):
initialize()
def on_message(self, message):
# Remove previous timeout, if one exists.
if self.timeout:
tornado.ioloop.IOLoop.instance().remove_timeout(self.timeout)
self.timeout = None
if is_last_message:
self.write_message(message)
self.close()
else:
# Add a new timeout.
self.timeout = tornado.ioloop.IOLoop.instance().add_timeout(
datetime.timedelta(milliseconds=1000), self._close_on_timeout)
self.write_message(message)
Am I being a klutz and is there a much simpler way of doing this? I can't even seem to schedule a simple print statement via add_timeout above.
I also need some help testing this. This is what I have so far:
from tornado.websocket import websocket_connect
from tornado.testing import AsyncHTTPTestCase, gen_test
import time
class WSTests(AsyncHTTPTestCase):
#gen_test
def test_long_response(self):
ws = yield websocket_connect('ws://address', io_loop=self.io_loop)
# First round trip.
ws.write_message('First message.')
result = yield ws.read_message()
self.assertEqual(result, 'First response.')
# Wait longer than the timeout.
# The test is in its own IOLoop, so a blocking sleep should be okay?
time.sleep(1.1)
# Expect either write or read to fail because of a closed socket.
ws.write_message('Second message.')
result = yield ws.read_message()
self.assertNotEqual(result, 'Second response.')
The client has no problem writing to and reading from the socket. This is presumably because the add_timeout isn't firing.
Does the test need to yield somehow to allow the timeout callback on the server to run? I would have thought not since the docs say the tests run in their own IOLoop.
Edit
This is the working version, per Ben's suggestions.
import datetime
import tornado
class WSHandler(WebSocketHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.timeout = None
def _close_on_timeout(self):
if self.ws_connection:
self.close()
def open(self):
initialize()
def on_message(self, message):
# Remove previous timeout, if one exists.
if self.timeout:
tornado.ioloop.IOLoop.current().remove_timeout(self.timeout)
self.timeout = None
if is_last_message:
self.write_message(message)
self.close()
else:
# Add a new timeout.
self.timeout = tornado.ioloop.IOLoop.current().add_timeout(
datetime.timedelta(milliseconds=1000), self._close_on_timeout)
self.write_message(message)
The test:
from tornado.websocket import websocket_connect
from tornado.testing import AsyncHTTPTestCase, gen_test
import time
class WSTests(AsyncHTTPTestCase):
#gen_test
def test_long_response(self):
ws = yield websocket_connect('ws://address', io_loop=self.io_loop)
# First round trip.
ws.write_message('First message.')
result = yield ws.read_message()
self.assertEqual(result, 'First response.')
# Wait a little more than the timeout.
yield gen.Task(self.io_loop.add_timeout, datetime.timedelta(seconds=1.1))
# Expect either write or read to fail because of a closed socket.
ws.write_message('Second message.')
result = yield ws.read_message()
self.assertEqual(result, None)
The timeout-handling code in your first example looks correct to me.
For testing, each test case gets its own IOLoop, but there is only one IOLoop for both the test and anything else it runs, so you must use add_timeout instead of time.sleep() here as well to avoid blocking the server.
Ey Ben, I know this question was long ago resolved, but I wanted to share with any user reading this the solution I made for this.
It's basically based on yours, but it solves the problem from an external Service that can be easily integrated within any websocket using composition instead of inheritance:
class TimeoutWebSocketService():
_default_timeout_delta_ms = 10 * 60 * 1000 # 10 min
def __init__(self, websocket, ioloop=None, timeout=None):
# Timeout
self.ioloop = ioloop or tornado.ioloop.IOLoop.current()
self.websocket = websocket
self._timeout = None
self._timeout_delta_ms = timeout or TimeoutWebSocketService._default_timeout_delta_ms
def _close_on_timeout(self):
self._timeout = None
if self.websocket.ws_connection:
self.websocket.close()
def refresh_timeout(self, timeout=None):
timeout = timeout or self._timeout_delta_ms
if timeout > 0:
# Clean last timeout, if one exists
self.clean_timeout()
# Add a new timeout (must be None from clean).
self._timeout = self.ioloop.add_timeout(
datetime.timedelta(milliseconds=timeout), self._close_on_timeout)
def clean_timeout(self):
if self._timeout is not None:
# Remove previous timeout, if one exists.
self.ioloop.remove_timeout(self._timeout)
self._timeout = None
In order to use the service, itÅ› as easy as create a new TimeoutWebService instance (optionally with the timeout in ms, as well as the ioloop where it should be executed) and call the method ''refresh_timeout'' to either set the timeout for the first time or reset an already existing timeout, or ''clean_timeout'' to stop the timeout service.
class BaseWebSocketHandler(WebSocketHandler):
def prepare(self):
self.timeout_service = TimeoutWebSocketService(timeout=(1000*60))
## Optionally starts the service here
self.timeout_service.refresh_timeout()
## rest of prepare method
def on_message(self):
self.timeout_service.refresh_timeout()
def on_close(self):
self.timeout_service.clean_timeout()
Thanks to this approach, you can control when exactly and under which conditions you want to restart the timeout which might differ from app to app. As an example you might only want to refresh the timeout if user acomplish X conditions, or if the message is the expected one.
I hope ppl enjoy this solution !