I have lot of code in my Tornado app which looks like this:
#tornado.web.asynchronous
def get(self):
...
some_async_call(..., callback=self._step1)
def _step1(self, response):
...
some_async_call(..., callback=self._step2)
def _step2(self, response):
...
some_async_call(..., callback=self._finish_request)
def _finish_request(self, response):
...
self.write(something)
self.finish()
Obviously inline callbacks would simplify that code a lot, it would look something like:
#inlineCallbacks
#tornado.web.asynchronous
def get(self):
...
response = yield some_async_call(...)
...
response = yield some_async_call(...)
...
response = yield some_async_call(...)
...
self.write(something)
self.finish()
Is there a way of having inline callbacks or otherwise simplifying the code in Tornado?
You could even factorize the calls.
I think what you do calls one async call after the other, thus not giving a maximum latency improvement.
If the calls don't have any dependencies (like e.g. taking the result of one call to do the second call) you could start all calls simultaneously:
#tornado.web.asynchronous
#gen.engine
def get(self):
responses = yield [ gen.Task(call) for call in required_calls ]
This way, all calls start at the same time and thus your overall latency is the max(all calls) instead of the sum(all calls).
I've used this in an app that need to aggregate many third-party WS or database calls and it improves the overall latency a lot.
Of course it doesn't work if there are dependencies between the calls (as mentionned above)
Found it. In Tornado it's not called inline callbacks, but rather "a generator-based interface" — tornado.gen. Thus my code should look something like:
#tornado.web.asynchronous
#gen.engine
def get(self):
...
response = yield gen.Task(some_async_call(...))
...
response = yield gen.Task(some_async_call(...))
...
response = yield gen.Task(some_async_call(...))
...
self.write(something)
self.finish()
You might also consider just using Cyclone, which would allow you to use #inlineCallbacks (and any other Twisted code that you want) directly.
Related
There is a tricky post handler, sometimes it can take a lots of time (depending on a input values), sometimes not.
What I want is to write back whenever 1 second passes, dynamically allocating the response.
def post():
def callback():
self.write('too-late')
self.finish()
timeout_obj = IOLoop.current().add_timeout(
dt.timedelta(seconds=1),
callback,
)
# some asynchronous operations
if not self.request.connection.stream.closed():
self.write('here is your response')
self.finish()
IOLoop.current().remove_timeout(timeout_obj)
Turns out I can't do much from within callback.
Even raising an exception is suppressed by the inner context and won't be passed through the post method.
Any other ways to achieve the goal?
Thank you.
UPD 2020-05-15:
I found similar question
Thanks #ionut-ticus, using with_timeout() is much more convenient.
After some tries, I think I came really close to what i'm looking for:
def wait(fn):
#gen.coroutine
#wraps(fn)
def wrap(*args):
try:
result = yield gen.with_timeout(
dt.timedelta(seconds=20),
IOLoop.current().run_in_executor(None, fn, *args),
)
raise gen.Return(result)
except gen.TimeoutError:
logging.error('### TOO LONG')
raise gen.Return('Next time, bro')
return wrap
#wait
def blocking_func(item):
time.sleep(30)
# this is not a Subprocess.
# It is a file IO and DB
return 'we are done here'
Still not sure, should wait() decorator being wrapped in a
coroutine?
Some times in a chain of calls of a blocking_func(), there can
be another ThreadPoolExecutor. I have a concern, would this work
without making "mine" one global, and passing to the
Tornado's run_in_executor()?
Tornado: v5.1.1
An example of usage of tornado.gen.with_timeout. Keep in mind the task needs to be async or else the IOLoop will be blocked and won't be able to process the timeout:
#gen.coroutine
def async_task():
# some async code
#gen.coroutine
def get(self):
delta = datetime.timedelta(seconds=1)
try:
task = self.async_task()
result = yield gen.with_timeout(delta, task)
self.write("success")
except gen.TimeoutError:
self.write("timeout")
I'd advise to use https://github.com/aio-libs/async-timeout:
import asyncio
import async_timeout
def post():
try:
async with async_timeout.timeout(1):
# some asynchronous operations
if not self.request.connection.stream.closed():
self.write('here is your response')
self.finish()
IOLoop.current().remove_timeout(timeout_obj)
except asyncio.TimeoutError:
self.write('too-late')
self.finish()
I am trying to play with this piece of code to understand #tornado.web.asynchronous. The code as intended should handle asynchronous web requests but it doesnt seem to work as intended. There are two end points:
1) http://localhost:5000/A (This is the time consuming request and
takes a few seconds)
2) http://localhost:5000/B (This is the fast request and takes no time to return.
However when I hit the browser to go to http://localhost:5000/A and then while that is running go to http://localhost:5000/B the second request is queued and runs only after A has finished.
In other words one task is time consuming but it blocks the other faster task. What am I doing wrong?
import tornado.web
from tornado.ioloop import IOLoop
import sys, random, signal
class TestHandler(tornado.web.RequestHandler):
"""
In below function goes your time consuming task
"""
def background_task(self):
sm = 0
for i in range(10 ** 8):
sm = sm + 1
return str(sm + random.randint(0, sm)) + "\n"
#tornado.web.asynchronous
def get(self):
""" Request that asynchronously calls background task. """
res = self.background_task()
self.write(str(res))
self.finish()
class TestHandler2(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
self.write('Response from server: ' + str(random.randint(0, 100000)) + "\n")
self.finish()
def sigterm_handler(signal, frame):
# save the state here or do whatever you want
print('SIGTERM: got kill, exiting')
sys.exit(0)
def main(argv):
signal.signal(signal.SIGTERM, sigterm_handler)
try:
if argv:
print ":argv:", argv
application = tornado.web.Application([
(r"/A", TestHandler),
(r"/B", TestHandler2),
])
application.listen(5000)
IOLoop.instance().start()
except KeyboardInterrupt:
print "Caught interrupt"
except Exception as e:
print e.message
finally:
print "App: exited"
if __name__ == '__main__':
sys.exit(main(sys.argv))
According to the documentation:
To minimize the cost of concurrent connections, Tornado uses a
single-threaded event loop. This means that all application code
should aim to be asynchronous and non-blocking because only one
operation can be active at a time.
To achieve this goal you need to prepare the RequestHandler properly. Simply adding #tornado.web.asynchronous decorator to any of the functions (get, post, etc.) is not enough if the function performs only synchronous actions.
What does the #tornado.web.asynchronous decorator do?
Let's look at the get function. The statements are executed one after another in a synchronous manner. Once the work is done and the function returns the request is being closed. A call to self.finish() is being made under the hood. However, when we use the #tornado.web.asynchronous decorator the request is not being closed after the function returned. So the self.finish() must be called by the user to finish the HTTP request. Without this decorator the request is automatically finished when the get() method returns.
Look at the "Example 21" from this page - tornado.web.asynchronous:
#web.asynchronous
def get(self):
http = httpclient.AsyncHTTPClient()
http.fetch("http://example.com/", self._on_download)
def _on_download(self, response):
self.finish()
The get() function performs an asynchronous call to the http://example.com/ page. Let's assume this call is a long action. So the http.fetch() function is being called and a moment later the get() function returns (http.fetch() is still running the background). The Tornado's IOLoop can move forward to serve the next request while the data from the http://example.com/ is being fetched. Once the the http.fetch() function call is finished the callback function - self._on_download - is called. Then self.finish() is called and the request is finally closed. This is the moment when the user can see the result in the browser.
It's possible due to the httpclient.AsyncHTTPClient(). If you use a synchronous version of the httpclient.HTTPClient() you will need to wait for the call to http://example.com/ to finish. Then the get() function will return and the next request will be processed.
To sum up, you use #tornado.web.asynchronous decorator if you use asynchronous code inside the RequestHandler which is advised. Otherwise it doesn't make much difference to the performance.
EDIT: To solve your problem you can run your time-consuming function in a separate thread. Here's a simple example of your TestHandler class:
class TestHandler(tornado.web.RequestHandler):
def on_finish(self, response):
self.write(response)
self.finish()
def async_function(base_function):
#functools.wraps(base_function)
def run_in_a_thread(*args, **kwargs):
func_t = threading.Thread(target=base_function, args=args, kwargs=kwargs)
func_t.start()
return run_in_a_thread
#async_function
def background_task(self, callback):
sm = 0
for i in range(10 ** 8):
sm = sm + 1
callback(str(sm + random.randint(0, sm)))
#tornado.web.asynchronous
def get(self):
res = self.background_task(self.on_finish)
You also need to add those imports to your code:
import threading
import functools
import threading
async_function is a decorator function. If you're not familiar with the topic I suggest to read (e.g.: decorators) and try it on your own. In general, our decorator allows the function to return immediately (so the main program execution can go forward) and the processing to take place at the same time in a separate thread. Once the function in a thread is finished we call a callback function which writes out the results to the end user and closes the connection.
Here is my code:
#/test
class Test(tornado.web.RequestHandler):
#tornado.web.asynchronous
#tornado.gen.coroutine
def get(self):
res = yield self.inner()
self.write(res)
#tornado.gen.coroutine
def inner(self):
import time
time.sleep(15)
raise tornado.gen.Return('hello')
#/test_1
class Test1(tornado.web.RequestHandler):
#tornado.web.asynchronous
#tornado.gen.coroutine
def get(self):
res = yield self.inner()
self.write(res)
#tornado.gen.coroutine
def inner(self):
raise tornado.gen.Return('hello test1')
When I fetch /test and then fetch /test_1, but /test_1 does not response until /test responsed, how to fixed it?
Don't use time.sleep(). time.sleep() will block cpu loop. Instead, use
yield tornado.gen.Task(tornado.ioloop.IOLoop.instance().add_timeout,
time.time() + sleep_seconds)
You've hit both the frequently-asked questions:
http://www.tornadoweb.org/en/stable/faq.html
First, please don't use time.sleep() in a Tornado application, use gen.sleep() instead. Second, be aware that most browsers won't fetch two pages from the same domain simultaneously: use "curl" or "wget" to test your application instead.
I'm trying to use AsyncHTTPClient in Tornado to do multiple callouts to a "device" available over http:
def ext_call(self, params):
device = AsyncHTTPClient()
request = HTTPRequest(...)
return partial(device.fetch, request)
#coroutine
def _do_call(self, someid):
acall = self.ext_call(params)
waitkey = str(someid)
acall(callback = (yield Callback(waitkey)))
response = yield Wait(waitkey)
raise Return(response)
def get_device_data(self, lst):
for someid in lst:
r = self._do_call(someid)
print 'response', r
But instead of HTTP responses as AsyncHTTPClient should return after .fetch, I'm getting this:
response <tornado.concurrent.TracebackFuture object at 0x951840c>
Why this is not working like examples in http://www.tornadoweb.org/en/stable/gen.html ?
Got this one solved. It appears that #coroutine has to be applied all the way down from the get/post method of your class inheriting from RequestHandler, otherwise #coroutine/yield magic does not work.
Apparently this is a case of Tornado newbiness combined with bad design on my part: according to a colleague one should not do "callback spaghetti" of nested #coroutine and yield()s, but rather move all the synchronous code out of request handler and call before or after async code it and have #coroutine call hierarchy flat rather than deep.
Would like to sort a simple query, but not sure how this works with "gen.task", as it takes a method as arg1 and param as arg2.
This works more than fine :
response, error = yield gen.Task(db.client().collection.find, {"user_id":user_id})
if response:
#blablabla
But then how do I give it the sort()?
UPDATE : This now throws a 'callback must be callable' error. Which seems to be some other issue with Tornado now.
def findsort(self, find, callback):
return callback(db.client().collection.find(find).sort({"myfield":1}))
#gen.engine
def anotherfunction(self):
response, error = yield gen.Task(self.findsort, {"user_id":user_id})
Use asyncmongo, it works perfectly with gen.
After juggling you will get something like this:
DB = asyncmongo.Client()
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
#gen.engine
def get(self):
result, error = yield gen.Task(DB.collection.find, {}, limit=50, sort=[('myfield', 1)])
And about 'callback must be callable'.. When working with gen - always describe +1 argument in functions, which is called by gen.Task.
def findsort(self, find, params, callback): #here we recieve self + 3 args, if we remove params - callback will contain {"user_id":user_id}
return callback(db.client().collection.find(find).sort({"myfield":1}))
#gen.engine
def anotherfunction(self):
response, error = yield gen.Task(self.findsort, {"user_id":user_id}) #we see 2 args, but it passes 3 args to findsort
Looks like you are trying to make db calls to mongo db asynchronous. By default pymongo is blocking but there is a separate branch called motor which makes it possible to have async queries.
See http://emptysquare.net/blog/introducing-motor-an-asynchronous-mongodb-driver-for-python-and-tornado/ for more details.
It supports the tornado.gen generator pattern too.
I'm not familiar with gen.Task but maybe you could try:
#gen.engine
def anotherfunction(self):
def findsort(find):
return db.client().collection.find(find).sort({"myfield":1})
response, error = yield gen.Task(findsort, {"user_id":user_id})
You should use asyncmongo, an async implemntation of pymongo.
gen.Task requires the function to have a callback parameter.