Does bottle handle requests with no concurrency? - python

At first, I think Bottle will handle requests concurrently, so I wrote test code bellow:
import json
from bottle import Bottle, run, request, response, get, post
import time
app = Bottle()
NUMBERS = 0
#app.get("/test")
def test():
id = request.query.get('id', 0)
global NUMBERS
n = NUMBERS
time.sleep(0.2)
n += 1
NUMBERS = n
return id
#app.get("/status")
def status():
return json.dumps({"numbers": NUMBERS})
run(app, host='0.0.0.0', port=8000)
Then I use jmeter to request /test url with 10 threads loops 20 times.
After that, /status gives me {"numbers": 200}, which seems like that bottle does not handle requests concurrently.
Did I misunderstand anything?
UPDATE
I did another test, I think it can prove that bottle deal with requests one by one(with no concurrency). I did a little change to the test function:
#app.get("/test")
def test():
t1 = time.time()
time.sleep(5)
t2 = time.time()
return {"t1": t1, "t2": t2}
And when I access /test twice in a browser I get:
{
"t2": 1415941221.631711,
"t1": 1415941216.631761
}
{
"t2": 1415941226.643427,
"t1": 1415941221.643508
}

Concurrency isn't a function of your web framework -- it's a function of the web server you use to serve it. Since Bottle is WSGI-compliant, it means you can serve Bottle apps through any WSGI server:
wsgiref (reference server in the Python stdlib) will give you no concurrency.
CherryPy dispatches through a thread pool (number of simultaneous requests = number of threads it's using).
nginx + uwsgi gives you multiprocess dispatch and multiple threads per process.
Gevent gives you lightweight coroutines that, in your use case, can easily achieve C10K+ with very little CPU load (on Linux -- on Windows it can only handle 1024 simultaneous open sockets) if your app is mostly IO- or database-bound.
The latter two can serve massive numbers of simultaneous connections.
According to http://bottlepy.org/docs/dev/api.html , when given no specific instructions, bottle.run uses wsgiref to serve your application, which explains why it's only handling one request at once.

Related

How to make flask application run every 5 minutes?

I have this application that works how I want to but now I want it to grab live data from some virtual machines every 5 minutes. In the code below I have it set to renew every ten seconds just to see if it works but nothing is happening. I am using time.sleep. What else am I missing?
import time
from flask import Flask, render_template
from testapi import grab_cpu
app = Flask(__name__)
starttime = time.time()
while True:
machines =["build05", "build06", "build07","build08", "build09", "build10", "build11", "build12","build14","build15", "winbuild10","winbuild11", "winbuild12", "winbuild13", "wbuild14", "wbuild15", "winbuild16", "winbuild17", "winbuild18"]
cpu_percentage =[grab_cpu("build05"), grab_cpu("build06"),grab_cpu("build07"),
grab_cpu("build08"), grab_cpu("build09"), grab_cpu("build10"), grab_cpu("build11"), grab_cpu("build12"), grab_cpu("build13"), grab_cpu("build14"), grab_cpu("build15"), grab_cpu("winbuild10"), grab_cpu("winbuild11"), grab_cpu("winbuild12"), grab_cpu("winbuild14"), grab_cpu("winbuild15"), grab_cpu("winbuild16"), grab_cpu("winbuild17"), grab_cpu("winbuild18")]
#app.route("/") # this sets the route to this page
def home():
return render_template('testdoc.html', len = len(machines), machines = machines, cpu_percentage = cpu_percentage)
app.run(use_reloader = True, debug = True)
time.sleep(10.0 - ((time.time() - starttime) % 10.0))
Edit (this is an update with the suggestions, it's still not working as i'd like):
Edit 2 more info: I have one file with a function, grab_cpu, that does an api call to a vm and returns the percentage of usage. I have another file called test doc.html which just displays the html. From these responses i'm guessing I need to use some javascript and something with sockets. Can someone please drop a link to point me in the right direction?
import time
from flask import Flask, render_template
from testapi import grab_cpu
app = Flask(__name__)
#app.route("/") # this sets the route to this page
def home():
starttime = time.time()
while True:
machines =["build05", "build06", "build07","build08", "build09", "build10", "build11", "build12","build14","build15", "winbuild10","winbuild11", "winbuild12", "winbuild13", "wbuild14", "wbuild15", "winbuild16", "winbuild17", "winbuild18"]
cpu_percentage =[grab_cpu("build05"), grab_cpu("build06"),grab_cpu("build07"),
grab_cpu("build08"), grab_cpu("build09"), grab_cpu("build10"), grab_cpu("build11"), grab_cpu("build12"), grab_cpu("build13"), grab_cpu("build14"), grab_cpu("build15"), grab_cpu("winbuild10"), grab_cpu("winbuild11"), grab_cpu("winbuild12"), grab_cpu("winbuild14"), grab_cpu("winbuild15"), grab_cpu("winbuild16"), grab_cpu("winbuild17"), grab_cpu("winbuild18")]
return render_template('testdoc.html', len = len(machines), machines = machines, cpu_percentage = cpu_percentage)
time.sleep(10.0 - ((time.time() - starttime) % 10.0))
app.run(use_reloader = True, debug = True)
Here is the page:
Thank you.
I recommend against using Flask to handle the scheduling. The intended flow for Flask requests is:
Receive an HTTP request from the browser
Generate a response as quickly as possible (ideally in a few milliseconds, but almost always less than a few seconds)
Return the response to the browser
The intention of the code above seems to be to use Flask to push updates down to the browser, but Flask can only respond to incoming requests, not force the browser to change.
For the use case you're describing, a simpler solution would be to handle the refresh logic in the browser.
For a very rudimentary example, put this into your testdoc.html template:
<script type="text/javascript">
setTimeout(function () {
location.reload();
}, 10 * 1000);
</script>
That will reload the page every 10 seconds, which will generate a new request to your Flask server and display updated information in your browser.
If you want to get fancier and avoid reloading the entire page, you can use JavaScript XmlHttpRequests or the more modern Fetch API to update specific elements of the page asynchronously.
Also, here's a suggested simplification of the Python code:
from flask import Flask, render_template
from testapi import grab_cpu
app = Flask(__name__)
build_machines = list(map(lambda i: 'build%02d' % i, range(5, 16)))
win_build_machines = list(map(lambda i: 'winbuild%02d' % i, range(10, 19)))
machines = build_machines + win_build_machines
# Handle HTTP GET requests to / route
#app.route("/", methods=['GET'])
def home():
cpu_percentage = list(map(lambda b: grab_cpu(b), machines))
return render_template('testdoc.html', len = len(machines), machines = machines, cpu_percentage = cpu_percentage)
app.run(use_reloader = True, debug = True)
Your code needs some work, Flask views are not meant to be declared inside a loop. I would suggest the following:
Remove the Flask view from inside the while loop.
Declare the server outside the loop too.
Write your code inside a function.
Run the while loop just calling the function to grab the information from your sources.
Based on what I think is just an example made on the run, I will also make some assumptions about your requirement:
This is not exactly your code.
Your code works perfectly, but this implementation is lacking.
Your project has some level of complexity.
You need to show this data somewhere else.
Base on these (and perhaps additional conditions), I would say you have two ways to achieve in an effective way what you need:
(Not recommended) Create a Cron Job and run everything in a dedicated script.
(My favorite) Encapsulate the logic of your script inside a Flask API call, a method, or a function, and declare it as a Celery task, scheduled to run every 5 seconds, updating your database, and use a view with some JS reactivity to show the data in realtime.

issue where Flask app request handling is choked when a regex.serach is happening?

Setup :
Language i am using is Python
I am running a flask app with threaded =True and inside the flask app when an endpoint is hit, it starts a thread and returns a thread started successfully with status code 200.
Inside the thread, there is re.Search happening, which takes 30-40 seconds(worst case) and once the thread is completed it hits a callback URL completing one request.
Ideally, the Flask app is able to handle concurrent requests
Issue:
When the re.search is happening inside the thread, the flask app is not accepting concurrent requests. I am assuming some kind of thread locking is happening and unable to figure out.
Question :
is it ok to do threading(using multi processing) inside flask app which has "threaded = True"?
When regex is happening does it do any thread locking?
Code snippet: hit_callback does a post request to another api which is not needed for this issue.
import threading
from flask import *
import re
app = Flask(__name__)
#app.route(/text)
def temp():
text = request.json["text"]
def extract_company(text)
attrib_list = ["llc","ltd"] # realone is 700 in length
entity_attrib = r"((\s|^)" + r"(.)?(\s|$)|(\s|^)".join(attrib_list) + r"(\s|$))"
raw_client = re.search("(.*(?:" + entity_attrib + "))", text, re.I)
hit_callback(raw_client)
extract_thread = threading.Thread(target=extract_company, )
extract_thread.start()
return jsonify({"Response": True}), 200
if __name__ == "__main__":
app.run(debug=True, host='0.0.0.0', port=4557, threaded=True)
Please read up on the GIL - basically, Python can only execute ONE piece of code at the same time - even if in threads. So your re-search runs and blocks all other threads until run to completion.
Solutions: Use Gunicorn etc. pp to run multiple flask processes - do not attempt to do everything in one flask process.
To add something: Also your design is problematic - 40 seconds or so for a http answer is way to long. A better design would have at least two services: a web server and a search service. The first would register a search and give back an id. Your search service would communicate asynchronously with the web service and return resullt +id when the result is ready. Your clients could pull the web service until they get a result.

How can we create asynchronous API in django?

I want to create a third party chatbot API which is asynchronous and replies "ok" after 10 seconds pause.
import time
def wait():
time.sleep(10)
return "ok"
# views.py
def api(request):
return wait()
I have tried celery for the same as follows where I am waiting for celery response in view itself:
import time
from celery import shared_task
#shared_task
def wait():
time.sleep(10)
return "ok"
# views.py
def api(request):
a = wait.delay()
work = AsyncResult(a.id)
while True:
if work.ready():
return work.get(timeout=1)
But this solution works synchronously and makes no difference. How can we make it asynchronous without asking our user to keep on requesting until the result is received?
As mentioned in #Blusky's answer:
The asynchronous API will exist in django 3.X. not before.
If this is not an option, then the answer is just no.
Please note as well, that even with django 3.X any django code, that accesses the database will not be asynchronous it had to be executed in a thread (thread pool)
Celery is intended for background tasks or deferred tasks, but celery will never return an HTTP response as it didn't receive the HTTP request to which it should respond to. Celery is also not asyncio friendly.
You might have to think of changing your architecture / implementation.
Look at your overall problem and ask yourself whether you really need an asynchronous API with Django.
Is this API intended for browser applications or for machine to machine applications?
Could your client's use web sockets and wait for the answer?
Could you separate blocking and non blocking parts on your server side?
Use django for everything non blocking, for everything periodic / deferred (django + celelry) and implement the asynchronous part with web server plugins or python ASGI code or web sockets.
Some ideas
Use Django + nginx nchan (if your web server is nginx)
Link to nchan: https://nchan.io/
your API call would create a task id, start a celery task, return immediately the task id or a polling url.
The polling URL would be handled for example via an nchan long polling channel.
your client connects to the url corresponding to an nchan long polling channel and celery deblocks it whenever you're task is finished (the 10s are over)
Use Django + an ASGI server + one handcoded view and use strategy similiar to nginx nchan
Same logic as above, but you don't use nginx nchan, but implement it yourself
Use an ASGI server + a non blocking framework (or just some hand coded ASGI views) for all blocking urls and Django for the rest.
They might exchange data via the data base, local files or via local http requests.
Just stay blocking and throw enough worker processes / threads at your server
This is probably the worst suggestion, but if it is just for personal use,
and you know how many requests you will have in parallel then just make sure you have enough Django workers, so that you can afford to be blocking.
I this case you would block an entire Django worker for each slow request.
Use websockets. e.g. with the channels module for Django
Websockets can be implemented with earlier versions of django (>= 2.2) with the django channels module (pip install channels) ( https://github.com/django/channels )
You need an ASGI server to server the asynchronous part. You could use for example Daphne ot uvicorn (The channel doc explains this rather well)
Addendum 2020-06-01: simple async example calling synchronous django code
Following code uses the starlette module as it seems quite simple and small
miniasyncio.py
import asyncio
import concurrent.futures
import os
import django
from starlette.applications import Starlette
from starlette.responses import Response
from starlette.routing import Route
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'pjt.settings')
django.setup()
from django_app.xxx import synchronous_func1
from django_app.xxx import synchronous_func2
executor = concurrent.futures.ThreadPoolExecutor(max_workers=2)
async def simple_slow(request):
""" simple function, that sleeps in an async matter """
await asyncio.sleep(5)
return Response('hello world')
async def call_slow_dj_funcs(request):
""" slow django code will be called in a thread pool """
loop = asyncio.get_running_loop()
future_func1 = executor.submit(synchronous_func1)
func1_result = future_func1.result()
future_func2 = executor.submit(synchronous_func2)
func2_result = future_func2.result()
response_txt = "OK"
return Response(response_txt, media_type="text/plain")
routes = [
Route("/simple", endpoint=simple_slow),
Route("/slow_dj_funcs", endpoint=call_slow_dj_funcs),
]
app = Starlette(debug=True, routes=routes)
you could for example run this code with
pip install uvicorn
uvicorn --port 8002 miniasyncio:app
then on your web server route these specific urls to uvicorn and not to your django application server.
The best option is to use the futur async API, which will be proposed on Django in 3.1 release (which is already available in alpha)
https://docs.djangoproject.com/en/dev/releases/3.1/#asynchronous-views-and-middleware-support
(however, you will need to use an ASGI Web Worker to make it work properly)
Checkout Django 3 ASGI (Asynchronous Server Gateway Interface) support:
https://docs.djangoproject.com/en/3.0/howto/deployment/asgi/

Is the Flask Caching filesystem cache shared across processes?

Let us assume I use Flask with the filesystem cache in combination with uWSGI or gunicorn, either of them starting multiple processes or workers. Do all these processes share the same cache? Or asked differently, do the functions and parameters always evaluate to the same cache key regardless of process pid, thread state etc.?
For instance, consider the following minimal example:
import time
from flask import Flask, jsonify
from flask_caching import Cache
app = Flask(__name__)
cache = Cache(app, config={
'CACHE_TYPE': 'filesystem',
'CACHE_DIR': 'my_cache_directory',
'CACHE_DEFAULT_TIMEOUT': 3600,
})
#cache.memoize()
def compute(param):
time.sleep(5)
return param + 1
#app.route('/')
#app.route('/<int:param>')
def main(param=41):
expensive = compute(param)
return jsonify({"Hello expensive": expensive})
if __name__ == '__main__':
app.run()
Will www.example.com/41 just take 5 seconds once and then (for 3600 seconds) be available instantly regardless of the uWSGI or gunicorn workers?
If i run it locally on my machine, the cache is stable across different processes as well as even different restarts of the entire server.
I am finding that the flask-caching filesystem cache is kept per-worker. Simple example I just tried in my app (4 workers):
#app.route("/product/<id>", methods=["GET"])
#app.cache.cached()
def product(id):
product = Product.from_id(id)
app.pp.pprint(product.get_data())
I'm re-loading the page that calls that view and I see the pprint output 4 times in the console, then no more after that.

Gevent joinall blocking error

A bit of background on the application:
Clients make web requests
Server has to handle web requests ( each request takes 20 seconds )
Send response to clients.
My approach was to first parallelize the request serving part using webpy and mod_wsgi. If I start the code with 20 or so threads, but since each thread takes 20 seconds to complete, the request level multi-threading is of less use. So I had to reduce the 20 seconds which I did by spawning greenlets. Something like this
File wordprocess_gevent.py
import gevent
urls = ('/'. A)
class A:
output = {}
def POST(self):
#words is a list of words
method A(words):
def A(self, words):
threads = []
for word in words:
i_thread = gevent.spawn(B, word)
threads.append(i_thread)
gevent.joinall(threads, timeout=2)
def B(self, word):
#Process word takes 20 seconds
result = process(word)
a[word] = result
application = web.application(urls, globals()).wsgifunc()
I start this code using the mod_wsgi-express as given below:
mod_wsgi-express start-server wordprocess_gevent.py --processes 5 --server-root wsgi_logs/ --with-wdb &
When simultaneous POST requests arrive I get this error
LoopExit: This operation would block forever
at the line
gevent.joinall(threads, timeout=2)
BUT if I post a single POST request - I get the required results. Can someone help me out here please.
So I solved this by removing mod_wsgi altogether. Even with one process I was getting the same result when multiple requests came.
Added the following and its now working like a charm :)
if __name__ == "__main__":
application = web.application(urls, globals()).wsgifunc()
appserver = WSGIServer(('', 8000), application)
appserver.serve_forever()

Categories

Resources