Using Python's bottle module, I would like to process a request internally, without invoking a call from outside.
Suppose I have the following minimal bottle application running at localhost:8080 and I would like to invoke foo from inside bar. One way to do this is:
from bottle import *
import requests
#get('/foo')
def foo()
return 'foo'
#get('/bar')
def bar()
return requests.get('http://localhost:8080/foo').text
app = default_app()
run(app, port=8080)
Now what I would like to do is get rid of the HTTP call using requests. I would simply love to use something like:
#get('/bar')
def bar()
return bottle.process_internally('/foo', 'GET')
For me, this would have two big advantages:
Only single worker is required (the worker is blocked while processing request, hence using requests.get leads to deadlock if only one worker is running).
No overhead caused by the HTTP protocol.
My true motivation is that I wish to process batches containing request URLs in form of JSON arrays. Very ineffective, yet very fast-to-implement.
Is that somehow possible?
You may want to consider "thin controllers" (and skinny everything) paradigm.
With this concept, all your code logic is elsewhere other than the controller (perhaps in your service or model classes).
If you have the bare minimum amount of logic in your controllers, then your foo and bar routes can call the same functions in your models/services, and you won't need to resort to your routes calling each other.
There are some frameworks that have support for internal redirect (Ruby's Sinatra is one), but I've always considered these a hacky workaround for not writing the code in a flexible way.
Related
I am writing a Flask application and I am trying to insert a multi-threaded implementation for certain server related features. I noticed this weird behavior so I wanted to understand why is it happening and how to solve it. I have the following code:
from flask_login import current_user, login_required
import threading
posts = Blueprint('posts', __name__)
#posts.route("/foo")
#login_required
def foo():
print(current_user)
thread = threading.Thread(target=goo)
thread.start()
thread.join()
return
def goo():
print(current_user)
# ...
The main process correctly prints the current_user, while the child thread prints None.
User('Username1', 'email1#email.com', 'Username1-ProfilePic.jpg')
None
Why is it happening? How can I manage to obtain the current_user also in the child process? I tried passing it as argument of goo but I still get the same behavior.
I found this post but I can't understand how to ensure the context is not changing in this situation, so I tried providing a simpler example.
A partially working workaround
I tried passing as parameter also a newly created object User populated with the data from current_user
def foo():
# ...
user = User.query.filter_by(username=current_user.username).first_or_404()
thread = threading.Thread(target=goo, args=[user])
# ...
def goo(user):
print(user)
# ...
And it correctly prints the information of the current user. But since inside goo I am also performing database operations I get the following error:
RuntimeError: No application found. Either work inside a view function
or push an application context. See
http://flask-sqlalchemy.pocoo.org/contexts/.
So as I suspected I assume it's a problem of context.
I tried also inserting this inside goo as suggested by the error:
def goo():
from myapp import create_app
app = create_app()
app.app_context().push()
# ... database access
But I still get the same errors and if I try to print current_user I get None.
How can I pass the old context to the new thread? Or should I create a new one?
This is because Flask uses thread local variables to store this for each request's thread. That simplifies in many cases, but makes it hard to use multiple threads. See https://flask.palletsprojects.com/en/1.1.x/design/#thread-local.
If you want to use multiple threads to handle a single request, Flask might not be the best choice. You can always interact with Flask exclusively on the initial thread if you want and then forward anything you need on other threads back and forth yourself through a shared object of some kind. For database access on secondary threads, you can use a thread-safe database library with multiple threads as long as Flask isn't involved in its usage.
In summary, treat Flask as single threaded. Any extra threads shouldn't interact directly with Flask to avoid problems. You can also consider either not using threads at all and run everything sequentially or trying e.g. Tornado and asyncio for easier concurrency with coroutines depending on the needs.
your server serves multiple users, wich are threads by themself.
flask_login was not designed for extra threading in it, thats why child thread prints None.
i suggest u to use db for transmit variables from users and run addition docker container if you need separate process.
That is because current_user is implement as a local safe resource:
https://github.com/maxcountryman/flask-login/blob/main/flask_login/utils.py#L26
Read:
https://werkzeug.palletsprojects.com/en/1.0.x/local/#module-werkzeug.local
I have hosted a Flask app on Heroku, written in Python. I have a function which is something like this:
#app.route("/execute")
def execute():
doSomething()
return Response()
Now, the problem is that doSomething() takes more than 30 seconds to execute, bypassing the 30-second-timeout duration of Heroku, and it kills the app.
I could make another thread and execute doSomething() inside it, but the Response object needs to return a file that will be made available only after doSomething() has finished execution.
I also tried working with generators and yield, but couldn't get them to work either. Something like:
#app.route("/execute")
def execute():
def generate():
yield ''
doSomething()
yield file
return Response(generate())
but the app requires me to refresh the page in order to get the second yielded object.
What I basically need to do is return an empty Response object initially, start the execution of doSomething(), and then return another Response object. How do I accomplish this?
Usually with http one request means one response, that's it.
For your issue you might want to look into:
Streaming Response, which are used for large response with many parts.
Sockets to allow multiple "responses" for a single "request".
Making multiple queries with your client, if you have control over your client code this is most likely the easiest solution
I'd recommend reading this, it gets a bit technical but it helped me understand a lot of things.
What you are trying to make is an asynchronous job. For that I recommend you use Celery (here you have a good example: https://blog.miguelgrinberg.com/post/using-celery-with-flask/page/7) or some another tool for asynchronous jobs. In the front-end you can do a simple pooling to wait for response, I recommend you to use SocketIO (https://socket.io/). It's a simple and efficient solution.
It's basically an asynchronous job. You can use Celery or Asyncio for these operations. You can never ask any user to wait for more than 3 seconds - 10 seconds for any operation.
1) Make an AJAX Request
2) Initialize a socket that listens to your operation.
3) As soon as you finish the operation, the socket sends the message back, you can show the user later on through a popup.
This is the best approach you can do
If you could share, what computation are you making then you can get more alternative approaches
I have created a module that does some heavy computations, and returns some data to be stored in a nosqldatabase. The computation process is started via a post request in my flask application. The flask function will execute the cumputation code and after the code and then the returned results will be stored in db. I was thinking of celery. But I am wondering and haven't found any clear info on that if it would be possible to use python trheading E.g
from mysci_module import heavy_compute
#route('/initiate_task/', methods=['POST',])
def run_computation():
import thread
thread.start_new_thread(heavy_compute, post_data)
return reponse
Its very abstract I know. The only problem I see in this method is that my function will have to know and be responsible in storing data in the database, so It is not very independant on the database used. Correct? Why is Celery a better (is it really?) than the method above?
Since CPython is restricted from true concurrency using threads by the GIL, all computations will infact happen serially. Instead you could use the python multiprocessing module and create a pool of processes to complete your heavy computation task.
There are a few microframeworks such as twisted klein apart from celery that can also help achieve that concurrency and independence that you're looking for. They aren't necessarily better, but are available for those who don't want to get their hands messy with various issues that are likely to come up when one gets into synchronizing flask and the actual business logic, especially when response is based on that activity.
I would suggest the following method to start a thread for the long procedure first. Then leave Flask to communicate with the procedure time by time upon your requirements:
from mysci_module import heavy_compute
import thread
thread.start_new_thread(heavy_compute, post_data)
#route('/initiate_task/', methods=['POST',])
def check_computation():
response = heave_compute.status
return response
The best part of this method is to make sure you have a callable thread in the background all the time while it's possible to get the necessary result even passing some parameters to the task.
I have a wsgi application which can potentially run in different python web server environments(cherrypy, uwsgi or gunicorn), but not greenlet-based. The application handles different paths for some REST apis and user interfaces. During any http call of the app there is a need to know, what is the context of the call, since implementation logic methods share code of API calls and UI calls and some bunch of logic which is separated in many modules should react differently depending on the context. The simple and straightforward way is to pass a parameter to implementation code, e.g. ApiCall(caller=client_service_id) or UserCall(caller=user_id), but it's a pain to propagate this parameter to all the possible modules. Is it a good solution to just set the context in the thread object like this?
def set_context(ctx):
threading.current_thread().ctx = ctx
def get_context():
return threading.current_thread().ctx
So call set_context somewhere in the beginning of the http call handler where we can construct the context ovject depending on the environment data, and then just use get_context() in any part of the code where we must react depending on the context. What is the best practices to achive this? Thank you!
so I have a handler below:
class PublishHandler(BaseHandler):
def post(self):
message = self.get_argument("message")
some_function(message)
self.write("success")
The problem that I'm facing is that some_function() takes some time to execute and I would like the post request to return straight away when called and for some_function() to be executed in another thread/process if possible.
I'm using berkeley db as the database and what I'm trying to do is relatively simple.
I have a database of users each with a filter. If the filter matches the message, the server will send the message to the user. Currently I'm testing with thousands of users and hence upon each publication of a message via a post request it's iterating through thousands of users to find a match. This is my naive implementation of doing things and hence my question. How do I do this better?
You might be able to accomplish this by using your IOLoop's add_callback method like so:
loop.add_callback(lambda: some_function(message))
Tornado will execute the callback in the next IOLoop pass, which may (I'd have to dig into Tornado's guts to know for sure, or alternatively test it) allow the request to complete before that code gets executed.
The drawback is that that long-running code you've written will still take time to execute, and this may end up blocking another request. That's not ideal if you have a lot of these requests coming in at once.
The more foolproof solution is to run it in a separate thread or process. The best way with Python is to use a process, due to the GIL (I'd highly recommend reading up on that if you're not familiar with it). However, on a single-processor machine the threaded implementation will work just as fine, and may be simpler to implement.
If you're going the threaded route, you can build a nice "async executor" module with a mutex, a thread, and a queue. Check out the multiprocessing module if you want to go the route of using a separate process.
I've tried this, and I believe the request does not complete before the callbacks are called.
I think a dirty hack would be to call two levels of add_callback, e.g.:
def get(self):
...
def _defered():
ioloop.add_callback(<whatever you want>)
ioloop.add_callback(_defered)
...
But these are hacks at best. I'm looking for a better solution right now, probably will end up with some message queue or simple thread solution.