I'm writing an web app. Users can post text, and I need to store them in my DB as well as sync them to a twitter account.
The problem is that I'd like to response to the user immediately after inserting the message to DB, and run the "sync to twitter" process in background.
How could I do that? Thanks
either you choose zrxq's solution, or you can do that with a thread, if you take care of two things:
you don't tamper with objects from the main thread (be careful of iterators),
you take good care of killing your thread once the job is done.
something that would look like :
import threading
class TwitterThreadQueue(threading.Thread):
queue = []
def run(self):
while len(self.queue!=0):
post_on_twitter(self.queue.pop()) # here is your code to post on twitter
def add_to_queue(self,msg):
self.queue.append(msg)
and then you instanciate it in your code :
tweetQueue = TwitterThreadQueue()
# ...
tweetQueue.add_to_queue(message)
tweetQueue.start() # you can check if it's not already started
# ...
Related
I have a DRF application with a python queue that I'm writing tests for. Somehow,
My queue thread cannot find an object that exists in the test database.
The main thread cannot destroy the db as it's in use by 1 other session.
To explain the usecase a bit further, I use Django's user model and have a table for metadata of files which you can upload. One of this fields is a created_by, which is a ForeignKey to django.conf.settings.AUTH_USER_MODEL. As shown below, I create a user in the TestCase's setUp(), which I then use to create an entry in the Files table. The creation of this entry happens in a queue however. During testing, this results in an error DETAIL: Key (created_by_id)=(4) is not present in table "auth_user"..
When the tests are completed, and the tearDown tries to destroy the test DB, I get another error DETAIL: There is 1 other session using the database.. The two seem related, and I'm probably handling the queue incorrectly.
The tests are written with Django's TestCase and run with python manage.py test.
from django.contrib.auth.models import User
from rest_framework.test import APIClient
from django.test import TestCase
class MyTest(TestCase):
def setUp(self):
self.client = APIClient()
self.client.force_authenticate()
user = User.objects.create_user('TestUser', 'test#test.test', 'testpass')
self.client.force_authenticate(user)
def test_failing(self):
self.client.post('/totestapi', data={'files': [open('tmp.txt', 'rt')]})
The queue is defined in separate file, app/queue.py.
from app.models import FileMeta
from queue import Queue
from threading import Thread
def queue_handler():
while True:
user, files = queue.get()
for file in files:
upload(file)
FileMeta(user=user, filename=file.name).save()
queue.task_done()
queue = Queue()
thread = Thread(target=queue_handler, daemon=True)
def start_upload_thread():
thread.start()
def put_upload_thread(*args):
queue.put(args)
Finally, the queue is started from app/views.py, which is always called when Django is started, and contains all the APIs.
from rest_framework import APIView
from app.queue import start_upload_thread, put_upload_thread
start_upload_thread()
class ToTestAPI(APIView):
def post(self, request):
put_upload_thread(request.user, request.FILES.getlist('files'))
Apologies that this is not a "real" answer but it was getting longer than a comment would allow.
The new ticket looks good. I did notice that there was no stoping of the background thread, as you did. That is probably what is causing that issue with the db still being active.
You use TestCase, which runs a db transaction and undoes all database changes when the test function ends. That means you won't be able to see data from the test case in another thread using a different connection to the database. You can see it inside your tests and views, since they share a connection.
Celery and RQ are the standard job queues - Celery is more flexible, but RQ is simpler. Start with RQ and keep things simple and isolated.
Some notes:
Pass in the PK of objects not the whole object
Read up on pickle if you do need to pass larger data.
Set the queues to async=False (run like normal code) in tests.
Queue consumers are a separate process running anywhere in the system, so data needs to get to them somehow. If you use full objects those need to be pickled, or serialized, and saved in the queue itself (i.e. redis) to be retrieved and processed. Just be careful and don't pass large objects this way - use the PK, store the file somewhere in S3 or another object storage, etc.
For Django-RQ I use this snippet to set the queues to sync mode when in testing, and then just run things as normal.
if IS_TESTING:
for q in RQ_QUEUES.keys():
RQ_QUEUES[q]['ASYNC'] = False
Good luck!
I am writing a Flask application and I am trying to insert a multi-threaded implementation for certain server related features. I noticed this weird behavior so I wanted to understand why is it happening and how to solve it. I have the following code:
from flask_login import current_user, login_required
import threading
posts = Blueprint('posts', __name__)
#posts.route("/foo")
#login_required
def foo():
print(current_user)
thread = threading.Thread(target=goo)
thread.start()
thread.join()
return
def goo():
print(current_user)
# ...
The main process correctly prints the current_user, while the child thread prints None.
User('Username1', 'email1#email.com', 'Username1-ProfilePic.jpg')
None
Why is it happening? How can I manage to obtain the current_user also in the child process? I tried passing it as argument of goo but I still get the same behavior.
I found this post but I can't understand how to ensure the context is not changing in this situation, so I tried providing a simpler example.
A partially working workaround
I tried passing as parameter also a newly created object User populated with the data from current_user
def foo():
# ...
user = User.query.filter_by(username=current_user.username).first_or_404()
thread = threading.Thread(target=goo, args=[user])
# ...
def goo(user):
print(user)
# ...
And it correctly prints the information of the current user. But since inside goo I am also performing database operations I get the following error:
RuntimeError: No application found. Either work inside a view function
or push an application context. See
http://flask-sqlalchemy.pocoo.org/contexts/.
So as I suspected I assume it's a problem of context.
I tried also inserting this inside goo as suggested by the error:
def goo():
from myapp import create_app
app = create_app()
app.app_context().push()
# ... database access
But I still get the same errors and if I try to print current_user I get None.
How can I pass the old context to the new thread? Or should I create a new one?
This is because Flask uses thread local variables to store this for each request's thread. That simplifies in many cases, but makes it hard to use multiple threads. See https://flask.palletsprojects.com/en/1.1.x/design/#thread-local.
If you want to use multiple threads to handle a single request, Flask might not be the best choice. You can always interact with Flask exclusively on the initial thread if you want and then forward anything you need on other threads back and forth yourself through a shared object of some kind. For database access on secondary threads, you can use a thread-safe database library with multiple threads as long as Flask isn't involved in its usage.
In summary, treat Flask as single threaded. Any extra threads shouldn't interact directly with Flask to avoid problems. You can also consider either not using threads at all and run everything sequentially or trying e.g. Tornado and asyncio for easier concurrency with coroutines depending on the needs.
your server serves multiple users, wich are threads by themself.
flask_login was not designed for extra threading in it, thats why child thread prints None.
i suggest u to use db for transmit variables from users and run addition docker container if you need separate process.
That is because current_user is implement as a local safe resource:
https://github.com/maxcountryman/flask-login/blob/main/flask_login/utils.py#L26
Read:
https://werkzeug.palletsprojects.com/en/1.0.x/local/#module-werkzeug.local
I'm building a telegram bot and for the start I used the structure from an example of the api wrapper. In the py script there is an infinite loop which is polling the telegram api to get new messages for the bot. And processes each new message one by one.
while True:
for update in bot.getUpdates(offset=LAST_UPDATE_ID, timeout=10):
chat_id = update.message.chat.id
update_id = update.update_id
if update.message.text:
#do things with the message \ start other functions and so on
What I foresee already now, is that with some messages\requests - i'll have a longer processing time and other messages, if the even came at the same time - will wait. For the user it will look like a delay in answering. Which boils down to a simple dependency: more user chatting = more delay.
I was thinking this: Can I have this main script bot.py run and check for new messages and each time a message arrived - this script will kickstart another script answer.py to do the processing of the message and reply.
And to start as many as needed those answer.py scripts in parallel.
I can also use bot.py to log all incoming things into DB with reference data about the user who is sending a message and then have another process processing all newly logged data and marking it as answered - but also then it should process each new entry parallel to each other.
I'm not a guru in python and is asking for some ideas and guidance on how to approach this? Thank you!
What you need are threads, or some frameworks that can handle many requests asynchronously, e.g. Twisted, Tornado, or asyncio in Python 3.4.
Here is an implementation using threads:
import threading
def handle(message):
##### do your response here
offset = None
while True:
for update in bot.getUpdates(offset=offset, timeout=10):
if update.message.text:
t = threading.Thread(target=handle, args=(update.message,))
t.start()
offset = update.update_id + 1
##### log the message if you want
This way, the call to handle() would not block, and the loop can go on handling the next message.
For more complicated situations, for example if you have to maintain states across messages from the same chat_id, I recommend taking a look at telepot, and this answer:
Handle multiple questions for Telegram bot in python
In short, telepot spawns threads for you, freeing you from worrying about the low-level details and letting you focus on the problem at hand.
I try to solve problem with sending mails(or any long task) in web.py project. What I want is to start sending any mail and return the http response. But this task (sending) is taking a long time. Is there any solution?
Example:
import web
''some settings urls, etc.''
class Index:
def GET(self):
''task''
sending_mail()
return 'response'
I found many examples about async tasks but I think that if this task put to background and return 'response' it will fail.
You could get away with sending email in a separate thread (you can spawn one when you need to send an email):
import threading
threading.Thread(target=sending_email).start()
However, the all-around best (and standard) solution would be to use an asynchronous task processor, such as Celery. In your web thread, simply create a new task, and Celery will asynchronously execute it.
There is no reason why "returning response" would fail when using a message queue, unless your response depends on the email being sent prior to sending the response (but in that case, you have an architectural problem).
Moving the sending_email() task to a background queue would be the best solution. This would allow you to return the response immediately and get the results of the sending_email task later on.
Let me also suggest taking a look at RQ
It is a lightweight alternative to Celery that I find easier to get up and running. I have used it in the past for sending emails in the background and it didn't disappoint.
I'm writing a code for a simple chat client in python. I have the GUI, a php server to store strings and other data. I want to make my code capable of updating the chat (conversation Text field) each 1 second.
I post a bit of pseudo-code:
Initialize Gui
Setup Users
UserX write messageX
messageX sent to server
At this point I need something that checks each second if userX(that could be user1 or user2) has new messages to display.
If I put something like:
while True:
time.sleep(1)
checkAndDisplayNewMessages()
the GUI doesn't appear! Because at the end of the code I got a mainloop()
To resume, I want my code to give the possibility to the user to send and receive messages asynchronously! With a part of code for sending messages if the user type in any message and the other part to constantly check for new messages while the program runs.
You did not mention which GUI toolkit you are using; from mainloop() I guess it's Tk.
The answer to this question explains how to set up a recurring event. Multithreading is not required.
You need to detach the way you fetch for new messages from the main thread of your applications. That can be easily done with threads in Python, it'd look something like this:
import threading
def fetch_messages(ui):
while not ui.ready():
#this loop syncs this process with the UI.
#we don't want to start showing messages
#until the UI is not ready
time.sleep(1)
while True:
time.sleep(1)
checkAndDisplayNewMessages()
def mainlogic():
thread_messages = threading.Thread(target=fetch_messages,args=(some_ui,))
thread_messages.start()
some_ui.show() # here you can go ahead with your UI stuff
# while messages are fetched. This method should
# set the UI to ready.
This implementation will run in parallel the process to seek for more messages and also will launch the UI. It is important that the UI is sync with the process to seek for messages otherwise you'd end up with funny exceptions. This is achieved by the first loop in the fetch_messages function.