I have a process that should accept requests from two different sources. It is not important, what those requests are, consider them simple string messages for instance. Requests can come from two sources and are filed into a PriorityQueue. The main process handles the requests in the queue. One of the two sources for requests is a telegram bot created by the python-telegram-bot package. Every source needs to run its "event" loop to provide requests. Thus, I want to launch them in separate threads.
The (pseudo) code to show the intention would read as follows:
queue = PriorityQueue()
handler = RequestHandler(queue)
telegramRequester = TelegramRequester(queue)
anotherRequester = SomeOtherSourceRequester(queue)
telegramRequester.start() # launches telegram bot polling/idle loop
anotherRequester.start() # launches the loop of another request source
handler.handleRequestsLoop() # launches the loop that handles incoming requests
The telegram bot and corresponding requester look something like this:
class Bot:
def __init__(self):
self._updater = telegram.ext.Updater("my api token", use_context=True)
def run(self):
self._updater.start_polling(drop_pending_updates=True)
self._updater.idle()
def otherFunctions(self):
# like registering commands, command handlers etc.
# I've got my bot working and tested as I want it.
class TelegramRequester:
def __init__(self, queue:RequestQueue) -> None:
self._queue:RequestQueue = requestQueue
self._bot = Bot()
self._thread:threading.Thread = threading.Thread(target=self._bot.run)
def start(self):
if not self._thread.is_alive():
self._thread.start()
However, when running this, I receive the following error messsage:
File "...\myscript.py", line 83, in run
self._updater.idle()
File "...\env\lib\site-packages\telegram\ext\updater.py", line 885, in idle
signal(sig, self._signal_handler)
File "C:\Program Files\aaaProgrammieren\Python\Python3_9\lib\signal.py", line 47, in signal
handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread of the main interpreter
It's the first time I use the telegram api and that I have multiple threads running in parallel. Plus, I have no experience in "web/network/etc." programming. Is it that simple? You shall not run a telegram bot in a separate thread! Or is there something very simple I am missing that would make a construct like mine possible?
Just some general hints on the usage of PTB (v13.x) in this context:
Updater.idle() is intended to keep the main thread alive - nothing else. This is because Updater.start_polling starts a few background threads that do the actual work but don't prevent the main thread from ending. If you have multiple things going on in the main thread, you'll probably have a custom "keep alive" and "shutdown" logic, so you'll likely not need Updater.idle() at all. Instead, you can just call Updater.stop() when you want it to shut down.
Updater.idle() allows you to customize which signals it sets. You can pass an empty list to not set any signal handlers.
Dislaimer: I'm currently the maintainer of PTB
Related
I would like to send a message to the mantainer of the bot (specific ID) whenever it initializes. All the script was made with handlers, and I can't make it work with both methods of sending messages:
def main() -> None:
"""Start the bot."""
# Create the Application and pass it your bot's token.
application = Application.builder().token("TOKEN").build()
application.bot.send_message("SOME ID","Hello! I am working.") # Problematic line
# Handlers
application.add_handler(CommandHandler("command", command_function))
...
# on non command i.e message - echo the message on Telegram
application.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, default_command))
# Run the bot until the user presses Ctrl-C
application.run_polling()
If I try to run the previous script, I get the following warning:
RuntimeWarning: coroutine 'ExtBot.send_message' was never awaited
application.bot.send_message("SOME ID","Hello! I am working.") # Problematic line
I can guess that the problem is between both methods of sending messages. But is there any way I can use both at the same time? (Or at least once, only when the bot initializes)
To run asyncio-code on startup, the Application(Builder) class provides the post_init hook. The documentation also includes an example of how to use this hook to call bot methods on startup.
Disclaimer: I'm currentyl the maintainer of python-telegram-bot
I'm building an async library with aiohttp. The library has a single client that on instantiation creates a ClientSession and uses it to make requests to an API (it's an REST API wrapper)
The problem i'm facing is how to cleanly close the client session on exit?
If the session is not explicitly closed a whole lot of errors come out but i can't simply use context managers to close the session since i don't know when the program will end.
A tipical use would be this:
from mylibrary import Client
client = Client()
async main():
await client.get_foo(...)
await client.patch_bar(...)
asyncio.run(main())
I could add await client.close_session() on main but I want to remove this responsability from the enduser so ideally the client would automatically close the ClientSession when the program ends.
How can I do this?
I have tried using __del__ on the client to get the loop and close the session without success as well as using the atexit library, but it seems that by the time these run the asyncio loop has already been destroyed and I still get the warnings.
The specific error is:
Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x0000013ACFD54AF0>
transport: <_ProactorSocketTransport fd=1052 read=<_OverlappedFuture cancelled>>
I did some research on this error and google seems to think it's because I need to implement flow control, I have however and this error only occurs if I don't explicitly close the session.
Unfortunately, it seems like the only clean pattern that can apply there is to make your client itself an (async) context manager, and require that your users use it in a with block.
The __del__ method could work in some cases - but it would require that code from your users would not "leak" the Client instance itself.
so, the code is trivial - the burden on your users is not zero:
class Client:
...
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc_value, tb):
await self.close_session()
Creating a pseudo-hook on loop.stop:
Another way, though not "clean" and not guaranteed to work, could be to decorate the running loop stop function to add a call to close_session.
If the user code just "halts" and does not tear down the loop properly, this can't help anyway - but I guess it might be an option for "well behaved" users.
The big problem here is this is not documented - but taking a pick on asyncio internals, it looks it always will go through self.stop().
import asyncio
class ShutDownCb:
def __init__(self, cb):
self.cb = cb
self.stopping = False
loop = self.loop = asyncio.get_running_loop()
self.original_stop = loop.stop
loop.stop = self.new_stop
async def _stop(self):
self.task.result()
return self.original_stop()
def new_stop(self):
if not self.stopping:
self.stopping = True
self.task = asyncio.create_task(self.cb())
asyncio.create_task(self._stop())
return
return self.original_stop()
class Client:
def __init__(self, ...):
...
ShutDownCb(self.close_session)
I have a Flask server that accepts HTTP requests from a client. This HTTP server needs to delegate work to a third-party server using a websocket connection (for performance reasons).
I find it hard to wrap my head around how to create a permanent websocket connection that can stay open for HTTP requests. Sending requests to the websocket server in a run-once script works fine and looks like this:
async def send(websocket, payload):
await websocket.send(json.dumps(payload).encode("utf-8"))
async def recv(websocket):
data = await websocket.recv()
return json.loads(data)
async def main(payload):
uri = f"wss://the-third-party-server.com/xyz"
async with websockets.connect(uri) as websocket:
future = send(websocket, payload)
future_r = recv(websocket)
_, output = await asyncio.gather(future, future_r)
return output
asyncio.get_event_loop().run_until_complete(main({...}))
Here, main() establishes a WSS connection and closes it when done, but how can I keep that connection open for incoming HTTP requests, such that I can call main() for each of those without re-establising the WSS connection?
The main problem there is that when you code a web app responding http(s), your code have a "life cycle" that is very peculiar to that: usually you have a "view" function that will get the request data, perform all actions needed to gather the response data and return it.
This "view" function in most web frameworks has to be independent from the rest of the system - it should be able to perform its duty relying on no other data or objects than what it gets when called - which are the request data, and system configurations - that gives the application server (the framework parts designed to actually connect your program to the internet) can choose a variety of ways to serve your program: they may run your view function in several parallel threads, or in several parallel processes, or even in different processes in various containers or physical servers: you application would not need to care about that.
If you want a resource that is available across calls to your view functions, you need to break out of this paradigm. For example, typically, frameworks will want to create a pool of database connections, so that views on the same process can re-use those connections. These database connections are usually supplied by the framework itself, which implements a mechanism for allowing then to be reused, and be available in a transparent way, as needed. You have to recreate a mechanism of the same sort if you want to keep a websocket connection alive.
In a certain way, you need a Python object that can mediate your websocket data behaving like a "server" for your web view functions.
That is simpler to do than it sounds - a special Python class designed to have a single instance per process, which keeps the connections, and is able to send and receive data received from parallel calls without mangling it is enough. A callable that will ensure this instance exists in the current process is enough to work under any strategy configured to serve your app to the web.
If you are using Flask, which does not use asyncio, you get a further complication - you will loose the async-ability inside your views, they will have to wait for the websocket requisition to be completed - it will then be the job of your application server to have your view in different threads or processes to ensure availability. And, it is your job to have the asyncio loop for your websocket running in a separate thread, so that it can make the requests it needs.
Here is some example code.
Please note that apart from using a single websocket per process,
this has no provisions in case of failure of any kind, but,
most important: it does nothing in parallel: all
pairs of send-recv are blocking, as you give no clue of
a mechanism that would allow one to pair each outgoing message
with its response.
import asyncio
import threading
from queue import Queue
class AWebSocket:
instance = None
def __new__(cls, *args, **kw):
if cls.instance:
return cls.instance
return super().__new__(cls, *args, **kw)
def __init__(self, *args, **kw):
cls = self.__class__
if cls.instance:
# init will be called even if new finds the existing instance,
# so we have to check again
return
self.outgoing = Queue()
self.responses = Queue()
self.socket_thread = threading.Thread(target=self.start_socket)
self.socket_thread.start()
def start_socket():
# starts an async loop in a separate thread, and keep
# the web socket running, in this separate thread
asyncio.get_event_loop().run_until_complete(self.core())
def core(self):
self.socket = websockets.connect(uri)
async def _send(self, websocket, payload):
await websocket.send(json.dumps(payload).encode("utf-8"))
async def _recv(self, websocket):
data = await websocket.recv()
return json.loads(data)
async def core(self):
uri = f"wss://the-third-party-server.com/xyz"
async with websockets.connect(uri) as websocket:
self.websocket = websocket
while True:
# This code is as you wrote it:
# it essentially blocks until a message is sent
# and the answer is received back.
# You have to have a mechanism in your websocket
# messages allowing you to identify the corresponding
# answer to each request. On doing so, this is trivially
# paralellizable simply by calling asyncio.create_task
# instead of awaiting on asyncio.gather
payload = self.outgoing.get()
future = self._send(websocket, payload)
future_r = self._recv(websocket)
_, response = await asyncio.gather(future, future_r)
self.responses.put(response)
def send(self, payload):
# This is the method you call from your views
# simply do:
# `output = AWebSocket().send(payload)`
self.outgoing.put(payload)
return self.responses.get()
I have a web endpoint for users to upload file.
When the endpoint receives the request, I want to run a background job to process the file.
Since the job would take time to complete, I wish to return the job_id to the user to track the status of the request while the job is running in background.
I am wondering if asyncio would help in this case.
import asyncio
#asyncio.coroutine
def process_file(job_id, file_obj):
<process the file and dump results in db>
#app.route('/file-upload', methods=['POST'])
def upload_file():
job_id = uuid()
process_file(job_id, requests.files['file']) . # I want this call to be asyc without any await
return jsonify('message' : 'Request received. Track the status using: " + `job_id`)
With the above code, process_file method is never called. Not able to understand why.
I am not sure if this is the right way to do it though, please help if I am missing something.
Flask doesn't support async calls yet.
To create and execute heavy tasks in background you can use https://flask.palletsprojects.com/en/1.1.x/patterns/celery/ Celery library.
You can use this for reference:
Making an asynchronous task in Flask
Official documentation:
http://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#installing-celery
Even though you wrote #asyncio.coroutine() around a function it is never awaited which tells a function to return result.
Asyncio is not good for such kind of tasks, because they are blocking I/O. It is usually used to make function calls and return results fast.
As #py_dude mentioned, Flask does not support async calls. If you are looking for a library that functions and feels similar to Flask but is asynchronous, I recommend checking out Sanic. Here is some sample code:
from sanic import Sanic
from sanic.response import json
app = Sanic()
#app.route("/")
async def test(request):
return json({"hello": "world"})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8000)
Updating your database asynchronously shouldn't be an issue; refer to here to find asyncio-supported database drivers. For processing your file, check out aiohttp. You can run your server extremely fast on a single thread without any hickup if you do so asynchronously.
So we have a single thread flask server running where we receive requests from a python app client. In this flask server we use rabbitMQ with pika library to distribute messages to other clients.
What is happening is that in the get function the program is crashing with the error:
pika.exceptions.ConnectionClosed: (505, 'UNEXPECTED_FRAME - expected
content header for class 60, got non content header frame instead')
I've searched a lot of topics about this in stack overflow and others but they all address problems with multi threading which is not the case. Flask should only serve with one thread unless it is called in app.run(threaded=yes).
The program normally crashes when multiple messages are sent in a short interval (e.g. 5 per second) and it's also important to note that messages are being received every second with a request to this function:
#app.route('/api/users/getMessages', methods=['POST'])
def get_Messages():
data = json.loads(request.data)
token = data['token']
payload = jwt.decode(token, 'SECRET', algorithms=['HS256'])
istid = payload['istid']
print('istid: '+istid)
messages = []
queue = channel.queue_declare(queue=istid)
for i in range(queue.method.message_count):
method_frame, header_frame, body = channel.basic_get(queue=istid, no_ack=True)
if method_frame:
#print(method_frame, header_frame, body)
messages.append(body)
else:
print('No message returned')
res = {'messages':messages, 'error':0}
return jsonify(res)
In this code it crashes normally in the line:
queue = channel.queue_declare(queue=istid)
But we also tried to change the code to use a while instead of a for where it ends when the body is None and it crashes in the line:
method_frame, header_frame, body = channel.basic_get(queue=istid, no_ack=True)
in that case.
Also important, the crashes are random and it can work a few times and then randomly crashes after a get request while messages are being sent. If anyone knows anything related to this we would appreciate any help.
Another note, we thought about using basic_consume with callback instead of basic_get but we didn't find a way in which this would work since we have to send the messages back and have several user making requests to this same function.
EDIT #1:
In the rabbitMQ docs rabbitmq if you search for the function "def basic_get" you will notice there are some TODO comments and also a reference to this
Due to implementation details, this cannot be called a second time
until the callback is executed.
So I suspected that this could be what was happening but even if it is I don't know how could it be solved.
For anyone interested in the solution, as it is in the other comments, the program was not thread safe since flask as of version 1.0 uses threaded = True as default.
The solution is either:
1) running flask with app.run(threaded = False)
2) Making the program thread safe by implementing locks whenever accessing the channel /connection with pika.