So, I'm trying to write a queue on Python for load testing and I'm stuck. What I have:
Rest API with authentication;
POST route for sending a request;
requests package for sending requests
Array with users email that also contains their user_id.
When I passed the authentication I got a UserID and access_token. I can save it in a dictionary or a list, but then I need to use it for sending requests: user_id need for route, access_token for checksum calculation. But I don't see any variants for using these cases. I thought about two for loops, but I need a way to save access_token and user_id for all of my users in the dictionary and use it. And I'm not sure about this case, too.
I tried to do it in multithreading. For example:
q = queue.Queue()
def run_with_queue_api_authorize():
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as pool:
auth = [pool.submit(ApiAuthorization.get_token_generated, email) for email in ApiVariables.users]
for r in concurrent.futures.as_completed(auth):
q.put(r)
print(q.qsize())
return q
def run_with_queue_send_activity():
start_time = int(datetime.datetime.now().timestamp())
run_with_queue_api_authorize()
end_time = int(datetime.datetime.now().timestamp())
print(f"Execution time for auth:{end_time - start_time}, start time is {start_time}, end_time is {end_time}")
time.sleep(10)
start_time = int(datetime.datetime.now().timestamp())
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as pool:
while not q.empty():
task = q.get()
new_try = [pool.submit(MultiQueue.multi_post_user_activity, task if task.done())]
if q.empty():
break
end_time = int(datetime.datetime.now().timestamp())
print(f"Execution time for post user activity:{end_time - start_time}, start time is {start_time}, end_time is {end_time}")
When I use this code I can send my test data but for one user in the 3000 of threads. And it'll be failed for 2999 threads. I don't understand why it doesn't work.
I tried to create a queue with threading package. It finishes immediately without any information. And I think that the solution with ThreadPoolExecutor is more reliable and most possible.
I can take a user_id from a special array but I need to get an access_token for this user. It doesn't look as a work case.
Why do I need it? Cause I need to get a count of requests in the second. But when I do authorization in the threads, I also send data at the same time asynchronously. And the total time for my tests will be counting with the time of authorization.
How can I resolved this problem? I've seen some videos about multithreading but it doesn't work for me. And I read a lot of information about the subject. But I can't apply it for my case.
I'd be grateful for any advice.
Related
I am trying to set up a fastAPI app doing the following:
Accept messages as post requests and put them in a queue;
A background job is, from time to time, pulling messages (up to a certain batch size) from the queue, processing them in a batch, and storing results in a dictionary;
The app is retrieving results from the dictionary and sending them back "as soon as" they are done.
To do so, I've set up a background job with apscheduler communicating via a queue trying to make a simplified version of this post: https://levelup.gitconnected.com/fastapi-how-to-process-incoming-requests-in-batches-b384a1406ec. Here is the code of my app:
import queue
import uuid
from asyncio import sleep
import uvicorn
from pydantic import BaseModel
from fastapi import FastAPI
from apscheduler.schedulers.asyncio import AsyncIOScheduler
app = FastAPI()
app.input_queue = queue.Queue()
app.output_dict = {}
app.queue_limit = 2
def upper_messages():
for i in range(app.queue_limit):
try:
obj = app.input_queue.get_nowait()
app.output_dict[obj['request_id']] = obj['text'].upper()
except queue.Empty:
pass
app.scheduler = AsyncIOScheduler()
app.scheduler.add_job(upper_messages, 'interval', seconds=5)
app.scheduler.start()
async def get_result(request_id):
while True:
if request_id in app.output_dict:
result = app.output_dict[request_id]
del app.output_dict[request_id]
return result
await sleep(0.001)
class Payload(BaseModel):
text: str
#app.post('/upper')
async def upper(payload: Payload):
request_id = str(uuid.uuid4())
app.input_queue.put({'text': payload.text, 'request_id': request_id})
return await get_result(request_id)
if __name__ == "__main__":
uvicorn.run(app)
however it's not really running asynchronously; if I invoke the following test script:
from time import time
import requests
texts = [
'text1',
'text2',
'text3',
'text4'
]
time_start = time()
for text in texts:
result = requests.post('http://127.0.0.1:8000/upper', json={'text': text})
print(result.text, time() - time_start)
the messages do get processed, but the whole processing takes 15-20 seconds, the output being something like:
"TEXT1" 2.961090087890625
"TEXT2" 7.96642279624939
"TEXT3" 12.962305784225464
"TEXT4" 17.96261429786682
I was instead expecting the whole processing to take 5-10 seconds (after less than 5 seconds the first two messages should be processed, and the other two more or less exactly 5 seconds later). It seems instead that the second message is not being put to the queue until the first one is processed - i.e. the same as if I were just using a single thread.
Questions:
Does anyone know how to modify the code above so that all the incoming messages are put to the queue immediately upon receiving them?
[bonus question 1]: The above holds true if I run the script (say, debug_app.py) from the command line via uvicorn debug_app:app. But if I run it with python3 debug_app.py no message is returned at all. Messages are received (doing CTRL+C results in Waiting for connections to close. (CTRL+C to force quit)) but never processed.
[bonus question 2]: Another thing I don't understand is why, if I remove the line await sleep(0.001) inside the definition of get_result, the behaviour gets even worse: no matter what I do, the app freezes, I cannot terminate it (i.e. neither CTRL+C nor kill work), I have to send a sigkill (kill -9) to stop it.
Background
If you are wondering why I am doing this, like in the blog post linked above, the purpose is to do efficient deep learning inference. The model I have takes (roughly) the same time processing one or a dozen requests at the same time, so batching can dramatically increase throughput. I first tried setting up a fastAPI frontend + RabbitMQ + Flask backend pipeline, and it worked, but the overhead of the complicated setup (and/or my inability of working with it) made the overhead heavier than the time it just took to compute the model, nullifying the gain... so I'm first trying to get a minimalistic version to work. The upper_messages method in this toy example will become either directly invocation of the model (if this computational-heavier step is not blocking incoming connections too much) or an async call to another process actually doing the computations - I'll see about that later...
... after looking better into it, it looks like the application was indeed working as I wanted it to, my error was in the way I tested it...
Indeed, when sending a POST request to the uvicorn server, the client is left waiting for an answer to come - which is intended behaviour. Of course, this also means, however, is that the next request is not sent until the first answer is collected. So the server is not batching them because there's nothing to batch!
To test this correctly, I slightly altered the test.py script to:
from time import time
import requests
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('prefix')
args = parser.parse_args()
texts = [
'text1',
'text2',
'text3',
'text4'
]
texts = [args.prefix + '_' + t for t in texts]
time_start = time()
for text in texts:
result = requests.post('http://127.0.0.1:8000/upper', json={'text': text})
print(result.text, time() - time_start)
And run it in multiple processes via:
python3 test.py user1 & python3 test.py user2 & python3 test.py user3
The output is now as expected, with pairs of messages (from different users!) being processed in a batch (and the exact order is a bit randomized, although the same user gets, of course, answers in the order of the requests it made):
"USER1_TEXT1" 4.340522766113281
"USER3_TEXT1" 4.340718030929565
"USER2_TEXT1" 9.334393978118896
"USER1_TEXT2" 9.340892553329468
"USER3_TEXT2" 14.33926010131836
"USER2_TEXT2" 14.334421396255493
"USER1_TEXT3" 19.339791774749756
"USER3_TEXT3" 19.33999013900757
"USER1_TEXT4" 24.33989715576172
"USER2_TEXT3" 24.334784030914307
"USER3_TEXT4" 29.338693857192993
"USER2_TEXT4" 29.333901166915894
I'm leaving the question open (and not accepting my own answer) because for the "bonus questions" above (about the application becoming frozen) I still don't have an answer.
I have an issue thinking of an architecture that'll solve the following problem:
I have a web application (producer) that receives some data on request. I also have a number of processes (consumers) that should process this data. 1 request generates 1 batch of data and should be processes by only 1 consumer.
My current solution consists of receiving the data, cache-ing it in memory with Redis, sending a message through a message channel that data has been written while the consumers are listening on the same channel, and then the data is processed by the consumers. The issue here is that I need to stop multiple consumers from working on the same data. So how can I inform the other consumers that I have started working on this task?
Producer code (flask endpoint):
data = request.get_json()
db = redis.Redis(connection_pool=pool)
db.set(data["externalId"], data)
# Subscribe to the batches channel and publish the id
db.pubsub()
db.publish('batches', request_key)
results = None
result_key = str(data["externalId"])
# Wait till the batch is processed
while results is None:
results = db.get(result_key)
if results is not None:
results = results.decode('utf8')
db.delete(data["externalId"])
db.delete(result_key)
Consumer:
db = redis.Redis(connection_pool = pool)
channel = db.pubsub()
channel.subscribe('batches')
while True:
try:
message = channel.get_message()
message_data = bytes(message['data']).decode('utf8')
external_id = message_data.split('-')[-1]
data = json.loads(db.get(external_id).decode('utf8'))
result = DataProcessor.process(data)
db.set(str(external_id), result)
except Exception:
pass
PUBSUB is often problematic for task queuing for exactly this reason. From the docs (https://redis.io/topics/pubsub):
SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe messaging paradigm where (citing Wikipedia) senders (publishers) are not programmed to send their messages to specific receivers (subscribers). Rather, published messages are characterized into channels, without knowledge of what (if any) subscribers there may be.
A popular alternative to consider would be to implement "publish" by pushing an element to the end of a Redis list, and "subscribe" by having your worker poll that list at some interval (exponential backoff is often an appropriate choice). In order to avoid cases where multiple workers get the same job, use lpop to get and remove an element from the list. Redis is single-threaded, so you're guaranteed only one worker will receive each element.
So, on the publish side, aim for something like this:
db = redis.Redis(connection_pool=pool)
db.rpush("my_queue", task_payload)
And on the subscribe side, you can safely run a loop like this in parallel as many times as you need:
while True:
db = redis.Redis(connection_pool=pool)
payload = db.lpop("my_queue")
if not payload:
continue
< deserialize and process payload here >
Note this is a last-in-first-out queue (LIFO) since we're pushing onto the right side with rpush and popping off the left with lpop. You can implement the FIFO version trivially by combining lpush/lpop.
I am new to working with message exchange and met problem finding proper manual for the task.
I need to organize pool of queues so, that:
Producer create some random empty queue and write there all the pack of messages (100 messages usually).
Consumer find non-empty and non-locked queue and read from it till
it's empty and then delete it and look for next one.
So my task is to work with messages as packs, I understand how to produce and consume using same key in one queue, but can't find how to work with the pool of queues.
We can have several producers and consumers run in parallel, but there is no matter which of them send to whom. We don't need and ever can't link particular producers with particular consumer.
General task: we have lot of clients to receive push-notifications, we group pushes by some parameters to process later as group, so such group should be in one queue in RabbitMQ to be produced and consumed as a group, but each group is independent from other groups.
Big thanks to Hannu for the help: key idea of his easy and robust solution that we can have one persistant queue with known name where producer will write names of created queues and consumer will read these names from there.
To make his solution more readable and easy work with in future in my personal task, I have divided publish_data() in producer into two function - one make random queue and write it to control_queue another receive this random_queue and fill it with messages. Similar idea is good for consumer - one function to process queue, another will be called for process message itself.
I have done something like this but with Pika. I had to clean and kombufy an old code snippet for the examples. It is probably not very kombuish (this is my absolutely first code snippet written using it) but this is how I would solve it. Basically I would set up a control queue with a known name.
Publishers will create a random queue name for a pack of messages, dump N messages to it (in my case numbers 1-42) and then post the queue name to the control queue. A consumer then receives this queue name, binds to it, reads messages until queue is empty and then deletes the queue.
This keeps things relatively simple, as publishers do not need to figure out where they are allowed to publish their groups of data (every queue is new with a random name). Receivers do not need to worry about timeouts or "all done" -messages, as a receiver would receive a queue name only when a group of data has been written to the queue and every message is there waiting.
There is also no need to tinker with locks or signalling or anything else that would complicate things. You can have as many consumers and producers as you want. And of course using exchanges and routing keys there could be different sets of consumers for different tasks etc.
Publisher
from kombu import Connection
import uuid
from time import sleep
def publish_data(conn):
random_name= "q" + str(uuid.uuid4()).replace("-", "")
random_queue = conn.SimpleQueue(random_name)
for i in xrange(0, 42):
random_queue.put(i)
random_queue.close()
return random_name
with Connection('amqp://guest:guest#localhost:5672//') as conn:
control_queue = conn.SimpleQueue('control_queue')
_a = 0
while True:
y_name = publish_data(conn)
message = y_name
control_queue.put(message)
print('Sent: {0}'.format(message))
_a += 1
sleep(0.3)
if _a > 20:
break
control_queue.close()
Consumer
from Queue import Empty
from kombu import Connection, Queue
def process_msg(foo):
print str(foo)
with Connection("amqp://guest:guest#localhost:5672//") as _conn:
sub_queue = _conn.SimpleQueue(str(foo))
while True:
try:
_msg = sub_queue.get(block=False)
print _msg.payload
_msg.ack()
except Empty:
break
sub_queue.close()
chan = _conn.channel()
dq = Queue(name=str(foo), exchange="")
bdq = dq(chan)
bdq.delete()
with Connection('amqp://guest:guest#localhost:5672//') as conn:
rec = conn.SimpleQueue('control_queue')
while True:
msg = rec.get(block=True)
entry = msg.payload
msg.ack()
process_msg(entry)
I have a function get_data(request) that requests some data to a server. Every time this function is called, it request data to a different server. All of them should return the same response.
I would like to get the response as soon as possible. I need to create a function that calls get_data several times, and returns the first response it gets.
EDIT:
I came up with an idea of using multithreading.Pipe(), but I have the feeling this is a very bad way to solve it, what do you think?:
def get_data(request, pipe):
data = # makes the request to a server, this can take a random amount of time
pipe.send(data)
def multiple_requests(request, num_servers):
my_pipe, his_pipe = multithreading.Pipe()
for i in range(num_servers):
Thread(target = get_data, args = (request,his_pipe)).start()
return my_pipe.recv()
multiple_requests("the_request_string", 6)
I think this is a bad way of doing it because you are passing the same pipe to all threads, and I don't really know but I guess that has to be very unsafe.
I think redis rq will be good for it. get_data is a job what you put in the queue six times. Jobs executes async, in the docs your also can read how to operate with results.
I have a python app where user can initiate a certain task.
The whole purpose of a task is too execute a given number of POST/GET requests with a particular interval to a given URL.
So user gives N - number of requests, V - number of requests per second.
How is it better to design such a task taking into account that due to a I/O latency the actual r/s speed could bigger or smaller.
First of all I decided to use Celery with Eventlet because otherwise I would need dozen of works which is not acceptable.
My naive approach:
Client starts a task using task.delay()
Inside task I do something like this:
#task
def task(number_of_requests, time_period):
for _ in range(number_of_requests):
start = time.time()
params_for_concrete_subtask = ...
# .... do some IO with monkey_patched eventlet requests library
elapsed = (time.time() - start)
# If we completed this subtask to fast
if elapsed < time_period / number_of_requests:
eventlet.sleep(time_period / number_of_requests)
A working example is here.
if we are too fast we try to wait to keep the desired speed. If we are too slow it's ok from client's prospective. We do not violate requests/second requirement. But will this resume correctly if I restart Celery?
I think this should work but I thought there is a better way.
In Celery I can define a task with a particular rate limit which will almost match my needs guarantee. So I could use Celery group feature and write:
#task(rate_limit=...)
def task(...):
#
task_executor = task.s(number_of_requests, time_period)
group(task_executor(params_for_concrete_task) for params_for_concrete_task in ...).delay()
But here I hardcode the the rate_limit which is dynamic and I do not see a way of changing it. I saw an example:
task.s(....).set(... params ...)
But I tried to pass rate_limit to the set method it it did not work.
Another maybe a bettre idea was to use Celery's periodic task scheduler. With the default implementation periods and tasks to be executed periodically is fixed.
I need to be able to dynamically create tasks, which run periodically a given number of times with a specific rate limit. Maybe I need to run my own Scheduler which will take tasks from DB? But I do not see any documentation around this.
Another approach was to try to use a chain function, but I could not figure out is there a delay between tasks parameter.
If you want to adjust the rate_limit dynamically you can do it using the following code. It is also creating the chain() at runtime.
Run this you will see that we successfully override the rate_limit of 5/sec to 0.5/sec.
test_tasks.py
from celery import Celery, signature, chain
import datetime as dt
app = Celery('test_tasks')
app.config_from_object('celery_config')
#app.task(bind=True, rate_limit=5)
def test_1(self):
print dt.datetime.now()
app.control.broadcast('rate_limit',
arguments={'task_name': 'test_tasks.test_1',
'rate_limit': 0.5})
test_task = signature('test_tasks.test_1').set(immutable=True)
l = [test_task] * 100
chain = chain(*l)
res = chain()
I also tried to override the attribute from within the class, but IMO the rate_limit is set when the task is registered by the worker, that is why the .set() has no effects. I'm speculating here, one would have to check the source code.
Solution 2
Implement your own waiting mechanism using the end time of the previous call, in the chain the return of the function is passed to the next one.
So it would look like this:
from celery import Celery, signature, chain
import datetime as dt
import time
app = Celery('test_tasks')
app.config_from_object('celery_config')
#app.task(bind=True)
def test_1(self, prev_endtime=dt.datetime.now(), wait_seconds=5):
wait = dt.timedelta(seconds=wait_seconds)
print dt.datetime.now() - prev_endtime
wait = wait - (dt.datetime.now() - prev_endtime)
wait = wait.seconds
print wait
time.sleep(max(0, wait))
now = dt.datetime.now()
print now
return now
#app.control.rate_limit('test_tasks.test_1', '0.5')
test_task = signature('test_tasks.test_1')
l = [test_task] * 100
chain = chain(*l)
res = chain()
I think this is actually more reliable than the broadcast.