gRPC response streaming with repeated field in Python - python

I am currently designing an API that is supposed to handle relatively small messages but many data entries. There are operations to add, delete and list all items stored in a database.
Now to my question: I want to return all entries (up to 5 Million) in a short amount of time. I figured response streaming would be the way to go.
Does it make sense to stream messages with a repeated field to be able to return multiple entries in one message. So far i haven't seen any indication whether that is faster or not.
Example:
rpc ListDataSet (ListDataSetRequest) returns (stream ListDataSetResponse);
message ListDataSetResponse {
string transaction_id = 1;
repeated Entries entries = 2;
}
And in the server i would append a certain amount of entries to each message and yield the messages while looping over the list of entries to use a generator.
Any recommendations or tips would be appreciated

Yes, it makes sense to stream messages containing repeated fields.
From a performance perspective, you may want to consider benchmarking your alternatives to prove this to yourself.
gRPC lacks comprehensive best practices but one reads that smaller messages are better and 4MiB is often given as a good, notional upper bound.
One other thing to consider is that it's not just the performance of your servers but also of your clients that you need to consider.
A more common pattern (?) is to page large results and give control to the client to ask for next|other pages. This may be worth evaluating too.
For exceptionally "huge" (unspecified) results, you'd likely be better placed returning a reference in your gRPC message to an out-of-band (e.g. object storage) object.

Related

Middleware to optimize postgres

In my company, we have an ingestion service written in Go whose job is to take messages from a HTTP end point and store them in Postgres. It receives a peak throughput of 50,000 messages/second. However, our database can handle a maximum of 30,000 messages/second.
Is it possible to write a middleware in Python to optimize this? If so please explain.
It seems to be pretty unrelated to Python or any particular programming language.
These are typical questions to be asked and answers to be given:
Are there duplicates? If yes, don't save every message immediately but rather wait for duplicates (for what some kind of RAM-originated cache is required, the simplest one is <thread-safe?> hashtable).
Batch your message into large enough packs and then dump them into PostgreSQL all-at-once. You have to determine what is "large enough" based on load tests.
Can you drop some of those messages? If your data is not of critical importance, or at least not all of it, then you may detect overload by tracking number of pending messages and start to throw incoming stuff away until load becomes acceptable.

Latched topic in ZeroMQ

Is it possible to have a "latched" topic in ZeroMQ, such that the last message sent to the topic is repeated to newly joined subscribers?
At the moment I have to create a REQ-REP-socket pair in addition to the PUB-SUB pair, so that when the new SUB joins, it asks for that last message using the REQ-socket. But this additional work, which is all boilerplate, is highly undesirable.
ROS has the "latched" option and it is described as:
When a connection is latched, the last message published is saved and
automatically sent to any future subscribers that connect. This is
useful for slow-changing to static data like a map. Note that if there
are multiple publishers on the same topic, instantiated in the same
node, then only the last published message from that node will be
sent, as opposed to the last published message from each publisher on
that single topic.
Well, your idea is doable in ZeroMQ:
Given a few bits from history, where due to a distributed-computing performance and memory capacity reasons and low costs of traffic, the topic-filter was initially implemented on the SUB-side(s), whereas later versions started to operate this feature on the PUB-side.
So, you application will never know in advance, which clients will use which version of the ZeroMQ and the problem is principally un-decidable.
Having this said,
your application user-code, on the PUB-side, can solve this, sending 2-in-1 formatted messages, and your SUB-side can be made aware of this soft-logic embedded into the message-stream.
Simply implement the "latched" logic in your user-code, be it via a naive re-send of each message per topic-line or some other means.
Yes, the very user-code is the only one, who can handle this,
not the PUB/SUB Scalable Formal Communication Pattern Archetype -- for two reasons -- it is not any general, universally applicable behaviour, but rather a user-specific speciality -- plus -- the topic-filter ( be it PUB-side or SUB-side operated ) has no prior knowledge about lexical-branching ( subscriptions are lexically interpreted from the left to the right and no one can a-priori say, what will a next subscriber actually subscribe to, and thus a "latched"-last-message store will not be able to get pre-populated until a new "next" subscriber actually joins and sets its actual topic-filter subscription ( storing all deterministic, combinatorics driven, possible {sub-|super-}-topic options is a very bad idea to circumvent the principal undecidability, isn't it? ) )

Match Making using GAE + ndb

I have a game in which users contact a server to find a user of their level who wants to play a game. Here is the basic architecture of a game request.
I am using ndb to store a waiting queue for each user level in the Google DataStore.
I am accessing these queues by their keys to ensure strong consistency (per this article). The entities are stored in the queue using a repeated (list of) LocalStructuredProperty.
Questions:
An entity is deleted from a waiting queue because it is matched to a request. The transaction is committed but not yet applied. That same entity is matched with another request and deleted. Will this throw an error?
These strongly consistent accesses are limited to ~1 write/sec. Is there a better architecture that would eliminate this constraint?
One thing I've considered for the latter question is to maintain multiple queues (whose number grows and shrinks with demand).
Not sure about your first question, but you might be able to simulate it with a sleep statement in your transaction.
For your second question, there is another architecture that you could use. If the waiting queue duration is relatively short (minutes instead of hours), you might want to use memcache. It will be a lot faster than writing to disk and you can avoid dealing with consistency issues.
1.- If you do the entity get and the post inside a transaction, then the same entity can not be matched for a game and therefore no error and it remains consistent.
2.- The 1 write per second is sthe limit for transactions inside the same entity group. If you need more, you can shard the queue entity.
You can use a dedicated memcache or a redis instance to avoid contention. This are much faster than the datastore.
See how these guys use tree nodes to do the match making:
https://www.youtube.com/watch?v=9nWyWwY2Onc

Is there a way to simulate python random.shuffle of a queue using a pseudorandom sequence or hash function?

I'm building an application based around a task queue: it serves a series of tasks to multiple, asynchronously connected clients. The twist is that the tasks must be served in a random order.
My problem is that the algorithm I'm using now is computationally expensive, because it relies on many large queries and transfers from the database. I have a strong hunch that there's a cheaper way to achieve the same result, but I can't quite see the solution. Can you think of a clever fix for this problem?
Here's the (computationally expensive) algorithm I'm using now:
When the client queries for a new task...
Query the database for "unfinished" tasks
Put all tasks in a list
Shuffle the list (using random.shuffle)
Flag the first task as "in progress"
Send the task parameters to the client for completion
When the client finishes the task...
6a. Record the result and flag the task as "finished."
If the client fails to finish the task by some deadline...
6b. Re-flag the task as "unfinished."
Seems like we could do better by replacing steps 1, 2, and 3, with pseudorandom sequences or hash functions. But I can't quite figure out the whole solution. Ideas?
Other considerations:
In case it's important, I'm using python and mongodb for all of this. (Mongodb doesn't have some clever "use find_one to efficiently return a random matching entry" usage, does it?)
The term "queue" is a little misleading. All the tasks are stored in subfields of a single collection within the mongodb. The length (total number of tasks) in the collection is known and fixed at the outset.
If it's necessary, it might be okay to let the same task be assigned multiple times, as long as the occurrence is rare. But instances of this kind would need to be very rare, because completing each task is costly.
I have identifying information on each client, so we know exactly who originates each task request.
There is an easy way to get a random document from MongoDB!
See Random record from MongoDB
If you don't want a task to be picked twice, you could mark the task as active and not select it.
Ah, based on the comments that I missed, you can do something along these lines:
import random
available = range(lengthofdatabase)
inprogress = []
while len(available) > 0:
taskindex = available.pop(random.randrange(0, len(available)))
# I'm not sure of your implementation, but you said something
# along these lines was possible
task = GetTask(taskindex)
inprogress.append(taskindex)
I'm not sure of any of the functions you are using - this is just an algorithm.
Happy Coding!

Python twisted asynchronous write using deferred

With regard to the Python Twisted framework, can someone explain to me how to write asynchronously a very large data string to a consumer, say the protocol.transport object?
I think what I am missing is a write(data_chunk) function that returns a Deferred. This is what I would like to do:
data_block = get_lots_and_lots_data()
CHUNK_SIZE = 1024 # write 1-K at a time.
def write_chunk(data, i):
d = transport.deferredWrite(data[i:i+CHUNK_SIZE])
d.addCallback(write_chunk, data, i+1)
write_chunk(data, 0)
But, after a day of wandering around in the Twisted API/Documentation, I can't seem to locate anything like the deferredWrite equivalence. What am I missing?
As Jean-Paul says, you should use IProducer and IConsumer, but you should also note that the lack of deferredWrite is a somewhat intentional omission.
For one thing, creating a Deferred for potentially every byte of data that gets written is a performance problem: we tried it in the web2 project and found that it was the most significant performance issue with the whole system, and we are trying to avoid that mistake as we backport web2 code to twisted.web.
More importantly, however, having a Deferred which gets returned when the write "completes" would provide a misleading impression: that the other end of the wire has received the data that you've sent. There's no reasonable way to discern this. Proxies, smart routers, application bugs and all manner of network contrivances can conspire to fool you into thinking that your data has actually arrived on the other end of the connection, even if it never gets processed. If you need to know that the other end has processed your data, make sure that your application protocol has an acknowledgement message that is only transmitted after the data has been received and processed.
The main reason to use producers and consumers in this kind of code is to avoid allocating memory in the first place. If your code really does read all of the data that it's going to write to its peer into a giant string in memory first (data_block = get_lots_and_lots_data() pretty directly implies that) then you won't lose much by doing transport.write(data_block). The transport will wake up and send a chunk of data as often as it can. Plus, you can simply do transport.write(hugeString) and then transport.loseConnection(), and the transport won't actually disconnect until either all of the data has been sent or the connection is otherwise interrupted. (Again: if you don't wait for an acknowledgement, you won't know if the data got there. But if you just want to dump some bytes into the socket and forget about it, this works okay.)
If get_lots_and_lots_data() is actually reading a file, you can use the included FileSender class. If it's something which is sort of like a file but not exactly, the implementation of FileSender might be a useful example.
The way large amounts of data is generally handled in Twisted is using the Producer/Consumer APIs. This doesn't give you a write method that returns a Deferred, but it does give you notification about when it's time to write more data.

Categories

Resources