Python twisted asynchronous write using deferred

Python twisted asynchronous write using deferred - python

With regard to the Python Twisted framework, can someone explain to me how to write asynchronously a very large data string to a consumer, say the protocol.transport object?
I think what I am missing is a write(data_chunk) function that returns a Deferred. This is what I would like to do:
data_block = get_lots_and_lots_data()
CHUNK_SIZE = 1024 # write 1-K at a time.
def write_chunk(data, i):
d = transport.deferredWrite(data[i:i+CHUNK_SIZE])
d.addCallback(write_chunk, data, i+1)
write_chunk(data, 0)
But, after a day of wandering around in the Twisted API/Documentation, I can't seem to locate anything like the deferredWrite equivalence. What am I missing?

As Jean-Paul says, you should use IProducer and IConsumer, but you should also note that the lack of deferredWrite is a somewhat intentional omission.
For one thing, creating a Deferred for potentially every byte of data that gets written is a performance problem: we tried it in the web2 project and found that it was the most significant performance issue with the whole system, and we are trying to avoid that mistake as we backport web2 code to twisted.web.
More importantly, however, having a Deferred which gets returned when the write "completes" would provide a misleading impression: that the other end of the wire has received the data that you've sent. There's no reasonable way to discern this. Proxies, smart routers, application bugs and all manner of network contrivances can conspire to fool you into thinking that your data has actually arrived on the other end of the connection, even if it never gets processed. If you need to know that the other end has processed your data, make sure that your application protocol has an acknowledgement message that is only transmitted after the data has been received and processed.
The main reason to use producers and consumers in this kind of code is to avoid allocating memory in the first place. If your code really does read all of the data that it's going to write to its peer into a giant string in memory first (data_block = get_lots_and_lots_data() pretty directly implies that) then you won't lose much by doing transport.write(data_block). The transport will wake up and send a chunk of data as often as it can. Plus, you can simply do transport.write(hugeString) and then transport.loseConnection(), and the transport won't actually disconnect until either all of the data has been sent or the connection is otherwise interrupted. (Again: if you don't wait for an acknowledgement, you won't know if the data got there. But if you just want to dump some bytes into the socket and forget about it, this works okay.)
If get_lots_and_lots_data() is actually reading a file, you can use the included FileSender class. If it's something which is sort of like a file but not exactly, the implementation of FileSender might be a useful example.

The way large amounts of data is generally handled in Twisted is using the Producer/Consumer APIs. This doesn't give you a write method that returns a Deferred, but it does give you notification about when it's time to write more data.

Related

Python 3.5 non blocking functions

I have a fairly large python package that interacts synchronously with a third party API server and carries out various operations with the server. Additionally, I am now also starting to collect some of the data for future analysis by pickling the JSON responses. After profiling several serialisation/database methods, using pickle was the fastest in my case. My basic pseudo-code is:
While True:
do_existing_api_stuff()...
# additional data pickling
data = {'info': []} # there are multiple keys in real version!
if pickle_file_exists:
data = unpickle_file()
data['info'].append(new_data)
pickle_data(data)
if len(data['info']) >= 100: # file size limited for read/write speed
create_new_pickle_file()
# intensive section...
# move files from "wip" (Work In Progress) dir to "complete"
if number_of_pickle_files >= 100:
compress_pickle_files() # with lzma
move_compressed_files_to_another_dir()
My main issue is that the compressing and moving of the files takes several seconds to complete and is therefore slowing my main loop. What is the easiest way to call these functions in a non-blocking way without any major modifications to my existing code? I do not need any return from the function, however it will raise an error if anything fails. Another "nice to have" would be for the pickle.dump() to also be non-blocking. Again, I am not interested in the return beyond "did it raise an error?". I am aware that unpickle/append/re-pickle every loop is not particularly efficient, however it does avoid data loss when the api drops out due to connection issues, server errors, etc.
I have zero knowledge on threading, multiprocessing, asyncio, etc and after much searching, I am currently more confused than I was 2 days ago!
FYI, all of the file related functions are in a separate module/class, so that could be made asynchronous if necessary.
EDIT:
There may be multiple calls to the above functions, so I guess some sort of queuing will be required?

Easiest solution is probably the threading standard library package. This will allow you to spawn a thread to do the compression while your main loop continues.
There is almost certainly quite a bit of 'dead time' in your existing loop waiting for the API to respond and conversely there is quite a bit of time spent doing the compression when you could be usefully making another API call. For this reason I'd suggest separating these two aspects. There are lots of good tutorials on threading so I'll just describe a pattern which you could aim for
Keep the API call and the pickling in the main loop but add a step which passes the file path to each pickle to a queue after it is written
Write a function which takes a the queue as its input and works through the filepaths performing the compression
Before starting the main loop, start a thread with the new function as its target

Read a file using threads

I try to write a python program that send files from one PC to another using python's sockets. But when file size increase it takes lots of time. Is it possible to read lines of a file sequentially using threads?
The concepts which I think is as follows:
Each thread separately and sequentially read lines from file and send it over socket. Is it possible to do? Or do you have any suggestion for it?

First, if you want to speed this up as much as possible without using threads, reading and sending a line at a time can be pretty slow. Python does a great job of buffering up the file to give you a line at a time for reading, but then you're sending tiny 72-byte packets over the network. You want to try to send at least 1.5KB at a time when possible.
Ideally, you want to use the sendfile method. Python will tell the OS to send the whole file over the socket in whatever way is most efficient, without getting your code involved at all. Unfortunately, this doesn't work on Windows; if you care about that, you may want to drop to the native APIs1 directly with pywin32 or switch to a higher-level networking library like twisted or asyncio.
Now, what about threading?
Well, reading a line at a time in different threads is not going to help very much. The threads have to read sequentially, fighting over the read pointer (and buffer) in the file object, and they presumably have to write to the socket sequentially, and you probably even need a mutex to make sure they write things in order. So, whichever one of those is slowest, all of your threads are going to end up waiting for their turn.2
Also, even forgetting about the sockets: Reading a file in parallel can be faster in some situations on modern hardware, but in general it's actually a lot slower. Imagine the file is on a slow magnetic hard drive. One thread is trying to read the first chunk, the next thread is trying to read the 64th chunk, the next thread is trying to read the 4th chunk… this means you spend more time seeking the disk head back and forth than actually reading data.
But, if you think you might be in one of those situations where parallel reads might help, you can try it. It's not trivial, but it's not that hard.
First, you want to do binary reads of fixed-size chunks. You're going to need to experiment with different sizes—maybe 4KB is fastest, maybe 1MB… so make sure to make it a constant you can easily change in just one place in the code.
Next, you want to be able to send the data as soon as you can get it, rather than serializing. This means you have to send some kind of identifier, like the offset into the file, before each chunk.
The function will look something like this:
def sendchunk(sock, lock, file, offset):
with lock:
sock.send(struct.pack('>Q', offset)
sent = sock.sendfile(file, offset, CHUNK_SIZE)
if sent < CHUNK_SIZE:
raise OopsError(f'Only sent {sent} out of {CHUNK_SIZE} bytes')
… except that (unless your files actually are all multiples of CHUNK_SIZE) you need to decide what you want to do for a legitimate EOF. Maybe send the total file size before any of the chunks, and pad the last chunk with null bytes, and have the receiver truncate the last chunk.
The receiving side can then just loop reading 8+CHUNK_SIZE bytes, unpacking the offset, seeking, and writing the bytes.
1. See TransmitFile—but in order to use that, you have to know about how to go between Python-level socket objects and Win32-level HANDLEs, and so on; if you've never done that, there's a learning curve—and I don't know of a good tutorial to get you started..
2. If you're really lucky, and, say, the file reads are only twice as fast as the socket writes, you might actually get a 33% speedup from pipelining—that is, only one thread can be writing at a time, but the threads waiting to write have mostly already done their reading, so at least you don't need to wait there.

Not Threads.
source_path = r"\\mynetworkshare"
dest_path = r"C:\TEMP"
file_name = "\\myfile.txt"
shutil.copyfile(source_path + file_name, dest_path + file_name)
https://docs.python.org/3/library/shutil.html
Shutil offers a high level copy function that uses the OS layer to copy. It is your best bet for this scenario.

mpi4py recv data cap?

I am working on a program that is communication intensive with a group of people. I'm not particularly good at debugging distributed programs, but I have a strong suspicion that I am sending too many messages at once to a process. I have reimplemented the actor model in mpi4py. Each process has a "mailbox" of jobs and when they finish with their mailbox they decide to go into CHECK_FOR_UPDATES mode, where they see if there is any new messages they can receive.
I had issues with the program that a group of students and I have been working on. When the load became too big it would start to crash, but we couldn't figure out where the issue was because we're all pretty bad at debugging stuff.
I asked some people at my school if he had any ideas and suggested that, as we are reimplementing the actor system, we should consider using Akka. A student this year said that there may still be a problem, that one actor may get inundated with messages and crash. I asked about it here. The stream model seems not to be what we want (see my comment for more details) and I have since then looked back at the mpi4py program as I had not accounted for this problem before.
In the plain C or Fortran implementation, it appears that there is a count parameter for MPI_Recv. I noticed that comm.recv has no count parameter and suspect that when a process goes into CHECK_FOR_UPDATES mode it just consume a ton of messages from a variety of sources and dies. (Technically, I don't know for sure, but we suspect it might be the case.) Is there a way to cap the amount of data comm.recv accepts?
(Note: I want to avoid using comm.Recv variant as it restricts the user to using numpy arrays.)

Found the answer:
The recv() and irecv() methods may be passed a buffer object that can be repeatedly used to receive messages avoiding internal memory allocation. The buffer must be sufficiently large to accomodate the transmitted messages.
Emphasis mine. Therefore, I have to use Send and Recv.

Improving speed of xmlrpclib

I'm working with a device that is essentially a black box, and the only known communication method for it is XML-RPC. It works for most needs, except for when I need to execute two commands very quickly after each other. Due to the overhead and waiting for the RPC response, this is not as quick as desired.
My main question is, how does one reduce this overhead to make this functionality possible? I know the obvious solution is to ditch XML-RPC, but I don't think that's possible for this device, as I have no control over implementing any other protocols from the "server". This also makes it impossible to do a MultiCall, as I can not add valid instructions for MultiCall. Does MultiCall have to be implemented server side? For example, if I have method1(), method2(), and method3() all implemented by the server already, should this block of code work to execute them all in one reply? I'd assume no from my testing so far, as the documentation shows examples where I need to initialize commands on the server side.
server=xmlrpclib.ServerProxy(serverURL)
multicall=xmlrpclib.MultiCall(server)
multicall.method1()
multicall.method2()
mutlicall.method3()
multicall()
Also, looking through the source of xmlrpclib, I see references to a "FastParser" as opposed to a default one that is used. However, I can not determine how to enable this parser over the default. Additionally, the comment on this answer mentions that it parses one character at a time. I believe this is related, but again, no idea how to change this setting.

Unless the bulk size of your requests or responses are very large, it's unlikely that changing the parser will affect the turnaround time (since CPU is much faster than network).
You might want to consider, if possible, sending more than one command to the device without waiting for the response from the first one. If the device can handle multiple requests at once, then this may be of benefit. Even if the device only handles requests in sequence, you can still have the next request waiting at the device so that there is no delay after processing the previous one. If the device serialises requests in this way, then that's goingn to be about the best you can do.

New transport and reader type in Twisted

I'm trying to add a new transport to Twisted, which will read data from a stream - either a file in a tail -f way, or from a pipe, but I have some problems with Twisted architecture.
I've got the transport itself (implements ITransport) ready - it handles all file opening. I've got streaming functions/deferreds ready. How do I put it together now? I'd like to report the new data back to some protocol's dataReceived().
I could of course create a new object that will set up the I/O monitors with proper callbacks, register a callback on reactor shutting down (to close the files / protocols) and start everything up manually - but is that "the right way"? Is there any nicer abstraction I could use? I've seen reactor.connectWith(), but it doesn't really provide much of an abstraction...
Also - how am I supposed to pass the data from my reader to the protocol? ITransport doesn't define any interface for it, even though it seems like exactly the transport's responsibility.

It sounds like you've mostly figured out how to do this. You might be interested in twisted.internet.fdesc.readFromFD, but it's only a few lines long and it's not doing anything particularly complicated (it's a few lines you don't have to maintain, though). Aside from that - yes, you have to do the I/O monitoring in this case, because regular file descriptors aren't supported by select/poll/epoll (they always get reported as ready, not what you want).
Some work has been done on supporting inotify in Twisted (http://twistedmatrix.com/trac/ticket/972) but this isn't complete yet, so it's not going to be directly useful to you now (unless you want to help finish it and then use it). Assuming you just use time-based polling, much of what's in the reactor isn't going to help you out much, since that code is focused on using a system-provided readiness API (ie, select/poll/epoll) to trigger events.
For the pipe case, though, you should be able to use and benefit from IReactorFDSet's methods - addReader et al.
Your time-based polling transport may still benefit from implementing ITransport - although I'm not sure how you would implement write for a tail -f-like transport. You will definitely benefit from having your transport deliver data via the IProtocol interface, since this simplifies code-reuse. IProtocol.dataReceived is exactly how you want to pass data from your reader (I think that's the same as your transport, isn't it?). This isn't defined on ITransport because it's a method you call on some other object which is not the transport.
reactor.connectWith probably isn't going to buy you anything. As you say, it's not much of an abstraction; I'd say it's more of a mistake. :)
Don't worry too much about not being able to add methods directly to the reactor. A free-function which accepts a reactor as a parameter is just as easy to use.
For the shutdown callback, addReader should actually get you most of the way there. Any reader in the reactor at shutdown time will have connectionLost called on it (part of IFileDescriptor). You should implement this to clean up the files and protocol.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.