Twisted RPC message aggregation - python

I'm working with a python application that makes remote procedure calls, using Twisted Perspective broker's callRemote, on a TCP connection. From a system call trace, it appears that multiple remote procedure calls from the sender could be aggregated together into a single sendto() call on the socket. The same behavior was observed with the receiver's response as well. I would've thought that as long as the socket was write-able and if there was some data to send, Perspective broker would send it out on the socket. But it does not appear to be the case.
Does Twisted's Perspective broker aggregate multiple RPC messages together for a specific reason, before they are sent on the socket ? In other words, does Twisted do something similar to Nagle's algorithm in TCP ?
If the above is true, is there an option to turn off this behavior ?

Twisted performs write buffering in the underlying twisted.internet.abstract.FileDescriptor object. You can try changing the twisted.internet.abstract.FileDescriptor.SEND_LIMIT attribute to something smaller to force it to write to the socket more frequently.
See the Twisted bug 4089 for discussion about the SEND_LIMIT and bufferSize attributes.

Related

Detecting when a tcp client is not active for more than 5 seconds

Im trying to make a tcp communication, where the server sends a message every x seconds through a socket, and should stop sending those messages on a certain condition where the client isnt sending any message for 5 seconds.
To be more detailed, the client also sends constant messages which are all ignored by the server on the same socket as above, and can stop sending them at any unknown time. The messages are, for simplicity, used as alive messages to inform the server that the communication is still relevant.
The problem is that if i want to send repeated messages from the server, i cannot allow it to "get busy" and start receiving messages instead, thus i cannot detect when a new messages arrives from the other side and act accordingly.
The problem is independent of the programming language, but to be more specific im using python, and cannot access the code of the client.
Is there any option of receiving and sending messages on a single socket simultaneously?
Thanks!
Option 1
Use two threads, one will write to the socket and the second will read from it.
This works since sockets are full-duplex (allow bi-directional simultaneous access).
Option 2
Use a single thread that manages all keep alives using select.epoll. This way one thread can handle multiple clients. Remember though, that if this isn't the only thread that uses the sockets, you might need to handle thread safety on your own
As discussed in another answer, threads are one common approach. The other approach is to use an event loop and nonblocking I/O. Recent versions of Python (I think starting at 3.4) include a package called asyncio that supports this.
You can call the create_connection method on an event_loop to create an asyncio connection. See this example for a simple server that reads and writes over TCP.
In many cases an event loop can permit higher performance than threads, but it has the disadvantage of requiring most or all of your code to be aware of the event model.

Is any possibility the python TLS over TCP make logic fails

I had a tcp proxy in python the version is 2.6.
It works fine in any cases with following logic
client ---> proxy ---> server
I wrapped the tcp with tls from proxy to server.
client ---> proxy ==++ssl++==> server
That works fine in some cases and fails in others.
The error is that the server is waiting for more information from the client, but client sends nothing more. At the 26th round trip.(Certainly, the round trip number of successful case also larger than 26.)
I cannot tell more about the detail but I thought the SSL should be transparent to the logic.
Any Idea that part of the functionality fails? How should I debug it?
Edit: In python 2.6, the tls version can only be 1.0.
It is hard to tell what you are doing without any example demonstrating the problem but depending on how your application works SSL/TLS is not just a transparent replacement for TCP sockets. While it might be transparent in most cases if you use only blocking sockets it gets different with non-blocking I/O. In this case you have to deal with user space buffering where select will not report available data even thought there are unread data. You also have to deal with situations where you temporarily fail to write because the TLS stack needs a read first or the other way.
For more details about differences with non-blocking I/O and select see Behavior of python's select() with partial recv() on SSL socket or select and ssl in python. Additionally non-blocking I/O needs special handling with accept and connect too but I doubt that there is useful support for it in the old python version you are using.

Proper way to close tcp sockets in python

I am currently working on a server + client combo on python and I'm using TCP sockets. From networking classes I know, that TCP connection should be closed step by step, first one side sends the signal, that it wants to close the connection and waits for confirmation, then the other side does the same. After that, socket can be safely closed.
I've seen in python documentation function socket.shutdown(flag), but I don't see how it could be used in this standard method, theoretical of closing TCP socket. As far as I know, it just blocks either reading, writing or both.
What is the best, most correct way to close TCP socket in python? Are there standard functions for closing signals or do I need to implement them myself?
shutdown is useful when you have to signal the remote client that no more data is being sent. You can specify in the shutdown() parameter which half-channel you want to close.
Most commonly, you want to close the TX half-channel, by calling shutdown(1). In TCP level, it sends a FIN packet, and the remote end will receive 0 bytes if blocking on read(), but the remote end can still send data back, because the RX half-channel is still open.
Some application protocols use this to signal the end of the message. Some other protocols find the EOM based on data itself. For example, in an interactive protocol (where messages are exchanged many times) there may be no opportunity, or need, to close a half-channel.
In HTTP, shutdown(1) is one method that a client can use to signal that a HTTP request is complete. But the HTTP protocol itself embeds data that allows to detect where a request ends, so multiple-request HTTP connections are still possible.
I don't think that calling shutdown() before close() is always necessary, unless you need to explicitly close a half-channel. If you want to cease all communication, close() does that too. Calling shutdown() and forgetting to call close() is worse because the file descriptor resources are not freed.
From Wikipedia: "On SVR4 systems use of close() may discard data. The use of shutdown() or SO_LINGER may be required on these systems to guarantee delivery of all data." This means that, if you have outstanding data in the output buffer, a close() could discard this data immediately on a SVR4 system. Linux, BSD and BSD-based systems like Apple are not SVR4 and will try to send the output buffer in full after close(). I am not sure if any major commercial UNIX is still SVR4 these days.
Again using HTTP as an example, an HTTP client running on SVR4 would not lose data using close() because it will keep the connection open after request to get the response. An HTTP server under SVR would have to be more careful, calling shutdown(2) before close() after sending the whole response, because the response would be partly in the output buffer.
According to the python documentation which says:
Strictly speaking, you’re supposed to use shutdown on a socket before
you close it. The shutdown is an advisory to the socket at the other
end. Depending on the argument you pass it, it can mean “I’m not going
to send anymore, but I’ll still listen”, or “I’m not listening, good
riddance!”. Most socket libraries, however, are so used to programmers
neglecting to use this piece of etiquette that normally a close is the
same as shutdown(); close(). So in most situations, an explicit
shutdown is not needed.
I think the most correct way to close a TCP connection would be to use shutdown before closing a connection, because close is not atomic! This can make some bugs. Suppose you're using close function without shutdown and the data didn't send to the server correctly, at the same time python closes the connection and server can't reply to client, now the socket at the other end may hang indefinitely.

Which protocol should I use for pyzmq?

I am working on a project where I have a client server model in python. I set up a server to monitor requests and send back data. PYZMQ supports: tcp, udp, pgm, epgm, inproc and ipc. I have been using tcp for interprocess communication, but have no idea what i should use for sending a request over the internet to a server. I simply need something to put in:
socket.bind(BIND_ADDRESS)
DIAGRAM: Client Communicating over internet to server running a program
Any particular reason you're not using ipc or inproc for interprocess communication?
Other than that, generally, you can consider tcp the universal communicator; it's not always the best choice, but no matter what (so long as you actually have an IP address) it will work.
Here's what you need to know when making a choice between transports:
PGM/EPGM are multicast transports - the idea is that you send one message and it gets delivered as a single message until the last possible moment where it will be broken up into multiple messages, one for each receiver. Unless you absolutely know you need this, you don't need this.
IPC/Inproc are for interprocess communication... if you're communicating between different threads in the same process, or different processes on the same logical host, then these might be appropriate. You get the benefit of a little less overhead. If you might ever add new logical hosts, this is probably not appropriate.
Russle Borogove enumerates the difference between TCP and UDP well. Typically you'll want to use TCP. Only if absolute speed is more important than reliability then you'll use UDP.
It was always my understanding that UDP wasn't supported by ZMQ, so if it's there it's probably added by the pyzmq binding.
Also, I took a look at your diagram - you probably want the server ZMQ socket to bind and the client ZMQ socket to connect... there are some reasons why you might reverse this, but as a general rule the server is considered the "reliable" peer, and the client is the "transient" peer, and you want the "reliable" peer to bind, the "transient" peer to connect.
Over the internet, TCP or UDP are the usual choices. I don't know if pyzmq has its own delivery guarantees on top of the transport protocol. If it doesn't, TCP will guarantee in-order delivery of all messages, while UDP may drop messages if the network is congested.
If you don't know what you want, TCP is the simplest and safest choice.

Multi-threaded UDP server with Python

I want to create a simple video streaming (actually, image streaming) server that can manage different protocols (TCP Push/Pull, UDP Push/Pull/Multicast).
I managed to get TCP Push/Pull working with the SocketServer.TCPServer class and ThreadinMixIn for processing each connected client in a different thread.
But now that I'm working on the UDP protocol, I just realized that ThreadinMixIn creates a thread per call of handle() per client query (as there's nothing such as a "connection" in UDP).
The problem is I need to process a sequence of queries by the same client, for all the clients. How could I manage that ?
The only way I see I could handle that is to have a list of (client adresses, processing thread) and send each query to the matching thread (or create a new one if the client haven't sent any thread yet). Is there an easier way to do that ?
Thanks !
P.S : I can't use any external or too "high-level" library for this as it's a school subject meant to understand how sockets work.
Take a look at Twisted. This will remove the need to do any thread dispatch from your application. You still have to match up packets to a particular session in order to handle them, but this isn't difficult (use a port per client and dispatch based on the port, or require packets in a session to always come from the same address and use the peer address, or use one of the existing protocols that solves this problem such as SIP).

Categories

Resources