Python Sockets - Creating a message format

Python Sockets - Creating a message format - python

I have built a Python server to which various clients can connect, and I need to set a predefined series of messages from clients to the server - For example the client passes in a name to the server when it first connects.
I was wondering what the best way to approach this is? How should I build a simple protocol for their communication?
Should the messages start with a specific set of bytes to mark them out as part of this protocol, then contain some sort of message id? Any suggestions or further reading appreciated.

First, you need to decide whether you want your protocol to be human readable (much more overhead) or binary. If the first, you probably want to use regular expressions to decode the messages. For this, use the python module re. If the latter, the module struct is your friend.
Second, if you are building a protocol that is somehow stateful (e.g. first we do a handshake, then we transfer data, then we check checksums and say goodbye) you probably want to create a some sort of FSM to track the state.
Third, if protocol design is not a familiar subject, read some simple protocol specifications, for example by IETF
If this is not a learning excercise, you might want to build up from something else, like Python Twisted

Depending on the requirements, you might want to consider using JSON: use "newline" terminated strings with JSON encoding. The transport protocol could be HTTP: with this, you could have access to all the "connection related" facilities (e.g. status codes) and have JSON encoded payload.
The advantages of using JSON over HTTP:
human readable (debugging etc.)
libraries available for all languages/platforms
cross-platform
browser debuggable (to some extent)
Of course, there are many other ways to skin this cat but the time to working prototype using this approach is very low. This is worth considering if your requirements (which aren't very detailed here) can be met.

Read some protocols, and try to find one that looks like what you need. Does it need to be message-oriented or stream-oriented? Does it need request order to be preserved, does it need requests to be paired with responses? Do you need message identifiers? Retries, back-off? Is it an RPC protocol, a message queue protocol?

See http://www.faqs.org/docs/artu/ch05s02.html and http://www.faqs.org/docs/artu/ch05s03.html for a good overview and discussion on data file formats and protocols.

Related

There is an example of Spyne client?

I'm trying to use spyne (http://spyne.io) in my server with ZeroMQ and MsgPack. I've followed the examples to program the server side, but i can't find any example that helps me to know how to program the client side.
I've found the class spyne.client.zeromq.ZeroMQClient , but I don't know what it's supposed to be the 'app' parameter of its constructor.
Thank you in advance!
Edit:
The (simplified) server-side code:
from spyne.application import Application
from spyne.protocol.msgpack import MessagePackRpc
from spyne.server.zeromq import ZeroMQServer
from spyne.service import ServiceBase
from spyne.decorator import srpc
from spyne.model.primitive import Unicode
class RadianteRPC(ServiceBase):
#srpc(_returns=Unicode)
def whoiam():
return "Hello I am Seldon!"
radiante_rpc = Application(
[RadianteRPC],
tns="radiante.rpc",
in_protocol=MessagePackRpc(validator="soft"),
out_protocol=MessagePackRpc()
)
s = ZeroMQServer(radiante_rpc, "tcp://127.0.0.1:5001")
s.serve_forever()

Spyne author here.
There are many issues with the Spyne's client transports.
First and most important being that they require server code to work. And that's because Spyne's wsdl parser is just halfway done, so there's no way to communicate the interface the server exposes to a client.
Once the Wsdl parser is done, Spyne's client transports will be revived as well. They're working just fine though, the tests pass, but they are (slightly) obsolete and, as you noticed, don't have proper docs.
Now back to your question: The app parameter to the client constructor is the same application instance that goes to the server constructor. So if you do this:
c = ZeroMQClient("tcp://127.0.0.1:5001", radiante_rpc)
print c.service.whoiam()
It will print "Hello I am Seldon!"
Here's the full code I just committed: https://github.com/arskom/spyne/tree/master/examples/zeromq
BUT:
All this said, you should not use ZeroMQ for RPC.
I looked at ZeroMQ for RPC purposes back when its hype was up at crazy levels, (I even got my name in ZeroMQ contributors list :)) I did not like what I saw, and I moved on.
Pasting my relevant news.yc comment from https://news.ycombinator.com/item?id=6089252 here:
In my experience, ZeroMQ is very fragile in RPC-like applications,
especially because it tries to abstract away the "connection". This
mindset is very appropriate when you're doing multicast (and ZeroMQ
rocks when doing multicast), but for unicast, I actually want to
detect a disconnection or a connection failure and handle it
appropriately before my outgoing buffers are choked to death. So, I'd
evaluate other alternatives before settling on ZeroMQ as a transport
for internal RPC-type messaging.
If you are fine with having the whole message in memory before parsing
(or sending) it (Http is not that bad when it comes to transferring
huge documents over the network), writing raw MessagePack document to
a regular TCP stream (or tucking it inside a UDP datagram) will do the
trick just fine. MessagePack library does support parsing streams --
see e.g. its Python example in its homepage (http://msgpack.org).
Disclosure: I'm just a happy MessagePack (and sometimes ZeroMQ) user.
I work on Spyne (http://spyne.io) so I just have experience with some
of the most popular protocols out there.
I seem to have written that comment more than a year ago. Fast forward to today, I got the MessagePack transport implemented and released in Spyne 2.11. So if you're looking for a lightweight transport for internally passing small messages, my recommendation would be to use it instead of ZeroMQ.
However, once you're outside the Http-land, you're back to dealing with sockets at the system-level, which may or may not be what you want, depending especially on the amount of resources you have to spare for this bit of your project.
Sadly, there is no documentation about it besides the examples I just put together here: https://github.com/arskom/spyne/tree/master/examples/msgpack_transport
The server code is fairly standard Spyne/Twisted code but the client is using system-level sockets to illustrate how it's supposed to work. I'd happily accept a pull request wrapping it to a proper Spyne client transport.
I hope this helps. Patches are welcome.
Best regards,

(Python) Send structured packet over network

i want to build a chat application which supports text messaging, group messaging, file transfer(like netmeeting). when i send it over TCP socket i saw that data is not structured all the data send as string over TCP. I want to send it in a structured way with few headers like name:,ip:,data:,data_type_flag:(file or text message) etc... one stackoverflow member told me to use TELEPATHY but i can't get a simple tutorial to understand. how can i send structured data over socket? or can any one suggest me a good tutorial to implement telepathy properly. i want to communicate over network as peer-to-peer rather than dedicated server.. Thanks

Try google protcol buffers or apache thrift. There are many examples for how to use them.
As for your comment about "peer to peer", please realize that even in peer-to-peer one of the peers is always acting as a server (sometimes both are).

TCP is a transport layer protocol, as opposed to application layer. This means that TCP is not responsible for the types of data you send, only the raw bits. HTTP has headers and other metadata because it is application level.
For a project like the one you're talking about, you will want to implement your own application layer protocol, but this is not exactly a trivial task. I would look at the python source code in the httplib module for an example of how to implement such a protocol, but note that this is likely fairly different still from what you want, as you will want persistent socket connections to be a first-class citizen in a peer-to-peer chat protocol like the one you're describing.
Another option is to use one of the various RPC libraries, eg xmlrpclib, which will handle a decent amount of the required low-level network things for you (although not file transfer; there are other libraries like the ftplib that can do this).

Pickle your data before you send it, and unpickle it on the other end?
http://docs.python.org/library/pickle.html

multiple send on httplib.HTTPConnection, and multiple read on HTTPResponse?

Should it be possible to send a plain, single http POST request (not chunk-encoded), in more than one segment? I was thinking of using httplib.HTTPConnection and calling the send method more than once (and calling read on the response object after each send).
(Context: I'm collaborating to the design of a server that offers services analogous to transactions [series of interrelated requests-responses]. I'm looking for the simplest, most compatible HTTP representation.)

After being convinced by friends that this should be possible, I found a way to do it. I override httplib.HTTPResponse (n.b. httplib.HTTPConnection is nice enough to let you specify the response_class it will instantiate).
Looking at socket.py and httplib.py (especially _fileobject.read()), I had noticed that read() only allowed 2 things:
read an exact number of bytes (this returns immediately, even if the connection is not closed)
read all bytes until the connection is closed
I was able to extend this behavior and allow free streaming with just a few lines of code. I also had to set the will_close member of my HTTPResponse to 0.
I'd still be interested to hear if this is considered acceptable or abusive usage of HTTP.

Simple server/client string exchange protocol

i am looking for an abstract and clean way to exchange strings between two python programs. The protocol is really simple: client/server sends a string to the server/client and it takes the corresponding action - via a handler, i suppose - and replies OR NOT to the other side with another string. Strings can be three things: an acknowledgement, signalling one side that the other on is still alive; a pickled class containing a command, if going from the "client" to the "server", or a response, if going from the "server" to the "client"; and finally a "lock" command, that signals a side of the conversation that the other is working and no further questions should be asked until another lock packet is received.
I have been looking at the python's built in SocketServer.TCPServer, but it's way too low level, it does not easily support reconnection and the client has to use the socket interface, which i preferred to be encapsulated.
I then explored the twisted framework, particularly the LineOnlyReceiver protocol and server examples, but i found the initial learning curve to be too steep, the online documentation assuming a little too much knowledge and a general lack of examples and good documentation (except the 2005 O'reilly book, is this still valid?).
I then tryied the pyliblo library, which is perfect for the task, alas it is monodirectional, there is no way to "answer" a client, and i need the answer to be associated to the specific command.
So my question is: is there an existing framework/library/module that allows me to have a client object in the server, to read the commands from and send the replies to, and a server object in the client, to read the replies from and send the commands to, that i can use after a simple setup (client, the server address is host:port, server, you are listening on port X) having the underlying socket, reconnection engine and so on handled?
thanks in advance to any answer (pardon my english and inexperience, this is my first question)

Python also provides an asyncchat module that simplifies much of the server/client behavior common to chat-like communications.

What you want to do seems a lot like RPC, so the things that come to my mind are XMLRPC or JSON RPC, if you dont want to use XML .
Python has a XMLRPC library that you can use, it uses HTTP as the transport so it also solves your problem of not being too low level. However if you could provide more detail in terms of what you exactly want to do perhaps we can give a better solution.

Content-based routing with RabbitMQ and Python

Is it possible with RabbitMQ and Python to do content-based routing?
The AMQP standard and RabbitMQ claims to support content-based routing, but are there any libraries for Python which support specifying content-based bindings etc.?
The library I am currently using (py-amqplib http://barryp.org/software/py-amqplib/) seems to only support topic-based routing with simple pattern-matching (#, *).

The answer is "yes", but there's more to it... :)
Let's first agree on what content-based routing means. There are two possible meanings. Some people say that it is based on the header portion of a message. Others say it's based on the data portion of a message.
If we take the first definition, these are more or less the assumptions we make:
The data comes into existence somewhere, and it gets sent to the AMQP broker by some piece of software. We assume that this piece of software knows enough about the data to put key-value (KV) pairs in the header of the message that describe the content. Ideally, the sender is also the producer of the data, so it has as much information as we could ever want. Let's say the data is an image. We could then have the sender put KV pairs in the message header like this:
width=1024
height=768
mode=bw
photographer=John Doe
Now we can implement content-based routing by creating appropriate queues. Let's say we have a separate operation to perform on black-and-white images and a separate one on colour images. We can create two queues, one that receives messages with mode=bw and another with mode=colour. Then we have separate clients listening on those queues. The broker performs the routing, and there is nothing in our client that needs to be aware of the routing.
If we take the second definition, we go from different assumptions. We assume that the data comes into existence somewhere, and it gets sent to AMQP broker by some piece of software. But we assume that it's not sensible to demand that that software should populate the header with KV pairs. Instead, we want to make a routing decision based on the data itself.
There are two options for this in AMQP: you can decide to implement a new exchange for your particular data format, or you can delegate the routing to a client.
In RabbitMQ, there are direct (1-to-1), fanout (1-to-N), headers (header-filtered 1-to-N) and topic (topic-filtered 1-to-N) exchanges, but you can implement your own according to the AMQP standard. This would require reading a lot of RabbitMQ documentation and implementing the exchange in Erlang.
The other option is to make an AMQP client in Python that listens to a special "content routing queue". Whenever a message arrives at the queue, your router-client picks it up, does whatever is needed to make a routing decision, and sends the message back to the broker to a suitable queue. So to implement the scenario above, your Python program would detect whether an image is in black-and-white or colour, and would (re)send it to a "black-and-white" or a "colour" queue, where some suitable client would take over.
So on your second question, there's really nothing that you do in your client that does any content-based binding. Either your client(s) work as described above, or you create a new exchange type in RabbitMQ itself. Then, in your client setup code, you define the exchange type to be your new type.
Hope this answers your question!

In RabbitMQ, routing is the process by which an exchange decides which queues to place your message on. You publish all messages to an exchange, but you only receive messages from a queue. This means that the exchange is an active part of the process that makes some decisions about message forwarding or copying.
The topic exchange included with RabbitMQ looks at a string on the incoming messages (the routing_key) and matches that with the patterns (the binding_keys) supplied by all queues which declare their desire to receive messages from the exchange.
RabbitMQ source code is on the web so you can have a look at the topic exchange code here:
http://hg.rabbitmq.com/rabbitmq-server/file/9b22dde04c9f/src/rabbit_exchange_type_topic.erl
A lot of the complexity there is to handle a data structure called a trie which allows for very fast lookups. In fact the same data structure is used inside Internet routers.
The headers exchange found here http://hg.rabbitmq.com/rabbitmq-server/file/9b22dde04c9f/src/rabbit_exchange_type_headers.erl
is probably easier to understand. As you can see there is not a lot of code required to make a different type of exchange. If you wanted to examine the content (or maybe just peek at the first few bytes of messages, you should be able to quickly identify XML versus JSON versus something else. And if your JSON objects and XML documents maintain a specific sequence of elements then you should be able to distinguish between different JSON objects (or XML doc types) without parsing the entire message body.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.