Send strings over the network - python

Here's a simple question and I'm surprised I haven't come across a similar one already.
I would like two processes to send strings (messages) to each other with send() and receive() functions. Here's a basic example:
# Process 1
# ... deal with sockets, connect to process 2 ...
msg = 'An arbitrarily long string\nMaybe with line breaks'
conn.send(msg)
msg = conn.receive()
if process1(msg):
conn.send('ok')
else:
conn.send('nok')
and
# Process 2
# ... deal with sockets, connect to process 1 ...
msg = conn.receive()
conn.send(process2(msg))
msg = conn.receive()
if msg == 'ok':
print('Success')
elif msg == 'nok':
print('Failure')
else:
print('Protocol error')
I know it is quite easy with bare stream sockets, but that's still cumbersome and error prone (do several conn.recv() inside a loop and check for size, like HTTP or end of stream marker, like SMTP, etc).
By the way, it doesn't necessarily need to use sockets, as long as messages of any size can be reliably carried through the network in an efficient manner.
Am I doing something wrong? Isn't there a simple library (Twisted AMP doesn't look simple) doing exactly that? I've been searching the Internet for a few hours without success :)

You can use ZeroMQ, there is an excellent Python binding called pyzmq.
It is a library for writing all kinds of distributed software, based on the
concept of message queues.
The project got a lot of hype lately, and you will find numerous examples and
tutorials on the web.

Related

I need advice should I use select or threading?

I'm building a live radio streamer, and I was wondering how I should handle multiple connections. Now from my experience select will block the audio from being streamed. It only plays 3 seconds then stops playing. I will provide an example of what I mean.
import socket, select
headers = """
HTTP/1.0 200 OK\n
Content-Type: audio/mpeg\n
Connection: keep-alive\n
\n\n
"""
file="/path/to/file.mp3"
bufsize=4096 # actually have no idea what this should be but python-shout uses this amount
sock = socket.socket()
cons = list()
buf = 0
nbuf = 0
def runMe():
cons.append(sock)
file = open(file)
nbuf = file.read(bufsize) # current buffer
while True:
buf = nbuf
nbuf = file.read(bufsize)
if len(buf) == 0:
break
rl, wl, xl = select.select(cons, [], [], 0.2)
for s in rl:
if s == sock:
con, addr = s.accept()
con.setblocking(0)
cons.append(con)
con.send(header)
else:
data = s.recv(1024)
if not data:
s.close()
cons.remove(s)
else:
s.send(buf)
That is an example of how i'd use select. But, the song will not play all the way. But if I send outside the select loop it'll play but it'll die on a 2nd connection. Should I use threading?
That is an example of how i'd use select. But, the song will not play
all the way. But if I send outside the select loop it'll play but
it'll die on a 2nd connection. Should I use threading?
You can do it either way, but if your select-implementation isn't working properly it's because your code is incorrect, not because a select-based implementation isn't capable of doing the job -- and I don't think a multithreaded solution will be easier to get right than a select-based solution.
Regardless of which implementation you choose, one issue you're going to have to think about is timing/throughput. Do you want your program to send out the audio data at approximately the same rate it is meant to be played back, or do you want to send out audio data as fast as the client is willing to read it, and leave it up to the client to read the data at the appropriate speed? Keep in mind that each TCP stream's send-rate will be different, depending on how fast the client chooses to recv() the information, as well as on how well the network path between your server and the client performs.
The next problem to deal with after that is the problem of a slow client -- what do you want your program to do when one of the TCP connections is very slow, e.g. due to network congestion? Right now your code just blindly calls send() on all sockets without checking the return value, which (given that the sockets are non-blocking) means that if a given socket's output-buffer is full, then some (probably many) bytes of the file will simply get dropped -- maybe that is okay for your purpose, I don't know. Will the clients be able to make use of an mp3 data stream that has arbitrary sections missing? I imagine that the person running that client will hear glitches, at best.
Implementation issues aside, if it was me I'd prefer the single-threaded/select() approach, simply because it will be easier to test and validate. Either approach is going to take some doing to get right, but with a single thread, your program's behavior is much more deterministic -- either it works right or it doesn't, and running a given test will generally give the same result each time (assuming consistent network conditions). In a multithreaded program, OTOH, the scheduling of the threads is non-deterministic, which makes it very easy to end up with a program that works correctly 99.99% of the time and then seriously malfunctions, but only once in a blue moon -- a situation that can be very difficult to debug, as you end up spending hours or days just reproducing the fault, let alone diagnosing and fixing it.

How does the select() function in the select module of Python exactly work?

I am working on writing a network-oriented application in Python. I had earlier worked on using blocking sockets, but after a better understanding of the requirement and concepts, I am wanting to write the application using non-blocking sockets and thus an event-driven server.
I understand that the functions in the select module in Python are to be used to conveniently see which socket interests us and so forth. Towards that I was basically trying to flip through a couple of examples of an event-driven server and I had come across this one:
"""
An echo server that uses select to handle multiple clients at a time.
Entering any line of input at the terminal will exit the server.
"""
import select
import socket
import sys
host = ''
port = 50000
backlog = 5
size = 1024
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((host,port))
server.listen(backlog)
input = [server,sys.stdin]
running = 1
while running:
inputready,outputready,exceptready = select.select(input,[],[])
for s in inputready:
if s == server:
# handle the server socket
client, address = server.accept()
input.append(client)
elif s == sys.stdin:
# handle standard input
junk = sys.stdin.readline()
running = 0
else:
# handle all other sockets
data = s.recv(size)
if data:
s.send(data)
else:
s.close()
input.remove(s)
server.close()
The parts that I didn't seem to understand are the following:
In the code snippet inputready,outputready,exceptready = select.select(input,[],[]), I believe the select() function returns three possibly empty lists of waitable objects for input, output and exceptional conditions. So it makes sense that the first argument to the select() function is the list containing the server socket and the stdin. However, where I face confusion is in the else block of the code.
Since we are for-looping over the list of inputready sockets, it is clear that the select() function will choose a client socket that is ready to be read. However, after we read data using recv() and find that the socket has actually sent data, we would want to echo it back to the client. My question is how can we write to this socket without adding it to the list passed as second argument to the select() function call? Meaning, how can we call send() on the new socket directly without 'registering' it with select() as a writable socket?
Also, why do we loop only over the sockets ready to be read (inputready in this case)? Isn't it necessary to loop over even the outputready list to see which sockets are ready to be written?
Obviously, I am missing something here.
It would also be really helpful if somebody could explain in a little more detailed fashion the working of select() function or point to good documentation.
Thank you.
Probably that snippet of code is just a simple example and so it is not exhaustive. You are free to write and read in every socket, also if select does not tell you that they are ready. But, of course, if you do this you cannot be sure that your send() won't block.
So, yes, it would be best practice to rely on select also for writing operations.
There are also many other function which have a similar purpose and in many cases they are better then select (e.g. epoll), but they are not available on all platforms.
Information about the select, epoll & other functions may be found in Linux man pages.
However in python there are many nice libraries used to handle many connections, some of these are: Twisted and gevent

Simplest python network messaging

I have a machine control system in Python that currently looks roughly like this
goal = GoalState()
while True:
current = get_current_state()
move_toward_goal(current,goal)
Now, I'm trying to add in the ability to control the machine over the network. The code I want to write would be something like this:
goal = GoalState()
while True:
if message_over_network():
goal = new_goal_from_message()
current = get_current_state()
move_toward_goal(current,goal)
What would be the simplest and most Pythonic way of adding this sort of networking capability into my application? Sockets could work, thought they don't particularly feel Pythonic. I've looked at XMLRPC and Twisted, but both seemed like they would require major revisions to the code. I also looked at ØMQ, but it felt like I was adding an external dependency that didn't offer anything that I didn't already have with sockets.
I'm not opposed to using any of the systems that I've addressed above, as what I believe to be failings are probably misunderstandings on my part. I'm simply curious as to the idiomatic way of handling this simple, common issue.
The are at least two issues you need to decide on:
How to exchange messages?
In what format?
Regarding 1. TCP sockets are the lowest level and you would need to deal with low level things like recognizing messages boundaries. Also, TCP connection gives you reliable delivery but only as long as the connection is not reset (due to for example a temporary network failure). If you want your application to gracefully recover when a TCP connection resets, you need to implement some form of messages acknowledgements to keep track what needs to be resend over the new connection. OMQ gives you higher level of abstraction than plain TCP connection. You don't need to deal with a stream of bytes but with whole messages. It still does not give you reliable delivery, messages can get lost, but it gives several communication patterns that can be used to ensure reliable delivery. 0MQ is also highly performant, IMO it is a good choice.
Regarding 2, if interoperability with other languages is not needed, Pickle is a very convenient and Pythonic choice. If interoperability is needed, you can consider JSON, or, if performance is an issue, binary format, such as Google protocol buffers. This last choice would require the most work (you'll need to define messages formats in .idl files) this would definitely not feel Pythonic.
Take a look how exchange of messages (any serializable Python object) over a plain socket can look like:
def send(sockfd, message):
string_message = cPickle.dumps(message)
write_int(sockfd, len(string_message))
write(sockfd, string_message)
def write_int(sockfd, integer):
integer_buf = struct.pack('>i', integer)
write(sockfd, integer_buf)
def write(sockfd, data):
data_len = len(data)
offset = 0
while offset != data_len:
offset += sockfd.send(data[offset:])
Not bad, but as you can see having to deal with serialization of a message length is quite low level.
And to receive such message:
def receive(self):
message_size = read_int(self.sockfd)
if message_size == None:
return None
data = read(self.sockfd, message_size)
if data == None:
return None
message = cPickle.loads(data)
return message
def read_int(sockfd):
int_size = struct.calcsize('>i')
intbuf = read(sockfd, int_size)
if intbuf == None:
return None
return struct.unpack('>i', intbuf)[0]
def read(sockfd, size):
data = ""
while len(data) != size:
newdata = sockfd.recv(size - len(data))
if len(newdata) == 0:
return None
data = data + newdata
return data
But this does not gracefully deal with errors (no attempt to determine which messages were delivered successfully).
If you're familiar with sockets, I would consider SocketServer.UDPServer (see http://docs.python.org/library/socketserver.html#socketserver-udpserver-example). UDP is definitely the simplest messaging system, but, obviously, you'll have to deal with fact that some messages can be lost, duplicated or delivered out of order. If your protocol is very simple, it's relatively easy to handle. The advantage is you don't need any additional threads and no external dependencies are needed. It also might be very good choice if you application doesn't have concept of session.
Might be good for a start, but there are much more details to be considered that are not included in your question. I also wouldn't be worried of the fact, that sockets are not very Pythonic. At the very end you're going to use sockets anyway, someone will just wrap them for you and you'll be forced to learn the framework, which in best case may be overwhelming for your requirements.
(Please note my opinion is highly biased, as I love dealing with raw sockets.)

Twisted Socket Send Message Immediately

I am making an iPhone application that communicates to a Twisted socket and it works great when I have one message to send. However, my issue is I need to send many different bits of information to the app. Here is my code.
if numrows == 1:
#Did login
msg = "%s: Login Credentials Success" % _UDID
print msg
for c in self.factory.clients:
c.message(msg)
time.sleep(0.5)
for result in results:
for i in range(1, 6):
msg = "%s:L%d;%s" % (_UDID, i, result[i])
print msg
for c in self.factory.clients:
c.message(msg)
time.sleep(0.5)
else:
msg = "%s: Login Credentials Failed" % _UDID
print msg
for c in self.factory.clients:
c.message(msg)
time.sleep(0.5)
cursor.close()
database.close()
#print msg
#for c in self.factory.clients:
#c.message(msg)
def message(self, message):
self.transport.write(message)
Say I were to send just the first msg, and every other msg didn't exist along with the print and for methods below each msg, the message Login Credentials Success would be sent to the app. But if put in the rest like how you are seeing it, nothing goes though because it sends it all at once, even with putting a time.sleep in the code.
The app checks the response every .05 seconds or less. Even though that the login credentials is on the top, it doesn't go through because there is more info that is being sent afterwards, but without all the info after the credentials message, it would go through.
I am desperate in finding the answer to this. I've tried everything I can think of. The app is not the issue, it's the Python.
Thanks.
At a risk of offending by contradicting, you may want to reexamine the claim that your app is not the problem. It sounds like you are expecting to have complete control over the content of each outgoing TCP packet, and that your app is depending on packet boundaries to determine the message boundaries. This isn't a very good approach to networking in general; some network intermediaries may split up (fragment) or even combine packets along the way, which would confuse your app. Remember that TCP is not a packet protocol: it is a stream protocol, so you're only really guaranteed that the octets you sent will be received in order, if at all.
A common approach to dealing with messages of varying size is to prefix each message with an X-bit big-endian size field stating how large the following message is. The receiving end of the communication reads X bits first, then reads 'size' octets after that to get the full message (blocking until that point, and leaving any additional information in the buffer for the next message handler to get).
I wouldn't mind helping more with your Twisted code, but it may very well be that it's already written properly. At least, I recommend not depending on trying to make Twisted flush network write buffers immediately. While it may help make your code appear to work right, right now, it will really just be hiding problems that will surface later.
Your issue appears to be that Twisted buffers the data you write to it.
Apparently, there is no easy way to force the data to be sent on its own without refactoring a great deal of your code. See Twisted transport.write and Using Twisted's twisted.web classes, how do I flush my outgoing buffers?.
Without knowing what your code looks like before the snipped you pasted and, according to the accepted answer of the last link:
define the wait function just as it is in the accepted answer
make sure that your class inherits from http.Request
decorate your method message with defer.inlineCallbacks
add yield wait(N) (for a value of N that you have to test and determine) after calls to write on the message method
I don't have enough experience with Twisted to know which of those steps are needed and which are just curft from the code of the original answer that doesn't apply to your code.
It may be possible (and easier) though, to re-write the iPhone application to accept the kind of messages that get sent when they include multiple writes in a single message.

Reading socket buffer using asyncore

I'm new to Python (I have been programming in Java for multiple years now though), and I am working on a simple socket-based networking application (just for fun). The idea is that my code connects to a remote TCP end-point and then listens for any data being pushed from the server to the client, and perform some parsing on this.
The data being pushed from server -> client is UTF-8 encoded text, and each line is delimited by CRLF (\x0D\x0A). You probably guessed: the idea is that the client connects to the server (until cancelled by the user), and then reads and parses the lines as they come in.
I've managed to get this to work, however, I'm not sure that I'm doing this quite the right way. So hence my actual questions (code to follow):
Is this the right way to do it in Python (ie. is it really this simple)?
Any tips/tricks/useful resources (apart from the reference documentation) regarding buffers/asyncore?
Currently, the data is being read and buffered as follows:
def handle_read(self):
self.ibuffer = b""
while True:
self.ibuffer += self.recv(self.buffer_size)
if ByteUtils.ends_with_crlf(self.ibuffer):
self.logger.debug("Got full line including CRLF")
break
else:
self.logger.debug("Buffer not full yet (%s)", self.ibuffer)
self.logger.debug("Filled up the buffer with line")
print(str(self.ibuffer, encoding="UTF-8"))
The ByteUtils.ends_with_crlf function simply checks the last two bytes of the buffer for \x0D\x0A. The first question is the main one (answer is based on this), but any other ideas/tips are appreciated. Thanks.
TCP is a stream, and you are not guaranteed that your buffer will not contain the end of one message and the beginning of the next.
So, checking for \n\r at the end of the buffer will not work as expected in all situations. You have to check each byte in the stream.
And, I would strongly recommend that you use Twisted instead of asyncore.
Something like this (from memory, might not work out of the box):
from twisted.internet import reactor, protocol
from twisted.protocols.basic import LineReceiver
class MyHandler(LineReceiver):
def lineReceived(self, line):
print "Got line:", line
f = protocol.ClientFactory()
f.protocol = MyHandler
reactor.connectTCP("127.0.0.1", 4711, f)
reactor.run()
It's even simpler -- look at asynchat and its set_terminator method (and other helpful tidbits in that module). Twisted is orders of magnitude richer and more powerful, but, for sufficiently simple tasks, asyncore and asynchat (which are designed to interoperate smoothly) are indeed very simple to use, as you've started observing.

Categories

Resources