Handling websockets in python - python

I am developing a messaging service that uses websockets. And I shall be using python/django as a server side language. There are options such as:
Tornado
django-websockets-redis
Crossbar.io
Flask-SocketIO
I am confused by what should I be using for the production environment where the number of active connections is large.

Websockets in tornado are relatively straightforward. This example shows how you can integrate websockets with extremely basic management (open and on_close methods).
For upstream traffic (browser -> server) you can implement the WebSocketHandler methods:
def on_message(self, message):
# call message callback
def data_received(self, chunk):
# do something with chunked data
For downstream traffic, there's WebSocketHandler.write_message:
def broadcast_to_all_websockets(self, message):
for ws in cl:
if not ws.ws_connection.stream.socket:
print "Web socket %s does not exist anymore!" % ws
cl.remove(ws)
else:
ws.write_message(message)

Highly suggest using autobahn|Python. Currently using it right now for a WebSocket project in Python and it's so easy to work with and has a lot of classes already built for you like a WebSocketServer. Let's you choose your implementation too (choice between asyncio and Twisted.)

Related

How to handle coroutine when using tornado websockets

I'm using tornado websockets and it works fine.
However, I'd like to listen for changes to a MongoDB Collection and send new changes to the websocket client.
I cannot get it running with threads, and I saw that using threads with tornado is discouraged.
I'm really stuck right now. How can I proceed?
(blocking) Code right now:
def open(self):
print("Opening Connection")
with self.collection.watch() as stream:
for change in stream:
doc = change["fullDocument"]
self.write_message(u"%s" % json.dumps(doc))
It looks like Motor can handle Mongodb changestreams. https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_change_stream.html
Personally, I find RethinkDB or Firebase are better alternatives for realtime features like this. But without knowing your needs, I cannot say if this is a good for you.

Does asyncio from python support coroutine-based API for UDP networking?

I was browsing the python asyncio module documentation this night looking for some ideas for one of my course projects, but I soon find that there might be a lack of feature in python's standard aysncio module.
If you look through the documentation, you'll find that there's a callback based API and a coroutine based API. And the callback API could be used for building both UDP and TCP applications, while it looks that the coroutine API could only be used for building TCP application, as it utilizes the use of a stream-style API.
This quite causes a problem for me because I was looking for a coroutine-based API for UDP networking, although I did find that asyncio supports low-level coroutine based socket methods like sock_recv and sock_sendall, but the crucial APIs for UDP networking, recvfrom and sendto are not there.
What I wish to do is to write some codes like:
async def handle_income_packet(sock):
await data, addr = sock.recvfrom(4096)
# data handling here...
await sock.sendto(addr, response)
I know that this could be equivalently implemented using a callback API, but the problem here is that callbacks are not coroutines but regular functions, so that in it you cannot yield control back to the event loop and preserve the function execution state.
Just look at the above code, if we need to do some blocking-IO operations in the data handling part, we won't have a problem in the coroutine version as long as our IO operations are done in coroutines as well:
async def handle_income_packet(sock):
await data, addr = sock.recvfrom(4096)
async with aiohttp.ClientSession() as session:
info = await session.get(...)
response = generate_response_from_info(info)
await sock.sendto(addr, response)
As long as we use await the event loop would take the control flow from that point to handle other things until that IO is done. But sadly these codes are not usable at this moment because we do not have a coroutined version of socket.sendto and socket.recvfrom in asyncio.
What we could implement this in is to use the transport-protocol callback API:
class EchoServerClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
peername = transport.get_extra_info('peername')
self.transport = transport
def data_received(self, data):
info = requests.get(...)
response = generate_response_from_info(info)
self.transport.write(response)
self.transport.close()
we cannot await a coroutine there because callbacks are not coroutines, and using a blocking IO call like above would stall the control flow in the callback and prevent the loop to handle any other events until the IO is done
Another recommended implementation idea is to create a Future object in the data_received function, add it to the event loop, and store any needed state variable in the Protocol class, then explicitly return control to the loop. While this could work, it does create a lot of complex codes where in the coroutine version they're not needed in any way.
Also here we have an example of using non-blocking socket and add_reader for handle UDP sockets. But the code still looks complex comparing to coroutine-version's a few lines.
The point I want to make is that coroutine is a really good design that could utilize the power of concurrency in one single thread while also has a really straightforward design pattern that could save both brainpower and unnecessary lines of codes, but the crucial part to get it work for UDP networking is really lacking in our asyncio standard library.
What do you guys think about this?
Also, if there's any other suggestions for 3rd party libraries supporting this kind of API for UDP networking, I would be extremely grateful for the sake of my course project. I found Bluelet is quite like such a thing but it does not seem to be actively maintained.
edit:
It seems that this PR did implement this feature but was rejected by the asyncio developers. The developers claim that all functions could be implemented using create_datagram_endpoint(), the protocol-transfer API. But just as I have discussed above, coroutine API has the power of simpleness compared to using the callback API in many use cases, it is really unfortunate that we do not have these with UDP.
The reason a stream-based API is not provided is because streams offer ordering on top of the callbacks, and UDP communication is inherently unordered, so the two are fundamentally incompatible.
But none of that means you can't invoke coroutines from your callbacks - it's in fact quite easy! Starting from the EchoServerProtocol example, you can do this:
def datagram_received(self, data, addr):
loop = asyncio.get_event_loop()
loop.create_task(self.handle_income_packet(data, addr))
async def handle_income_packet(self, data, addr):
# echo back the message, but 2 seconds later
await asyncio.sleep(2)
self.transport.sendto(data, addr)
Here datagram_received starts your handle_income_packet coroutine which is free to await any number of coroutines. Since the coroutine runs in the "background", the event loop is not blocked at any point and datagram_received returns immediately, just as intended.
You might be interested in this module providing high-level UDP endpoints for asyncio:
async def main():
# Create a local UDP enpoint
local = await open_local_endpoint('localhost', 8888)
# Create a remote UDP enpoint, pointing to the first one
remote = await open_remote_endpoint(*local.address)
# The remote endpoint sends a datagram
remote.send(b'Hey Hey, My My')
# The local endpoint receives the datagram, along with the address
data, address = await local.receive()
# Print: Got 'Hey Hey, My My' from 127.0.0.1 port 50603
print(f"Got {data!r} from {address[0]} port {address[1]}")
asyncudp provides easy to use UDP sockets in asyncio.
Here is an example:
import asyncio
import asyncudp
async def main():
sock = await asyncudp.create_socket(remote_addr=('127.0.0.1', 9999))
sock.sendto(b'Hello!')
print(await sock.recvfrom())
sock.close()
asyncio.run(main())
I thought I would post my solution for others that may be coming in from the search engine. When I was learning async networking programming in Python I couldn't find an async API for UDP. I searched Google for why that was and eventually came across an old mailing list posting about the issue and how Python's creator thought it was a bad idea. I don't agree with this.
Yes, it's true UDP packets are unordered and may not arrive but there's no technical reason why it shouldn't be possible to have awaitable APIs for send/recv/open/close in UDP. So I built a library and added it.
Here's what it looks like to do async UDP.
First, start the Python REPR with await support:
python3 -m asyncio
from p2pd import *
# Load internal interface details.
netifaces = await init_p2pd()
# Load the default interface.
i = await Interface(netifaces=netifaces)
# Open a UDP echo client.
route = await i.route().bind()
dest = await Address("p2pd.net", 7, route)
pipe = await pipe_open(UDP, dest, route)
# Queue all responses.
pipe.subscribe()
# Send / recv.
await pipe.send(b"echo back this data", dest.tup)
out = await pipe.recv()
print(out)
# Cleanup.
await pipe.close()
There's many more problems my library solves. It properly handles interface management, address lookup, and NAT enumeration. It makes IPv6 as easy to use as IPv4. It provides the same API for UDP / TCP / Server / Client. It supports peer-to-peer connections. And there's a REST API that can be used from other languages.
You can read more about the problems Python async networking has at https://roberts.pm/p2pd and the docs from my library are at https://p2pd.readthedocs.io/en/latest/.

cherrypy as a gevent-socketio server

I have just started using gevent-socketio and it's great!
But I have been using the default socketioserver and socketio_manage from the chat tutorial and was wondering how to integrate socketio with cherrypy.
essentially, how do I turn this:
class MyNamespace(BaseNamespace):...
def application(environ, start_response):
if environ['PATH_INFO'].startswith('/socket.io'):
return socketio_manage(environ, { '/app': MyNamespace})
else:
return serve_file(environ, start_response)
def serve_file(...):...
sio_server = SocketIOServer(
('', 8080), application,
policy_server=False) sio_server.serve_forever()
into a normal cherrypy server?
Gevent-socketio is based on Gevent, and Gevent's web server. There are two implementations: pywsgi, which is pure python, and wsgi, which uses libevent's http implementation.
See the paragraph starting with "The difference between pywsgi.WSGIServer and wsgi.WSGIServer" over here:
http://www.gevent.org/servers.html
Only those servers are "green", in the sense that they yield the control to the Gevent loop.. so you can only use those servers afaik. The reason for this is that the server is present at the very beginning of the request, and will know how to handle the "Upgrade" and websockets protocol negotiations, and it will pass values inside the "environ" that the next layer (SocketIO) will expect and know how to handle.
You will also need to use the gevent-websocket package.. because it is green (and gevent-socketio is based on that one). You can't just swap the websocket stack.
Hope this helps.
CherryPy doesn't implement the socket.io protocol, nor does it support WebSocket as a built-in. However, there is an extension to CherryPy, called ws4py, that implements only the bare WebSocket protocol on top of its stack. You could start there probably.

Python xmpp jabber client in tornado web application

I am desktop programmer but I want to learn something about web services. I decided for python. I am trying understand how web applications works. I know how to create basic tornado website (request - response) and working jabber client, but I don't know how to mix them. Can I use any python components in web services? Does they must have specific structure ( sync or async )? Because I'm stuck in loop handlers:
If tornado start web serwer by command:
app = Application()
app.listen(options.port)
tornado.ioloop.IOLoop.instance().start()
... so how (where) can I start xmpp loop?
client.connect()
client.run()
I think that tornado listen loop should handle xmpp listening, but don't know how
Regards.
Edit: I forgot. I am using pyxmpp2
I believe what you are trying to accomplish is not feasible in one thread of python as both are trying to listen at the same time which isn't possible in one thread. Might I suggest looking at this tutorial on threading.
Another question would be are you trying to make a web based xmpp or just have a xmpp & html server running in the same script. If you wish to try the former I would advise you to look into inter-thread communication either with zeromq or queue
maybe WebSocketHandler and Thread will help you.
Demo
class BotThread(threading.Thread):
def __init__(self,my_jid,settings,on_message):
super(BotThread,self).__init__()
#EchoBot is pyxmpp2's Client
self.bot = EchoBot(my_jid, settings,on_message= on_message)
def run(self):
self.bot.run()
class ChatSocketHandler(tornado.websocket.WebSocketHandler):
def open(self):
#init xmpp client
my_jid =
settings =
bot =BotThread(my_jid, settings,on_message=self.on_message)
bot.start()

How to serve data from UDP stream over HTTP in Python?

I am currently working on exposing data from legacy system over the web. I have a (legacy) server application that sends and receives data over UDP. The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved.
In order to expose this data over the web, I am considering building a lightweight web server that reads/write UDP data and exposes this data over HTTP.
As I am experienced with Python, I am considering to use it.
The question is the following: how can I (continuously) read data from UDP and send snapshots of it over TCP/HTTP on-demand with Python? So basically, I am trying to build a kind of "UDP2HTTP" adapter to interface with the legacy app so that I wouldn't need to touch the legacy code.
A solution that is WSGI compliant would be much preferred. Of course any tips are very welcome and MUCH appreciated!
Twisted would be very suitable here. It supports many protocols (UDP, HTTP) and its asynchronous nature makes it possible to directly stream UDP data to HTTP without shooting yourself in the foot with (blocking) threading code. It also support wsgi.
Here's a quick "proof of concept" app using the twisted framework. This assumes that the legacy UDP service is listening on localhost:8000 and will start sending UDP data in response to a datagram containing "Send me data". And that the data is 3 32bit integers. Additionally it will respond to an "HTTP GET /" on port 2080.
You could start this with twistd -noy example.py:
example.py
from twisted.internet import protocol, defer
from twisted.application import service
from twisted.python import log
from twisted.web import resource, server as webserver
import struct
class legacyProtocol(protocol.DatagramProtocol):
def startProtocol(self):
self.transport.connect(self.service.legacyHost,self.service.legacyPort)
self.sendMessage("Send me data")
def stopProtocol(self):
# Assume the transport is closed, do any tidying that you need to.
return
def datagramReceived(self,datagram,addr):
# Inspect the datagram payload, do sanity checking.
try:
val1, val2, val3 = struct.unpack("!iii",datagram)
except struct.error, err:
# Problem unpacking data log and ignore
log.err()
return
self.service.update_data(val1,val2,val3)
def sendMessage(self,message):
self.transport.write(message)
class legacyValues(resource.Resource):
def __init__(self,service):
resource.Resource.__init__(self)
self.service=service
self.putChild("",self)
def render_GET(self,request):
data = "\n".join(["<li>%s</li>" % x for x in self.service.get_data()])
return """<html><head><title>Legacy Data</title>
<body><h1>Data</h1><ul>
%s
</ul></body></html>""" % (data,)
class protocolGatewayService(service.Service):
def __init__(self,legacyHost,legacyPort):
self.legacyHost = legacyHost #
self.legacyPort = legacyPort
self.udpListeningPort = None
self.httpListeningPort = None
self.lproto = None
self.reactor = None
self.data = [1,2,3]
def startService(self):
# called by application handling
if not self.reactor:
from twisted.internet import reactor
self.reactor = reactor
self.reactor.callWhenRunning(self.startStuff)
def stopService(self):
# called by application handling
defers = []
if self.udpListeningPort:
defers.append(defer.maybeDeferred(self.udpListeningPort.loseConnection))
if self.httpListeningPort:
defers.append(defer.maybeDeferred(self.httpListeningPort.stopListening))
return defer.DeferredList(defers)
def startStuff(self):
# UDP legacy stuff
proto = legacyProtocol()
proto.service = self
self.udpListeningPort = self.reactor.listenUDP(0,proto)
# Website
factory = webserver.Site(legacyValues(self))
self.httpListeningPort = self.reactor.listenTCP(2080,factory)
def update_data(self,*args):
self.data[:] = args
def get_data(self):
return self.data
application = service.Application('LegacyGateway')
services = service.IServiceCollection(application)
s = protocolGatewayService('127.0.0.1',8000)
s.setServiceParent(services)
Afterthought
This isn't a WSGI design. The idea for this would to use be to run this program daemonized and have it's http port on a local IP and apache or similar to proxy requests. It could be refactored for WSGI. It was quicker to knock up this way, easier to debug.
The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved
What you must do is this.
Step 1.
Build a Python app that collects the UDP data and caches it into a file. Create the file using XML, CSV or JSON notation.
This runs independently as some kind of daemon. This is your listener or collector.
Write the file to a directory from which it can be trivially downloaded by Apache or some other web server. Choose names and directory paths wisely and you're done.
Done.
If you want fancier results, you can do more. You don't need to, since you're already done.
Step 2.
Build a web application that allows someone to request this data being accumulated by the UDP listener or collector.
Use a web framework like Django for this. Write as little as possible. Django can serve flat files created by your listener.
You're done. Again.
Some folks think relational databases are important. If so, you can do this. Even though you're already done.
Step 3.
Modify your data collection to create a database that the Django ORM can query. This requires some learning and some adjusting to get a tidy, simple ORM model.
Then write your final Django application to serve the UDP data being collected by your listener and loaded into your Django database.

Categories

Resources