uWSGI: working with multiple clients through websockets - python

I'm new to uWSGI and I'm working on a web application that would require a lot of web-socket communication. I decided to practice by creating a simple chat application.
From uWSGI docs:
def application(env, start_response):
# complete the handshake
uwsgi.websocket_handshake(env['HTTP_SEC_WEBSOCKET_KEY'], env.get('HTTP_ORIGIN', ''))
while True:
msg = uwsgi.websocket_recv()
uwsgi.websocket_send(msg)
I don't want to use this approach because
You can send a message to your client only after he sent you something;
You don't have access to other websocket connections, so basically you only communicate with 1 client.
However the same page has an example of how to implement a chat. It would be great but they use Redis in their example:
r = redis.StrictRedis(host='localhost', port=6379, db=0)
channel = r.pubsub()
channel.subscribe('foobar')
websocket_fd = uwsgi.connection_fd()
redis_fd = channel.connection._sock.fileno()
while True:
uwsgi.wait_fd_read(websocket_fd, 3)
uwsgi.wait_fd_read(redis_fd)
uwsgi.suspend()
From what I see, Redis here is used as an external server that allows different uWSGI request handlers use the same data.
Does this really need to be that difficult?
Take a look at the chat solution using Node.js (example taken from javascript.ru):
var WebSocketServer = new require('ws');
// connected clients
var clients = {};
// WebSocket-server serves 8081 port
var webSocketServer = new WebSocketServer.Server({
port: 8081
});
webSocketServer.on('connection', function(ws) {
var id = Math.random();
clients[id] = ws;
console.log("new connection " + id);
ws.on('message', function(message) {
console.log("recieved a new message: " + message);
for (var key in clients) {
clients[key].send(message);
}
});
ws.on('close', function() {
console.log("connection closed " + id);
delete clients[id];
});
});
What I really like here is
Event-based approach allows me to send data to clients whenever I want to
All clients are stored in a simple clients dictionary. I can access it easily, no need of using some external server to exchange data between clients.
To solve the first problem (non-blocking event-based approach) in uWSGI I wrote this code snippet:
import uwsgi
from threading import Thread
class WebSocket(Thread):
def __init__(self, env):
super().__init__()
self.listeners = []
self._env = env
def run(self):
self._working = True
uwsgi.websocket_handshake(
self._env['HTTP_SEC_WEBSOCKET_KEY'], self._env.get('HTTP_ORIGIN', ''))
while self._working:
msg = uwsgi.websocket_recv()
for listener in self.listeners:
listener(msg)
def send(self, msg):
uwsgi.websocket_send(msg)
def close(self):
self._working = False
So my first question is whether or not this will work.
The second question is how I exchange data between request handlers. I feel like I completely misunderstand uWSGI design.
I use uwsgi --http :80 --wsgi-file=main.py --master --static-map /st=web-static to test my application. Ideally I would just define an object in main.py and work with it, but I assume that this main.py will be initialized multiple times in different workers/threads/process.
I've already seen a similar question on data exchanging: Communication between workers in uwsgi
The answer was
Pay attention, it will works only if you have a single worker/process. Another common approach is using the uWSGI caching framework (the name is misleading, infact is a shared dictionary). It will allows you to share data between workers and threads.
I see this uWSGI caching framework as some kind of independent external data storage (see Redis example above). But after I saw that neat implementation on Node.js I don't want to use no caching frameworks but just share the same python object across all request handlers.

First of all you should invest in learning the sync vs async programming paradigms. The node approach seems easier, but only because you have a single process to manage. If you need to scale (to multiple machines or simple multiple processes) you are back to the "python problems". Having an external channel (like redis) is a common pattern, you should use it as it will allow you to scale easily.
Regarding python, uWSGI and websockets i strongly susgest you to look at gevent. The uWSGI websockets system supports it and there are lot of examples out there. You will be able to increase concurrency without needing to rely on callback based programming.
Eventually (but only if you like callbacks) you can give a look at tornado.

Related

grpc client python: How to create grpc client connection pool for better throughput?

Our usecase is to make a large number of requests. Each request return 1 MB of data. Right now, on client side, we create a single GRPC channel and the run the following function in a loop
content_grpc_channel = grpc.insecure_channel(content_netloc)
test_stub = test_pb2_grpc.ContentServiceInternalStub(
content_grpc_channel)
def get_feature_data_future(feature_id, span_context=()):
req_feature = test_pb2.GetFeatureRequest()
req_feature.feature_id = feature_id
resp_feature_future = test_stub.GetFeature.future(
req_feature, metadata=span_context)
return resp_feature_future
My question is in python how I can create grpc client connection pool for better throughput?
In golang I see this https://godoc.org/google.golang.org/api/option#WithGRPCConnectionPool but I have a hard time to find the doc in python.
Is there such a utility in python to create grpc connection pool? Or should I create multiple grpc channels and manage those myself? I assume each channel will have different tcp connection, correct?
gRPC uses HTTP/2 and can multiplex many requests on one connection and gRPC client connections should be re-used for the lifetime of the client app.
The Golang link you mentioned, says that WithGRPCConnectionPool would be used to balance the requests. You might search for load balancing if it is what you need but remember that load balancing only makes sense if you have multiple gRPC server instances.
If you are searching for a connection pool inspired by what is done when working with databases, I would say you don't need to worry about it as the opening connection overhead doesn't exist when working with gRPC

Starting and stopping flask on demand

I am writing an application, which can expose a simple RPC interface implemented with flask. However I want it to be possible to activate and deactivate that interface. Also it should be possible to have multiple instances of the application running in the same python interpreter, which each have their own RPC interface.
The service is only exposed to localhost and this is a prototype, so I am not worried about security. I am looking for a small and easy solution.
The obvious way here seems to use the flask development server, however I can't find a way to shut it down.
I have created a flask blueprint for the functionality I want to expose and now I am trying to write a class to wrap the RPC interface similar to this:
class RPCInterface:
def __init__(self, creating_app, config):
self.flask_app = Flask(__name__)
self.flask_app.config.update(config)
self.flask_app.my_app = creating_app
self.flask_app.register_blueprint(my_blueprint)
self.flask_thread = Thread(target=Flask.run, args=(self.flask_app,),
name='flask_thread', daemon=True)
def shutdown(self):
# Seems impossible with the flask server
raise NotImplemented()
I am using the variable my_app of the current app to pass the instance of my application this RPC interface is working with into the context of the requests.
It can be shut down from inside a request (as described here http://flask.pocoo.org/snippets/67/), so one solution would be to create a shutdown endpoint and send a request with the test client to initiate a shutdown. However that requires a flask endpoint just for this purpose. This is far from clean.
I looked into the source code of flask and werkzeug and figured out the important part (Context at https://github.com/pallets/werkzeug/blob/master/werkzeug/serving.py#L688) looks like this:
def inner():
try:
fd = int(os.environ['WERKZEUG_SERVER_FD'])
except (LookupError, ValueError):
fd = None
srv = make_server(hostname, port, application, threaded,
processes, request_handler,
passthrough_errors, ssl_context,
fd=fd)
if fd is None:
log_startup(srv.socket)
srv.serve_forever()
make_server returns an instance of werkzeugs server class, which inherits from pythons http.server class. This in turn is a python BaseSocketServer, which exposes a shutdown method. The problem is that the server created here is just a local variable and thus not accessible from anywhere.
This is where I ran into a dead end. So my question is:
Does anybody have another idea how to shut down this server easily?
Is there any other simple server to run flask on? Something which does not require an external process and can just be started and stopped in a few lines of code? Everything listed in the flask doc seems to have a complex setup.
Answering my own question in case this ever happens again to anyone.
The first solution involved switching from flask to klein. Klein is basically flask with less features, but running on top of the twisted reactor. This way the integration is very simple. Basically it works like this:
from klein import Klein
from twisted.internet import reactor
app = Klein()
#app.route('/')
def home(request):
return 'Some website'
endpoint = serverFromString(reactor, endpoint_string)
endpoint.listen(Site(app.resource()))
reactor.run()
Now all the twisted tools can be used to start and stop the server as needed.
The second solution I switched to further down the road was to get rid of HTTP as a transport protocol. I switched to JSONRPC on top of twisted's LineReceiver protocol. This way everything got even simpler and I didn't use any of the HTTP stuff anyway.
This is a terrible, horrendous hack that nobody should ever use for any purpose whatsoever... except maybe if you're trying to write an integration test suite. There are probably better approaches - but if you're trying to do exactly what the question is asking, here goes...
import sys
from socketserver import BaseSocketServer
# implementing the shutdown() method above
def shutdown(self):
for frame in sys._current_frames().values():
while frame is not None:
if 'srv' in frame.f_locals and isinstance(frame.f_locals['srv'], BaseSocketServer):
frame.f_locals['srv'].shutdown()
break
else:
continue
break
self.flask_thread.join()

Handling websockets in python

I am developing a messaging service that uses websockets. And I shall be using python/django as a server side language. There are options such as:
Tornado
django-websockets-redis
Crossbar.io
Flask-SocketIO
I am confused by what should I be using for the production environment where the number of active connections is large.
Websockets in tornado are relatively straightforward. This example shows how you can integrate websockets with extremely basic management (open and on_close methods).
For upstream traffic (browser -> server) you can implement the WebSocketHandler methods:
def on_message(self, message):
# call message callback
def data_received(self, chunk):
# do something with chunked data
For downstream traffic, there's WebSocketHandler.write_message:
def broadcast_to_all_websockets(self, message):
for ws in cl:
if not ws.ws_connection.stream.socket:
print "Web socket %s does not exist anymore!" % ws
cl.remove(ws)
else:
ws.write_message(message)
Highly suggest using autobahn|Python. Currently using it right now for a WebSocket project in Python and it's so easy to work with and has a lot of classes already built for you like a WebSocketServer. Let's you choose your implementation too (choice between asyncio and Twisted.)

How to serve data from UDP stream over HTTP in Python?

I am currently working on exposing data from legacy system over the web. I have a (legacy) server application that sends and receives data over UDP. The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved.
In order to expose this data over the web, I am considering building a lightweight web server that reads/write UDP data and exposes this data over HTTP.
As I am experienced with Python, I am considering to use it.
The question is the following: how can I (continuously) read data from UDP and send snapshots of it over TCP/HTTP on-demand with Python? So basically, I am trying to build a kind of "UDP2HTTP" adapter to interface with the legacy app so that I wouldn't need to touch the legacy code.
A solution that is WSGI compliant would be much preferred. Of course any tips are very welcome and MUCH appreciated!
Twisted would be very suitable here. It supports many protocols (UDP, HTTP) and its asynchronous nature makes it possible to directly stream UDP data to HTTP without shooting yourself in the foot with (blocking) threading code. It also support wsgi.
Here's a quick "proof of concept" app using the twisted framework. This assumes that the legacy UDP service is listening on localhost:8000 and will start sending UDP data in response to a datagram containing "Send me data". And that the data is 3 32bit integers. Additionally it will respond to an "HTTP GET /" on port 2080.
You could start this with twistd -noy example.py:
example.py
from twisted.internet import protocol, defer
from twisted.application import service
from twisted.python import log
from twisted.web import resource, server as webserver
import struct
class legacyProtocol(protocol.DatagramProtocol):
def startProtocol(self):
self.transport.connect(self.service.legacyHost,self.service.legacyPort)
self.sendMessage("Send me data")
def stopProtocol(self):
# Assume the transport is closed, do any tidying that you need to.
return
def datagramReceived(self,datagram,addr):
# Inspect the datagram payload, do sanity checking.
try:
val1, val2, val3 = struct.unpack("!iii",datagram)
except struct.error, err:
# Problem unpacking data log and ignore
log.err()
return
self.service.update_data(val1,val2,val3)
def sendMessage(self,message):
self.transport.write(message)
class legacyValues(resource.Resource):
def __init__(self,service):
resource.Resource.__init__(self)
self.service=service
self.putChild("",self)
def render_GET(self,request):
data = "\n".join(["<li>%s</li>" % x for x in self.service.get_data()])
return """<html><head><title>Legacy Data</title>
<body><h1>Data</h1><ul>
%s
</ul></body></html>""" % (data,)
class protocolGatewayService(service.Service):
def __init__(self,legacyHost,legacyPort):
self.legacyHost = legacyHost #
self.legacyPort = legacyPort
self.udpListeningPort = None
self.httpListeningPort = None
self.lproto = None
self.reactor = None
self.data = [1,2,3]
def startService(self):
# called by application handling
if not self.reactor:
from twisted.internet import reactor
self.reactor = reactor
self.reactor.callWhenRunning(self.startStuff)
def stopService(self):
# called by application handling
defers = []
if self.udpListeningPort:
defers.append(defer.maybeDeferred(self.udpListeningPort.loseConnection))
if self.httpListeningPort:
defers.append(defer.maybeDeferred(self.httpListeningPort.stopListening))
return defer.DeferredList(defers)
def startStuff(self):
# UDP legacy stuff
proto = legacyProtocol()
proto.service = self
self.udpListeningPort = self.reactor.listenUDP(0,proto)
# Website
factory = webserver.Site(legacyValues(self))
self.httpListeningPort = self.reactor.listenTCP(2080,factory)
def update_data(self,*args):
self.data[:] = args
def get_data(self):
return self.data
application = service.Application('LegacyGateway')
services = service.IServiceCollection(application)
s = protocolGatewayService('127.0.0.1',8000)
s.setServiceParent(services)
Afterthought
This isn't a WSGI design. The idea for this would to use be to run this program daemonized and have it's http port on a local IP and apache or similar to proxy requests. It could be refactored for WSGI. It was quicker to knock up this way, easier to debug.
The software uses UDP to send sequential updates to a given set of variables in (near) real-time (updates every 5-10 ms). thus, I do not need to capture all UDP data -- it is sufficient that the latest update is retrieved
What you must do is this.
Step 1.
Build a Python app that collects the UDP data and caches it into a file. Create the file using XML, CSV or JSON notation.
This runs independently as some kind of daemon. This is your listener or collector.
Write the file to a directory from which it can be trivially downloaded by Apache or some other web server. Choose names and directory paths wisely and you're done.
Done.
If you want fancier results, you can do more. You don't need to, since you're already done.
Step 2.
Build a web application that allows someone to request this data being accumulated by the UDP listener or collector.
Use a web framework like Django for this. Write as little as possible. Django can serve flat files created by your listener.
You're done. Again.
Some folks think relational databases are important. If so, you can do this. Even though you're already done.
Step 3.
Modify your data collection to create a database that the Django ORM can query. This requires some learning and some adjusting to get a tidy, simple ORM model.
Then write your final Django application to serve the UDP data being collected by your listener and loaded into your Django database.

How does AMF communication work?

How does Flash communicate with services / scripts on servers via AMF?
Regarding the AMF libraries for Python / Perl / PHP which are easier to develop than .NET / Java:
do they execute script files, whenever Flash sends an Remote Procedure Call?
or do they communicate via sockets, to script classes that are running as services?
Regarding typical AMF functionality:
How is data transferred? is it by method arguments that are automatically serialised?
How can servers "push" to clients? do Flash movies have to connect on a socket?
Thanks for your time.
The only AMF library I'm familiar with is PyAMF, which has been great to work with so far. Here are the answers to your questions for PyAMF:
I'd imagine you can run it as a script (do you mean like CGI?), but the easiest IMO is to set up an app server specifically for AMF requests
the easiest way is to define functions in pure python, which PyAMF wraps to serialize incoming / outgoing AMF data
you can communicate via sockets if that's what you need to do, but again, it's the easiest to use pure Python functions; one use for sockets is to keep an open connection and 'push' data to clients, see this example
Here's an example of three simple AMF services being served on localhost:8080:
from wsgiref import simple_server
from pyamf.remoting.gateway.wsgi import WSGIGateway
## amf services ##################################################
def echo(data):
return data
def reverse(data):
return data[::-1]
def rot13(data):
return data.encode('rot13')
services = {
'myservice.echo': echo,
'myservice.reverse': reverse,
'myservice.rot13': rot13,
}
## server ########################################################
def main():
app = WSGIGateway(services)
simple_server.make_server('localhost', 8080, app).serve_forever()
if __name__ == '__main__':
main()
I would definitely recommend PyAMF. Check out the examples to see what it's capable of and what the code looks like.
How does Flash communicate with services / scripts on servers via AMF?
Data is transferred over a TCP/IP connection. Sometimes an existing HTTP connection is used, and in other cases a new TCP/IP connection is opened for the AMF data. When the HTTP or additional TCP connections are opened, the sockets interface is probably used. The AMF definitely travels over a TCP connection of some sort, and the sockets interface is practically the only way to open such a connection.
The "data" that is transferred consists of ECMA-script (Javascript(tm)) data types such as "integer", "string", "object", and so on.
For a technical specification of how the objects are encoded into binary, Adobe has published a specification: AMF 3.0 Spec at Adobe.com
Generally the way an AMF-using client/server system works is something like this:
The client displays some user interface and opens a TCP connection to the server.
The server sends some data to the client, which updates its user interface.
If the user makes a command, the client sends some data to the server over the TCP connection.
Continue steps 2-3 until the user exits.
For example, if the user clicks a "send mail" button in the UI, then the client code might do this:
public class UICommandMessage extends my.CmdMsg
{
public function UICommandMessage(action:String, arg: String)
{
this.cmd = action;
this.data = String;
}
}
Then later:
UICommandMessage msg = new UICommandMessage("Button_Press", "Send_Mail");
server_connection.sendMessage(msg);
in the server code, the server is monitoring the connection as well for incoming AMF object. It receives the message, and passes control to an appropriate response function. This is called "dispatching a message".
With more information about what you are trying to accomplish, I could give you more useful details.

Categories

Resources