I am building a web application that has a real-time feed (similar to Facebook's newsfeed) that I want to update via a long-polling mechanism. I understand that with Python, my choices are pretty much to either use Stackless (building from their Comet wsgi example) or Cometd + Twisted. Unfortunately there is very little documentation regarding these options and I cannot find good information online about production scale users of comet on Python.
Has anyone successfully implemented comet on Python in a production system? How did you go about doing it and where can I find resources to implement my own?
Orbited seems as a nice solution. Haven't tried it though.
Update: things have changed in the last 2.5 years.
We now have websockets in all major browsers, except IE (naturally) and a couple of very good abstractions over it, that provide many methods of emulating real-time communication.
socket.io along with tornadio (socket.io 0.6) and tornadio2 (socket.io 0.7+)
sock.js along with SockJS-tornado
I recommend you should use StreamHub Comet Server - its used by a lot of people - personally I use it with a couple of Django sites I run. You will need to write a tiny bit of Java to handle the streaming - I did this using Jython. The front-end code is some real simple Javascript a la:
StreamHub hub = new StreamHub();
hub.connect("http://myserver.com/");
hub.subscribe("newsfeed", function(sTopic, oData) { alert("new news item: " + oData.Title); });
The documentation is pretty good - I had similar problems as you trying to get started with the sparse docs of Cometd et al. For a start I'd read Getting Started With Comet and StreamHub, download and see how some of the examples work and reference the API docs if you need to:
Javascript API JSDoc
Streaming from Java Javadoc
Here is a full-featured example of combining Django, Orbited,and Twisted to create a real-time (Comet) app: http://github.com/clemesha/hotdot using Python.
I've done tons of APIs using twisted for stuff like that, most of which are available on my github account.
Most are client-side, but slosh is a server I wrote to do a realtime cheap pubsub sort of thing. It scales somewhat horizontally for reads by allowing for simple stream replication. Writes are a little different when you stick to plain HTTP, but I've pushed a decent amount through it for a demo.
Otherwise, you have full-on BOSH which most XMPP servers support and will allow you to decouple the message distribution from the web frontend.
I haven't done it, but this guy has and writes a good article about it, with Django examples and pointers (which I haven't checked) to other frameworks.
the orbited and redis solutions are nice, but not longer relevant when you have something like the PubSubHubbub that google released. This makes it very easy to be the publisher or the subscriber to a given feed. http://code.google.com/p/pubsubhubbub/
Here's an example that does long-polling with gevent and Django.
It uses greenlet - stack switching functionality from Stackless packaged as a CPython extension.
Related
I am currently working on a Django project and would like to add the ability for uses to enter a video conference with each other using their webcams. I understand html5 has capabilities for this, but I would like to stay away from it for now, as quite a few browsers don't yet support it. Anyone have any suggestion as to how I could do this? Thanks.
It's hard to say use this one thing when really it will be a collection of things that meet your individual needs. Here are some links to some resources that should get you started.
OpenCV - Has Python wrappers for webcam
Tornado - Python web framework and asynchronous networking
library
Twisted - Event driven networking engine written in Python
On the client side, you might want to look at getUserMedia.js for handling capturing the video from the camera - it implements a Flash fallback for browsers that don't support the getUserMedia() API.
On the server side, I think Drewness's answer covers it.
The short answer is that you have to use flash or narrow down which browsers you want to support.
The act of getting the stream from your webcam and into the browser is somewhat supported by HTML5 and fully supported by flash in modern browsers.
The tricky part is streaming that to other people in a call. There are two approaches - have everyone pipe their feed to a central server which then beams the collected feeds down to everyone in the room, or have peers directly connect to one another.
For any kind of real time chat app, you want to be using the latter (the latency of a central server architecture makes it unusable).
On the web your options are WebRTC, RTMFP, HLS, or plugins. WebRTC is fantastic, but is still a working standard. Most significantly IE does not support it, so if you expect this to be a public facing web app a sizable percentage of your users will be out of luck. HLS is an apple technology that also has patchy support (and isn't particularly efficient).
For RTMFP, have a look at cirrus/stratus. They have a sample app that illustrates the technology (BTW this is what ChatRoulette uses). Of course this requires flash, but IMO it's your best bet for covering as many platforms as possible without having your users install something first.
The choice of web framework (Django in your case) doesn't matter very much since you don't want your users sending their streams up to the server anyway. The server's job is simply to help with discovery and connection, and for this you should look into a push/comet server like APE.
I currently have a one-page Bottle project working through localhost:8080.
For the purposes of this question, assume that single page is naught but a basic short-polling chat, retrieving chatline objects from Python that contain only the sender's name and the body of the message.
Those chatline objects are stored in chat objects, with the project allowing multiple chats.
The chat and sender is determined by the URL. For example, if a chatline is sent from localhost:8080/chat/23/50, it is sent to chat 23 as sender 50, and localhost:8080/chat/23/* will display all chatlines of chat 23 in a basic overflow:auto div.
The current short-polling AJAX requests data from Python once every second. I want to make things more real-time and have decided to go with long-polling (although if you love HTML5 WebSockets, I wouldn't mind learning about them too).
My question is in two parts:
How would I go about implementing a long-poll approach in such a chat system, preferably while still using Python's Bottle module?
How would I then deliver the project through an actual server, accessible externally (i.e., not only from localhost)? Even making it available through LAN would be good.
I'm aware that long-polling can cause severe performance issues with servers such as Apache and would appreciate it if that fact could be factored into any answers; I'd like as scalable a solution as possible.
Any help is appreciated!
I recently attended a presentation about a real-time client-server application that made great use of gevent on the Python/server side and socket.io on the client side. The speaker, Alexandre Bourget, released a gevent-socketio module ongithub, that can be used to make all the plumbing easier.
Everything worked with HTTP long polling only (but socket.io contains all the logic to switch to HTML5 WebSocket or Flash socket). Although the framework was Pyramid, I believe it should work with Bottle too!
I didn't try myself but I think you can use bottle together works with Tornado http://www.tornadoweb.org/ (see Tornado - mount Bottle app).
It is possible to make long-polling with Tornado. Look at the tornadio project https://github.com/mrjoes/tornadio.
You may also be interested in http://pypi.python.org/pypi/bottle-tornado-websocket. I never used this one but it looks like the thing you are looking for.
Tornado doc has a section about running in production : http://www.tornadoweb.org/documentation/overview.html#running-tornado-in-production
I hope it helps
I need to do a client-server application, the client will be made with python-gtk,
all procedures will be on server-side to free the client of this workload.
So i did search on google about client-server protocols and i found that CORBA and RPC are closer from what i had in mind, BUT also i want to made this app ready to accept web and mobile clients, so i found REST and SOAP.
From all that reading i found myself with this doubts, should i implement two different protocols, one for gtk-client (like RPC or CORBA) and another for web and mobile (REST or SOAP)?
Can i use REST or SOAP for all?
I've implemented webservices using SOAP/XMLRPC (it was easy to support both, the framework I was using at the time made it pretty trivial) before; I had thought about using standard HTTP without the SOAP/XMLRPC layer (before I was aware that REST had a name :) but decided against it in the end because "I didn't want to write client-side code to handle the datastructures". (The Perl client also had easy SOAP/XMLRPC APIs.)
In the end, I regretted the decision I made: I could have written the code to handle the datastructures myself in an afternoon (or at the most a day) -- or if I had chosen to use JSON, probably two hours. But the burden of the SOAP/XMLRPC API and library dependencies lives on, years after I saved a few hours of developing, and will continue to be a burden for future development of the product.
So I recommend giving REST a really good try before going with an RPC framework.
Use REST. It's the simplest, and therefore the most widely accessible. If you really find a need for SOAP, RPC, or CORBA later, you can add them then.
I would like to write a simple server-push implementation either using long pooling or comet that integrates into the server.
I don't want to use a networking framework like twisted because I want to learn how everything is done internally.
What exactly should I learn?
What specifications should I look at?
I prefer something that fits to apache so long pooling is better right?
Is there a way to implement such a thing without any external framework like Stackless Python?
Using Django it is not possible, because django works behind standard http server. To push you need to write a server supporting large number of paralell connections. To start with, I recommend reading Orbited source code. Read both server (python) and client (javascript) code.
I have been working with python for a while now. Recently I got into Sockets with Twisted which was good for learning Telnet, SSH, and Message Passing. I wanted to take an idea and implement it in a web fashion. A week of searching and all I can really do is create a resource that handles GET and POST all to itself. And this I am told is bad practice.
So The questions I have after one week:
* Are other options like Tornado and Standard Python Sockets a better (or more popular) approach?
* Should one really use separate resources in Twisted GET and POST operations?
* What is a good resource to start in this area of Python Development?
My background with languages are C, Java, HTML/DHTML/XHTML/XML and my main systems (even home) are Linux.
I'd recommend against building your own web server and handling raw socket calls to build web applications; it makes much more sense to just write your web services as wsgi applications and use an existing web server, whether it's something like tornado or apache with mod_wsgi.
If what you're doing is more of a web site than an API, look into using a normal web framework like Django.
I'll try to answer your various points individually.
Are other options like Tornado and Standard Python Sockets a better
(or more popular) approach?
WSGI frameworks are by far the most popular options these days. They can give you
access to GET and POST primitives, but often wrap them with enough syntactic sugar
to get you off to the races quickly.
Hardly anyone deals with sockets for htt. To give you an idea, one of the more popular http libraries, requests initially wrapped urrllib2 up until recently.
Should one really use separate resources in Twisted GET and POST operations?
I can't speak to this as I'm not a Twisted developer. It seems to be a language unto itself.
What is a good resource to start in this area of Python Development?
For handling GETs and POSTs, Webob is probably a good place to start.
For some more context, webob is wrapping base Python primitives coming from WSGI (rhymes with "whiskey"). WSGI is an interface between web applications and servers, not unlike CGI.
PEP 3333, the document which defined the WSGI standard, is a really good place to start if you're interested in the nitty gritty of http.
Going a bit lower in the stack, there are also a number of WSGI servers worth checking out. Cloud-hosted, Platform-as-a-Service (PaaS) options like Google App Engine and Heroku will take care of the details for you. On the other hand, there's specialized wsgi servers like gunicorn and Tornado, the latter of which you're already familiar with.
If you are looking to just get stuff done, check out Bottle, Flask, Django, or any of the other great Python web frameworks out there.