I'm getting started with websockets. Trying to write a python server for a browser based (javascript) client.
I have also never really done asynchronous programming before (except "events"). I was trying to avoid it - I have searched and searched for an example of websocket use that did not involve importing tornado or asyncio. But I've found nothing, even the "most basic examples" do it.
So now I'm internalising it, but clear it up for me - is "full duplex" server code necessarily asynchronous?
Full-duplex servers are necessarily concurrent. In Tornado and asyncio, concurrency is based on the asynchronous programming model, so if you use a websocket library based on one of those packages, your code will need to be asynchronous.
But that's not the only way: full-duplex websockets could be implemented in a synchronous way by dedicating a thread to reading from the connection (in addition to whatever other threads you're using). I don't know if there are any python websocket implementations that support this kind of multithreading model for full-duplex, but that's how Go's websocket implementation works for example.
That said, the asynchronous/event-driven model is a natural fit for websockets (it's how everything works on the javascript side), so I would encourage you to get comfortable with that model instead of trying to find a way to work with websockets synchronously.
I need to do a client-server application, the client will be made with python-gtk,
all procedures will be on server-side to free the client of this workload.
So i did search on google about client-server protocols and i found that CORBA and RPC are closer from what i had in mind, BUT also i want to made this app ready to accept web and mobile clients, so i found REST and SOAP.
From all that reading i found myself with this doubts, should i implement two different protocols, one for gtk-client (like RPC or CORBA) and another for web and mobile (REST or SOAP)?
Can i use REST or SOAP for all?
I've implemented webservices using SOAP/XMLRPC (it was easy to support both, the framework I was using at the time made it pretty trivial) before; I had thought about using standard HTTP without the SOAP/XMLRPC layer (before I was aware that REST had a name :) but decided against it in the end because "I didn't want to write client-side code to handle the datastructures". (The Perl client also had easy SOAP/XMLRPC APIs.)
In the end, I regretted the decision I made: I could have written the code to handle the datastructures myself in an afternoon (or at the most a day) -- or if I had chosen to use JSON, probably two hours. But the burden of the SOAP/XMLRPC API and library dependencies lives on, years after I saved a few hours of developing, and will continue to be a burden for future development of the product.
So I recommend giving REST a really good try before going with an RPC framework.
Use REST. It's the simplest, and therefore the most widely accessible. If you really find a need for SOAP, RPC, or CORBA later, you can add them then.
I decided to improve my knowledge about python network programming and here is the deal: I have a simple server for Windows, which interacts with a client from a mobile device using wi-fi. Also I have a packet sniffer (Wireshark).
Now I want to ask, what do I need to write the Linux version of this server? How to determine the structure of packets, establish the connection? What do I need to use - sockets, Twisted, maybe Tornado?
Start with the SocketServer module and build from there.
Note that this will take a lot of guesswork if there is no documentation about the protocol. If you're lucky, they are using XML or HTML. If not, you will have to make the existing server send a lot of test data which you have to manipulate in some way (by changing fields and see what changes in the data stream).
Good luck!
I am building a web application that has a real-time feed (similar to Facebook's newsfeed) that I want to update via a long-polling mechanism. I understand that with Python, my choices are pretty much to either use Stackless (building from their Comet wsgi example) or Cometd + Twisted. Unfortunately there is very little documentation regarding these options and I cannot find good information online about production scale users of comet on Python.
Has anyone successfully implemented comet on Python in a production system? How did you go about doing it and where can I find resources to implement my own?
Orbited seems as a nice solution. Haven't tried it though.
Update: things have changed in the last 2.5 years.
We now have websockets in all major browsers, except IE (naturally) and a couple of very good abstractions over it, that provide many methods of emulating real-time communication.
socket.io along with tornadio (socket.io 0.6) and tornadio2 (socket.io 0.7+)
sock.js along with SockJS-tornado
I recommend you should use StreamHub Comet Server - its used by a lot of people - personally I use it with a couple of Django sites I run. You will need to write a tiny bit of Java to handle the streaming - I did this using Jython. The front-end code is some real simple Javascript a la:
StreamHub hub = new StreamHub();
hub.connect("http://myserver.com/");
hub.subscribe("newsfeed", function(sTopic, oData) { alert("new news item: " + oData.Title); });
The documentation is pretty good - I had similar problems as you trying to get started with the sparse docs of Cometd et al. For a start I'd read Getting Started With Comet and StreamHub, download and see how some of the examples work and reference the API docs if you need to:
Javascript API JSDoc
Streaming from Java Javadoc
Here is a full-featured example of combining Django, Orbited,and Twisted to create a real-time (Comet) app: http://github.com/clemesha/hotdot using Python.
I've done tons of APIs using twisted for stuff like that, most of which are available on my github account.
Most are client-side, but slosh is a server I wrote to do a realtime cheap pubsub sort of thing. It scales somewhat horizontally for reads by allowing for simple stream replication. Writes are a little different when you stick to plain HTTP, but I've pushed a decent amount through it for a demo.
Otherwise, you have full-on BOSH which most XMPP servers support and will allow you to decouple the message distribution from the web frontend.
I haven't done it, but this guy has and writes a good article about it, with Django examples and pointers (which I haven't checked) to other frameworks.
the orbited and redis solutions are nice, but not longer relevant when you have something like the PubSubHubbub that google released. This makes it very easy to be the publisher or the subscriber to a given feed. http://code.google.com/p/pubsubhubbub/
Here's an example that does long-polling with gevent and Django.
It uses greenlet - stack switching functionality from Stackless packaged as a CPython extension.
I'm looking for a good server/client protocol supported in Python for making data requests/file transfers between one server and many clients. Security is also an issue - so secure login would be a plus. I've been looking into XML-RPC, but it looks to be a pretty old (and possibly unused these days?) protocol.
If you are looking to do file transfers, XMLRPC is likely a bad choice. It will require that you encode all of your data as XML (and load it into memory).
"Data requests" and "file transfers" sounds a lot like plain old HTTP to me, but your statement of the problem doesn't make your requirements clear. What kind of information needs to be encoded in the request? Would a URL like "http://yourserver.example.com/service/request?color=yellow&flavor=banana" be good enough?
There are lots of HTTP clients and servers in Python, none of which are especially great, but all of which I'm sure will get the job done for basic file transfers. You can do security the "normal" web way, which is to use HTTPS and passwords, which will probably be sufficient.
If you want two-way communication then HTTP falls down, and a protocol like Twisted's perspective broker (PB) or asynchronous messaging protocol (AMP) might suit you better. These protocols are certainly well-supported by Twisted.
ProtocolBuffers was released by Google as a way of serializing data in a very compact efficient way. They have support for C++, Java and Python. I haven't used it yet, but looking at the source, there seem to be RPC clients and servers for each language.
I personally have used XML-RPC on several projects, and it always did exactly what I was hoping for. I was usually going between C++, Java and Python. I use libxmlrpc in Python often because it's easy to memorize and type interactively, but it is actually much slower than the alternative pyxmlrpc.
PyAMF is mostly for RPC with Flash clients, but it's a compact RPC format worth looking at too.
When you have Python on both ends, I don't believe anything beats Pyro (Python Remote Objects.) Pyro even has a "name server" that lets services announce their availability to a network. Clients use the name server to find the services it needs no matter where they're active at a particular moment. This gives you free redundancy, and the ability to move services from one machine to another without any downtime.
For security, I'd tunnel over SSH, or use TLS or SSL at the connection level. Of course, all these options are essentially the same, they just have various difficulties of setup.
Pyro (Python Remote Objects) is fairly clever if all your server/clients are going to be in Python. I use XMPP alot though since I'm communicating with hosts that are not always Python. XMPP lends itself to being extended fairly easily too.
There is an excellent XMPP library for python called PyXMPP which is reasonably up to date and has no dependancy on Twisted.
I suggest you look at 1. XMLRPC 2. JSONRPC 3. SOAP 4. REST/ATOM
XMLRPC is a valid choice. Don't worry it is too old. That is not a problem. It is so simple that little needed changing since original specification. The pro is that in every programming langauge I know there is a library for a client to be written in. Certainly for python. I made it work with mod_python and had no problem at all.
The big problem with it is its verbosity. For simple values there is a lot of XML overhead. You can gzip it of cause, but then you loose some debugging ability with the tools like Fiddler.
My personal preference is JSONRPC. It has all of the XMLRPC advantages and it is very compact. Further, Javascript clients can "eval" it so no parsing is necessary. Most of them are built for version 1.0 of the standard. I have seen diverse attempts to improve on it, called 1.1 1.2 and 2.0 but they are not built one on top of another and, to my knowledge, are not widely supported yet. 2.0 looks the best, but I would still stick with 1.0 for now (October 2008)
Third candidate would be REST/ATOM. REST is a principle, and ATOM is how you convey bulk of data when it needs to for POST, PUT requests and GET responses.
For a very nice implementation of it, look at GData, Google's API. Real real nice.
SOAP is old, and lots lots of libraries / langauges support it. IT is heeavy and complicated, but if your primary clients are .NET or Java, it might be worth the bother.
Visual Studio would import your WSDL file and create a wrapper and to C# programmer it would look like local assembly indeed.
The nice thing about all this, is that if you architect your solution right, existing libraries for Python would allow you support more then one with almost no overhead. XMLRPC and JSONRPC are especially good match.
Regarding authentication. XMLRPC and JSONRPC don't bother defining one. It is independent thing from the serialization. So you can implement Basic Authentication, Digest Authentication or your own with any of those. I have seen couple of examples of client side Digest Authentication for python, but am yet to see the server based one. If you use Apache, you might not need one, using mod_auth_digest Apache module instead. This depens on the nature of your application
Transport security. It is obvously SSL (HTTPS). I can't currently remember how XMLRPC deals with, but with JSONRPC implementation that I have it is trivial - you merely change http to https in your URLs to JSONRPC and it shall be going over SSL enabled transport.
HTTP seems to suit your requirements and is very well supported in Python.
Twisted is good for serious asynchronous network programming in Python, but it has a steep learning curve, so it might be worth using something simpler unless you know your system will need to handle a lot of concurrency.
To start, I would suggest using urllib for the client and a WSGI service behind Apache for the server. Apache can be set up to deal with HTTPS fairly simply.
SSH can be a good choice for file transfer and remote control, especially if you are concerned with secure login. Most Linux and Solaris servers will already run an SSH service for administration, so if your Python program use ssh then you don't need to open up any additional ports or services on remote machines.
OpenSSH is the standard and portable SSH client and server, and can be used via subprocesses from Python. If you want more flexibility Twisted includes Twisted Conch which is a SSH client and server implementation which provides flexible programmable control of an SSH stack, on both Linux and Windows. I use both in production.
I'd use http and start with understanding what the Python library offers.
Then I'd move onto the more industrial strength Twisted library.
There is no need to use HTTP (indeed, HTTP is not good for RPC in general in some respects), and no need to use a standards-based protocol if you're talking about a python client talking to a python server.
Use a Python-specific RPC library such as Pyro, or what Twisted provides (Twisted.spread).
XMLRPC is very simple to get started with, and at my previous job, we used it extensively for intra-node communication in a distributed system. As long as you keep track of the fact that the None value can't be easily transferred, it's dead easy to work with, and included in Python's standard library.
Run it over https and add a username/password parameter to all calls, and you'll have simple security in place. Not sure about how easy it is to verify server certificate in Python, though.
However, if you are transferring large amounts of data, the coding into XML might become a bottleneck, so using a REST-inspired architecture over https may be as good as xmlrpclib.
Facebook's thrift project may be a good answer. It uses a light-weight protocol to pass object around and allows you to use any language you wish. It may fall-down on security though as I believe there is none.
In the RPC field, Json-RPC will bring a big performance improvement over xml-rpc:
http://json-rpc.org/wiki/python-json-rpc