I'm looking for a good server/client protocol supported in Python for making data requests/file transfers between one server and many clients. Security is also an issue - so secure login would be a plus. I've been looking into XML-RPC, but it looks to be a pretty old (and possibly unused these days?) protocol.
If you are looking to do file transfers, XMLRPC is likely a bad choice. It will require that you encode all of your data as XML (and load it into memory).
"Data requests" and "file transfers" sounds a lot like plain old HTTP to me, but your statement of the problem doesn't make your requirements clear. What kind of information needs to be encoded in the request? Would a URL like "http://yourserver.example.com/service/request?color=yellow&flavor=banana" be good enough?
There are lots of HTTP clients and servers in Python, none of which are especially great, but all of which I'm sure will get the job done for basic file transfers. You can do security the "normal" web way, which is to use HTTPS and passwords, which will probably be sufficient.
If you want two-way communication then HTTP falls down, and a protocol like Twisted's perspective broker (PB) or asynchronous messaging protocol (AMP) might suit you better. These protocols are certainly well-supported by Twisted.
ProtocolBuffers was released by Google as a way of serializing data in a very compact efficient way. They have support for C++, Java and Python. I haven't used it yet, but looking at the source, there seem to be RPC clients and servers for each language.
I personally have used XML-RPC on several projects, and it always did exactly what I was hoping for. I was usually going between C++, Java and Python. I use libxmlrpc in Python often because it's easy to memorize and type interactively, but it is actually much slower than the alternative pyxmlrpc.
PyAMF is mostly for RPC with Flash clients, but it's a compact RPC format worth looking at too.
When you have Python on both ends, I don't believe anything beats Pyro (Python Remote Objects.) Pyro even has a "name server" that lets services announce their availability to a network. Clients use the name server to find the services it needs no matter where they're active at a particular moment. This gives you free redundancy, and the ability to move services from one machine to another without any downtime.
For security, I'd tunnel over SSH, or use TLS or SSL at the connection level. Of course, all these options are essentially the same, they just have various difficulties of setup.
Pyro (Python Remote Objects) is fairly clever if all your server/clients are going to be in Python. I use XMPP alot though since I'm communicating with hosts that are not always Python. XMPP lends itself to being extended fairly easily too.
There is an excellent XMPP library for python called PyXMPP which is reasonably up to date and has no dependancy on Twisted.
I suggest you look at 1. XMLRPC 2. JSONRPC 3. SOAP 4. REST/ATOM
XMLRPC is a valid choice. Don't worry it is too old. That is not a problem. It is so simple that little needed changing since original specification. The pro is that in every programming langauge I know there is a library for a client to be written in. Certainly for python. I made it work with mod_python and had no problem at all.
The big problem with it is its verbosity. For simple values there is a lot of XML overhead. You can gzip it of cause, but then you loose some debugging ability with the tools like Fiddler.
My personal preference is JSONRPC. It has all of the XMLRPC advantages and it is very compact. Further, Javascript clients can "eval" it so no parsing is necessary. Most of them are built for version 1.0 of the standard. I have seen diverse attempts to improve on it, called 1.1 1.2 and 2.0 but they are not built one on top of another and, to my knowledge, are not widely supported yet. 2.0 looks the best, but I would still stick with 1.0 for now (October 2008)
Third candidate would be REST/ATOM. REST is a principle, and ATOM is how you convey bulk of data when it needs to for POST, PUT requests and GET responses.
For a very nice implementation of it, look at GData, Google's API. Real real nice.
SOAP is old, and lots lots of libraries / langauges support it. IT is heeavy and complicated, but if your primary clients are .NET or Java, it might be worth the bother.
Visual Studio would import your WSDL file and create a wrapper and to C# programmer it would look like local assembly indeed.
The nice thing about all this, is that if you architect your solution right, existing libraries for Python would allow you support more then one with almost no overhead. XMLRPC and JSONRPC are especially good match.
Regarding authentication. XMLRPC and JSONRPC don't bother defining one. It is independent thing from the serialization. So you can implement Basic Authentication, Digest Authentication or your own with any of those. I have seen couple of examples of client side Digest Authentication for python, but am yet to see the server based one. If you use Apache, you might not need one, using mod_auth_digest Apache module instead. This depens on the nature of your application
Transport security. It is obvously SSL (HTTPS). I can't currently remember how XMLRPC deals with, but with JSONRPC implementation that I have it is trivial - you merely change http to https in your URLs to JSONRPC and it shall be going over SSL enabled transport.
HTTP seems to suit your requirements and is very well supported in Python.
Twisted is good for serious asynchronous network programming in Python, but it has a steep learning curve, so it might be worth using something simpler unless you know your system will need to handle a lot of concurrency.
To start, I would suggest using urllib for the client and a WSGI service behind Apache for the server. Apache can be set up to deal with HTTPS fairly simply.
SSH can be a good choice for file transfer and remote control, especially if you are concerned with secure login. Most Linux and Solaris servers will already run an SSH service for administration, so if your Python program use ssh then you don't need to open up any additional ports or services on remote machines.
OpenSSH is the standard and portable SSH client and server, and can be used via subprocesses from Python. If you want more flexibility Twisted includes Twisted Conch which is a SSH client and server implementation which provides flexible programmable control of an SSH stack, on both Linux and Windows. I use both in production.
I'd use http and start with understanding what the Python library offers.
Then I'd move onto the more industrial strength Twisted library.
There is no need to use HTTP (indeed, HTTP is not good for RPC in general in some respects), and no need to use a standards-based protocol if you're talking about a python client talking to a python server.
Use a Python-specific RPC library such as Pyro, or what Twisted provides (Twisted.spread).
XMLRPC is very simple to get started with, and at my previous job, we used it extensively for intra-node communication in a distributed system. As long as you keep track of the fact that the None value can't be easily transferred, it's dead easy to work with, and included in Python's standard library.
Run it over https and add a username/password parameter to all calls, and you'll have simple security in place. Not sure about how easy it is to verify server certificate in Python, though.
However, if you are transferring large amounts of data, the coding into XML might become a bottleneck, so using a REST-inspired architecture over https may be as good as xmlrpclib.
Facebook's thrift project may be a good answer. It uses a light-weight protocol to pass object around and allows you to use any language you wish. It may fall-down on security though as I believe there is none.
In the RPC field, Json-RPC will bring a big performance improvement over xml-rpc:
http://json-rpc.org/wiki/python-json-rpc
Related
I'm a beginner in python and I'd like to know what the difference between socket and requests modules in python is. I'm sorry if this question format is bad, if this may be due to the fact that I do not know exactly how different protocols of Internet work, then I will be grateful if you share the relevant literature, so that I myself look for the answer to my question. Can you give examples?
How Everything is Related
Requests is one of the most used, highest level python HTTP clients. It is built on urllib3, which is built on httpclient lib. These are all HTTP application protocol libraries that utilize sockets to make calls, and thus sockets is foundation of them all.
More on the Foundational Sockets
Sockets are widely used with most Operating Systems to communicate with networks, since they can control tons of types of connections and transmit data. HTTP libraries rely on sockets, and thus there is no HTTP/TCP protocol connections without them. However, on the opposite side, sockets can exist without HTTP.
Conclusion
In the end, the requests library is fundamentally built on several libs with sockets being the (basically) bottom level. You’ll find that requests is the easiest to use since it is the highest level with the most documentation. You can also use pysocks, however it is much more difficult since it is lower level, but there is some documentation out there that can lead you to the right direction for making your own connections and sending your own encoded data to act as an HTTP request. The main trade off is pysocks are harder to work with, but much more customizable and fast. Most python web-apps/scrapers use requests, and sockets are used more rarely by experts in some cases.
More Reading
Intro to HTTP: https://www.tutorialspoint.com/http/index.htm
Distinguishing Between HTTP and HTTPS protocols are very important to the web: https://www.cloudflare.com/learning/ssl/why-is-http-not-secure/
Requests (pythons objectively most widely used HTTP library): https://docs.python-requests.org/en/latest/
Pysocks (pythons use of sockets): https://pypi.org/project/PySocks/
The Relation of Senders/Receivers (advanced): https://www.geeksforgeeks.org/layers-of-osi-model/
I'm working on a project to expose a set of methods from various client machines to a server for the purpose of information gathering and automation. I'm using Python at the moment, and SimpleXMLRPCServer seems to work great on a local network, where I know the addresses of the client machines, and there's no NAT or firewall.
The problem is that the client/server model is backwards for what I want to do. Rather than have an RPC server running on the client machine, exposing a service to the software client, I'd like to have a server listening for connections from clients, which connect and expose the service to the server.
I'd thought about tunneling, remote port forwarding with SSH, or a VPN, but those options don't scale well, and introduce more overhead and complexity than I'd like.
I'm thinking I could write a server and client to reverse the model, but I don't want to reinvent the wheel if it already exists. It seems to me that this would be a common enough problem that there would be a solution for it already.
I'm also just cutting my teeth on Python and networked services, so it's possible I'm asking the wrong question entirely.
What you want is probably WAMP routed RPC.
It seems to address your issue and it's very convenient once you get used to it.
The idea is to put the WAMP router (let's say) in the cloud, and both RPC caller and RPC callee are clients with outbound connections to the router.
I was also using VPN for connecting IoT devices together through the internet, but switching to this router model really simplified things up and it scales pretty well.
By the way WAMP is implemented in different languages, including Python.
Maybe Pyro can be of use? It allows for many forms of distributed computing in Python. You are not very clear in your requirements so it is hard to say if this might work for you, but I advise you to have a look at the documentation or the many examples of Pyro and see if there's something that matches what you want to do.
Pyro abstracts most of the networking intricacy away, you simply invoke a method on a (remote) python object.
I need to do a client-server application, the client will be made with python-gtk,
all procedures will be on server-side to free the client of this workload.
So i did search on google about client-server protocols and i found that CORBA and RPC are closer from what i had in mind, BUT also i want to made this app ready to accept web and mobile clients, so i found REST and SOAP.
From all that reading i found myself with this doubts, should i implement two different protocols, one for gtk-client (like RPC or CORBA) and another for web and mobile (REST or SOAP)?
Can i use REST or SOAP for all?
I've implemented webservices using SOAP/XMLRPC (it was easy to support both, the framework I was using at the time made it pretty trivial) before; I had thought about using standard HTTP without the SOAP/XMLRPC layer (before I was aware that REST had a name :) but decided against it in the end because "I didn't want to write client-side code to handle the datastructures". (The Perl client also had easy SOAP/XMLRPC APIs.)
In the end, I regretted the decision I made: I could have written the code to handle the datastructures myself in an afternoon (or at the most a day) -- or if I had chosen to use JSON, probably two hours. But the burden of the SOAP/XMLRPC API and library dependencies lives on, years after I saved a few hours of developing, and will continue to be a burden for future development of the product.
So I recommend giving REST a really good try before going with an RPC framework.
Use REST. It's the simplest, and therefore the most widely accessible. If you really find a need for SOAP, RPC, or CORBA later, you can add them then.
What library should I use for network programming? Is sockets the best, or is there a higher level interface, that is standard?
I need something that will be pretty cross platform (ie. Linux, Windows, Mac OS X), and it only needs to be able to connect to other Python programs using the same library.
You just want to send python data between nodes (possibly on separate computers)? You might want to look at SimpleXMLRPCServer. It's based on the inbuilt HTTP server, which is based on the inbuilt Socket server, neither of which are the most industrial-strength servers around, but it's easy to set up in a hurry:
from SimpleXMLRPCServer import SimpleXMLRPCServer
server = SimpleXMLRPCServer(("localhost", 9876))
def my_func(a,b):
return a + b
server.register_function(my_func)
server.serve_forever()
And easy to connect to:
import xmlrpclib
s = xmlrpclib.ServerProxy('http://localhost:9876')
print s.my_func(2,3)
>>> 5
print type(s.my_func(2,3))
>>> <type 'int'>
print s.my_func(2,3.0):
>>> 7.0
Twisted is popular for industrial applications, but it's got a brutal learning curve.
There is a framework that you may be interested in: Twisted
the answer depends on what you are trying to do.
"What library should I use for network programming?" is pretty vague.
for example, if you want to do HTTP, you might look at such standard libraries as urllib, urllib2, httplib, sockets. It all depends on which protocol you are looking to use, and which network layer you want to work at.
there are libraries in python for various network tasks... email, web, rpc, etc etc...
for starters, look over the standard library reference manual and see which tasks you want to do, then go from there: http://docs.python.org/library/index.html
As previously mentioned, Twisted is the most popular (by far). However, there are a lot of other alternative worth exploring. Tornado and Diesel are probably the top two contenders. A more complete comparison is found here.
Personally I just use asyncore from the standard library, which is a bit like a very cut-down version of Twisted, but this is because I prefer a simple and low level interface. If you want a higher level interface, especially just to communicate with another instance of your own program, you don't necessarily have to worry about the networking layer, and can consider something higher level like RPyC or pyro instead. The network then becomes an implementation detail and you can concentrate on just sending the information.
A lot of people like Twisted. I was a huge fan for awhile, but on working with it a bit and thinking about it more I've become disenchanted. It's complex, and last I looked, a lot of it had the assumption that your program would always be able to send data leading to possible situations in which your program grows memory usage endlessly buffering data to send that isn't being picked up by the remote side or isn't being picked up fast enough.
In my opinion, it depends a lot on what kind of network programming you want to do. A lot of times you don't really care about getting stuff done while you're waiting for IO. HTTP, for example, is very request-response oriented, and if you're only talking to a single server there is little reason to need something like Twisted and plain sockets or Python's built-in HTTP libraries will work fine.
If you're writing a server of any kind, you almost certainly need to be event-driven. Twisted has a slight edge there, but it still seems overly complex to me. Bittorrent, for example, was written in Python and doesn't use Twisted at all.
Another factor favoring Twisted is that there is code for a lot of protocols already written for it. So if you want to speak an existing protocol a lot of hard work may already have been done for you.
The socket module in the standard lib is in my opinion a good choice if you don't need high performance.
It is a very famous API that is known by almost every developpers of almost every languages. It's quite sipmple and there is a lot of information available on the internet. Moreover, it will be easier for other people to understand your code.
I guess that an event-driven framework like Twisted has better performance but in basic cases standard sockets are enough.
Of course, if you use a higher-level protocol (http, ftp...), you should use the corresponding implementation in the python standard library.
Socket is low level api, it is mapped directly to operating system interface.
Twisted, Tornado ... are high level framework (of course they are built on socket because socket is low level).
When it come to TCP/IP programming, you should have some basic knowledge to make a decision about what you shoud use:
Will you use well-known protocol like HTTP, FTP or create your own protocol?
Blocking or non-blocking? Twisted, Tornado are non-blocking framework (basically like nodejs).
Of course, socket can do everything because every other framework is base on its ;)
I have been working with python for a while now. Recently I got into Sockets with Twisted which was good for learning Telnet, SSH, and Message Passing. I wanted to take an idea and implement it in a web fashion. A week of searching and all I can really do is create a resource that handles GET and POST all to itself. And this I am told is bad practice.
So The questions I have after one week:
* Are other options like Tornado and Standard Python Sockets a better (or more popular) approach?
* Should one really use separate resources in Twisted GET and POST operations?
* What is a good resource to start in this area of Python Development?
My background with languages are C, Java, HTML/DHTML/XHTML/XML and my main systems (even home) are Linux.
I'd recommend against building your own web server and handling raw socket calls to build web applications; it makes much more sense to just write your web services as wsgi applications and use an existing web server, whether it's something like tornado or apache with mod_wsgi.
If what you're doing is more of a web site than an API, look into using a normal web framework like Django.
I'll try to answer your various points individually.
Are other options like Tornado and Standard Python Sockets a better
(or more popular) approach?
WSGI frameworks are by far the most popular options these days. They can give you
access to GET and POST primitives, but often wrap them with enough syntactic sugar
to get you off to the races quickly.
Hardly anyone deals with sockets for htt. To give you an idea, one of the more popular http libraries, requests initially wrapped urrllib2 up until recently.
Should one really use separate resources in Twisted GET and POST operations?
I can't speak to this as I'm not a Twisted developer. It seems to be a language unto itself.
What is a good resource to start in this area of Python Development?
For handling GETs and POSTs, Webob is probably a good place to start.
For some more context, webob is wrapping base Python primitives coming from WSGI (rhymes with "whiskey"). WSGI is an interface between web applications and servers, not unlike CGI.
PEP 3333, the document which defined the WSGI standard, is a really good place to start if you're interested in the nitty gritty of http.
Going a bit lower in the stack, there are also a number of WSGI servers worth checking out. Cloud-hosted, Platform-as-a-Service (PaaS) options like Google App Engine and Heroku will take care of the details for you. On the other hand, there's specialized wsgi servers like gunicorn and Tornado, the latter of which you're already familiar with.
If you are looking to just get stuff done, check out Bottle, Flask, Django, or any of the other great Python web frameworks out there.