I'm a beginner in python and I'd like to know what the difference between socket and requests modules in python is. I'm sorry if this question format is bad, if this may be due to the fact that I do not know exactly how different protocols of Internet work, then I will be grateful if you share the relevant literature, so that I myself look for the answer to my question. Can you give examples?
How Everything is Related
Requests is one of the most used, highest level python HTTP clients. It is built on urllib3, which is built on httpclient lib. These are all HTTP application protocol libraries that utilize sockets to make calls, and thus sockets is foundation of them all.
More on the Foundational Sockets
Sockets are widely used with most Operating Systems to communicate with networks, since they can control tons of types of connections and transmit data. HTTP libraries rely on sockets, and thus there is no HTTP/TCP protocol connections without them. However, on the opposite side, sockets can exist without HTTP.
Conclusion
In the end, the requests library is fundamentally built on several libs with sockets being the (basically) bottom level. You’ll find that requests is the easiest to use since it is the highest level with the most documentation. You can also use pysocks, however it is much more difficult since it is lower level, but there is some documentation out there that can lead you to the right direction for making your own connections and sending your own encoded data to act as an HTTP request. The main trade off is pysocks are harder to work with, but much more customizable and fast. Most python web-apps/scrapers use requests, and sockets are used more rarely by experts in some cases.
More Reading
Intro to HTTP: https://www.tutorialspoint.com/http/index.htm
Distinguishing Between HTTP and HTTPS protocols are very important to the web: https://www.cloudflare.com/learning/ssl/why-is-http-not-secure/
Requests (pythons objectively most widely used HTTP library): https://docs.python-requests.org/en/latest/
Pysocks (pythons use of sockets): https://pypi.org/project/PySocks/
The Relation of Senders/Receivers (advanced): https://www.geeksforgeeks.org/layers-of-osi-model/
Related
I'm trying to use spyne (http://spyne.io) in my server with ZeroMQ and MsgPack. I've followed the examples to program the server side, but i can't find any example that helps me to know how to program the client side.
I've found the class spyne.client.zeromq.ZeroMQClient , but I don't know what it's supposed to be the 'app' parameter of its constructor.
Thank you in advance!
Edit:
The (simplified) server-side code:
from spyne.application import Application
from spyne.protocol.msgpack import MessagePackRpc
from spyne.server.zeromq import ZeroMQServer
from spyne.service import ServiceBase
from spyne.decorator import srpc
from spyne.model.primitive import Unicode
class RadianteRPC(ServiceBase):
#srpc(_returns=Unicode)
def whoiam():
return "Hello I am Seldon!"
radiante_rpc = Application(
[RadianteRPC],
tns="radiante.rpc",
in_protocol=MessagePackRpc(validator="soft"),
out_protocol=MessagePackRpc()
)
s = ZeroMQServer(radiante_rpc, "tcp://127.0.0.1:5001")
s.serve_forever()
Spyne author here.
There are many issues with the Spyne's client transports.
First and most important being that they require server code to work. And that's because Spyne's wsdl parser is just halfway done, so there's no way to communicate the interface the server exposes to a client.
Once the Wsdl parser is done, Spyne's client transports will be revived as well. They're working just fine though, the tests pass, but they are (slightly) obsolete and, as you noticed, don't have proper docs.
Now back to your question: The app parameter to the client constructor is the same application instance that goes to the server constructor. So if you do this:
c = ZeroMQClient("tcp://127.0.0.1:5001", radiante_rpc)
print c.service.whoiam()
It will print "Hello I am Seldon!"
Here's the full code I just committed: https://github.com/arskom/spyne/tree/master/examples/zeromq
BUT:
All this said, you should not use ZeroMQ for RPC.
I looked at ZeroMQ for RPC purposes back when its hype was up at crazy levels, (I even got my name in ZeroMQ contributors list :)) I did not like what I saw, and I moved on.
Pasting my relevant news.yc comment from https://news.ycombinator.com/item?id=6089252 here:
In my experience, ZeroMQ is very fragile in RPC-like applications,
especially because it tries to abstract away the "connection". This
mindset is very appropriate when you're doing multicast (and ZeroMQ
rocks when doing multicast), but for unicast, I actually want to
detect a disconnection or a connection failure and handle it
appropriately before my outgoing buffers are choked to death. So, I'd
evaluate other alternatives before settling on ZeroMQ as a transport
for internal RPC-type messaging.
If you are fine with having the whole message in memory before parsing
(or sending) it (Http is not that bad when it comes to transferring
huge documents over the network), writing raw MessagePack document to
a regular TCP stream (or tucking it inside a UDP datagram) will do the
trick just fine. MessagePack library does support parsing streams --
see e.g. its Python example in its homepage (http://msgpack.org).
Disclosure: I'm just a happy MessagePack (and sometimes ZeroMQ) user.
I work on Spyne (http://spyne.io) so I just have experience with some
of the most popular protocols out there.
I seem to have written that comment more than a year ago. Fast forward to today, I got the MessagePack transport implemented and released in Spyne 2.11. So if you're looking for a lightweight transport for internally passing small messages, my recommendation would be to use it instead of ZeroMQ.
However, once you're outside the Http-land, you're back to dealing with sockets at the system-level, which may or may not be what you want, depending especially on the amount of resources you have to spare for this bit of your project.
Sadly, there is no documentation about it besides the examples I just put together here: https://github.com/arskom/spyne/tree/master/examples/msgpack_transport
The server code is fairly standard Spyne/Twisted code but the client is using system-level sockets to illustrate how it's supposed to work. I'd happily accept a pull request wrapping it to a proper Spyne client transport.
I hope this helps. Patches are welcome.
Best regards,
thanks for the interesting responses thus far. In light of said responses I have changed my question a bit.
guess what I really need to know is, is socketserver as opposed to the straight-up socket library designed to handle both periods of latency and stress, i.e. does it have additional mechanisms or features that justify its implicitly advertised status as a "server," or is it just slightly easier to use?
everyone seems to be recommending socketserver but I'm still not entirely clear why, as opposed to socket.
thanks!!!
I've built some server programs in
python based on the standard socket
library
http://docs.python.org/library/socket.html
I've noticed that they seem to work
just fine except that without load
they have a tendency to go to sleep
after a while. I guess this may not
be an issue in production (no doubt
there will be plenty of other issues)
but I would like to know if I am
using the right code for the job here.
Looking around I saw that python also
provides a socketserver library -
http://docs.python.org/library/socketserver.html
The socket library provides the
ability to listen for multiple
connections, typically up to 5.
According to the socketserver page,
its services are synchronous, i.e.
blocking, but one may support
asynchronous behavior via threading.
I did notice it has the ability to
maintain a request queue, with a
default value of up to 5 requests...so
maybe not much difference there.
I have also read that Twisted runs
socketserver under the hood. Though I
would rather not get into a beast the
size of Twisted unless it's going to
be worthwhile.
so my question is, is socketserver
more robust than socket? If so, why?
(And how do you know?)
incidentally, is socketserver built on
top of python's socket or is it
entirely separate?
finally, as a bonus if anyone knows
what one could do wrong such that
standard sockets 'fall asleep' please
feel free to chime in on that too.
Oh, and I'm talking python 2.x rather
than 3.x here if that makes a
difference.
thanks folks!
jsh
Well, I don't have a technical answer but I've implemented SocketServer per folks' recommendations and it IS definitely more reliable. If anyone ever comes up with the low-level explanation please let me know...thanks!
The socket module is a very low-level module for sending and receiving packets. As said in the documentation, it "provides access to the BSD socket interface".
If you want something more elaborate, there is "socketserver" that takes care of the gory details for you, but it is still relatively low level.
On top of that you can find an HTTP server, with or without CGI, an XML-RPC server, and so on. These are frameworks, which usually means that their code calls your code. It makes things simpler because you just have to fill some "gaps" to have a fully working server, but it also means you have a little bit less control over what it does.
If you only need features of socketserver, I would probably go with it, unless you want to reinvent the wheel for some reason (and there are always good reasons to design new wheels, for example to understand how it works).
I have a project that is based on Twisted used to
communicate with network devices and I am adding support for a new
vendor (Citrix NetScaler) whose API is SOAP. Unfortunately the
support for SOAP in Twisted still relies on SOAPpy, which is badly out
of date. In fact as of this question (I just checked), twisted.web.soap
itself hasn't even been updated in 21 months!
I would like to ask if anyone has any experience they would be willing
to share with utilizing Twisted's superb asynchronous transport
functionality with SUDS. It seems like plugging in a custom Twisted
transport would be a natural fit in SUDS' Client.options.transport, I'm just having
a hard time wrapping my head around it.
I did come up with a way to call the SOAP method with SUDS
asynchronously by utilizing twisted.internet.threads.deferToThread(),
but this feels like a hack to me.
Here is an example of what I've done, to give you an idea:
# netscaler is a module I wrote using suds to interface with NetScaler SOAP
# Source: http://bitbucket.org/jathanism/netscaler-api/src
import netscaler
import os
import sys
from twisted.internet import reactor, defer, threads
# netscaler.API is the class that sets up the suds.client.Client object
host = 'netscaler.local'
username = password = 'nsroot'
wsdl_url = 'file://' + os.path.join(os.getcwd(), 'NSUserAdmin.wsdl')
api = netscaler.API(host, username=username, password=password, wsdl_url=wsdl_url)
results = []
errors = []
def handleResult(result):
print '\tgot result: %s' % (result,)
results.append(result)
def handleError(err):
sys.stderr.write('\tgot failure: %s' % (err,))
errors.append(err)
# this converts the api.login() call to a Twisted thread.
# api.login() should return True and is is equivalent to:
# api.service.login(username=self.username, password=self.password)
deferred = threads.deferToThread(api.login)
deferred.addCallbacks(handleResult, handleError)
reactor.run()
This works as expected and defers return of the api.login() call until
it is complete, instead of blocking. But as I said, it doesn't feel
right.
Thanks in advance for any help, guidance, feedback, criticism,
insults, or total solutions.
Update: The only solution I've found is twisted-suds, which is a fork of Suds modified to work with Twisted.
The default interpretation of transport in the context of Twisted is probably an implementation of twisted.internet.interfaces.ITransport. At this layer, you're basically dealing with raw bytes being sent and received over a socket of some sort (UDP, TCP, and SSL being the most commonly used three). This isn't really what a SUDS/Twisted integration library is interested in. Instead, what you want is an HTTP client which SUDS can use to make the necessary requests and which presents all of the response data so that SUDS can determine what the result was. That is to say, SUDS doesn't really care about the raw bytes on the network. What it cares about is the HTTP requests and responses.
If you examine the implementation of twisted.web.soap.Proxy (the client part of the Twisted Web SOAP API), you'll see that it doesn't really do much. It's about 20 lines of code that glues SOAPpy to twisted.web.client.getPage. That is, it's hooking SOAPpy in to Twisted in just the way I described above.
Ideally, SUDS would provide some kind of API along the lines of SOAPpy.buildSOAP and SOAPpy.parseSOAPRPC (perhaps the APIs would be a bit more complicated, or accept a few more parameters - I'm not a SOAP expert, so I don't know if SOAPpy's particular APIs are missing something important - but the basic idea should be the same). Then you could write something like twisted.web.soap.Proxy based on SUDS instead. If twisted.web.client.getPage doesn't offer enough control over the requests or enough information about the responses, you could also use twisted.web.client.Agent instead, which is more recently introduced and offers much more control over the whole request/response process. But again, that's really the same idea as the current getPage-based code, just a more flexible/expressive implementation.
Having just looked at the API documentation for Client.options.transport, it sounds like a SUDS transport is basically an HTTP client. The problem with this kind of integration is that SUDS wants to send a request and then be able to immediately get the response. Since Twisted is largely based on callbacks, a Twisted-based HTTP client API can't immediately return a response to SUDS. It can only return a Deferred (or equivalent).
This is why things work better if the relationship is inverted. Instead of giving SUDS an HTTP client to play with, give SUDS and an HTTP client to a third piece of code and let it orchestrate the interactions.
It may not be impossible to have things work by creating a Twisted-based SUDS transport (aka HTTP client), though. The fact that Twisted primarily uses Deferred (aka callbacks) to expose events doesn't mean that this is the only way it can work. By using a third-party library such as greenlet, it's possible to provide a coroutine-based API, where a request for an asynchronous operation involves switching execution from one coroutine to another, and events are delivered by switching back to the original coroutine. There is a project called corotwine which can do just this. It may be possible to use this to provide SUDS with the kind of HTTP client API it wants; however, it's not guaranteed. It depends on SUDS not breaking when a context switch is suddenly inserted where previously there was none. This is a very subtle and fragile property of SUDS and can easily be changed (unintentionally, even) by the SUDS developers in a future release, so it's probably not the ideal solution, even if you can get it to work now (unless you can get cooperation from the SUDS maintainers in the form of a promise to test their code in this kind of configuration to ensure it continues to work).
As an aside, the reason Twisted Web's SOAP support is still based on SOAPpy and hasn't been modified for nearly two years is that no clear replacement for SOAPpy has ever shown up. There have been many contenders (What SOAP client libraries exist for Python, and where is the documentation for them? covers several of them). If things ever settle down, it may make sense to try to update Twisted's built-in SOAP support. Until then, I think it makes more sense to do these integration libraries separately, so they can be updated more easily and so Twisted itself doesn't end up with a big pile of different SOAP integration that no one wants (which would be worse than the current situation, where there's just one SOAP integration module that no one wants).
What library should I use for network programming? Is sockets the best, or is there a higher level interface, that is standard?
I need something that will be pretty cross platform (ie. Linux, Windows, Mac OS X), and it only needs to be able to connect to other Python programs using the same library.
You just want to send python data between nodes (possibly on separate computers)? You might want to look at SimpleXMLRPCServer. It's based on the inbuilt HTTP server, which is based on the inbuilt Socket server, neither of which are the most industrial-strength servers around, but it's easy to set up in a hurry:
from SimpleXMLRPCServer import SimpleXMLRPCServer
server = SimpleXMLRPCServer(("localhost", 9876))
def my_func(a,b):
return a + b
server.register_function(my_func)
server.serve_forever()
And easy to connect to:
import xmlrpclib
s = xmlrpclib.ServerProxy('http://localhost:9876')
print s.my_func(2,3)
>>> 5
print type(s.my_func(2,3))
>>> <type 'int'>
print s.my_func(2,3.0):
>>> 7.0
Twisted is popular for industrial applications, but it's got a brutal learning curve.
There is a framework that you may be interested in: Twisted
the answer depends on what you are trying to do.
"What library should I use for network programming?" is pretty vague.
for example, if you want to do HTTP, you might look at such standard libraries as urllib, urllib2, httplib, sockets. It all depends on which protocol you are looking to use, and which network layer you want to work at.
there are libraries in python for various network tasks... email, web, rpc, etc etc...
for starters, look over the standard library reference manual and see which tasks you want to do, then go from there: http://docs.python.org/library/index.html
As previously mentioned, Twisted is the most popular (by far). However, there are a lot of other alternative worth exploring. Tornado and Diesel are probably the top two contenders. A more complete comparison is found here.
Personally I just use asyncore from the standard library, which is a bit like a very cut-down version of Twisted, but this is because I prefer a simple and low level interface. If you want a higher level interface, especially just to communicate with another instance of your own program, you don't necessarily have to worry about the networking layer, and can consider something higher level like RPyC or pyro instead. The network then becomes an implementation detail and you can concentrate on just sending the information.
A lot of people like Twisted. I was a huge fan for awhile, but on working with it a bit and thinking about it more I've become disenchanted. It's complex, and last I looked, a lot of it had the assumption that your program would always be able to send data leading to possible situations in which your program grows memory usage endlessly buffering data to send that isn't being picked up by the remote side or isn't being picked up fast enough.
In my opinion, it depends a lot on what kind of network programming you want to do. A lot of times you don't really care about getting stuff done while you're waiting for IO. HTTP, for example, is very request-response oriented, and if you're only talking to a single server there is little reason to need something like Twisted and plain sockets or Python's built-in HTTP libraries will work fine.
If you're writing a server of any kind, you almost certainly need to be event-driven. Twisted has a slight edge there, but it still seems overly complex to me. Bittorrent, for example, was written in Python and doesn't use Twisted at all.
Another factor favoring Twisted is that there is code for a lot of protocols already written for it. So if you want to speak an existing protocol a lot of hard work may already have been done for you.
The socket module in the standard lib is in my opinion a good choice if you don't need high performance.
It is a very famous API that is known by almost every developpers of almost every languages. It's quite sipmple and there is a lot of information available on the internet. Moreover, it will be easier for other people to understand your code.
I guess that an event-driven framework like Twisted has better performance but in basic cases standard sockets are enough.
Of course, if you use a higher-level protocol (http, ftp...), you should use the corresponding implementation in the python standard library.
Socket is low level api, it is mapped directly to operating system interface.
Twisted, Tornado ... are high level framework (of course they are built on socket because socket is low level).
When it come to TCP/IP programming, you should have some basic knowledge to make a decision about what you shoud use:
Will you use well-known protocol like HTTP, FTP or create your own protocol?
Blocking or non-blocking? Twisted, Tornado are non-blocking framework (basically like nodejs).
Of course, socket can do everything because every other framework is base on its ;)
I'm looking for a good server/client protocol supported in Python for making data requests/file transfers between one server and many clients. Security is also an issue - so secure login would be a plus. I've been looking into XML-RPC, but it looks to be a pretty old (and possibly unused these days?) protocol.
If you are looking to do file transfers, XMLRPC is likely a bad choice. It will require that you encode all of your data as XML (and load it into memory).
"Data requests" and "file transfers" sounds a lot like plain old HTTP to me, but your statement of the problem doesn't make your requirements clear. What kind of information needs to be encoded in the request? Would a URL like "http://yourserver.example.com/service/request?color=yellow&flavor=banana" be good enough?
There are lots of HTTP clients and servers in Python, none of which are especially great, but all of which I'm sure will get the job done for basic file transfers. You can do security the "normal" web way, which is to use HTTPS and passwords, which will probably be sufficient.
If you want two-way communication then HTTP falls down, and a protocol like Twisted's perspective broker (PB) or asynchronous messaging protocol (AMP) might suit you better. These protocols are certainly well-supported by Twisted.
ProtocolBuffers was released by Google as a way of serializing data in a very compact efficient way. They have support for C++, Java and Python. I haven't used it yet, but looking at the source, there seem to be RPC clients and servers for each language.
I personally have used XML-RPC on several projects, and it always did exactly what I was hoping for. I was usually going between C++, Java and Python. I use libxmlrpc in Python often because it's easy to memorize and type interactively, but it is actually much slower than the alternative pyxmlrpc.
PyAMF is mostly for RPC with Flash clients, but it's a compact RPC format worth looking at too.
When you have Python on both ends, I don't believe anything beats Pyro (Python Remote Objects.) Pyro even has a "name server" that lets services announce their availability to a network. Clients use the name server to find the services it needs no matter where they're active at a particular moment. This gives you free redundancy, and the ability to move services from one machine to another without any downtime.
For security, I'd tunnel over SSH, or use TLS or SSL at the connection level. Of course, all these options are essentially the same, they just have various difficulties of setup.
Pyro (Python Remote Objects) is fairly clever if all your server/clients are going to be in Python. I use XMPP alot though since I'm communicating with hosts that are not always Python. XMPP lends itself to being extended fairly easily too.
There is an excellent XMPP library for python called PyXMPP which is reasonably up to date and has no dependancy on Twisted.
I suggest you look at 1. XMLRPC 2. JSONRPC 3. SOAP 4. REST/ATOM
XMLRPC is a valid choice. Don't worry it is too old. That is not a problem. It is so simple that little needed changing since original specification. The pro is that in every programming langauge I know there is a library for a client to be written in. Certainly for python. I made it work with mod_python and had no problem at all.
The big problem with it is its verbosity. For simple values there is a lot of XML overhead. You can gzip it of cause, but then you loose some debugging ability with the tools like Fiddler.
My personal preference is JSONRPC. It has all of the XMLRPC advantages and it is very compact. Further, Javascript clients can "eval" it so no parsing is necessary. Most of them are built for version 1.0 of the standard. I have seen diverse attempts to improve on it, called 1.1 1.2 and 2.0 but they are not built one on top of another and, to my knowledge, are not widely supported yet. 2.0 looks the best, but I would still stick with 1.0 for now (October 2008)
Third candidate would be REST/ATOM. REST is a principle, and ATOM is how you convey bulk of data when it needs to for POST, PUT requests and GET responses.
For a very nice implementation of it, look at GData, Google's API. Real real nice.
SOAP is old, and lots lots of libraries / langauges support it. IT is heeavy and complicated, but if your primary clients are .NET or Java, it might be worth the bother.
Visual Studio would import your WSDL file and create a wrapper and to C# programmer it would look like local assembly indeed.
The nice thing about all this, is that if you architect your solution right, existing libraries for Python would allow you support more then one with almost no overhead. XMLRPC and JSONRPC are especially good match.
Regarding authentication. XMLRPC and JSONRPC don't bother defining one. It is independent thing from the serialization. So you can implement Basic Authentication, Digest Authentication or your own with any of those. I have seen couple of examples of client side Digest Authentication for python, but am yet to see the server based one. If you use Apache, you might not need one, using mod_auth_digest Apache module instead. This depens on the nature of your application
Transport security. It is obvously SSL (HTTPS). I can't currently remember how XMLRPC deals with, but with JSONRPC implementation that I have it is trivial - you merely change http to https in your URLs to JSONRPC and it shall be going over SSL enabled transport.
HTTP seems to suit your requirements and is very well supported in Python.
Twisted is good for serious asynchronous network programming in Python, but it has a steep learning curve, so it might be worth using something simpler unless you know your system will need to handle a lot of concurrency.
To start, I would suggest using urllib for the client and a WSGI service behind Apache for the server. Apache can be set up to deal with HTTPS fairly simply.
SSH can be a good choice for file transfer and remote control, especially if you are concerned with secure login. Most Linux and Solaris servers will already run an SSH service for administration, so if your Python program use ssh then you don't need to open up any additional ports or services on remote machines.
OpenSSH is the standard and portable SSH client and server, and can be used via subprocesses from Python. If you want more flexibility Twisted includes Twisted Conch which is a SSH client and server implementation which provides flexible programmable control of an SSH stack, on both Linux and Windows. I use both in production.
I'd use http and start with understanding what the Python library offers.
Then I'd move onto the more industrial strength Twisted library.
There is no need to use HTTP (indeed, HTTP is not good for RPC in general in some respects), and no need to use a standards-based protocol if you're talking about a python client talking to a python server.
Use a Python-specific RPC library such as Pyro, or what Twisted provides (Twisted.spread).
XMLRPC is very simple to get started with, and at my previous job, we used it extensively for intra-node communication in a distributed system. As long as you keep track of the fact that the None value can't be easily transferred, it's dead easy to work with, and included in Python's standard library.
Run it over https and add a username/password parameter to all calls, and you'll have simple security in place. Not sure about how easy it is to verify server certificate in Python, though.
However, if you are transferring large amounts of data, the coding into XML might become a bottleneck, so using a REST-inspired architecture over https may be as good as xmlrpclib.
Facebook's thrift project may be a good answer. It uses a light-weight protocol to pass object around and allows you to use any language you wish. It may fall-down on security though as I believe there is none.
In the RPC field, Json-RPC will bring a big performance improvement over xml-rpc:
http://json-rpc.org/wiki/python-json-rpc