Is Twisted an httplib2/socket replacement? - python

Many python libraries, even recently written ones, use httplib2 or the socket interface to perform networking tasks.
Those are obviously easier to code on than Twisted due to their blocking nature, but I think this is a drawback when integrating them with other code, especially GUI one. If you want scalability, concurrency or GUI integration while avoiding multithreading, Twisted is then a natural choice.
So I would be interested in opinions in those matters:
Should new networking code (with the exception of small command line tools) be written with Twisted?
Would you mix Twisted, http2lib or socket code in the same project?
Is Twisted pythonic for most libraries (it is more complex than alternatives, introduce a dependency to a non-standard package...)?
Edit: please let me phrase this in another way. Do you feel writing new library code with Twisted may add a barrier to its adoption? Twisted has obvious benefits (especially portability and scalability as stated by gimel), but the fact that it is not a core python library may be considered by some as a drawback.

See asychronous-programming-in-python-twisted, you'll have to decide if depending on a non-standard (external) library fits your needs. Note the answer by #Glyph, he is the founder of the Twisted project, and can authoritatively answer any Twisted related question.
At the core of libraries like Twisted, the function in the main loop is not sleep, but an operating system call like select() or poll(), as exposed by a module like the Python select module. I say "like" select, because this is an API that varies a lot between platforms, and almost every GUI toolkit has its own version. Twisted currently provides an abstract interface to 14 different variations on this theme. The common thing that such an API provides is provide a way to say "Here are a list of events that I'm waiting for. Go to sleep until one of them happens, then wake up and tell me which one of them it was."

Should new networking code (with the exception of small command line tools) be written with Twisted?
Maybe. It really depends. Sometimes its just easy enough to wrap the blocking calls in their own thread. Twisted is good for large scale network code.
Would you mix Twisted, http2lib or socket code in the same project?
Sure. But just remember that Twisted is single threaded, and that any blocking call in Twisted will block the entire engine.
Is Twisted pythonic for most libraries (it is more complex than alternatives, introduce a dependency to a non-standard package...)?
There are many Twisted zealots that will say it belongs in the Python standard library. But many people can implement decent networking code with asyncore/asynchat.

Related

What is the core difference between asyncio and trio?

Today, I found a library named trio which says itself is an asynchronous API for humans. These words are a little similar with requests'. As requests is really a good library, I am wondering what is the advantages of trio.
There aren't many articles about it, I just find an article discussing curio and asyncio. To my surprise, trio says itself is even better than curio(next-generation curio).
After reading half of the article, I cannot find the core difference between these two asynchronous framework. It just gives some examples that curio's implementation is more convenient than asyncio's. But the underlying structure is almost the same.
So could someone give me a reason I have to accept that trio or curio is better than asyncio? Or explain more about why I should choose trio instead of built-in asyncio?
Where I'm coming from: I'm the primary author of trio. I'm also one of the top contributors to curio (and wrote the article about it that you link to), and a Python core dev who's been heavily involved in discussions about how to improve asyncio.
In trio (and curio), one of the core design principles is that you never program with callbacks; it feels more like thread-based programming than callback-based programming. I guess if you open up the hood and look at how they're implemented internally, then there are places where they use callbacks, or things that are sorta equivalent to callbacks if you squint. But that's like saying that Python and C are equivalent because the Python interpreter is implemented in C. You never use callbacks.
Anyway:
Trio vs asyncio
Asyncio is more mature
The first big difference is ecosystem maturity. At the time I'm writing this in March 2018, there are many more libraries with asyncio support than trio support. For example, right now there aren't any real HTTP servers with trio support. The Framework :: AsyncIO classifier on PyPI currently has 122 libraries in it, while the Framework :: Trio classifier only has 8. I'm hoping that this part of the answer will become out of date quickly – for example, here's Kenneth Reitz experimenting with adding trio support in the next version of requests – but right now, you should expect that if you're trio for anything complicated, then you'll run into missing pieces that you need to fill in yourself instead of grabbing a library from pypi, or that you'll need to use the trio-asyncio package that lets you use asyncio libraries in trio programs. (The trio chat channel is useful for finding out about what's available, and what other people are working on.)
Trio makes your code simpler
In terms of the actual libraries, they're also very different. The main argument for trio is that it makes writing concurrent code much, much simpler than using asyncio. Of course, when was the last time you heard someone say that their library makes things harder to use... let me give a concrete example. In this talk (slides), I use the example of implementing RFC 8305 "Happy eyeballs", which is a simple concurrent algorithm used to efficiently establish a network connection. This is something that Glyph has been thinking about for years, and his latest version for Twisted is ~600 lines long. (Asyncio would be about the same; Twisted and asyncio are very similar architecturally.) In the talk, I teach you everything you need to know to implement it in <40 lines using trio (and we fix a bug in his version while we're at it). So in this example, using trio literally makes our code an order of magnitude simpler.
You might also find these comments from users interesting: 1, 2, 3
There are many many differences in detail
Why does this happen? That's a much longer answer :-). I'm gradually working on writing up the different pieces in blog posts and talks, and I'll try to remember to update this answer with links as they become available. Basically, it comes down to Trio having a small set of carefully designed primitives that have a few fundamental differences from any other library I know of (though of course build on ideas from lots of places). Here are some random notes to give you some idea:
A very, very common problem in asyncio and related libraries is that you call some_function(), and it returns, so you think it's done – but actually it's still running in the background. This leads to all kinds of tricky bugs, because it makes it difficult to control the order in which things happen, or know when anything has actually finished, and it can directly hide problems because if a background task crashes with an unhandled exception, asyncio will generally just print something to the console and then keep going. In trio, the way we handle task spawning via "nurseries" means that none of these things happen: when a function returns then you know it's done, and Trio's currently the only concurrency library for Python where exceptions always propagate until you catch them.
Trio's way of managing timeouts and cancellations is novel, and I think better than previous state-of-the-art systems like C# and Golang. I actually did write a whole essay on this, so I won't go into all the details here. But asyncio's cancellation system – or really, systems, it has two of them with slightly different semantics – are based on an older set of ideas than even C# and Golang, and are difficult to use correctly. (For example, it's easy for code to accidentally "escape" a cancellation by spawning a background task; see previous paragraph.)
There's a ton of redundant stuff in asyncio, which can make it hard to tell which thing to use when. You have futures, tasks, and coroutines, which are all basically used for the same purpose but you need to know the differences between them. If you want to implement a network protocol, you have to pick whether to use the protocols/transports layer or the streams layer, and they both have tricky pitfalls (this is what the first part of the essay you linked is about).
Trio's currently the only concurrency library for Python where control-C just works the way you expect (i.e., it raises KeyboardInterrupt where-ever your code is). It's a small thing, but it makes a big difference :-). For various reasons, I don't think this is fixable in asyncio.
Summing up
If you need to ship something to production next week, then you should use asyncio (or Twisted or Tornado or gevent, which are even more mature). They have large ecosystems, other people have used them in production before you, and they're not going anywhere.
If trying to use those frameworks leaves you frustrated and confused, or if want to experiment with a different way of doing things, then definitely check out trio – we're friendly :-).
If you want to ship something to production a year from now... then I'm not sure what to tell you. Python concurrency is in flux. Trio has many advantages at the design level, but is that enough to overcome asyncio's head start? Will asyncio being in the standard library be an advantage, or a disadvantage? (Notice how these days everyone uses requests, even though the standard library has urllib.) How many of the new ideas in trio can be added to asyncio? No-one knows. I expect that there will be a lot of interesting discussions about this at PyCon this year :-).

Which way to go with twisted and web-programming?

So, I programmed this twisted application a few months ago, which I now would like to extend with a web-based user interface for configuration.
The Twisted website recommends Nevow, but I am not really sure if this is a good choice. Their website is down for a while it seems and their launchpad page hadn't seen any update in half a year. Is this project dead?
Additionally I have seen discussion of moving parts of Nevow into twisted.web on the twisted-web mailinglist. So, is it still recommended for new developments?
Another idea was using Django. I would need user authentication and permissions anyway in the config-interface, and I am quite familiar with it. (I have never worked with Nevow or twisted.web)
But it seems quite difficult to interface both worlds, all I could find were examples of running Django with WSGI in Twisted.
Are there any other possibilities to have a slick looking user interface on top of twisted?
First, let me address the perception that Nevow is dead. The launchpad project containing the code for Nevow (and the rest of the Divmod projects) is divmod.org on launchpad. A hardware failure has badly impacted the project's public presence, but it's still there, and other things (like the wiki and the tickets) are in the process of being recovered. There isn't a lot of active maintenance work going on right now, but that's mostly because it's good enough for most of its users; there are lots of people who depend on Nevow and would be very upset if it stopped working. Those people have the skills and experience necessary to continue maintaining it. So, while it's not being actively promoted right now, I think it's unlikely that it's going to go away.
My long-term hope for Nevow would be as follows. (I'd say "plan", but since I haven't been actively involved with its maintenance lately, this is really up to those who are.) First, I'd like to extract its templating facilities and move them into twisted.web. The clean, non-deprecated API for Nevow is mostly covered by nevow.page.Element and the various loaders. Twisted itself wants to generate HTML in a few places and these facilities could be useful. Then we should throw out the "appserver" and resource-model parts of Nevow. Those are mostly just a random collection of bugfixes or alterations for twisted.web, most of which were present in some form in twisted.web2 and will therefore either be rolled back into twisted.web anyway, or have already been applied there. Finally there's the question of Athena. While two-way communication is one of Twisted's strengths, Athena is itself a gigantic, sprawling JavaScript codebase and should probably remain its own project.
Third, on to the main question, given this information, what should you do now?
Generally speaking, I'd say, "use nevow". The project has some warts, it needs more documentation and its API needs to be trimmed to eliminate some old and broken stuff, but it's still quite useful and very much alive. To make up for the slightly sparse documentation, you can join the #divmod or #twisted.web channels on Freenode to get help with it. If you help out by contributing patches where you can, you will find that you'll get a lot of enthusiastic help there. When you ignore the deprecated parts Nevow has a pretty small, sane, twisted friendly API. The consequence of the plan for Nevow's evolution that I outlined above are actually pretty minimal. If it even happens at all, what it means for you is, in 1-5 years, when you upgrade to a new version of Twisted, you'll get a couple of deprecation warnings, change some import lines in your code from from nevow.page import ...; from nevow.loaders import ... to some hypothetical new thing like from twisted.web.page.element import ...; from twisted.web.page.templates import ..., or somesuch. Most of the API past that point should remain the same, and definitely the high-level concepts shouldn't change much.
The main advantage that you get from using Nevow is that it's async-friendly and can render pages in your main thread without blocking things. Plus, you can then get really easy COMET for free with Athena.
You can also use Django. This is not quite as async-friendly but it obviously does have a broader base of support. However, "not as async friendly" doesn't mean "hard to use". You can run it in twisted.web via WSGIResource, and simply use blockingCallFromThread in your Django application to invoke any Twisted API that returns a Deferred, which should be powerful enough to do just about anything you want. If you have a more specific question about how to instantiate Twisted web resources to combine Twisted Web and Django, you should probably ask it in its own Stack Overflow question.
Nevow is still a good choice if you want support for Deferreds in the templating system you use (it's not dead). It also has a few advantages over plain Twisted Web when it comes to complicated URL dispatch. However, it is basically just a templating system. Twisted Web is the real web server. So either way, you're going to use Twisted Web. In fact, even if you use Django in Twisted Web's WSGI container, you're still going to use Twisted Web. So learning things about Twisted Web isn't going to hurt you.
If you're going to be generating any amount of HTML, then you very much want to use an HTML templating library. By this point no one should be constructing HTML using primitive string operations. So if you want to use one of the other Python HTML templating libraries out there - Cheetah, Quixote, etc - instead of Nevow, that's great! You're just going to use the templating library to get a string to write out in response to an HTTP request. Twisted Web doesn't care where the string came from.
And if you do want to do something with Django (or another WSGI-based system), then you can certainly deploy this in your Twisted process using Twisted Web's WSGI support. And you can still interact between the WSGI applications and the rest of your Twisted code, as long as you exercise a little care - WSGI applications run in a thread pool, and Twisted APIs are not thread-safe, you have to invoke them with reactor.callFromThread or one of the small number of similar APIs (in particular, blockingCallFromThread is sometimes a useful higher-level tool to use).
At this point Nevow is definitively dead. As illustration of how dead it is, there is a bug that prevents installation of Nevow using pip, which was fixed on trunk in 2009, but it isn't in any release because there has been no release since then.
twisted.web and in particular twisted.web.template cover pretty much all of what was useful in Nevow, and should be used for any new project that was considering using Nevow.

Why is there a need for Twisted?

I have been playing around with the twisted framework for about a week now(more because of curiosity rather than having to use it) and its been a lot of fun doing event driven asynchronous network programming.
However, there is something that I fail to understand. The twisted documentation starts off with
Twisted is a framework designed to be very flexible and let you write powerful servers.
My doubt is :- Why do we need such an event-driven library to write powerful servers when there are already very efficient implementations of various servers out there?
Surely, there must have been more than a couple of concrete implementations which the twisted developers had in mind while writing this event-driven I\O library. What are those? Why exactly was twisted made?
In a comment on another answer, you say "Every library is supposed to have ...". "Supposed" by whom? Having use-cases is certainly a nice way to nail down your requirements, but it's not the only way. It also doesn't make sense to talk about the use-cases for all of Twisted at once. There is no use case that justifies every single API in Twisted. There are hundreds or thousands of different use cases, each which justifies a lesser or greater subdivision of Twisted. These came and went over the years of Twisted's development, and no attempt has been made to keep a list of them. I can say that I worked on part of Twisted Names so that I would have a topic for a paper I was presenting at the time. I implemented the vt102 parser in Twisted Conch because I am obsessed with terminals and wanted a fun project involving them. And I implemented the IMAP4 support in Twisted Mail because I worked at a company developing a mail server which required tighter control over the mail store than any other IMAP4 server at the time offered.
So, as you can see, different parts of Twisted were written for widely differing reasons (and I've only given examples of my own reasons, not the reasons of any other developers).
The initial reason for a program being written often doesn't matter much in the long run though. Now the code is written: Twisted Names now runs the DNS for many domain names on the internet, the vt102 parser helped me get a job, and the company that drove the IMAP4 development is out of business. What really matters is what useful things you can do with the code now. As MattH points out, the resulting plethora of functionality has resulted in a library that (perhaps uniquely) addresses a wide array of interesting problems.
Why do we need such an event-driven library to write powerful servers when there are already very efficient implementations of various servers out there?
So paraphrasing: you can't imagine why anyone would need a toolkit when dyecast products already exist?
I'm guessing you've never needed to knock up a protocol gateway, e.g.
- write a daemon to md5 local files on demand over a unix socket
- interrogate a piece of software using udp and expose statistics over http.
I wrote a little proof-of-concept for the second example for a question here on SO in a handful of minutes. I couldn't do that without twisted.
Have you looked at: ProjectsUsingTwisted?
More on 'why': (disclaimer: I'm not a developer of Twisted proper), it's necessary to consider Twisted's high age (relative to Python's). When Twisted was written there was no sufficiently powerful non-blocking network/event driven library written around the reactor pattern (almost everyone was using threads back then). Twisted's initial use case was a large multiplayer game, although the specifics of this game seems to be somewhat lost in time.
Since the origins, as #MattH's link suggest, a very large amount of various network servers written in Python is based on Twisted.
This PyCon talk by the creator of Twisted should give you answers.
It has changed my opinion of Twisted. Before I viewed it as a massive piece of software with interfaces and weird names, two things that many developers dislike but that are actually just superficial things, and now that I’ve seen the history behind and the amazing number of use cases I respect it a lot. Life is short, you need Twisted :)

networking application and GUI in python

I'm writing an application that sends files over network, I want to develop a custom protocol to not limit myself in term on feature richness (http wouldn't be appropriate, the nearest thing is the bittorrent protocol maybe).
I've tried with twisted, I've built a good app but there's a bug in twisted that makes my GUI blocking, so I've to switch to another framework/strategy.
What do you suggest? Using raw sockets and using gtk mainloop (there are select-like functions in the toolkit) is too much difficult?
It's viable running two mainloops in different threads?
Asking for suggestions
Disclaimer: I have little experience with network applications.
That being said, the raw sockets isn't terribly difficult to wrap your head around/use, especially if you're not too worried about optimization. That takes more thought, of course. But using GTK and raw sockets should be fairly straightforward. Especially since you've used the twisted framework, which IIRC, just abstracts some of the more nitty-gritty details of socket managing.
Two threads: one for the GUI, one for sending/receiving data. Tkinter would be a perfectly fine toolkit for this. You don't need twisted or any other external libraries or toolkits -- what comes out of the box is sufficient to get the job done.
If your application is somewhat similar to bittorrent, why not check the source code of Deluge http://deluge-torrent.org/ and build from it? It is written in Python, it does use the bittorrent protocol and it does have a GTK user interface.
As an alternative to twisted and whatever GUI library you seem to be using, how about trying PyQt? It provides a GUI and non-blocking sockets all in the same event loop. That way you don't have to worry about interoperability issues, which seem to be the issue you are facing.
Hope this helps!

Suggestion Needed - Networking in Python - A good idea?

I am considering programming the network related features of my application in Python instead of the C/C++ API. The intended use of networking is to pass text messages between two instances of my application, similar to a game passing player positions as often as possible over the network.
Although the python socket modules seems sufficient and mature, I want to check if there are limitations of the python module which can be a problem at a later stage of the development.
What do you think of the python socket module :
Is it reliable and fast enough for production quality software ?
Are there any known limitations which can be a problem if my app. needs more complex networking other than regular client-server messaging ?
Thanks in advance,
Paul
Check out Twisted, a Python engine for Networking. Has built-in support for TCP, UDP, SSL/TLS, multicast, Unix sockets, a large number of protocols (including HTTP, NNTP, IMAP, SSH, IRC, FTP, and others)
Python is a mature language that can do almost anything that you can do in C/C++ (even direct memory access if you really want to hurt yourself).
You'll find that you can write beautiful code in it in a very short time, that this code is readable from the start and that it will stay readable (you will still know what it does even after returning one year later).
The drawback of Python is that your code will be somewhat slow. "Somewhat" as in "might be too slow for certain cases". So the usual approach is to write as much as possible in Python because it will make your app maintainable. Eventually, you might run into speed issues. That would be the time to consider to rewrite a part of your app in C.
The main advantages of this approach are:
You already have a running application. Translating the code from Python to C is much more simple than write it from scratch.
You already have a running application. After the translation of a small part of Python to C, you just have to test that small part and you can use the rest of the app (that didn't change) to do it.
You don't pay a price upfront. If Python is fast enough for you, you'll never have to do the optional optimization.
Python is much, much more powerful than C. Every line of Python can do the same as 100 or even 1000 lines of C.
To answer #1, I know that among other things, EVE Online (the MMO) uses a variant of Python for their server code.
The python that EVE online uses is StacklessPython (http://www.stackless.com/), and as far as i understand they use it for how it implements threading through using tasklets and whatnot. But since python itself can handle stuff like MMO with 40k people online i think it can do anything.
This bad answer and not really an answer to your question, rather addition to previous answer.
Alan.

Categories

Resources