Gevent's libev, and Twisted's reactor

Gevent's libev, and Twisted's reactor - python

I'm trying to figure out how Gevent works with respect to other asynchronous frameworks in python, like Twisted.
The key difference between Gevent and Twisted is that Gevent uses greenlets and monkey patching the standard library for an implicit behavior and a synchronous programming model whereas Twisted requires specific libraries and callbacks for an explicit behavior. The event loop in Gevent is libev/libevent, which is written in C, and the event loop in Twisted is the reactor, which is written in python.
Is there anything special about libev/libevent that allows for this implicit behavior? Why not use an event loop written in Python? Conversely, why isn't Twisted using libev/libevent? Is there any particular reason? Maybe it was simply a design choice and could have gone either way...
Theoretically, can Gevent's libev be replaced with another event loop, written in python, like Twisted's reactor? And can Twisted's reactor be replaced with libev?

Short answer: Twisted is a network framework. Gevent tries to act as a library without requiring from the programmer to change the way he programs. That's their focus.. and not so much how that is achieved under the hood.
Long answer:
All asyncio libraries (Gevent, Asyncio, etc.) work pretty much the same:
Have a main loop running endlessly on a single thread.
When an event occurs, it's captured by the main loop.
The main loop decides based on different rules (scheduling) if it should continue checking for events or switch temporarily and give control to any subscriber functions to the event.
greenlet is a different library. It's very simple in that it just changes the order that Python code is run and lets you change jumping back and forth between functions. Gevent uses it under the hood to implement its async features.
asyncio which comes with Python3 is like gevent. The big difference is the interface again. It requires the programmer to mark functions with async and allow him to explicitly wait for a subscribed function in the main loop with await.
Gevent is like asyncio. But instead of the keywords it patches existing code where appropriate. It uses greenlet under the hood to switch between main loop and subscribed functions and make it all work seamlessly.
Twisted as mentioned feels more like a framework than a library. It requires the programmer to follow very specific ways to achieve concurrency. Again though it has a main loop under the hood called reactor like everything else.
Back to your initial question: You can in theory replace the reactor with any loop (including gevent). But that would defeat the purpose. Probably Twisted's team decided to use their own version of a main loop for optimisation reasons. All these libraries use different scheduling in their main loops to meet their needs.

Related

Using Python asyncio interfaces with Cython libraries

I've got an external library in C++ that has been wrapped by Cython. This C++ library itself I cannot change. I would like to combine the library to be used as part of Python application that uses asyncio as its primary process control.
The Cython library essentially does network work with a proprietary protocol. The Cython library however is blocking when the event handler for the library is started in python. I've gotten it to a stage where I can pass a Python function and receive callbacks for events received from the C++ library. I can resolve the library hanging the application at the library event handler if I run the event handler within event_loop.run_in_executor.
My question is, how can I best model this to work with asnycio that fits well with its interfaces rather than hack up ad hoc solutions to use the Cython library methods? I had a look into writing this as a asyncio.Protocol and asyncio.Transport that then uses the Cython library as it's underlying communication mechanism. However, it looks like it's a lot of effort with some monkey patching to make it look like a socket. Is there a better way or abstraction to put a wrapper on external libraries to make it work with asyncio?

To answer my own question, as far as I can see there is no obligations to use abstractions provided by Protocol or Transport in asyncio for structuring applications. The best modeling for this I found is to use a regular class with its methods defined as async. The class then can be made to look like whatever pattern fits your requirement. This is especially relevant if the code you are wrapping doesn’t have same overall use case as a socket. The asyncio provided abstractions themselves are pretty barebones.
For things that are complicated like Cython wrapped C++ blocking code, you will need to deal with it with multiprocessing. This is to avoid hanging the interpreter. Asyncio does not make it possible to run blocking code without changes. The code must be specifically written to be asyncio compatible.
What I did was put the entire blocking code including the construction of the object into a function that was executed with event_loop.run_in_executor. In addition to this I used a unix socket to communicate with the process for commands and callback data. Due to using unix sockets you can use asnycio methods in your main application, same goes for pipes.
Here are some results I got from sending 128 bytes from the multiprocess Process producer to the asyncio main process. The data was generated at a 10-millisecond interval. The duration was timed using time.perf_counter(). Results below are in nanoseconds. The machine itself was Intel(R) Core(TM) i7-2600 CPU # 3.40GHz running Linux kernel 4.10.17.
Asyncio with uvloop
count 10001.000000
mean 76435.956504
std 8887.459462
min 63608.000000
25% 71709.000000
50% 74104.000000
75% 79496.000000
max 287204.000000
Standard Asyncio event loop
count 10001.000000
mean 199741.937506
std 27900.377114
min 173321.000000
25% 185545.000000
50% 191839.000000
75% 205279.000000
max 529246.000000

Twisted threads for multiple clients on server

I have implemented a server program using Twisted. I am using twisted.protocols.basic.LineReceiver along with twisted.internet.protocol.ServerFactory.
I would like to have each client that connects to the server run a set of functions in parallel (I'm thinking of multi-threading for this).
I have some confusion with using twisted.internet.threads.deferToThread for this problem.
Should I call deferToThread in the ServerFactory for this purpose?
Are twisted threads, thread-safe with respect to race conditions?
Previously, I tried using multiprocessing in my server program but it seemed not to work in combination with the Twisted reactor, while deferToThread did the job.
I'm wondering how are Twisted threads implemented? Don't they utilize multiprocessing?

Previously, I tried using multiprocessing in my server program but it seemed not to work in combination with the Twisted reactor, while deferToThread did the job. I'm wondering how are Twisted threads implemented? Don't they utilize multiprocessing?
You didn't say whether you used the multi-threaded version of multiprocessing or the multi-process version of multiprocessing.
You can read about mixing Twisted and multiprocessing on Stack Overflow, though:
Mix Python Twisted with multiprocessing?
Twisted network client with multiprocessing workers?
is twisted incompatible with multiprocessing events and queues?
(And more)
To answer the shorter part of this question - no, Twisted does not use the stdlib multiprocessing package to implement its threading APIs. It uses the stdlib threading module.
Are twisted threads, thread-safe with respect to race conditions?
The answer to this is implied by the above answer: no. "Twisted threads" aren't really a thing. Twisted's threading APIs are just a layer on top of the stdlib threading module (which is really just a Python API for POSIX threads (or something kind of similar but different on Windows). Twisted's threading APIs don't magically eliminate the possibility of race conditions (if there is any magic in Twisted, it is the ability to do certain things concurrently without using threads at all - which helps reduce the number of race conditions in your program, though it doesn't entirely eliminate the possibility of creating them).
Should I call deferToThread in the ServerFactory for this purpose?
I'm not quite sure what the point of this question is. Are you wondering if a method on your ServerFactory subclass is the best place to put your calls to deferToThread? That probably depends on the details of your implementation approach. It probably doesn't make a huge difference overall, though. If you like the pattern of having the factory provide services to protocol instances - go for it.

Twisted and libtorrent - do I need to worry about blocking?

I am looking into building a multi-protocol application using twisted. One of those protocols is bittorrent. Since libtorrent is a fairly complete implementation and its python bindings seems to be a good choice.
Now the question is:
When using libtorrent with twisted, do I need to worry about blocking?
Does the libtorrent networking layer (using boost.asio, a async networking loop) interfere with twisted epoll in any way?
Should I perhaps run the libtorrent session in a thread or target a multi-process application design?

I may be able to provide answers to some of those questions.
all of libtorrents logic, including networking and disk I/O is done in separate threads. So, over all, the concern of "blocking" is not that great. Assuming you mean libtorrent functions not returning immediately.
Some operations are guaranteed to return immediately, functions that don't return any state or information. However, functions that do return something, must synchronize with the libtorrent main thread, and if it is under heavy load (especially when built in debug mode with invariant checks and no optimization) this synchronization may be noticeable, especially when making many of them, and often.
There are ways to use libtorrent that are more asynchronous in nature, and there is an ongoing effort in minimizing the need for using functions that synchronize. For example, instead of querying the status of all torrents individually, one can subscribe to torrent status updates. Asynchronous notifications are returned via pop_alerts().
Whether it would interfere with twisted's epoll; I can't say for sure, but it doesn't seem very likely.
I don't think there's much need to interact with libtorrent via another layer of threads, since all of the work is already done in separate threads.

Python asyncore & dbus

Is it possible to integrate asyncore with dbus through the same main loop?
Usually, DBus integration is done through glib main loop: is it possible to have either asyncore integrate this main loop or have dbus use asyncore's ?

asyncore sucks. glib already provides async stuff, so just use glib's mainloop to do everything.

I wrote a trivial GSource wrapper for one of my own projects called AsyncoreGSource
Just attach it to an appropriate MainContext:
source = AsyncoreGSource([socket_map])
source.attach([main_context])
Naturally the defaults are asyncore.socket_map and the default MainContext respectively.
You can also try monkey-patching asyncore.socket_map, which would have been my solution had I not poked through the GLib python bindings source code for GSource.

Although you got what is probably a perfectly reasonable answer, there is another approach - you don't need to use asyncore's loop per se. Just call asyncore.loop with a zero timeout and a count of 1, which stops it iterating (and thus makes the function name completely misleading) and polls the sockets just once. Call this as often as you need.
I don't know anything about glib's async support but if it requires threads you might still get better performance by using asyncore in this way since it will use select or poll and won't need to spawn additional threads.

Threading in a PyQt application: Use Qt threads or Python threads?

I'm writing a GUI application that regularly retrieves data through a web connection. Since this retrieval takes a while, this causes the UI to be unresponsive during the retrieval process (it cannot be split into smaller parts). This is why I'd like to outsource the web connection to a separate worker thread.
[Yes, I know, now I have two problems.]
Anyway, the application uses PyQt4, so I'd like to know what the better choice is: Use Qt's threads or use the Python threading module? What are advantages / disadvantages of each? Or do you have a totally different suggestion?
Edit (re bounty): While the solution in my particular case will probably be using a non-blocking network request like Jeff Ober and Lukáš Lalinský suggested (so basically leaving the concurrency problems to the networking implementation), I'd still like a more in-depth answer to the general question:
What are advantages and disadvantages of using PyQt4's (i.e. Qt's) threads over native Python threads (from the threading module)?
Edit 2: Thanks all for you answers. Although there's no 100% agreement, there seems to be widespread consensus that the answer is "use Qt", since the advantage of that is integration with the rest of the library, while causing no real disadvantages.
For anyone looking to choose between the two threading implementations, I highly recommend they read all the answers provided here, including the PyQt mailing list thread that abbot links to.
There were several answers I considered for the bounty; in the end I chose abbot's for the very relevant external reference; it was, however, a close call.
Thanks again.

This was discussed not too long ago in PyQt mailing list. Quoting Giovanni Bajo's comments on the subject:
It's mostly the same. The main difference is that QThreads are better
integrated with Qt (asynchrnous signals/slots, event loop, etc.).
Also, you can't use Qt from a Python thread (you can't for instance
post event to the main thread through QApplication.postEvent): you
need a QThread for that to work.
A general rule of thumb might be to use QThreads if you're going to interact somehow with Qt, and use Python threads otherwise.
And some earlier comment on this subject from PyQt's author: "they are both wrappers around the same native thread implementations". And both implementations use GIL in the same way.

Python's threads will be simpler and safer, and since it is for an I/O-based application, they are able to bypass the GIL. That said, have you considered non-blocking I/O using Twisted or non-blocking sockets/select?
EDIT: more on threads
Python threads
Python's threads are system threads. However, Python uses a global interpreter lock (GIL) to ensure that the interpreter is only ever executing a certain size block of byte-code instructions at a time. Luckily, Python releases the GIL during input/output operations, making threads useful for simulating non-blocking I/O.
Important caveat: This can be misleading, since the number of byte-code instructions does not correspond to the number of lines in a program. Even a single assignment may not be atomic in Python, so a mutex lock is necessary for any block of code that must be executed atomically, even with the GIL.
QT threads
When Python hands off control to a 3rd party compiled module, it releases the GIL. It becomes the responsibility of the module to ensure atomicity where required. When control is passed back, Python will use the GIL. This can make using 3rd party libraries in conjunction with threads confusing. It is even more difficult to use an external threading library because it adds uncertainty as to where and when control is in the hands of the module vs the interpreter.
QT threads operate with the GIL released. QT threads are able to execute QT library code (and other compiled module code that does not acquire the GIL) concurrently. However, the Python code executed within the context of a QT thread still acquires the GIL, and now you have to manage two sets of logic for locking your code.
In the end, both QT threads and Python threads are wrappers around system threads. Python threads are marginally safer to use, since those parts that are not written in Python (implicitly using the GIL) use the GIL in any case (although the caveat above still applies.)
Non-blocking I/O
Threads add extraordinarily complexity to your application. Especially when dealing with the already complex interaction between the Python interpreter and compiled module code. While many find event-based programming difficult to follow, event-based, non-blocking I/O is often much less difficult to reason about than threads.
With asynchronous I/O, you can always be sure that, for each open descriptor, the path of execution is consistent and orderly. There are, obviously, issues that must be addressed, such as what to do when code depending on one open channel further depends on the results of code to be called when another open channel returns data.
One nice solution for event-based, non-blocking I/O is the new Diesel library. It is restricted to Linux at the moment, but it is extraordinarily fast and quite elegant.
It is also worth your time to learn pyevent, a wrapper around the wonderful libevent library, which provides a basic framework for event-based programming using the fastest available method for your system (determined at compile time).

The advantage of QThread is that it's integrated with the rest of the Qt library. That is, thread-aware methods in Qt will need to know in which thread they run, and to move objects between threads, you will need to use QThread. Another useful feature is running your own event loop in a thread.
If you are accessing a HTTP server, you should consider QNetworkAccessManager.

I asked myself the same question when I was working to PyTalk.
If you are using Qt, you need to use QThread to be able to use the Qt framework and expecially the signal/slot system.
With the signal/slot engine, you will be able to talk from a thread to another and with every part of your project.
Moreover, there is not very performance question about this choice since both are a C++ bindings.
Here is my experience of PyQt and thread.
I encourage you to use QThread.

Jeff has some good points. Only one main thread can do any GUI updates. If you do need to update the GUI from within the thread, Qt-4's queued connection signals make it easy to send data across threads and will automatically be invoked if you're using QThread; I'm not sure if they will be if you're using Python threads, although it's easy to add a parameter to connect().

I can't really recommend either, but I can try describing differences between CPython and Qt threads.
First of all, CPython threads do not run concurrently, at least not Python code. Yes, they do create system threads for each Python thread, however only the thread currently holding Global Interpreter Lock is allowed to run (C extensions and FFI code might bypass it, but Python bytecode is not executed while thread doesn't hold GIL).
On the other hand, we have Qt threads, which are basically common layer over system threads, don't have Global Interpreter Lock, and thus are capable of running concurrently. I'm not sure how PyQt deals with it, however unless your Qt threads call Python code, they should be able to run concurrently (bar various extra locks that might be implemented in various structures).
For extra fine-tuning, you can modify the amount of bytecode instructions that are interpreted before switching ownership of GIL - lower values mean more context switching (and possibly higher responsiveness) but lower performance per individual thread (context switches have their cost - if you try switching every few instructions it doesn't help speed.)
Hope it helps with your problems :)

I can't comment on the exact differences between Python and PyQt threads, but I've been doing what you're attempting to do using QThread, QNetworkAcessManager and making sure to call QApplication.processEvents() while the thread is alive. If GUI responsiveness is really the issue you're trying to solve, the later will help.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.