A lot of the functions in asyncio have deprecated loop parameters, scheduled to be removed in Python 3.10. Examples include as_completed(), sleep(), and wait().
I'm looking for some historical context on these parameters and their removal.
What problems did loop solve? Why would one have used it in the first place?
What was wrong with loop? Why is it being removed en masse?
What replaces loop, now that it's gone?
What problems did loop solve? Why would one have used it in the first place?
Prior to Python 3.6, asyncio.get_event_loop() was not guaranteed to return the event loop currently running when called from an asyncio coroutine or callback. It would return whatever event loop was previously set using set_event_loop(some_loop), or the one automatically created by asyncio. But sync code could easily create a different loop with another_loop = asyncio.new_event_loop() and spin it up using another_loop.run_until_complete(some_coroutine()). In this scenario, get_event_loop() called inside some_coroutine and the coroutines it awaits would return some_loop rather than another_loop. This kind of thing wouldn't occur when using asyncio casually, but it had to be accounted for by async libraries which couldn't assume that they were running under the default event loop. (For example, in tests or in some usages involving threads, one might want to spin up an event loop without disturbing the global setting with set_event_loop.) The libraries would offer the explicit loop argument where you'd pass another_loop in the above case, and which you'd use whenever the running loop differed from the loop set up with asyncio.set_event_loop().
This issue would be fixed in Python 3.6 and 3.5.3, where get_event_loop() was modified to reliably return the running loop if called from inside one, returning another_loop in the above scenario. Python 3.7 would additionally introduced get_running_loop() which completely ignores the global setting and always returns the currently running loop, raising an exception if not inside one. See this thread for the original discussion.
Once get_event_loop() became reliable, another problem was that of performance. Since the event loop was needed for some very frequently used calls, most notably call_soon, it was simply more efficient to pass around and cache the loop object. Asyncio itself did that, and many libraries followed suit. Eventually get_event_loop() was accelerated in C and was no longer a bottleneck.
These two changes made the loop arguments redundant.
What was wrong with loop? Why is it being removed en masse?
As any other redundancy, it complicates the API and opens up possibilities for errors. Async code should almost never just randomly communicate with a different loop, and now that get_event_loop() is both correct and fast, there is no reason not to use it.
Also, passing the loop through all the layers of abstraction of a typical application is simply tedious. With async/await becoming mainstream in other languages, it has become clear that manually propagating a global object is not ergonomic and should not be required from programmers.
What replaces loop, now that it's gone?
Just use get_event_loop() to get the loop when you need it. Alternatively, you can use get_running_loop() to assert that a loop is running.
The need for accessing the event loop is somewhat reduced in Python 3.7, as some functions that were previously only available as methods on the loop, such as create_task, are now available as stand-alone functions.
The loop parameter was the way to pass the global event loop around. New implementations of the same functions no longer require you to pass the global event loop, they instead just request it where it's needed.
As the documentation suggests https://docs.python.org/3/library/asyncio-eventloop.html: "Application developers should typically use the high-level asyncio functions, such as asyncio.run(), and should rarely need to reference the loop object or call its methods."
Removing the need for you to pass it around to library functions aligns with that principle. The loop is not replaced, but its disappearance simply means you no longer have to deal with it 'manually'.
Related
I'm working on a personal project that has been refactored a number of times. It started off using multithreading, then parts of it used asyncio, and now it is back to being mainly single threaded.
As a result of all these changes I have a number of threading.Lock()'s in the code that I would like to remove and cleanup to prevent future issues.
How can I easily work out which locks are in use and hit by more than one thread during the runtime of the application?
If I am in the situation to find that out, I would try to replace the lock with a wrapper that do the counting (or print something, raise an exception, etc.) for me when the undesired behavior happened. Python is hacky, so I can simply create a function and overwrite the original threading.Lock to get the job done. That might need some careful implementation, e.g., catch both all possible pathway to lock and unlock.
However, you have to be careful that even so, you might not exercise all possible code path and thus never know if you really remove all "bugs".
My understanding of Gevent is that it's merely concurrency and not parallelism. My understanding of concurrency mechanisms like Gevent and AsyncIO is that, nothing in the Python application is ever executing at the same time.
The closest you get is, calling a non-blocking IO method, and while waiting for that call to return other methods within the Python application are able to be executed. Again, none of the methods within the Python application ever actually execute Python code at the same time.
With that said, why is there a need for gevent.queue? It sounds to me like the Python application doesn't really need to worry about more than one Python method accessing a queue instance at a time.
I'm sure there's a scenario that I'm not seeing that gevent.queue fixes, I'm just curious what that is.
Although you are right that no two statements execute at the same time within a single Python process, you might want to ensure that a series of statements execute atomically, or you might want to impose an order on the execution of certain statements, and in that case things like gevent.queue become useful. A tutorial is here.
From documentation:
An NDB tasklet is a piece of code that might run concurrently with
other code. If you write a tasklet, your application can use it much
like it uses an async NDB function: it calls the tasklet, which
returns a Future; later, calling the Future's get_result() method gets
the result.
The explanation and examples in the document really likes a magic for me.
I can use it, but feel hard to understand it properly.
For example:
May I put any kind of code inside a function and decorate it as ndb.tasklet? Then used it as async function later. Or it must be appengine RPC?
Does this kind of decorator also works on my PC?
Is it the same as tasklet for pypy
If you look at the implementation of a Future, its very comparable to what a generator is in python. In fact, it uses the same yield keyword to achieve what it says it does. Read the intro comments on the tasklets.py for some clarification.
When you use the #tasklet decorator, it creates a Future and waits for a value on the wrapped function. If the value is a generator, it adds the Future to the event loop. When you yield on a Future, the event loop runs through ALL queued Futures until the Future you want is ready. The concurrency here is that each Future will execute its code until it either returns (using raise ndb.Return(...) or the function completes), an exception is thrown, or yield is used again. I guess technically, you can use yield in the code just to stop executing that function in favor of letting the event loop continue running other Futures, but I would assume this wouldn't help much unless you really have a clever use-case in mind.
To answer your questions:
Technically yes, but it will not run asynchronously. When you decorate a non-yielding function with #tasklet, its Future's value is computed and set when you call that function. That is, it runs through the entire function when you call it. If you want to achieve asynchronous operation, you must yield on something that does asynchronous work. Generally in GAE it will work its way down to an RPC call.
If by work on your PC you mean does the dev appserver implement tasklets/Futures like GAE, then yes, although this is more accurate with the devappserver2 (now the default in the newer SDK). I'm actually not 100% sure if local RPC calls will run in parallel when using Futures, but there is an eventloop going through Futures whether its local or in production. If you want to use Future's in your other, non-GAE code, then I think you would be better off using Python 3.2's built-in future (or find a backport here)
Kind of, its not really a simple comparison. Look at the documentation here. The idea is somewhat the same (the scheduler can be compared to the eventloop), but the low-level implementation greatly differs.
This is quite an essential part of my program and I need to have sorted out as soon as possible so anything would be a massive help.
My program consists of three modules which are imported to each other. One module consists of my user interface for which I am using tkinter. The user inputs data on a canvas which is sent to a second program to be processed and is then sent to the third module which contains the algorithm which I intend to step through with the user.
The "first" and "third" modules can interact with each other and during certain points in explaining the algorithm I change the appearance of the canvas and some text on the interface. The third module should then pause (for which I'm currently using a basic sleep method), and wait (ideally it will wait for the user to press the "Next Step" button on the user interface). It is during this step that my interface decides that it wants to freeze.
Is there any way I can stop this?
Many thanks in advance.
Edit: I've found a way to fix this. Thank you for all the suggestions!
Calling time.sleep() will stop your program doing anything until it finishes sleeping. You need Tkinter to keep processing events until it should run the next part of your code.
To do that, put the next part of your code in a separate function, and get Tkinter to call it when it's ready. Typically, you want this to happen when the user triggers it (e.g. by clicking a button), so you need to bind it to an event (docs). If you actually want it to happen after a fixed time, you can use the .after() method on any tkinter widget (docs).
GUI programming takes a bit of getting used to. You don't code as a series of things happening one after the other, you write separate bits of code which are triggered by what the user does.
Terminology note: if your Python files import each other, you have three modules, but it's still all one program. Talking about the "first program" will confuse people.
H.E.P -
The traditional way to do this does indeed involve using a separate thread and co-ordinating the work between the "worker" thread and the GUI thread using some sort of polling or eventing mechanism.
But, as Thomas K. points out, that can get very complex and tricky, especially regarding Python's use of the Global Interpreter Lock (GIL) etc. and having to also contend with Tkinter's processing loop.
(The only good reason to use a multi-threaded GUI is if you absolutely MUST ensure that the GUI remains responsive during a potentially long-running background task, which I don't believe is the issue in this case.)
What I would suggest instead is a generator-based "co-routine"-type architecture.
As noted in "The Python (2.7) Language Reference", Section 6.8, [the "yield" statement is used when defining a generator function and is only used in the body of the generator function. Using a yield statement in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.]
(This effectively forms the basis of a co-routine architecture. (ed.))
[When a generator function is called, it returns an iterator known as a generator iterator, or more commonly, a generator. The body of the generator function is executed by calling the generator’s next() method repeatedly until it raises an exception.
When a yield statement is executed, the state of the generator is frozen and the value of expression_list is returned to next()‘s caller. By “frozen” we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack: enough information is saved so that the next time next() is invoked, the function can proceed exactly as if the yield statement were just another external call.]
(Also see "PEP 0342 - Coroutines via Enhanced Generators " for additional background and general info.)
This should allow your GUI to call the next part of your algorithm specification generator, on demand, without it having to be put to sleep until the operator presses the "Next" button.
You would basically just be creating a little 'domain-specific language', (DSL), consisting of just the list of steps for your presentation of this particular algorithm, and the generator (iterator) would simply execute each next step when called (on demand).
Much simpler and easier to maintain.
A GUI program is always waiting for some action to occur. When actions do occur, the event code corresponding to that action is executed. Therefore, there is no need to call sleep(). All you need to do is set it up so that the third program is executed from the appropriate event.
I am working on a text-based game in Python 3.1 that would use timing as it's major source of game play. In order to do this effectively (rather than check the time every mainloop, my current method, which can be inaccurate, and slow if multiple people are playing the game at once) I was thinking about using the Threading.Timer class. Is it a bad thing to have multiple timers going at the same time? if so, how many timers is recommended?
For example, the user inputs to start the game. every second after the game starts it decides whether or not something happens, so there's a Timer(1) for every user playing at the same time. If something happens, the player has a certain time to react to it, so a timer must be set for that. If the user reacts quickly enough, that timer needs to end and it will set a new timer depending on what's going to happen next, etc
I think its a bad idea to use Timers in your case.
Using the delayed threads in python will result in more complex code, less accuracy, and quite possible worse performance. Basically, the rule is that if you think you need threads, you don't. Very few programs benefit from the use of threads.
I don't know what you are doing for input. You make reference to multiple players and I'm not sure whether thats on a single keyboard or perhaps networked. Regardless, your current strategy of a main loop may well be the best strategy. Although without seeing how your main loop operates its hard to say for certain.
It should be perfectly safe to have multiple timers going at the same time. Beware that it may not give much of a performance boost, as the CPython interpreter (the standard Python interpreter) uses a GIL (Global Interpreter Lock) which makes threading stuff a bit.... slow.