Here's an esoteric pure-Python question.
I'm doing some statistical profiling using sys._current_frames(). i.e. I've got a background thread that runs sys._current_frames() once every second, dumps the results in a text file, and then later I've got some Python code that sorts the tracebacks from most common to least.
One curious phenomenon I've seen is tracebacks like these:
File "/opt/foo/bar.py", line 1437, in __iter__
yield key
This yield is a generator that I wrote. The curious thing is that there's just one frame on this traceback. How could this be? The other traceback have lots of frames, either from the top level of the process or the top level of the frame. What is the meaning of this single-frame stacktrace?
One theory that I had is that this is a generator's frozen state, after it's yielded a value and it's waiting to have next called on it again. But I think I disproved this theory with a separate experiment: I made a generator, ensured it was paused, called sys._current_frames() and I didn't see that kind of stacktrace.
As the sys._current_frames() documentation warns,
This is most useful for debugging deadlock: this function does not require the deadlocked threads’ cooperation, and such threads’ call stacks are frozen for as long as they remain deadlocked. The frame returned for a non-deadlocked thread may bear no relationship to that thread’s current activity by the time calling code examines the frame.
sys._current_frames() is naturally prone to race conditions in any situation where you cannot guarantee the threads of interest are paused.
As you suspected, you're seeing a stack trace for a suspended generator. When a generator suspends, its stack frame has no parent frames. Its f_back is set to null.
sys._current_frames() retrieves stack frames for currently running threads, but by the time you look at those frames, they may not be running any more. If a generator suspends between the time you call sys._current_frames() and the time you inspect the frame, this is what it will look like. You might also see it on top of a call stack that looks completely different from when you actually called sys._current_frames(), if it gets resumed somewhere else.
Your test didn't show the generator frame because you suspended the generator before calling sys._current_frames() instead of afterward. The generator's stack frame was not the active frame of any thread at that point.
Related
I have a GTK window. It turns out that if I schedule several redraw calls all at once, without any delay, from a separate thread, using idle_add(window.queue_draw), only one call will execute.
While if I do idle_add(custom_function), every single scheduled call to custom_function will run.
While it's clear that this is done for optimization, I can't see if/where this is mentioned in the documentation [1] and I also wonder if there are other such rules for idle_add.
[1] https://developer.gnome.org/pygobject/stable/glib-functions.html
It actually isn't idle_add that is making that behavior. The docs for widget-queue-draw-region, which gets called by queue-redraw, state that redrawing only gets done after the main loop is no longer busy.
I'm having trouble understanding the differences between stack frames and execution frames, mostly with respect to the traceback and inspect modules (in Python 3).
I thought they were the same but the docs imply they are not as methods of the inspect module return frame objects whereas methods of the traceback module do not (i.e. inspect.stack() vs traceback.print_stack().
From googling, I understand that a stack frame is a data structure containing subroutine state information (function call and argument data). However, as per the docs, an an execution frame is something similar:
An execution frame contains some administrative information (used for debugging), determines where and how execution continues after the code block's execution has completed, and (perhaps most importantly) defines two namespaces, the local and the global namespace, that affect execution of the code block.
So what exactly is the difference between a stack frame and an execution frame?
With normal functions calls, the program state is mostly described by a simple call stack. It is printed out as a traceback after an uncaught exception, it can be examined with inspect.stack, and it can be displayed in a debugger after a breakpoint.
In the presence of generators, generator-based couroutines, and async def-based coroutines, I don't think the call stack is enough. What's a good way to mentally visualize the program state? How do I inspect it in run-time?
There are functions inspect.getgeneratorstate and inspect.getcoroutinestate, but they only provide information about whether the generator/coroutine is created, running, suspended, or closed. In the case the state is RUNNING, I want to be able to examine the actual line number the generator or coroutine is currently executing and the stack frames that correspond to the other functions it might have called. In the case it's SUSPENDED, I want to examine other generators / coroutines it sent data to or yielded to.
Edit: I found a related question on SO which pointed me to this excellent article that explains everything I asked about in this question.
You just have to findout all instances fo generators and co-routines, in all "traditional" frames - (either search for them recursively in all objects in all frames, or you mitght try to use the garbage collector (gc) module to get a reference to all these instances)
Generators and co-routines have, respectively, a gi_frame and a cr_frame attribute.
In a python reference manual said that
A code block is executed in an execution frame. A frame contains some
administrative information (used for debugging) and determines where
and how execution continues after the code block’s execution has
completed.
and
Frame objects represent execution frames. They may occur
in traceback objects
But I don't understanf how frame does work. How can I get an acces to a current frame object? When is frame object creating? Is the frame object created everytime when a code of new block is strarting to execute?
These frames are representations of the stack frames created by function calls. You should not need to access them in normal programming. A new frame is indeed created every time a function is called, and destroyed when it exits or raises an uncaught exception. Since function calls can go many levels deep your program ends up with a bunch of nested stack frames, but it's not good programming practice (unless you are writing a debugger or similar application) to mess about with the frames even though Python does make them available.
It is important to understand frames in Python 3, especially if you maintain production code that is sensitive to memory usage, such as deep learning models.
Frames are objects that represent areas of memory where local variables are stored when a function is called. They can be stored and manipulated, which is what debuggers do. Understanding how frames are handled by python can help you avoid memory leaks.
Each time a function is called, a new frame is created to hold the function's variables and parameters. These frames are normally destroyed when the function finishes executing normally. However, if an exception is raised, the associated frame and all parent frames are stored in a traceback object, which is an attribute of the Exception object (__traceback__). This can cause memory leaks if the Exception object live for a long time, because they will hold onto the frames and all associated local variables are not going to be garbage collected.
This matter quite a lot.
For example it is one reason why you need to call the close method on file objects even if you don't create reference cycles, because the file object may be referenced by a traceback object stored on an Exception. This exclude file from being garbage collected even after it goes out of scope.
The issue is worse in interactive python shells like (Jupyter) where each unhandled exceptions ends up leaving forever in few places. I'm working on a way to clear that up hence I've found this issue.
In Python, the types module provides the FrameType and TracebackType types, which represent frames and tracebacks, respectively. However, you cannot instantiate these types directly. https://docs.python.org/3/library/types.html#types.FrameType
The tracback attribute was introduced in python 3 with PEP 3134, it goes a bit on ramification of this change in details, so it is a good read for curious.
This is quite an essential part of my program and I need to have sorted out as soon as possible so anything would be a massive help.
My program consists of three modules which are imported to each other. One module consists of my user interface for which I am using tkinter. The user inputs data on a canvas which is sent to a second program to be processed and is then sent to the third module which contains the algorithm which I intend to step through with the user.
The "first" and "third" modules can interact with each other and during certain points in explaining the algorithm I change the appearance of the canvas and some text on the interface. The third module should then pause (for which I'm currently using a basic sleep method), and wait (ideally it will wait for the user to press the "Next Step" button on the user interface). It is during this step that my interface decides that it wants to freeze.
Is there any way I can stop this?
Many thanks in advance.
Edit: I've found a way to fix this. Thank you for all the suggestions!
Calling time.sleep() will stop your program doing anything until it finishes sleeping. You need Tkinter to keep processing events until it should run the next part of your code.
To do that, put the next part of your code in a separate function, and get Tkinter to call it when it's ready. Typically, you want this to happen when the user triggers it (e.g. by clicking a button), so you need to bind it to an event (docs). If you actually want it to happen after a fixed time, you can use the .after() method on any tkinter widget (docs).
GUI programming takes a bit of getting used to. You don't code as a series of things happening one after the other, you write separate bits of code which are triggered by what the user does.
Terminology note: if your Python files import each other, you have three modules, but it's still all one program. Talking about the "first program" will confuse people.
H.E.P -
The traditional way to do this does indeed involve using a separate thread and co-ordinating the work between the "worker" thread and the GUI thread using some sort of polling or eventing mechanism.
But, as Thomas K. points out, that can get very complex and tricky, especially regarding Python's use of the Global Interpreter Lock (GIL) etc. and having to also contend with Tkinter's processing loop.
(The only good reason to use a multi-threaded GUI is if you absolutely MUST ensure that the GUI remains responsive during a potentially long-running background task, which I don't believe is the issue in this case.)
What I would suggest instead is a generator-based "co-routine"-type architecture.
As noted in "The Python (2.7) Language Reference", Section 6.8, [the "yield" statement is used when defining a generator function and is only used in the body of the generator function. Using a yield statement in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.]
(This effectively forms the basis of a co-routine architecture. (ed.))
[When a generator function is called, it returns an iterator known as a generator iterator, or more commonly, a generator. The body of the generator function is executed by calling the generator’s next() method repeatedly until it raises an exception.
When a yield statement is executed, the state of the generator is frozen and the value of expression_list is returned to next()‘s caller. By “frozen” we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack: enough information is saved so that the next time next() is invoked, the function can proceed exactly as if the yield statement were just another external call.]
(Also see "PEP 0342 - Coroutines via Enhanced Generators " for additional background and general info.)
This should allow your GUI to call the next part of your algorithm specification generator, on demand, without it having to be put to sleep until the operator presses the "Next" button.
You would basically just be creating a little 'domain-specific language', (DSL), consisting of just the list of steps for your presentation of this particular algorithm, and the generator (iterator) would simply execute each next step when called (on demand).
Much simpler and easier to maintain.
A GUI program is always waiting for some action to occur. When actions do occur, the event code corresponding to that action is executed. Therefore, there is no need to call sleep(). All you need to do is set it up so that the third program is executed from the appropriate event.