twisted: no exception trace if error from a callback

twisted: no exception trace if error from a callback - python

Consider the following code:
df = defer.Deferred()
def hah(_): raise ValueError("4")
df.addCallback(hah)
df.callback(hah)
When it runs, that exception just gets eaten. Where did it go? How can I get it to be displayed? Doing defer.setDebugging(True) has no effect.
I ask this because other times, I get a printout saying "Unhandled error in Deferred:". How do I get that to happen in this case? I see that if I add an errback to df then the errback gets called with the exception, but all I want to do is print the error and do nothing else, and I don't want to manually add that handler to every deferred I create.

The exception is still sitting in the Deferred. There are two possible outcomes at this point:
You could add an errback to the Deferred. As soon as you do, it will get called with a Failure containing the exception that was raised.
You could let the Deferred be garbage collected (explicitly delete df, or return from the function, or lose the reference in any other way). This triggers the ''Unhandled error in Deferred'' code.
Because an errback can be added to a Deferred at any time (ie, the first point above), Deferreds don't do anything with otherwise unhandled errors right away. They don't know if the error is really unhandled, or just unhandled so far. It's only when the Deferred is garbage collected that it can be sure no one else is going to handle the exception, so that's when it gets logged.
In general, you want to be sure you have errbacks on Deferreds, precisely because it's sometimes hard to predict when a Deferred will get garbage collected. It might be a long time, which means it might be a long time before you learn about the exception if you don't have your own errback attached.
This doesn't have to be a terrible burden. Any Deferred (a) which is returned from a callback on another Deferred (b) (ie, when chaining happens) will pass its errors along to b. So (a) doesn't need extra errbacks on it for logging and reporting, only (b) does. If you have a single logical task which is complicated and involves many asynchronous operations, it's almost always the case that all of the Deferreds involved in those operations should channel their results (success or failure) to one main Deferred that represents the logical operation. You often only need special error handling behavior on that one Deferred, and that will let you handle errors from any of the other Deferreds involved.

Related

Is SIGINT intrinsically unreliable in python?

I have an application that relies on SIGINT for a graceful shutdown. I noticed that every once in awhile it just keeps running. The cause turned out to be a generator in xml/etree/ElementTree.py.
If SIGINT arrives while that generator is being cleaned up, all exceptions are ignored (recall that default action for SIGINT is to raise a KeyboardInterrupt). That's not unique to this particular generator, or to generators in general.
From the python docs:
"Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead"
In over five years of programming in python, this is the first time I run into this issue.
If garbage collection can occur at any point, then SIGINT can also theoretically be ignored at any point, and I can't ever rely on it. Is that correct? Have I just been lucky this whole time?
Or is it something about this particular package and this particular generator?

Does MemoryError cause python to flush its cache?

I just saw a MemoryError happen on a machine and I noticed the available cache on the server increased drastically after this. Is there some kind of way that Python triggers a memory-management task when the error gets thrown? Or is this potentially managed by the server (Linux / CentOs)?

MemoryError isn't handled specially in a way that should cause this to occur for it, and no other exception, but:
Exceptions do unwind the stack, and objects referenced solely along the stack between the exception being raised and when it is caught will generally be released when the exception handling is complete (during handling, the exception traceback tends to create cyclic references that prevent cleanup from occurring)
MemoryError inherits from BaseException, not Exception, so it's less likely to be handled by "generic" except Exception: blocks, meaning more stack layers are unwound and eventually released
The CPython cyclic garbage collector determines when to run collections based on the number of allocations and deallocations that have occurred; if the large stack unwind frees a lot of objects, even more might be freed if it's enough to trigger a collection
All of this increases the odds that memory will be released, but none of it is specific to MemoryError; the same behavior could be observed if you hit Ctrl-C and triggered a KeyboardInterrupt. More likely, you're seeing Python exit, or Linux is responding to the extreme memory request by dumping its cache; the MemoryError would come after the cache is dumped to try to satisfy the large memory request, particularly if the request is made in several sequential requests for blocks of memory instead of a single huge request.

tearDown not called after timeout in twisted trial?

I'm seeing an issue in my test suite in trial where everything works fine until I get a timeout. If a test fails due to a timeout, the tearDown function never gets called, leaving the reactor unclean which in turn causes the rest of the tests to fail. I believe tearDown should be called after a timeout, does anyone know why this might happen?

You are correct that tearDown() should be called regardless of what happens in your test. From the documentation for tearDown():
This is called even if the test method raised an exception
However, there is a catch. From the same documentation:
This method will only be called if the setUp() succeeds, regardless of the outcome of the test method.
So it sounds like you perhaps start the reactor in setUp() and when it times out, this is preventing your tearDown() from running - the idea being that whatever you were trying to "set up" in setUp() was not successfully set up, so you do not want to try to tear it down. However, it would be hard to diagnose with certainty unless you provide the code of your setUp and tearDown methods, along with the code of any relevant tests.

It's rather strange because on my box, the teardown executes even if a timeout occurs. The tests should stop running if the reactor is not in a clean state, unless you use the --unclean-warnings flag. Does the test runner stop after the timeout for you? What version of Python and Twisted are you running?
As a side note, if you need to run a unique teardown for a specific test function, there's a very convenient addCleanup() callback. It comes in handy if you need to cancel callback, LoopingCall, or callLater functions so that the reactor isn't in a dirty state. addCleanup returns a Deferred so you can just chain callbacks that perform an adhoc teardown. It might be a good option to try if the class teardown isn't working for you.
PS
I've been so used to writing "well behaved" Twisted code, I don't even recall how to get into an unclean reactor state :D I swear I'm not bragging. Could you provide me a brief summary of what you're doing so that I could test it out on my end?

I found the problem, I'll put this here in case it's helpful to anyone else in the future.
I was returning a deferred from the test that had already been called (as in, deferred.callback had been called), but it still had an unfinished callback chain. From what I can see in the trial code here https://github.com/twisted/twisted/blob/twisted-16.5.0/src/twisted/trial/_asynctest.py#L92, the reactor is crashed when this happens, which explains why the tearDown doesn't get called. The solution for me was to return a deferred from the offending tests that does not have a callback chain that lives for a long time (it's callbacks do not return deferreds themselves).

How to avoid suppressing KeyboardInterrupt during garbage collection in python?

Default handler for SIGINT raises KeyboardInterrupt. However, if a program is inside a __del__ method (because of an ongoing garbage collection), the exception is ignored with the following message printed to stderr:
Exception KeyboardInterrupt in <...> ignored
As a result, the program continues to work despite receiving SIGINT. Of course, I can define my own handler for SIGINT that sets a global variable sigint_received to True, and then often check the value of the variable in my program. But this looks ugly.
Is there an elegant and reliable way to make sure that the python program gets interrupted after receiving SIGINT?

Before I dive into my solution, I want to highlight the scary red "Warning:" sidebar in the docs for object.__del__ (emphasis mine):
Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead. [...] __del__() methods should do the absolute minimum needed to maintain external invariants.
This suggests to me that any __del__ method that's at serious risk of being interrupted by an interactive user's Ctrl-C might be doing too much. So my first suggestion would be to look for ways to minimize your __del__ method, whatever it is.
Or to put it another way: If your __del__ method really does do "the absolute minimum needed", then how can it be safe to kill the process half-way through?
Custom Signal Handler
The only solution I could find was indeed a custom signal handler for signal.SIGINT... but a lot of the obvious tricks didn't work:
Failed: sys.exit
Calling sys.exit from the signal handler just raised a SystemExit exception, which was ignored. Python's C API docs suggest that it is impossible for the Python interpreter to raise any exception during a __del__ method:
voidPyErr_WriteUnraisable(PyObject *obj)
[Called when...] it is impossible for the interpreter to actually raise the exception [...] for example, when an exception occurs in an __del__() method.
Partial Success: Flag Variable
Your idea of setting a global "drop dead" variable inside the signal handler worked only partially --- although it updated the variable, nothing got a chance to read that variable until after the __del__ method returned. So for several seconds, the Ctrl-C appeared to have done nothing.
This might be good enough if you just want to terminate the process "eventually", since it will exit whenever the __del__ method returns. But since you probably want to shut down the process without waiting (both SIGINT and KeyboardInterrupt typically come from an impatient user), this won't do.
Success: os.kill
Since I couldn't find a way to convince the Python interpreter to kill itself, my solution was to have the (much more persuasive) operating system do it for me. This signal handler uses os.kill to send a stronger SIGTERM to its own process ID, causing the Python interpreter itself to exit.
def _sigterm_this_process(signum, frame):
pid = os.getpid()
os.kill(pid, signal.SIGTERM)
return
# Elsewhere...
signal.signal(signal.SIGINT, _sigterm_this_process)
Once the custom signal handler was set, Ctrl-C caused the __del__ method (and the entire program) to exit immediately.

twisted: check whether a deferred has already been called

This is what I'm trying to accomplish. I'm making a remote call to a server for information, and I want to block to wait for the info. I created a function that returns a Deferred such that when the RPC comes in with the reply, the deferred is called. Then I have a function called from a thread that goes threads.blockingCallFromThread(reactor, deferredfunc, args).
If something goes wrong - for example, the server goes down - then the call will never un-block. I'd prefer the deferred to go off with an exception in these cases.
I partially succeeded. I have a deferred, onConnectionLost which goes off when the connection is lost. I modified my blocking call function to:
deferred = deferredfunc(args)
self.onConnectionLost.addCallback(lambda _: deferred.errback(
failure.Failure(Exception("connection lost while getting run"))))
result = threads.blockingCallFromThread(
reactor, lambda _: deferred, None)
return result
This works fine. If the server goes down, the connection is lost, and the errback is triggered. However, if the server does not go down and everything shuts down cleanly, onConnectionLost still gets fired, and the anonymous callback here attempts to trigger the errback, causing an AlreadyCalled exception to be raised.
Is there any neat way to check that a deferred has already been fired? I want to avoid wrapping it in a try/except block, but I can always resort to that if that's the only way.

There are ways, but you really shouldn't do it. Your code that is firing the Deferred should be keeping track of whether it's fired the Deferred or not in the associated state. Really, when you fire the Deferred, you should lose track of it so that it can get properly garbage collected; that way you never need to worry about calling it twice, since you won't have a reference to it any more.
Also, it looks like you're calling deferredfunc from the same thread that you're calling blockingCallFromThread. Don't do that; functions which return Deferreds are most likely calling reactor APIs, and those APIs are not thread safe. In fact, Deferred itself is not thread safe. This is why it's blockingCallFromThread, not blockOnThisDeferredFromThread. You should do blockingCallFromThread(reactor, deferredfunc, args).
If you really want errback-if-it's-been-called-otherwise-do-nothing behavior, you may want to cancel the Deferred.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.