Tracing an ignored exception in Python? - python

My app has a custom audio library that itself uses the BASS library.
I create and destroy BASS stream objects throughout the program.
When my program exits, randomly (I haven't figured out the pattern yet) I get the following notice on my console:
Exception TypeError: "'NoneType' object is not callable" in <bound method stream.__del__ of <audio.audio_player.stream object at 0xaeda2f0>> ignored
My audio library (audio/audio_player.py [class Stream]) contains a class that creates a BASS stream object and then allows the code to manipulate it. When the class is destroyed (in the del routine) it calls BASS_StreamFree to clear any resources BASS might have allocated.
(audio_player.py)
from pybass import *
from ctypes import pointer, c_float, c_long, c_ulong, c_buffer
import os.path, time, threading
# initialize the BASS engine
BASS_Init(-1, 44100, 0, 0, None)
class stream(object):
"""Represents a single audio stream"""
def __init__(self, file):
# check for file existence
if (os.path.isfile(file) == False):
raise ValueError("File %s not found." % file)
# initialize a bass channel
self.cAddress = BASS_StreamCreateFile(False, file, 0, 0, 0)
def __del__(self):
BASS_StreamFree(self.cAddress)
def play(self):
BASS_ChannelPlay(self.cAddress, True)
while (self.playing == False):
pass
..more code..
My first inclination based on this message is that somewhere in my code, an instance of my stream class is being orphaned (no longer assigned to a variable) and Python still is trying to call its del function when the app closes, but by the time it tries the object has gone away.
This app does use wxWidgets and thus involves some threading. The fact that I'm not being given an actual variable name leads me to believe what I stated in the previous paragraph.
I'm not sure exactly what code would be relevant to debug this. The message does seem harmless but I don't like the idea of an "ignored" exception in the final production code.
Is there any tips anyone has for debugging this?

The message that the exception was ignored is because all exceptions raised in a __del__ method are ignored to keep the data model sane. Here's the relevant portion of the docs:
Warning: Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead. Also, when __del__() is invoked in response to a module being deleted (e.g., when execution of the program is done), other globals referenced by the __del__() method may already have been deleted or in the process of being torn down (e.g. the import machinery shutting down). For this reason, __del__() methods should do the absolute minimum needed to maintain external invariants. Starting with version 1.5, Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the __del__() method is called.
As for debugging it, you could start by putting a try/except block around the code in your __del__ method and printing out more information about the program's state at the time it occurs. Or you could consider doing less in the __del__ method, or getting rid of it entirely!

Related

import inside a function: is memory reclaimed upon function exit?

Linked questions:
python - import at top of file vs inside a function
Should Python import statements always be at the top of a module?
If an import statement is inside a function, will the memory occupied by it get reclaimed once the function exits? If yes, is the timing of the reclamation deterministic (or even -ish)?
def func():
import os
...
# function about to exit; will memory occupied by `os` be freed?
If anyone has knowledge on the behavior of micropython on this topic, bonus points.
The first import executes the code in the module. It creates the module object's attributes. Each subsequent import just references the module object created by the first import.
Module objects in Python are effectively singletons. For this to work, the Python implementation has to keep the one and only module instance around after the first import, regardless of the name the module was bound to. If it was bound to a name anyway, as there are also imports of the form from some_module import some_name.
So no, the memory isn't reclaimed.
No idea about Micropython, but I would be surprised if it changes semantics here that drastically. You can simply test this yourself:
some_module.py:
value = 0
some_other_module.py:
def f():
import some_module
some_module.value += 1
print(some_module.value)
f()
f()
This should print the numbers 1 and 2.
To second what #BlackJack wrote, per Python semantics, an "import" statement adds module reference to sys.modules, that alone does keep the module object from being garbage collected.
You can try to do del sys.modules["some_module"], but there's no guarantee that all memory taken by the module would be reclaimed. (That issue popped up previously, but I don't remember the current state of it, e.g. if bytecode objects can be garbage-collected).
If yes, is the timing of the reclamation deterministic (or even -ish)?
In MicroPython, "reclamation time" is guaranteedly non-deterministic, because it uses purely garbage collection scheme, no reference counting. That means that any resource-consuming objects (files, sockets) should be closed explicitly.
Otherwise, function-level imports are valid and useful idiom in Python, and especially useful in MicroPython. It allows to import some module only if a particular code path is hit. E.g. if user never calls some function, a module will not be imported, saving more memory for tasks user needs more in this particular application/invocation.

How exactly do modules in Python work?

I am trying to better understand Pythons modules, coming from C background mostly.
I have main.py with the following:
def g():
print obj # Need access to the object below
if __name__ == "__main__":
obj = {}
import child
child.f()
And child.py:
def f():
import main
main.g()
This particular structure of code may seem strange at first, but rest assured this is stripped from a larger project I am working on, where delegation of responsibility and decoupling forces the kind of inter-module function call sequence you see.
I need to be able to access the actual object I create when first executing main python main.py. Is this possible without explicitly sending obj as parameter around? Because I will have other variables and I don't want to send these too. If desperate, I can create a "state" object for the entire main module that I need access to, and send it around, but even that is to me a last resort. This is global variables at its simplest in C, but in Python this is a different beast I suppose (module global variables only?)
One of the solutions, excluding parameter passing at least, has turned to revolve around the fact that when executing the main Python module main as such - via f.e. python main.py where if clause suceeds and subsequently, obj is bound - the main module and its state exist and are referenced as __main__ (inspected using sys.modules dictionary). So when the child module needs the actual instance of the main module, it is not main it needs to import but __main__, otherwise two distinct copies would exist, with their own distinct states.
'Fixed' child.py:
def f():
import __main__
__main__.g()

How to inherit from a multiprocessing queue?

With the following code, it seems that the queue instance passed to the worker isn't initialized:
from multiprocessing import Process
from multiprocessing.queues import Queue
class MyQueue(Queue):
def __init__(self, name):
Queue.__init__(self)
self.name = name
def worker(queue):
print queue.name
if __name__ == "__main__":
queue = MyQueue("My Queue")
p = Process(target=worker, args=(queue,))
p.start()
p.join()
This throws:
... line 14, in worker
print queue.name
AttributeError: 'MyQueue' object has no attribute 'name'
I can't re-initialize the queue, because I'll loose the original value of queue.name, even passing the queue's name as an argument to the worker (this should work, but it's not a clean solution).
So, how can inherit from multiprocessing.queues.Queue without getting this error?
On POSIX, Queue objects are shared to the child processes by simple inheritance.*
On Windows, that isn't possible, so it has to pickle the Queue, send it over a pipe to the child, and unpickle it.
(This may not be obvious, because if you actually try to pickle a Queue, you get an exception, RuntimeError: MyQueue objects should only be shared between processes through inheritance. If you look through the source, you'll see that this is really a lie—it only raises this exception if you try to pickle a Queue when multiprocess is not in the middle of spawning a child process.)
Of course generic pickling and unpickling wouldn't do any good, because you'd end up with two identical queues, not the same queue in two processes. So, multiprocessing extends things a bit, by adding a register_after_fork mechanism for objects to use when unpickling.** If you look at the source for Queue, you can see how it works.
But you don't really need to know how it works to hook it; you can hook it the same way as any other class's pickling. For example, this should work:***
def __getstate__(self):
return self.name, super(MyQueue, self).__getstate__()
def __setstate__(self, state):
self.name, state = state
super(MyQueue, self).__setstate__(state)
For more details, the pickle documentation explains the different ways you can influence how your class is pickled.
(If it doesn't work, and I haven't made a stupid mistake… then you do have to know at least a little about how it works to hook it… but most likely just to figure out whether to do your extra work before or after the _after_fork(), which would just require swapping the last two lines…)
* I'm not sure it's actually guaranteed to use simple fork inheritance on POSIX platforms. That happens to be true on 2.7 and 3.3. But there's a fork of multiprocessing that uses the Windows-style pickle-everything on all platforms for consistency, and another one that uses a hybrid on OS X to allow using CoreFoundation in single-threaded mode, or something like that, and it's clearly doable that way.
** Actually, I think Queue is only using register_after_fork for convenience, and could be rewritten without it… but it's depending on the magic that Pipe does in its _after_fork on Windows, or Lock and BoundedSemaphore on POSIX.
*** This is only correct because I happen to know, from reading the source, that Queue is a new-style class, doesn't override __reduce__ or __reduce_ex, and never returns a falsey value from __getstate__. If you didn't know that, you'd have to write more code.

Unpickling a function into a different context in Python

I have written a Python interface to a process-centric job distribution system we're developing/using internally at my workplace. While reasonably skilled programmers, the primary people using this interface are research scientists, not software developers, so ease-of-use and keeping the interface out of the way to the greatest degree possible is paramount.
My library unrolls a sequence of inputs into a sequence of pickle files on a shared file server, then spawns jobs that load those inputs, perform the computation, pickle the results, and exit; the client script then picks back up and produces a generator that loads and yields the results (or rethrows any exception the calculation function did.)
This is only useful since the calculation function itself is one of the serialized inputs. cPickle is quite content to pickle function references, but requires the pickled function to be reimportable in the same context. This is problematic. I've already solved the problem of finding the module to reimport it, but the vast majority of the time, it is a top-level function that is pickled and, thus, does not have a module path. The only strategy I've found to be able to unpickle such a function on the computation nodes is this nauseating little approach towards simulating the original environment in which the function was pickled before unpickling it:
...
# At this point, we've identified the source of the target function.
# A string by its name lives in "modname".
# In the real code, there is significant try/except work here.
targetModule = __import__(modname)
globalRef = globals()
for thingie in dir(targetModule):
if thingie not in globalRef:
globalRef[thingie] = targetModule.__dict__[thingie]
# sys.argv[2]: the path to the pickle file common to all jobs, which contains
# any data in common to all invocations of the target function, then the
# target function itself
commonFile = open(sys.argv[2], "rb")
commonUnpickle = cPickle.Unpickler(commonFile)
commonData = commonUnpickle.load()
# the actual function unpack I'm having trouble with:
doIt = commonUnpickle.load()
The final line is the most important one here- it's where my module is picking up the function it should actually be running. This code, as written, works as desired, but directly manipulating the symbol tables like this is unsettling.
How can I do this, or something very much like this that does not force the research scientists to separate their calculation scripts into a proper class structure (they use Python like the most excellent graphing calculator ever and I would like to continue to let them do so) the way Pickle desperately wants, without the unpleasant, unsafe, and just plain scary __dict__-and-globals() manipulation I'm using above? I fervently believe there has to be a better way, but exec "from {0} import *".format("modname") didn't do it, several attempts to inject the pickle load into the targetModule reference didn't do it, and eval("commonUnpickle.load()", targetModule.__dict__, locals()) didn't do it. All of these fail with Unpickle's AttributeError over being unable to find the function in <module>.
What is a better way?
Pickling functions can be rather annoying if trying to move them into a different context. If the function does not reference anything from the module that it is in and references (if anything) modules that are guaranteed to be imported, you might check some code from a Rudimentary Database Engine found on the Python Cookbook.
In order to support views, the academic module grabs the code from the callable when pickling the query. When it comes time to unpickle the view, a LambdaType instance is created with the code object and a reference to a namespace containing all imported modules. The solution has limitations but worked well enough for the exercise.
Example for Views
class _View:
def __init__(self, database, query, *name_changes):
"Initializes _View instance with details of saved query."
self.__database = database
self.__query = query
self.__name_changes = name_changes
def __getstate__(self):
"Returns everything needed to pickle _View instance."
return self.__database, self.__query.__code__, self.__name_changes
def __setstate__(self, state):
"Sets the state of the _View instance when unpickled."
database, query, name_changes = state
self.__database = database
self.__query = types.LambdaType(query, sys.modules)
self.__name_changes = name_changes
Sometimes is appears necessary to make modifications to the registered modules available in the system. If for example you need to make reference to the first module (__main__), you may need to create a new module with your available namespace loaded into a new module object. The same recipe used the following technique.
Example for Modules
def test_northwind():
"Loads and runs some test on the sample Northwind database."
import os, imp
# Patch the module namespace to recognize this file.
name = os.path.splitext(os.path.basename(sys.argv[0]))[0]
module = imp.new_module(name)
vars(module).update(globals())
sys.modules[name] = module
Your question was long, and I was too caffeinated to make it through your very long question… However, I think you are looking to do something that there's a pretty good existing solution for already. There's a fork of the parallel python (i.e. pp) library that takes functions and objects and serializes them, sends them to different servers, and then unpikles and executes them. The fork lives inside the pathos package, but you can download it independently here:
http://danse.cacr.caltech.edu/packages/dev_danse_us
The "other context" in that case is another server… and the objects are transported by converting the objects to source code and then back to objects.
If you are looking to use pickling, much in the way you are doing already, there's an extension to mpi4py that serializes arguments and functions, and returns pickled return values… The package is called pyina, and is commonly used to ship code and objects to cluster nodes in coordination with a cluster scheduler.
Both pathos and pyina provide map abstractions (and pipe), and try to hide all of the details of parallel computing behind the abstractions, so scientists don't need to learn anything except how to program normal serial python. They just use one of the map or pipe functions, and get parallel or distributed computing.
Oh, I almost forgot. The dill serializer includes dump_session and load_session functions that allow the user to easily serialize their entire interpreter session and send it to another computer (or just save it for later use). That's pretty handy for changing contexts, in a different sense.
Get dill, pathos, and pyina here: https://github.com/uqfoundation
For a module to be recognized as loaded I think it must by in sys.modules, not just its content imported into your global/local namespace. Try to exec everything, then get the result out of an artificial environment.
env = {"fn": sys.argv[2]}
code = """\
import %s # maybe more
import cPickle
commonFile = open(fn, "rb")
commonUnpickle = cPickle.Unpickler(commonFile)
commonData = commonUnpickle.load()
doIt = commonUnpickle.load()
"""
exec code in env
return env["doIt"]
While functions are advertised as first-class objects in Python, this is one case where it can be seen that they are really second-class objects. It is the reference to the callable, not the object itself, that is pickled. (You cannot directly pickle a lambda expression.)
There is an alternate usage of __import__ that you might prefer:
def importer(modulename, symbols=None):
u"importer('foo') returns module foo; importer('foo', ['bar']) returns {'bar': object}"
if modulename in sys.modules: module = sys.modules[modulename]
else: module = __import__(modulename, fromlist=['*'])
if symbols == None: return module
else: return dict(zip(symbols, map(partial(getattr, module), symbols)))
So these would all be basically equivalent:
from mymodule.mysubmodule import myfunction
myfunction = importer('mymodule.mysubmodule').myfunction
globals()['myfunction'] = importer('mymodule.mysubmodule', ['myfunction'])['myfunction']

Python would not let me use methods inside the class while the class definition has not been finished

I am using Python 2.6 as a batch script replacement. It will be started by double-clicking, so all output to stdout will be lost/ignored by the user. So, I decided to add logging, and to make matters simple, I wrote a class for that. The idea is that I can use Logging.Logger anywhere in my code, and the logger will be ready to go.
I wish to have no more than 10 old log files in a directory, so I clear out the old ones manually. I have not found a functionality like that through API, plus I am paranoid, and want to log everything, event the fact that there were files with unexpected names inside the log directory.
So, below is my attempt at such class, but I get an error when I try to test (run) it:
>>> ================================ RESTART ================================
>>>
Traceback (most recent call last):
File "C:/AutomationScripts/build scripts/Deleteme.py", line 6, in <module>
class Logging:
File "C:/AutomationScripts/build scripts/Deleteme.py", line 42, in Logging
__clearOldLogs(dummySetting)
File "C:/AutomationScripts/build scripts/Deleteme.py", line 38, in __clearOldLogs
_assert(Logger, 'Logger does not exist yet! Why?')
NameError: global name '_assert' is not defined
>>>
Yes, I come from Java/C# background. I am probably not doing things "the Pythonic" way. Please help me do the right thing, and please give a complete answer that would work, instead of simply pointing out holes in my knowledge. I believe I have provided enough of a code sample. Sorry, it would not run without a Settings class, but hopefully you get the idea.
# This file has been tested on Python 2.6.*. For Windows only.
import logging # For actual logging
import tkMessageBox # To display an error (if logging is unavailable)
class Logging:
"""
Logging class provides simplified interface to logging
as well as provides some handy functions.
"""
# To be set when the logger is properly configured.
Logger = None
#staticmethod
def _assert(condition, message):
""" Like a regular assert, except that it should be visible to the user. """
if condition: return
# Else log and fail
if Logger:
Logger.debug(traceback.format_stack())
Logger.error('Assert failed: ' + message)
else:
tkMessageBox.showinfo('Assert failed: ' + message, traceback.format_stack())
assert(condition)
#staticmethod
def _removeFromEnd(string, endString):
_assert(string.endswith(endString),
"String '{0}' does not end in '{1}'.".format(string, endString))
return string[:-len(endString)]
def __clearOldLogs(logSettings):
"""
We want to clear old (by date) files if we get too many.
We should call this method only after variable 'Logger' has been created.
"""
# The following check should not be necessary from outside of
# Logging class, when it is fully defined
_assert(Logger, 'Logger does not exist yet! Why?')
# Do more work here
def __createLogger(logSettings):
logFileName = logSettings.LogFileNameFunc()
#_assert(False, 'Test')
logName = _removeFromEnd(logFileName, logSettings.FileExtension)
logFileName = os.path.join(logSettings.BaseDir, logFileName)
# If someone tried to log something before basicConfig is called,
# Python creates a default handler that goes to the console and will
# ignore further basicConfig calls. Remove the handler if there is one.
root = logging.getLogger()
if root.handlers:
for handler in root.handlers:
root.removeHandler(handler)
logging.basicConfig(filename = logFileName, name = logName, level = logging.DEBUG, format = "%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(logName)
return logger
# Settings is a separate class (not dependent on this one).
Logger = __createLogger(Settings.LOGGING)
__clearOldLogs(Settings.LOGGING)
if __name__ == '__main__':
# This code section will not run when the class is imported.
# If it is run directly, then we will print debugging info.
logger = Logging.Logger
logger.debug('Test debug message.')
logger.info('Test info message.')
logger.warning('Test warning message.')
logger.error('Test error message.')
logger.critical('Test critical message.')
Relevant questions, style suggestions and complete answers are welcome. Thanks!
You're getting that exception because you're calling _assert() rather than Logging._assert(). The error message tells you it's looking for _assert() in the module's global namespace rather than in the class namespace; to get it to look in the latter, you have to explicitly specify it.
Of course, in this case, you're trying to do it while the class is still being defined and the name is not available until the class is complete, so it's going to be tough to make that work.
A solution would be to un-indent the following two lines (which I've edited to use the fully qualified name) so that they are outside the class definition; they will be executed right after it.
Logger = Logging.__createLogger(Settings.LOGGING)
Logging.__clearOldLogs(Settings.LOGGING)
A style suggestion that will help: rather than making a class with a bunch of static methods, consider making them top-level functions in the module. Users of your module (including yourself) will find it easier to get to the functionality they want. There's no reason to use a class just to be a container; the module itself is already just such a container.
A module is basically a single *.py file (though you can create modules that have multiple files, this will do for now). When you do an import, what you're importing is a module. In your example, tkMessageBox and logging are both modules. So just make a separate file (making sure its name does not conflict with existing Python module names), save it in the same directory as your main script, and import it in your main script. If you named it mylogging.py then you would import mylogging and access functions in it as mylogging.clearOldLogs() or whatever (similar to how you'd address them now as a class).
"Global" names in Python are not truly global, they're only global to the module they're defined in. So a module is a good way to compartmentalize your functionality, especially parts (like logging) that you anticipate reusing in a lot of your future scripts.
Replace the line
_assert(Logger, 'Logger does not exist yet! Why?')
with
Logging._assert(Logger, 'Logger does not exist yet! Why?')
This is because you define _assert as a static method of the class, and static method has to be referred to as ClassName.methodName even if you are calling it from a method for an instance of this class.
Some comments.
Do not use double underscores unless you are absolutely sure that you have to and why.
To access the _assert method, you call it with self. Like so: self._assert(Logger, 'Logger does not exist yet! Why?') In a static method, as in your example, you use the class name: Logger._assert(). Python is very explicit.
Classes are only created at the END of the class definition. That's just how it is with Python. But your error is not related to that.
I'm not sure what this code is supposed to do:
# Settings is a separate class (not dependent on this one).
Logger = __createLogger(Settings.LOGGING)
__clearOldLogs(Settings.LOGGING)
But I suspect you should put it in the __init__ method. I do not immediately see any code that needs access to the class during class construction.
__createLogger is a strange function. Isn't the class called Logger? Shouldn't that just be the __init__ of the Logger class?
I think it just needs to be Logging._assert. Python doesn't do namespace resolution the way java does: an unqualified name is either method-local or global. Enclosing scopes like classes won't be searched automatically.
The chapter and verse on how Python deals with names is in section 4.1. Naming and binding in the language reference manual. The bit that's relevant here is probably:
The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods
Contrast this with names defined in functions, which are inherited downwards (into method-local functions and classes):
If the definition occurs in a function block, the scope extends to any blocks contained within the defining one, unless a contained block introduces a different binding for the name

Categories

Resources