Segmentation fault in destructor with Python - python

I have made a class to represent my led strip, and I would like to switch off the strip when I stop it (aka when the program stops and the object is destroyed). Hence, as I would do in C++, I created a destructor to do that. But it looks like Python call it after it destroyed the object. Then I got a segmentation fault error.
Here is my class, the destructor just have to call the function to set the colour of each LED to 0.
class LedStrip:
def __init__(self, led_count, led_pin, led_freq_hz, led_dma, led_invert, led_brightness, led_channel, color = MyColor(0,0,0)):
self.__strip = Adafruit_NeoPixel(led_count, led_pin, led_freq_hz, led_dma, led_invert, led_brightness, led_channel)
self.__color = color
self.__strip.begin()
def __del__(self):
self.__color = MyColor(0,0,0)
self.colorWipe(10)
# ATTRIBUTS (getter/setter)
#property
def color(self):
return self.__color
#color.setter
def color(self, color):
if isinstance(color, MyColor):
self.__color = color
else:
self.__color = MyColor(0,0,0)
def __len__(self):
return self.__strip.numPixels()
# METHODS
def colorWipe(self, wait_ms=50):
"""Wipe color across display a pixel at a time."""
color = self.__color.toNum()
for i in range(self.__strip.numPixels()):
self.__strip.setPixelColor(i, color)
self.__strip.show()
time.sleep(wait_ms/1000.0)
MyColor is just a class that I made to represent an RGB colour. What would be the correct what to achieve that task in Python? I come from C++, hence my OOP method is really C++ oriented, so I have some difficulties thinking in a pythonic way.
Thanks in advance

You have to be very careful when writing __del__ methods (finalizers). They can be called at virtually any time after an object is no longer referenced (it doesn’t necessarily happen immediately) and there's really no guarantee that they'll be called at interpreter exit time. If they do get called during interpreter exit, other objects (such as global variables and other modules) might already have been cleaned up, and therefore unavailable to your finalizer. They exist so that objects can clean up state (such as low-level file handles, connections, etc.), and don't function like C++ destructors. In my Python experience, you rarely need to write your own __del__ methods.
There are other mechanisms you could use here. One choice would be try/finally:
leds = LedStrip(...)
try:
# application logic to interact with the LEDs
finally:
leds.clear() # or whatever logic you need to clear the LEDs to zero
This is still pretty explicit. If you want something a bit more implicit, you could consider using the Python context manager structure instead. To use a context manager, you use the with keyword:
with open("file.txt", "w") as outfile:
outfile.write("Hello!\n")
The with statement calls the special __enter__ method to initialize the "context". When the block ends, the __exit__ method will be called to end the "context". For the case of a file, __exit__ would close the file. The key is that __exit__ will be called even if an exception occurs inside the block (kind of like finally on a try block).
You could implement __enter__ and __exit__ on your LED strip, then write:
with LedStrip(...) as leds:
# do whatever you want with the leds
and when the block ends, the __exit__ method could reset the state of all the LEDs.

Let's put it this way. Firstly, "...as I would do in C++" approach is not appropriate, as I'm sure you know yourself. It goes without saying that Python is totally different language. But in this particular case it should be stressed, since Python's memory management is quite different from C++. Python uses reference counting, when objects reference count goes to zero, its memory will be released (i.e. when an object is garbage collected) and so on.
Python user-defined objects sometimes do need to define __del__() method. But it is not a destructor in any sense (not in C++ sense for sure), it's finalizer. Moreover, it is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits. Yet we can invoke __del__() explicitly, it should not be for your case, as for this I would advise to make LED switch off as an explicit method, not relying on Python's internals. Just like it goes in Zen of Python (import this command).
Explicit is better than implicit.
For more information on __del__(), check this good answer. For more on reference counting check this article.

Related

Observe when variable is deleted

I'm using Python as a wrapper to a library that, for desired reasons, keeps certain objects in memory until the process is killed and system GC removes them (or, a command is sent to explicitly remove them).
A user can retrieve references to one of these objects using a Python function, so I know when a user has accessed them, but I don't know when a user is done accessing them.
My question is: it is possible in Python to observe when a variable is deleted (for reasons of reassignment, going out of scope, garbage collection, etc.)? Can I observe state change on variables at all (similar to Swift's didSet/willSet)?
Python calls __del__ magic method when about to destroy an object.
You could override it and add your logic.
class ObserveDel:
def __del__(self):
# do your stuff
Or just replace it in place.
def _handle_del(obj):
# do your stuff
a.__del__ = _handle_del

Difference between calling a method vs using the field from __init__ in python within a class?

So I have a class with a couple of methods defined as:
class Recognizer(object):
def __init__(self):
self.image = None
self.reduced_image = None
def load_image(self, path):
self.image = cv2.imread(path)
return self.image
Say I wanna add a third method that uses a return value from load_image(). Should I define it like this:
def shrink_image(self):
self.reduced_img = cv2.resize(self.image, (300, 300))
return self.reduced_img
Or should I define it like this:
def shrink_image(self, path):
reduced_img = cv2.resize(self.load_image(path), (300, 300))
return reduced_img
What exactly is the difference between the two? I can see that I can have access to the fields inside of init from any method that I declare within that class so I guess if I update the fields within init I would be able to access those fields for a instance at a given time.
Is there a consensus on which way is better?
What exactly is the difference between the two?
In Python the function with the signature __init__ is the constructor of the object, which is invoked implicitly when calling it via (), such as Recognizer()
The term "better" is vague, because in the former example you are saving the image as a property on the object, hence making the object larger.
But in second example you are simply returning the data from the function, to be used by the caller.
So it's a matter of context and style.
A simple rule of thumb is if that you are going to be using the property reduced_img in the context of the Recognizer object then it would be ideal to save it as a property on the object, to be accessed via self. If the caller is simply using the reduced_img and Recognizer is unaware of any state changes, then it's fine to just return it from the function.
In the second way the variable is scoped to the shrink_image function.
In the first way the variable is scoped to the objects lifetime, and having self.reduced_img set is a side-effect of the method.
Only seeing your code sample, without seeing clients, the second case is "better", because reduced_img isn't used anywhere else, and is unecessary to bind it to the instance. There def may be a use case where you need to persist the last self.reduced_img call making it a necessary side-effect.
In general it is extremely helpful to minimize side effects. Having side effects especially ones that mutate state can make reasoning about your program more difficult.
This is especially seen when you have multiple accessors to your object.
Imagine having the first shrink_image, you release your program, you have a single client in a single call site of the program calling shrink_object, easy peasy. After the call self.reduced_img will be the result.
Imagine sharing the object between multiple call sites?? It introduces a temporal-ish coupling: you may no longer be able to make an assumption about what reduced_img is, and accesses to it before calling shrink_image may no longer be None, because there may be other callers!!!
Compare this to the second shrink image, callers no longer have the mutatable state, and it's easier to reason about the state of Recognizer instance across shrink_image calls.
Something really nuts happens for the first example when multiple concurrent calls are introduced. It goes from being difficult to reason about and potentially logically incorrect to being a synchronization and data race issue.
Without concurrent callers this isn't going to be an issue. But it's def a possibility, If you're using this call in a web framework and you create a single instance to share between multiple web worker processes you can get this implicit concurrency and could potentially, maybe be subject to race conditions :p

Cleaning up in pypy

I've been looking for ways to clean up objects in python.
I'm currently using pypy.
I found a web page and an example.
First a basic example:
class FooType(object):
def __init__(self, id):
self.id = id
print self.id, 'born'
def __del__(self):
print self.id, 'died'
ft = FooType(1)
This SHOULD print:
1 born
1 died
BUT it just prints
1 born
So my question is: how do I clean anything up in PyPy?
When you need a "cleanup" to run for sure at a specific time, use a context manager.
class FooType(object):
def __init__(self, id):
self.id = id
print 'born'
def __enter__(self):
print 'entered'
return self
def __exit__(self, *exc):
print 'exited'
with FooType(1) as ft:
pass # do something with ft
The same way you should do it in every other Python implementation, including CPython: By explicitly managing the lifetime in code, rather than by relying on automatic memory management and __del__.
Even in CPython, there's more than one case where __del__ is not called at all, and quite a few cases where it's called much later than you might expect (e.g., any time any object gets caught up in a reference cycle, which can happen quite easily). That means it's essentially useless, except perhaps to debug lifetime issues (but there are memory profilers for that!) and as a last resort if some code neglects cleaning up explicitly.
By being explicit about cleanup, you sidestep all these issues. Have a method that does the cleanup, and make client code call it at the right time. Context managers can make this easier to get right in the face of exceptions, and more readable. It often also allows cleaning up sooner than __del__, even if reference counting "immediately" calls __del__. For example, this:
def parse_file(path):
f = open(path)
return parse(f.read()) # file stays open during parsing
is worse than this w.r.t. resource usage:
def parse_file(path):
with open(path) as f:
s = f.read()
# file is closed here
return parse(s)
I would also argue that such a design is cleaner, because it doesn't confuse the lifetime of the resource wrapper object with the lifetime of the wrapped resource. Sometimes, it can make sense to have that object outlive the resource, or even make it take ownership of a new resource.
In your example, __del__ is not called, but that's only because the test program you wrote finishes immediately. PyPy guarantees that __del__ is called some time after the object is not reachable any more, but only as long as the program continues to execute. So if you do ft = FooType(1) in an infinite loop, it will after some time print the died too.
As the other answers explain, CPython doesn't really guarantee anything, but in simple cases (e.g. no reference cycles) it will call __del__ immediately and reliably. Still, the point is that you shouldn't strictly rely on this.

Creating a hook to a frequently accessed object

I have an application which relies heavily on a Context instance that serves as the access point to the context in which a given calculation is performed.
If I want to provide access to the Context instance, I can:
rely on global
pass the Context as a parameter to all the functions that require it
I would rather not use global variables, and passing the Context instance to all the functions is cumbersome and verbose.
How would you "hide, but make accessible" the calculation Context?
For example, imagine that Context simply computes the state (position and velocity) of planets according to different data.
class Context(object):
def state(self, planet, epoch):
"""base class --- suppose `state` is meant
to return a tuple of vectors."""
raise NotImplementedError("provide an implementation!")
class DE405Context(Context):
"""Concrete context using DE405 planetary ephemeris"""
def state(self, planet, epoch):
"""suppose that de405 reader exists and can provide
the required (position, velocity) tuple."""
return de405reader(planet, epoch)
def angular_momentum(planet, epoch, context):
"""suppose we care about the angular momentum of the planet,
and that `cross` exists"""
r, v = context.state(planet, epoch)
return cross(r, v)
# a second alternative, a "Calculator" class that contains the context
class Calculator(object):
def __init__(self, context):
self._ctx = context
def angular_momentum(self, planet, epoch):
r, v = self._ctx.state(planet, epoch)
return cross(r, v)
# use as follows:
my_context = DE405Context()
now = now() # assume this function returns an epoch
# first case:
print angular_momentum("Saturn", now, my_context)
# second case:
calculator = Calculator(my_context)
print calculator.angular_momentum("Saturn", now)
Of course, I could add all the operations directly into "Context", but it does not feel right.
In real life, the Context not only computes positions of planets! It computes many more things, and it serves as the access point to a lot of data.
So, to make my question more succinct: how do you deal with objects which need to be accessed by many classes?
I am currently exploring: python's context manager, but without much luck. I also thought about dynamically adding a property "context" to all functions directly (functions are objects, so they can have an access point to arbitrary objects), i.e.:
def angular_momentum(self, planet, epoch):
r, v = angular_momentum.ctx.state(planet, epoch)
return cross(r, v)
# somewhere before calling anything...
import angular_momentum
angular_momentum.ctx = my_context
edit
Something that would be great, is to create a "calculation context" with a with statement, for example:
with my_context:
h = angular_momentum("Earth", now)
Of course, I can already do that if I simply write:
with my_context as ctx:
h = angular_momentum("Earth", now, ctx) # first implementation above
Maybe a variation of this with the Strategy pattern?
You generally don't want to "hide" anything in Python. You may want to signal human readers that they should treat it as "private", but this really just means "you should be able to understand my API even if you ignore this object", not "you can't access this".
The idiomatic way to do that in Python is to prefix it with an underscore—and, if your module might ever be used with from foo import *, add an explicit __all__ global that lists all the public exports. Again, neither of these will actually prevent anyone from seeing your variable, or even accessing it from outside after import foo.
See PEP 8 on Global Variable Names for more details.
Some style guides suggest special prefixes, all-caps-names, or other special distinguishing marks for globals, but PEP 8 specifically says that the conventions are the same, except for the __all__ and/or leading underscore.
Meanwhile, the behavior you want is clearly that of a global variable—a single object that everyone implicitly shares and references. Trying to disguise it as anything other than what it is will do you no good, except possibly for passing a lint check or a code review that you shouldn't have passed. All of the problems with global variables come from being a single object that everyone implicitly shares and references, not from being directly in the globals() dictionary or anything like that, so any decent fake global is just as bad as a real global. If that truly is the behavior you want, make it a global variable.
Putting it together:
# do not include _context here
__all__ = ['Context', 'DE405Context', 'Calculator', …
_context = Context()
Also, of course, you may want to call it something like _global_context or even _private_global_context, instead of just _context.
But keep in mind that globals are still members of a module, not of the entire universe, so even a public context will still be scoped as foo.context when client code does an import foo. And this may be exactly what you want. If you want a way for client scripts to import your module and then control its behavior, maybe foo.context = foo.Context(…) is exactly the right way. Of course this won't work in multithreaded (or gevent/coroutine/etc.) code, and it's inappropriate in various other cases, but if that's not an issue, in some cases, this is fine.
Since you brought up multithreading in your comments: In the simple style of multithreading where you have long-running jobs, the global style actually works perfectly fine, with a trivial change—replace the global Context with a global threading.local instance that contains a Context. Even in the style where you have small jobs handled by a thread pool, it's not much more complicated. You attach a context to each job, and then when a worker pulls a job off the queue, it sets the thread-local context to that job's context.
However, I'm not sure multithreading is going to be a good fit for your app anyway. Multithreading is great in Python when your tasks occasionally have to block for IO and you want to be able to do that without stopping other tasks—but, thanks to the GIL, it's nearly useless for parallelizing CPU work, and it sounds like that's what you're looking for. Multiprocessing (whether via the multiprocessing module or otherwise) may be more of what you're after. And with separate processes, keeping separate contexts is even simpler. (Or, you can write thread-based code and switch it to multiprocessing, leaving the threading.local variables as-is and only changing the way you spawn new tasks, and everything still works just fine.)
It may make sense to provide a "context" in the context manager sense, as an external version of the standard library's decimal module did, so someone can write:
with foo.Context(…):
# do stuff under custom context
# back to default context
However, nobody could really think of a good use case for that (especially since, at least in the naive implementation, it doesn't actually solve the threading/etc. problem), so it wasn't added to the standard library, and you may not need it either.
If you want to do this, it's pretty trivial. If you're using a private global, just add this to your Context class:
def __enter__(self):
global _context
self._stashedcontext = _context
_context = self
def __exit__(self, *args):
global context
_context = self._stashedcontext
And it should be obvious how to adjust this to public, thread-local, etc. alternatives.
Another alternative is to make everything a member of the Context object. The top-level module functions then just delegate to the global context, which has a reasonable default value. This is exactly how the standard library random module works—you can create a random.Random() and call randrange on it, or you can just call random.randrange(), which calls the same thing on a global default random.Random() object.
If creating a Context is too heavy to do at import time, especially if it might not get used (because nobody might ever call the global functions), you can use the singleton pattern to create it on first access. But that's rarely necessary. And when it's not, the code is trivial. For example, the source to random, starting at line 881, does this:
_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
…
And that's all there is to it.
And finally, as you suggested, you could make everything a member of a different Calculator object which owns a Context object. This is the traditional OOP solution; overusing it tends to make Python feel like Java, but using it when it's appropriate is not a bad thing.
You might consider using a proxy object, here's a library that helps in creating object proxies:
http://pypi.python.org/pypi/ProxyTypes
Flask uses object proxies for it's "current_app", "request" and other variables, all it takes to reference them is:
from flask import request
You could create a proxy object that is a reference to your real context, and use thread locals to manage the instances (if that would work for you).

How to do cleanup reliably in python?

I have some ctypes bindings, and for each body.New I should call body.Free. The library I'm binding doesn't have allocation routines insulated out from the rest of the code (they can be called about anywhere there), and to use couple of useful features I need to make cyclic references.
I think It'd solve if I'd find a reliable way to hook destructor to an object. (weakrefs would help if they'd give me the callback just before the data is dropped.
So obviously this code megafails when I put in velocity_func:
class Body(object):
def __init__(self, mass, inertia):
self._body = body.New(mass, inertia)
def __del__(self):
print '__del__ %r' % self
if body:
body.Free(self._body)
...
def set_velocity_func(self, func):
self._body.contents.velocity_func = ctypes_wrapping(func)
I also tried to solve it through weakrefs, with those the things seem getting just worse, just only largely more unpredictable.
Even if I don't put in the velocity_func, there will appear cycles at least then when I do this:
class Toy(object):
def __init__(self, body):
self.body.owner = self
...
def collision(a, b, contacts):
whatever(a.body.owner)
So how to make sure Structures will get garbage collected, even if they are allocated/freed by the shared library?
There's repository if you are interested about more details: http://bitbucket.org/cheery/ctypes-chipmunk/
What you want to do, that is create an object that allocates things and then deallocates automatically when the object is no longer in use, is almost impossible in Python, unfortunately. The del statement is not guaranteed to be called, so you can't rely on that.
The standard way in Python is simply:
try:
allocate()
dostuff()
finally:
cleanup()
Or since 2.5 you can also create context-managers and use the with statement, which is a neater way of doing that.
But both of these are primarily for when you allocate/lock in the beginning of a code snippet. If you want to have things allocated for the whole run of the program, you need to allocate the resource at startup, before the main code of the program runs, and deallocate afterwards. There is one situation which isn't covered here, and that is when you want to allocate and deallocate many resources dynamically and use them in many places in the code. For example of you want a pool of memory buffers or similar. But most of those cases are for memory, which Python will handle for you, so you don't have to bother about those. There are of course cases where you want to have dynamic pool allocation of things that are NOT memory, and then you would want the type of deallocation you try in your example, and that is tricky to do with Python.
If weakrefs aren't broken, I guess this may work:
from weakref import ref
pointers = set()
class Pointer(object):
def __init__(self, cfun, ptr):
pointers.add(self)
self.ref = ref(ptr, self.cleanup)
self.data = cast(ptr, c_void_p).value # python cast it so smart, but it can't be smarter than this.
self.cfun = cfun
def cleanup(self, obj):
print 'cleanup 0x%x' % self.data
self.cfun(self.data)
pointers.remove(self)
def cleanup(cfun, ptr):
Pointer(cfun, ptr)
I yet try it. The important piece is that the Pointer doesn't have any strong references to the foreign pointer, except an integer. This should work if ctypes doesn't free memory that I should free with the bindings. Yeah, it's basicly a hack, but I think it may work better than the earlier things I've been trying.
Edit: Tried it, and it seem to work after small finetuning my code. A surprising thing is that even if I got del out from all of my structures, it seem to still fail. Interesting but frustrating.
Neither works, from some weird chance I've been able to drop away cyclic references in places, but things stay broke.
Edit: Well.. weakrefs WERE broken after all! so there's likely no solution for reliable cleanup in python, except than forcing it being explicit.
In CPython, __del__ is a reliable destructor of an object, because it will always be called when the reference count reaches zero (note: there may be cases - like circular references of items with __del__ method defined - where the reference count will never reaches zero, but that is another issue).
Update
From the comments, I understand the problem is related to the order of destruction of objects: body is a global object, and it is being destroyed before all other objects, thus it is no longer available to them.
Actually, using global objects is not good; not only because of issues like this one, but also because of maintenance.
I would then change your class with something like this
class Body(object):
def __init__(self, mass, inertia):
self._bodyref = body
self._body = body.New(mass, inertia)
def __del__(self):
print '__del__ %r' % self
if body:
body.Free(self._body)
...
def set_velocity_func(self, func):
self._body.contents.velocity_func = ctypes_wrapping(func)
A couple of notes:
The change is only adding a reference to the global body object, that thus will live at least as much as all the objects derived from that class.
Still, using a global object is not good because of unit testing and maintenance; better would be to have a factory for the object, that will set the correct "body" to the class, and in case of unit test will easily put a mock object. But that's really up to you and how much effort you think makes sense in this project.

Categories

Resources