How should you create properties when using asyncio? - python

When creating a class which uses asyncio I've found myself in a situation where a property getter needs to make an io operation. Therefore the function should be a coroutine. However awaiting a property feels unusual.
Here is a minimal working of example of what I mean. The code is valid and runs.
import asyncio
class Person:
"""A class that represents a person"""
def __init__(self, forename, surname):
self.forename = forename
self.surname = surname
#property
async def fullname(self):
"""Perform an io operation and return something.
This could be looking something up in a database for example.
"""
await asyncio.sleep(0.1)
return f"{self.forename} {self.surname}"
async def main():
john = Person("John", "Smith")
# Let's print out the forename here, using the standard property format
print(john.forename)
# When printing the full name we must instroduce an await, which feels awkward.
print(await john.fullname)
# Start the loop and run the main function
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
Is this the correct way of doing this?

Short answer: don't do this.
Longer answer: as mentionned in pep8:
Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.
So anything requiring IO is obviously not a candidate for a property. FWIW we don't only expect attributes access to be cheap, we also expect them to be safe (would you expect an attribute access to possibly raise an IOError, Database error, socket error or anything similar ?)
FWIW, you mention that "awaiting a property feels unusual" which should already answer you question. Actually and as far as I'm concerned, the mere idea of an "async property" strikes me as just totally insane - properties are (semantically) about the object's state, and I just can't make sense of the concept of "async state".

Is this the correct way of doing this?
It is, except for the stylistic question of whether a property should return an awaitable. The other answer argues against the practice on the grounds of common sense, but also based on the following quote from PEP 8:
Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.
As written, this does not imply that properties should not return awaitables, for two reasons:
Accessing the property with the attribute notation is extremely cheap, because it only creates an awaitable (a coroutine object in case of a coroutine). It is only when you await the resulting object that you can suspend, and that is clearly marked with the use of an await.
Awaiting something is not computationally expensive - in fact, doing something computationally expensive in a coroutine is forbidden because it would interfere with other tasks. An await either immediately returns the value, or it suspends the enclosing coroutine. The latter can certainly take time (but that is the whole point of using await), but it is definitely not expensive in terms of CPU.
I believe the idea behind the PEP8 warning is that a simple attribute access shouldn't result in state change or a long pause. As argued above, that holds for async properties as well, since the access only gives you the coroutine object. On the other hand, if you then go on to explicitly await that object, you're not only allowing, but actually requesting the resolution of the awaitable. This is not much different from how <some list>.append gives you the bound method object without doing anything, but if you then call that object, the call will change the list.
In conclusion, if returning an awaitable from a property "feels wrong", then just don't do it, and use a method instead. But PEP 8 does not, as far as I can tell, oppose the practice.

Related

Is it possible to create a decorator inside a class?

Or maybe you can think of an alternative solution to what I'm trying to do. Basically I have a class with a bunch of functions that I want to wrap in try/except blocks to intercept the KeyboardInterrupt error as I have a function that tidily cleans up each of my functions.
Instead of putting huge try catch blocks in each function I figured I could create a decorator to do that, but I'm running into some issues. So far I have something like this
class MyClass:
def catch_interrupt(self, func):
def catcher():
try:
func()
except KeyboardInterrupt:
self.End()
return catcher
#catch_interrupt
def my_func(self):
# Start a long process that might need to be interrupted
def End(self):
# Cleans up things
sys.exit()
The issue when I run this is that I get the error
TypeError: catch_interrupt() takes exactly 2 arguments (1 given)
Is this even possible? Is there a better way or should I really just put try/except blocks around each functions innards?
It is indeed possible to create a decorator inside a class, but your implementation is faulty:
First of all, catch_interrupt() can't take self.
#catch_interrupt
def my_func(self):
# Start a long process that might need to be interrupted
is equivalent to
def my_func(self):
# Start a long process that might need to be interrupted
my_func = catch_interrupt(my_func)
Obviously this doesn't allow self.
Secondly, your inner wrapper function that you return from the decorator needs to at least take self as an argument and pass it on to func, as the functions you will be decorating expect self as their first arguments.
You may also want to call your internal decorator _catch_interrupt to hint that it's meant for internal usage. That won't prevent anyone from calling it, but it's good practice given the behavior will be incorrect if called on an instance of the class (e.g. MyClass().catch_interrupt() will attempt to decorate the MyClass instance itself, which you probably don't want).
My suggestion, however, would be to implement a context manager instead and have it perform your cleanup. That is more Pythonic for the cases where you're just enclosing a group of statements, and if you implement it right you can actually use it as a decorator too.
I am not sure if you can use decorators inside a class in the nature you propose. I think that this would be counter-intuitive to the purpose of decorators themselves (a hack, so to say).
What is wrong with a try-except block? You already put all the cleanup code in a function so are adhering to the DRY principle. A decorator and/or wrapper would only limit the flexibility of your error handling, by fixing that the try-except statement wraps your entire function(s), without providing any real added benefit.

Python: store expected Exceptions in function attributes

Is it pythonic to store the expected exceptions of a funcion as attributes of the function itself? or just a stinking bad practice.
Something like this
class MyCoolError(Exception):
pass
def function(*args):
"""
:raises: MyCoolError
"""
# do something here
if some_condition:
raise MyCoolError
function.MyCoolError = MyCoolError
And there in other module
try:
function(...)
except function.MyCoolError:
#...
Pro: Anywhere I have a reference to my function, I have also a reference to the exception it can raise, and I don't have to import it explicitly.
Con: I "have" to repeat the name of the exception to bind it to the function. This could be done with a decorator, but it is also added complexity.
EDIT
Why I am doing this is because I append some methods in an irregular way to some classes, where I think that a mixin it is not worth it. Let's call it "tailored added functionality". For instance let's say:
Class A uses method fn1 and fn2
Class B uses method fn2 and fn3
Class C uses fn4 ...
And like this for about 15 classes.
So when I call obj_a.fn2(), I have to import explicitly the exception it may raise (and it is not in the module where classes A, B or C, but in another one where the shared methods live)... which I think it is a little bit annoying. Appart from that, the standard style in the project I'm working in forces to write one import per line, so it gets pretty verbose.
In some code I have seen exceptions stored as class attributes, and I have found it pretty useful, like:
try:
obj.fn()
except obj.MyCoolError:
....
I think it is not Pythonic. I also think that it does not provide a lot of advantage over the standard way which should be to just import the exception along with the function.
There is a reason (besides helping the interpreter) why Python programs use import statements to state where their code comes from; it helps finding the code of the facilities (e. g. your exception in this case) you are using.
The whole idea has the smell of the declaration of exceptions as it is possible in C++ and partly mandatory in Java. There are discussions amongst the language lawyers whether this is a good idea or a bad one, and in the Python world the designers decided against it, so it is not Pythonic.
It also raises a whole bunch of further questions. What happens if your function A is using another function B which then, later, is changed so that it can throw an exception (a valid thing in Python). Are you willing to change your function A then to reflect that (or catch it in A)? Where would you want to draw the line — is using int(text) to convert a string to int reason enough to "declare" that a ValueError can be thrown?
All in all I think it is not Pythonic, no.

Cleaning up in pypy

I've been looking for ways to clean up objects in python.
I'm currently using pypy.
I found a web page and an example.
First a basic example:
class FooType(object):
def __init__(self, id):
self.id = id
print self.id, 'born'
def __del__(self):
print self.id, 'died'
ft = FooType(1)
This SHOULD print:
1 born
1 died
BUT it just prints
1 born
So my question is: how do I clean anything up in PyPy?
When you need a "cleanup" to run for sure at a specific time, use a context manager.
class FooType(object):
def __init__(self, id):
self.id = id
print 'born'
def __enter__(self):
print 'entered'
return self
def __exit__(self, *exc):
print 'exited'
with FooType(1) as ft:
pass # do something with ft
The same way you should do it in every other Python implementation, including CPython: By explicitly managing the lifetime in code, rather than by relying on automatic memory management and __del__.
Even in CPython, there's more than one case where __del__ is not called at all, and quite a few cases where it's called much later than you might expect (e.g., any time any object gets caught up in a reference cycle, which can happen quite easily). That means it's essentially useless, except perhaps to debug lifetime issues (but there are memory profilers for that!) and as a last resort if some code neglects cleaning up explicitly.
By being explicit about cleanup, you sidestep all these issues. Have a method that does the cleanup, and make client code call it at the right time. Context managers can make this easier to get right in the face of exceptions, and more readable. It often also allows cleaning up sooner than __del__, even if reference counting "immediately" calls __del__. For example, this:
def parse_file(path):
f = open(path)
return parse(f.read()) # file stays open during parsing
is worse than this w.r.t. resource usage:
def parse_file(path):
with open(path) as f:
s = f.read()
# file is closed here
return parse(s)
I would also argue that such a design is cleaner, because it doesn't confuse the lifetime of the resource wrapper object with the lifetime of the wrapped resource. Sometimes, it can make sense to have that object outlive the resource, or even make it take ownership of a new resource.
In your example, __del__ is not called, but that's only because the test program you wrote finishes immediately. PyPy guarantees that __del__ is called some time after the object is not reachable any more, but only as long as the program continues to execute. So if you do ft = FooType(1) in an infinite loop, it will after some time print the died too.
As the other answers explain, CPython doesn't really guarantee anything, but in simple cases (e.g. no reference cycles) it will call __del__ immediately and reliably. Still, the point is that you shouldn't strictly rely on this.

Creating a hook to a frequently accessed object

I have an application which relies heavily on a Context instance that serves as the access point to the context in which a given calculation is performed.
If I want to provide access to the Context instance, I can:
rely on global
pass the Context as a parameter to all the functions that require it
I would rather not use global variables, and passing the Context instance to all the functions is cumbersome and verbose.
How would you "hide, but make accessible" the calculation Context?
For example, imagine that Context simply computes the state (position and velocity) of planets according to different data.
class Context(object):
def state(self, planet, epoch):
"""base class --- suppose `state` is meant
to return a tuple of vectors."""
raise NotImplementedError("provide an implementation!")
class DE405Context(Context):
"""Concrete context using DE405 planetary ephemeris"""
def state(self, planet, epoch):
"""suppose that de405 reader exists and can provide
the required (position, velocity) tuple."""
return de405reader(planet, epoch)
def angular_momentum(planet, epoch, context):
"""suppose we care about the angular momentum of the planet,
and that `cross` exists"""
r, v = context.state(planet, epoch)
return cross(r, v)
# a second alternative, a "Calculator" class that contains the context
class Calculator(object):
def __init__(self, context):
self._ctx = context
def angular_momentum(self, planet, epoch):
r, v = self._ctx.state(planet, epoch)
return cross(r, v)
# use as follows:
my_context = DE405Context()
now = now() # assume this function returns an epoch
# first case:
print angular_momentum("Saturn", now, my_context)
# second case:
calculator = Calculator(my_context)
print calculator.angular_momentum("Saturn", now)
Of course, I could add all the operations directly into "Context", but it does not feel right.
In real life, the Context not only computes positions of planets! It computes many more things, and it serves as the access point to a lot of data.
So, to make my question more succinct: how do you deal with objects which need to be accessed by many classes?
I am currently exploring: python's context manager, but without much luck. I also thought about dynamically adding a property "context" to all functions directly (functions are objects, so they can have an access point to arbitrary objects), i.e.:
def angular_momentum(self, planet, epoch):
r, v = angular_momentum.ctx.state(planet, epoch)
return cross(r, v)
# somewhere before calling anything...
import angular_momentum
angular_momentum.ctx = my_context
edit
Something that would be great, is to create a "calculation context" with a with statement, for example:
with my_context:
h = angular_momentum("Earth", now)
Of course, I can already do that if I simply write:
with my_context as ctx:
h = angular_momentum("Earth", now, ctx) # first implementation above
Maybe a variation of this with the Strategy pattern?
You generally don't want to "hide" anything in Python. You may want to signal human readers that they should treat it as "private", but this really just means "you should be able to understand my API even if you ignore this object", not "you can't access this".
The idiomatic way to do that in Python is to prefix it with an underscore—and, if your module might ever be used with from foo import *, add an explicit __all__ global that lists all the public exports. Again, neither of these will actually prevent anyone from seeing your variable, or even accessing it from outside after import foo.
See PEP 8 on Global Variable Names for more details.
Some style guides suggest special prefixes, all-caps-names, or other special distinguishing marks for globals, but PEP 8 specifically says that the conventions are the same, except for the __all__ and/or leading underscore.
Meanwhile, the behavior you want is clearly that of a global variable—a single object that everyone implicitly shares and references. Trying to disguise it as anything other than what it is will do you no good, except possibly for passing a lint check or a code review that you shouldn't have passed. All of the problems with global variables come from being a single object that everyone implicitly shares and references, not from being directly in the globals() dictionary or anything like that, so any decent fake global is just as bad as a real global. If that truly is the behavior you want, make it a global variable.
Putting it together:
# do not include _context here
__all__ = ['Context', 'DE405Context', 'Calculator', …
_context = Context()
Also, of course, you may want to call it something like _global_context or even _private_global_context, instead of just _context.
But keep in mind that globals are still members of a module, not of the entire universe, so even a public context will still be scoped as foo.context when client code does an import foo. And this may be exactly what you want. If you want a way for client scripts to import your module and then control its behavior, maybe foo.context = foo.Context(…) is exactly the right way. Of course this won't work in multithreaded (or gevent/coroutine/etc.) code, and it's inappropriate in various other cases, but if that's not an issue, in some cases, this is fine.
Since you brought up multithreading in your comments: In the simple style of multithreading where you have long-running jobs, the global style actually works perfectly fine, with a trivial change—replace the global Context with a global threading.local instance that contains a Context. Even in the style where you have small jobs handled by a thread pool, it's not much more complicated. You attach a context to each job, and then when a worker pulls a job off the queue, it sets the thread-local context to that job's context.
However, I'm not sure multithreading is going to be a good fit for your app anyway. Multithreading is great in Python when your tasks occasionally have to block for IO and you want to be able to do that without stopping other tasks—but, thanks to the GIL, it's nearly useless for parallelizing CPU work, and it sounds like that's what you're looking for. Multiprocessing (whether via the multiprocessing module or otherwise) may be more of what you're after. And with separate processes, keeping separate contexts is even simpler. (Or, you can write thread-based code and switch it to multiprocessing, leaving the threading.local variables as-is and only changing the way you spawn new tasks, and everything still works just fine.)
It may make sense to provide a "context" in the context manager sense, as an external version of the standard library's decimal module did, so someone can write:
with foo.Context(…):
# do stuff under custom context
# back to default context
However, nobody could really think of a good use case for that (especially since, at least in the naive implementation, it doesn't actually solve the threading/etc. problem), so it wasn't added to the standard library, and you may not need it either.
If you want to do this, it's pretty trivial. If you're using a private global, just add this to your Context class:
def __enter__(self):
global _context
self._stashedcontext = _context
_context = self
def __exit__(self, *args):
global context
_context = self._stashedcontext
And it should be obvious how to adjust this to public, thread-local, etc. alternatives.
Another alternative is to make everything a member of the Context object. The top-level module functions then just delegate to the global context, which has a reasonable default value. This is exactly how the standard library random module works—you can create a random.Random() and call randrange on it, or you can just call random.randrange(), which calls the same thing on a global default random.Random() object.
If creating a Context is too heavy to do at import time, especially if it might not get used (because nobody might ever call the global functions), you can use the singleton pattern to create it on first access. But that's rarely necessary. And when it's not, the code is trivial. For example, the source to random, starting at line 881, does this:
_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
…
And that's all there is to it.
And finally, as you suggested, you could make everything a member of a different Calculator object which owns a Context object. This is the traditional OOP solution; overusing it tends to make Python feel like Java, but using it when it's appropriate is not a bad thing.
You might consider using a proxy object, here's a library that helps in creating object proxies:
http://pypi.python.org/pypi/ProxyTypes
Flask uses object proxies for it's "current_app", "request" and other variables, all it takes to reference them is:
from flask import request
You could create a proxy object that is a reference to your real context, and use thread locals to manage the instances (if that would work for you).

How to do cleanup reliably in python?

I have some ctypes bindings, and for each body.New I should call body.Free. The library I'm binding doesn't have allocation routines insulated out from the rest of the code (they can be called about anywhere there), and to use couple of useful features I need to make cyclic references.
I think It'd solve if I'd find a reliable way to hook destructor to an object. (weakrefs would help if they'd give me the callback just before the data is dropped.
So obviously this code megafails when I put in velocity_func:
class Body(object):
def __init__(self, mass, inertia):
self._body = body.New(mass, inertia)
def __del__(self):
print '__del__ %r' % self
if body:
body.Free(self._body)
...
def set_velocity_func(self, func):
self._body.contents.velocity_func = ctypes_wrapping(func)
I also tried to solve it through weakrefs, with those the things seem getting just worse, just only largely more unpredictable.
Even if I don't put in the velocity_func, there will appear cycles at least then when I do this:
class Toy(object):
def __init__(self, body):
self.body.owner = self
...
def collision(a, b, contacts):
whatever(a.body.owner)
So how to make sure Structures will get garbage collected, even if they are allocated/freed by the shared library?
There's repository if you are interested about more details: http://bitbucket.org/cheery/ctypes-chipmunk/
What you want to do, that is create an object that allocates things and then deallocates automatically when the object is no longer in use, is almost impossible in Python, unfortunately. The del statement is not guaranteed to be called, so you can't rely on that.
The standard way in Python is simply:
try:
allocate()
dostuff()
finally:
cleanup()
Or since 2.5 you can also create context-managers and use the with statement, which is a neater way of doing that.
But both of these are primarily for when you allocate/lock in the beginning of a code snippet. If you want to have things allocated for the whole run of the program, you need to allocate the resource at startup, before the main code of the program runs, and deallocate afterwards. There is one situation which isn't covered here, and that is when you want to allocate and deallocate many resources dynamically and use them in many places in the code. For example of you want a pool of memory buffers or similar. But most of those cases are for memory, which Python will handle for you, so you don't have to bother about those. There are of course cases where you want to have dynamic pool allocation of things that are NOT memory, and then you would want the type of deallocation you try in your example, and that is tricky to do with Python.
If weakrefs aren't broken, I guess this may work:
from weakref import ref
pointers = set()
class Pointer(object):
def __init__(self, cfun, ptr):
pointers.add(self)
self.ref = ref(ptr, self.cleanup)
self.data = cast(ptr, c_void_p).value # python cast it so smart, but it can't be smarter than this.
self.cfun = cfun
def cleanup(self, obj):
print 'cleanup 0x%x' % self.data
self.cfun(self.data)
pointers.remove(self)
def cleanup(cfun, ptr):
Pointer(cfun, ptr)
I yet try it. The important piece is that the Pointer doesn't have any strong references to the foreign pointer, except an integer. This should work if ctypes doesn't free memory that I should free with the bindings. Yeah, it's basicly a hack, but I think it may work better than the earlier things I've been trying.
Edit: Tried it, and it seem to work after small finetuning my code. A surprising thing is that even if I got del out from all of my structures, it seem to still fail. Interesting but frustrating.
Neither works, from some weird chance I've been able to drop away cyclic references in places, but things stay broke.
Edit: Well.. weakrefs WERE broken after all! so there's likely no solution for reliable cleanup in python, except than forcing it being explicit.
In CPython, __del__ is a reliable destructor of an object, because it will always be called when the reference count reaches zero (note: there may be cases - like circular references of items with __del__ method defined - where the reference count will never reaches zero, but that is another issue).
Update
From the comments, I understand the problem is related to the order of destruction of objects: body is a global object, and it is being destroyed before all other objects, thus it is no longer available to them.
Actually, using global objects is not good; not only because of issues like this one, but also because of maintenance.
I would then change your class with something like this
class Body(object):
def __init__(self, mass, inertia):
self._bodyref = body
self._body = body.New(mass, inertia)
def __del__(self):
print '__del__ %r' % self
if body:
body.Free(self._body)
...
def set_velocity_func(self, func):
self._body.contents.velocity_func = ctypes_wrapping(func)
A couple of notes:
The change is only adding a reference to the global body object, that thus will live at least as much as all the objects derived from that class.
Still, using a global object is not good because of unit testing and maintenance; better would be to have a factory for the object, that will set the correct "body" to the class, and in case of unit test will easily put a mock object. But that's really up to you and how much effort you think makes sense in this project.

Categories

Resources