Suppose that I have a function in my Python application that define some kind of context - a user_id for example. This function call other functions that do not take this context as a function argument. For example:
def f1(user, operation):
user_id = user.id
# somehow define user_id as a global/context variable for any function call inside this scope
f2(operation)
def f2(operation):
# do something, not important, and then call another function
f3(operation)
def f3(operation):
# get user_id if there is a variable user_id in the context, get `None` otherwise
user_id = getcontext("user_id")
# do something with user_id and operation
My questions are:
Can the Context Variables of Python 3.7 be used for this? How?
Is this what these Context Variables are intended for?
How to do this with Python v3.6 or earlier?
EDIT
For multiple reasons (architectural legacy, libraries, etc) I can't/won't change the signature of intermediary functions like f2, so I can't just pass user_id as arguments, neither place all those functions inside the same class.
You can use contextvars in Python 3.7 for what you're asking about. It's usually really easy:
import contextvars
user_id = contextvars.ContextVar("user_id")
def f1(user, operation):
user_id.set(user.id)
f2()
def f2():
f3()
def f3():
print(user_id.get(default=None)) # gets the user_id value, or None if no value is set
The set method on the ContextVar returns a Token instance, which you can use to reset the variable to the value it had before the set operation took place. So if you wanted f1 to restore things the way they were (not really useful for a user_id context variable, but more relevant for something like setting the precision in the decimal module), you can do:
token = some_context_var.set(value)
try:
do_stuff() # can use value from some_context_var with some_context_var.get()
finally:
some_context_var.reset(token)
There's more to the contextvars module than this, but you almost certainly don't need to deal with the other stuff. You probably only need to be creating your own contexts and running code in other contexts if you're writing your own asynchronous framework from scratch.
If you're just using an existing framework (or writing a library that you want to play nice with asynchronous code), you don't need to deal with that stuff. Just create a global ContextVar (or look up one already defined by your framework) and get and set values on it as shown above, and you should be good to go.
A lot of contextvars use is probably going to be in the background, as an implementation detail of various libraries that want to have a "global" state that doesn't leak changes between threads or between separate asynchronous tasks within a single thread. The example above might make more sense in this kind of situation: f1 and f3 are part of the same library, and f2 is a user-supplied callback passed into the library somewhere else.
Essentially what you're looking for is a way to share a state between a set of function. The canonical way to do so in an object oriented language is to use a class:
class Foo(object):
def __init__(self, operation, user=None):
self._operation = operation
self._user_id = user.id if user else None
def f1(self):
print("in f1 : {}".format(self._user_id))
self.f2()
def f2(self):
print("in f2 : {}".format(self._user_id))
self.f3()
def f3(self):
print("in f3 : {}".format(self._user_id))
f = Foo(operation, user)
f.f1()
With this solution, your class instances (here f) are "the context" in which the functions are executed - each instance having it's own dedicated context.
The functional programing equivalent would be to use closures, I'm not going to give an example here since while Python supports closures, it's still first and mainly an object language so the OO solution is the most obvious.
And finally, the clean procedural solution is to pass this context (which can be expressed as a dict or any similar datatype) all along the call chain, as shown in DFE's answer.
As a general rule : relying on global variables or some "magic" context that could - or not - be set by you-dont-who-nor-where-nor-when makes for code that is hard if not impossible to reason about, and that can break in the most unpredictable ways (googling for "globals evil" will yield an awful lot of litterature on the topic).
You can use kwargs in your function calls in order to pass
def f1(user, operation):
user_id = user.id
# somehow define user_id as a global/context variable for any function call inside this scope
f2(operation, user_id=user_id)
def f2(operation, **kwargs):
# do something, not important, and then call another function
f3(operation, **kwargs)
def f3(operation, **kwargs):
# get user_id if there is a variable user_id in the context, get `None` otherwise
user_id = kwargs.get("user_id")
# do something with user_id and operation
the kwargs dict is the equivalent to what you are looking at in context variables, but limited at a call stack. It is the same memory element passed (through pointer-like) in each function and not duplicates variables in memory.
In my opinion, but I would like to see what you all think, context variables is an elegant way to authorize globals variables and to control it.
Related
I have a function foo that takes a parameter stuff
Stuff can be something in a database and I'd like to create a function that takes a stuff_id, get the stuff from the db, execute foo.
Here's my attempt to solve it:
1/ Create a second function with suffix from_stuff_id
def foo(stuff):
do something
def foo_from_stuff_id(stuff_id):
stuff = get_stuff(stuff_id)
foo(stuff)
2/ Modify the first function
def foo(stuff=None, stuff_id=None):
if stuff_id:
stuff = get_stuff(stuff_id)
do something
I don't like both ways.
What's the most pythonic way to do it ?
Assuming foo is the main component of your application, your first way. Each function should have a different purpose. The moment you combine multiple purposes into a single function, you can easily get lost in long streams of code.
If, however, some other function can also provide stuff, then go with the second.
The only thing I would add is make sure you add docstrings (PEP-257) to each function to explain in words the role of the function. If necessary, you can also add comments to your code.
I'm not a big fan of type overloading in Python, but this is one of the cases where I might go for it if there's really a need:
def foo(stuff):
if isinstance(stuff, int):
stuff = get_stuff(stuff)
...
With type annotations it would look like this:
def foo(stuff: Union[int, Stuff]):
if isinstance(stuff, int):
stuff = get_stuff(stuff)
...
It basically depends on how you've defined all these functions. If you're importing get_stuff from another module the second approach is more Pythonic, because from an OOP perspective you create functions for doing one particular purpose and in this case when you've already defined the get_stuff you don't need to call it within another function.
If get_stuff it's not defined in another module, then it depends on whether you are using classes or not. If you're using a class and you want to use all these modules together you can use a method for either accessing or connecting to the data base and use that method within other methods like foo.
Example:
from some module import get_stuff
MyClass:
def __init__(self, *args, **kwargs):
# ...
self.stuff_id = kwargs['stuff_id']
def foo(self):
stuff = get_stuff(self.stuff_id)
# do stuff
Or if the functionality of foo depends on the existence of stuff you can have a global stuff and simply check for its validation :
MyClass:
def __init__(self, *args, **kwargs):
# ...
_stuff_id = kwargs['stuff_id']
self.stuff = get_stuff(_stuff_id) # can return None
def foo(self):
if self.stuff:
# do stuff
else:
# do other stuff
Or another neat design pattern for such situations might be using a dispatcher function (or method in class) that delegates the execution to different functions based on the state of stuff.
def delegator(stff, stuff_id):
if stuff: # or other condition
foo(stuff)
else:
get_stuff(stuff_id)
I'd like to make a function which would also act as context manager if called with with statement. Example usage would be:
# Use as function
set_active_language("en")
# Use as context manager
with set_active_language("en"):
...
This is very similar to how the standard function open is used.
Here's the solution I came up with:
active_language = None # global variable to store active language
class set_active_language(object):
def __init__(self, language):
global active_language
self.previous_language = active_language
active_language = language
def __enter__(self):
pass
def __exit__(self, *args):
global active_language
active_language = self.previous_language
This code is not thread-safe, but this is not related to the problem.
What I don't like about this solution is that class constructor pretends to be a simple function and is used only for its side effects.
Is there a better way to do this?
Note that I haven't tested this solution.
Update: the reason why I don't want to split function and context manager into separate entities is is naming. The function and the context manager do the same thing, basically, so it seems reasonable to use one name for both. Naming the context processor would be problematic if I wanted to keep it separate. What should it be? active_language? This name may (and will) collide with variable name. override_active_language might work, though.
Technically no, you cannot do this. But you can fake it well enough that people (who didn't overthink it) wouldn't notice.
def set_active_language(language):
global active_language
previous_language = active_language
active_language = language
class ActiveScope(object):
def __enter__(self):
pass
def __exit__(self, *args):
global active_language
active_language = previous_language
return ActiveScope()
When used as a function the ActiveScope class is just a slightly wasteful no-op.
Hopefully someone will prove me wrong, but I think the answer is no: there is no other way. And also, another short-coming of the method you chose is that it might misbehave when used along with other context managers in a with a, b, c: statement. The intended side-effect of the CM is executed on object construction, and not in the __enter__ method as would be expected.
To be able to do what you want, you would have to know from inside the class constructor whether it was initialized as a context manager in a with statement, or simply called as a function. As far as I can tell, there is no way to gather that, not even with the inspect module.
My Situation
I'm currently writing on a project in python which I want to use to learn a bit more about software architecture. I've read a few texts and watched a couple of talks about dependency injection and learned to love how clear constructor injection shows the dependencies of an object.
However, I'm kind of struggling how to get a dependency passed to an object. I decided NOT to use a DI framework since:
I don't have enough knowledge of DI to specify my requirements and thus cannot choose a framework.
I want to keep the code free of more "magical" stuff since I have the feeling that introducing a seldom used framework drastically decreases readability. (More code to read of which only a small part is used).
Thus, I'm using custom factory functions to create objects and explicitly pass their dependencies:
# Business and Data Objects
class Foo:
def __init__(self,bar):
self.bar = bar
def do_stuff(self):
print(self.bar)
class Bar:
def __init__(self,prefix):
self.prefix = prefix
def __str__(self):
return str(self.prefix)+"Hello"
# Wiring up dependencies
def create_bar():
return Bar("Bar says: ")
def create_foo():
return Foo(create_bar())
# Starting the application
f = create_foo()
f.do_stuff()
Alternatively, if Foo has to create a number of Bars itself, it gets the creator function passed through its constructor:
# Business and Data Objects
class Foo:
def __init__(self,create_bar):
self.create_bar = create_bar
def do_stuff(self,times):
for _ in range(times):
bar = self.create_bar()
print(bar)
class Bar:
def __init__(self,greeting):
self.greeting = greeting
def __str__(self):
return self.greeting
# Wiring up dependencies
def create_bar():
return Bar("Hello World")
def create_foo():
return Foo(create_bar)
# Starting the application
f = create_foo()
f.do_stuff(3)
While I'd love to hear improvement suggestions on the code, this is not really the point of this post. However, I feel that this introduction is required to understand
My Question
While the above looks rather clear, readable and understandable to me, I run into a problem when the prefix dependency of Bar is required to be identical in the context of each Foo object and thus is coupled to the Foo object lifetime. As an example consider a prefix which implements a counter (See code examples below for implementation details).
I have two Ideas how to realize this, however, none of them seems perfect to me:
1) Pass Prefix through Foo
The first idea is to add a constructor parameter to Foo and make it store the prefix in each Foo instance.
The obvious drawback is, that it mixes up the responsibilities of Foo. It controls the business logic AND provides one of the dependencies to Bar. Once Bar does not require the dependency any more, Foo has to be modified. Seems like a no-go for me. Since I don't really think this should be a solution, I did not post the code here, but provided it on pastebin for the very interested reader ;)
2) Use Functions with State
Instead of placing the Prefix object inside Foo this approach is trying to encapsulate it inside the create_foo function. By creating one Prefix for each Foo object and referencing it in a nameless function using lambda, I keep the details (a.k.a there-is-a-prefix-object) away from Foo and inside my wiring-logic. Of course a named function would work, too (but lambda is shorter).
# Business and Data Objects
class Foo:
def __init__(self,create_bar):
self.create_bar = create_bar
def do_stuff(self,times):
for _ in range(times):
bar = self.create_bar()
print(bar)
class Bar:
def __init__(self,prefix):
self.prefix = prefix
def __str__(self):
return str(self.prefix)+"Hello"
class Prefix:
def __init__(self,name):
self.name = name
self.count = 0
def __str__(self):
self.count +=1
return self.name+" "+str(self.count)+": "
# Wiring up dependencies
def create_bar(prefix):
return Bar(prefix)
def create_prefix(name):
return Prefix(name)
def create_foo(name):
prefix = create_prefix(name)
return Foo(lambda : create_bar(prefix))
# Starting the application
f1 = create_foo("foo1")
f2 = create_foo("foo2")
f1.do_stuff(3)
f2.do_stuff(2)
f1.do_stuff(2)
This approach seems much more useful to me. However, I'm not sure about common practices and thus fear that having state inside functions is not really recommended. Coming from a java/C++ background, I'd expect a function to be dependent on its parameters, its class members (if it's a method) or some global state. Thus, a parameterless function that does not use global state would have to return exactly the same value every time it is called. This is not the case here. Once the returned object is modified (which means that counter in prefix has been increased), the function returns an object which has a different state than it had when beeing returned the first time.
Is this assumption just caused by my restricted experience in python and do I have to change my mindset, i.e. don't think of functions but of something callable? Or is supplying functions with state an unintended misuse of lambda?
3) Using a Callable Class
To overcome my doubts on stateful functions I could use callable classes where the create_foo function of approach 2 would be replaced by this:
class BarCreator:
def __init__(self, prefix):
self.prefix = prefix
def __call__(self):
return create_bar(self.prefix)
def create_foo(name):
return Foo(BarCreator(create_prefix(name)))
While this seems a usable solution for me, it is sooo much more verbose.
Summary
I'm not absolutely sure how to handle the situation. Although I prefer number 2 I still have my doubts. Furthermore, I'm still hope that anyone comes up with a more elegant way.
Please comment, if there is anything you think is too vague or can be possibly misunderstood. I will improve the question as far as my abilities allow me to do :)
All examples should run under python2.7 and python3 - if you experience any problems, please report them in the comments and I'll try to fix my code.
If you want to inject a callable object but don't want it to have a complex setup -- if, as in your example, it's really just binding to a single input value -- you could try using functools.partial to provide a function <> value pair:
def factory_function(arg):
#processing here
return configurted_object_base_on_arg
class Consumer(object):
def __init__(self, injection):
self._injected = injection
def use_injected_value():
print self._injected()
injectable = functools.partial(factory_function, 'this is the configuration argument')
example = Consumer(injectable)
example.use_injected_value() # should return the result of your factory function and argument
As an aside, if you're creating a dependency injection setup like your option 3, you probably want to put the knwledge about how to do the configuration into a factory class rather than doing it inline as you're doing here. That way you can swap out factories if you want to choose between strategies. It's not functionally very different (unless the creation is more complex than this example and involves persistent state) but it's more flexible down the road if the code looks like
factory = FooBarFactory()
bar1 = factory.create_bar()
alt_factory = FooBlahFactory(extra_info)
bar2 = alt_factory.create_bar()
I have inherited code in which there are standalone functions, one per country code. E.g.
def validate_fr(param):
pass
def validate_uk(param):
pass
My idea is to create a class to group them together and consolidate the code into one method. Unfortunately that breaks cohesion. Another option is to dispatch to instance methods ?
class Validator(object):
def validate(param, country_code):
# dispatch
Alas, python does not have a switch statement.
UPDATE: I am still not convinced why I should leave them as global functions in my module. Lumping them as class methods seems cleaner.
I would keep the functions at module level -- no need for a class if you don't want to instantiate it anyway. The switch statement can easily be simulated using a dicitonary:
def validate_fr(param):
pass
def validate_uk(param)
pass
validators = {"fr": validate_fr,
"uk": validate_uk}
def validate(country_code, param):
return validators[country_code](param)
Given the naming scheme, you could also do it without the dictionary:
def validate(country_code, param):
return gloabls()["validate_" + country_code](param)
You do not need a switch statement for this.
validators = {
'fr': Validator(...),
'uk': Validator(...),
...
}
...
validators['uk'](foo)
Classes are not meant to group functions together, modules are. Functions in a class should be either methods that operate on the object itself (changing it's state, emitting information about the state, etc.) or class methods that do the same, but for the class itself (classes in Python are also objects). There's not even a need for static methods in Python, since you can always have functions at module level. As they say: Flat is better than nested.
If you want to have a set of functions place them in separate module.
Can someone explain why the following code behaves the way it does:
import types
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d1 = Dummy("d1")
del d1
d1 = None
print "after d1"
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, d2)
d2.func()
del d2
d2 = None
print "after d2"
d3 = Dummy("d3")
def func(self):
print "func called"
d3.func = types.MethodType(func, d3)
d3.func()
d3.func = None
del d3
d3 = None
print "after d3"
The output (note that the destructor for d2 is never called) is this (python 2.7)
delete d1
after d1
func called
after d2
func called
delete d3
after d3
Is there a way to "fix" the code so the destructor is called without deleting the method added? I mean, the best place to put the d2.func = None would be in the destructor!
Thanks
[edit] Based on the first few answers, I'd like to clarify that I'm not asking about the merits (or lack thereof) of using __del__. I tried to create the shortest function that would demonstrate what I consider to be non-intuitive behavior. I'm assuming a circular reference has been created, but I'm not sure why. If possible, I'd like to know how to avoid the circular reference....
You cannot assume that __del__ will ever be called - it is not a place to hope that resources are automagically deallocated. If you want to make sure that a (non-memory) resource is released, you should make a release() or similar method and then call that explicitly (or use it in a context manager as pointed out by Thanatos in comments below).
At the very least you should read the __del__ documentation very closely, and then you should probably not try to use __del__. (Also refer to the gc.garbage documentation for other bad things about __del__)
I'm providing my own answer because, while I appreciate the advice to avoid __del__, my question was how to get it to work properly for the code sample provided.
Short version: The following code uses weakref to avoid the circular reference. I thought I'd tried this before posting the question, but I guess I must have done something wrong.
import types, weakref
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, weakref.ref(d2)) #This works
#d2.func = func.__get__(weakref.ref(d2), Dummy) #This works too
d2.func()
del d2
d2 = None
print "after d2"
Longer version:
When I posted the question, I did search for similar questions. I know you can use with instead, and that the prevailing sentiment is that __del__ is BAD.
Using with makes sense, but only in certain situations. Opening a file, reading it, and closing it is a good example where with is a perfectly good solution. You've gone a specific block of code where the object is needed, and you want to clean up the object and the end of the block.
A database connection seems to be used often as an example that doesn't work well using with, since you usually need to leave the section of code that creates the connection and have the connection closed in a more event-driven (rather than sequential) timeframe.
If with is not the right solution, I see two alternatives:
You make sure __del__ works (see this blog for a better
description of weakref usage)
You use the atexit module to run a callback when your program closes. See this topic for example.
While I tried to provide simplified code, my real problem is more event-driven, so with is not an appropriate solution (with is fine for the simplified code). I also wanted to avoid atexit, as my program can be long-running, and I want to be able to perform the cleanup as soon as possible.
So, in this specific case, I find it to be the best solution to use weakref and prevent circular references that would prevent __del__ from working.
This may be an exception to the rule, but there are use-cases where using weakref and __del__ is the right implementation, IMHO.
Instead of del, you can use the with operator.
http://effbot.org/zone/python-with-statement.htm
just like with filetype objects, you could something like
with Dummy('d1') as d:
#stuff
#d's __exit__ method is guaranteed to have been called
del doesn't call __del__
del in the way you are using removes a local variable. __del__ is called when the object is destroyed. Python as a language makes no guarantees as to when it will destroy an object.
CPython as the most common implementation of Python, uses reference counting. As a result del will often work as you expect. However it will not work in the case that you have a reference cycle.
d3 -> d3.func -> d3
Python doesn't detect this and so won't clean it up right away. And its not just reference cycles. If an exception is throw you probably want to still call your destructor. However, Python will typically hold onto to the local variables as part of its traceback.
The solution is not to depend on the __del__ method. Rather, use a context manager.
class Dummy:
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
print "Destroying", self
with Dummy() as dummy:
# Do whatever you want with dummy in here
# __exit__ will be called before you get here
This is guaranteed to work, and you can even check the parameters to see whether you are handling an exception and do something different in that case.
A full example of a context manager.
class Dummy(object):
def __init__(self, name):
self.name = name
def __enter__(self):
return self
def __exit__(self, exct_type, exce_value, traceback):
print 'cleanup:', d
def __repr__(self):
return 'Dummy(%r)' % (self.name,)
with Dummy("foo") as d:
print 'using:', d
print 'later:', d
It seems to me the real heart of the matter is here:
adding the functions is dynamic (at runtime) and not known in advance
I sense that what you are really after is a flexible way to bind different functionality to an object representing program state, also known as polymorphism. Python does that quite well, not by attaching/detaching methods, but by instantiating different classes. I suggest you look again at your class organization. Perhaps you need to separate a core, persistent data object from transient state objects. Use the has-a paradigm rather than is-a: each time state changes, you either wrap the core data in a state object, or you assign the new state object to an attribute of the core.
If you're sure you can't use that kind of pythonic OOP, you could still work around your problem another way by defining all your functions in the class to begin with and subsequently binding them to additional instance attributes (unless you're compiling these functions on the fly from user input):
class LongRunning(object):
def bark_loudly(self):
print("WOOF WOOF")
def bark_softly(self):
print("woof woof")
while True:
d = LongRunning()
d.bark = d.bark_loudly
d.bark()
d.bark = d.bark_softly
d.bark()
An alternative solution to using weakref is to dynamically bind the function to the instance only when it is called by overriding __getattr__ or __getattribute__ on the class to return func.__get__(self, type(self)) instead of just func for functions bound to the instance. This is how functions defined on the class behave. Unfortunately (for some use cases) python doesn't perform the same logic for functions attached to the instance itself, but you can modify it to do this. I've had similar problems with descriptors bound to instances. Performance here probably isn't as good as using weakref, but it is an option that will work transparently for any dynamically assigned function with the use of only python builtins.
If you find yourself doing this often, you might want a custom metaclass that does dynamic binding of instance-level functions.
Another alternative is to add the function directly to the class, which will then properly perform the binding when it's called. For a lot of use cases, this would have some headaches involved: namely, properly namespacing the functions so they don't collide. The instance id could be used for this, though, since the id in cPython isn't guaranteed unique over the life of the program, you'd need to ponder this a bit to make sure it works for your use case... in particular, you probably need to make sure you delete the class function when an object goes out of scope, and thus its id/memory address is available again. __del__ is perfect for this :). Alternatively, you could clear out all methods namespaced to the instance on object creation (in __init__ or __new__).
Another alternative (rather than messing with python magic methods) is to explicitly add a method for calling your dynamically bound functions. This has the downside that your users can't call your function using normal python syntax:
class MyClass(object):
def dynamic_func(self, func_name):
return getattr(self, func_name).__get__(self, type(self))
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name).__get__(self, type(self))(*args, **kwargs)
"""
Alternate without using descriptor functionality:
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name)(self, *args, **kwargs)
"""
Just to make this post complete, I'll show your weakref option as well:
import weakref
inst = MyClass()
def func(self):
print 'My func'
# You could also use the types modules, but the descriptor method is cleaner IMO
inst.func = func.__get__(weakref.ref(inst), type(inst))
use eval()
In [1]: int('25.0')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-67d52e3d0c17> in <module>
----> 1 int('25.0')
ValueError: invalid literal for int() with base 10: '25.0'
In [2]: int(float('25.0'))
Out[2]: 25
In [3]: eval('25.0')
Out[3]: 25.0