Python - Function attributes or mutable default values - python

Say you have a function that needs to maintain some sort of state and behave differently depending on that state. I am aware of two ways to implement this where the state is stored entirely by the function:
Using a function attribute
Using a mutable default value
Using a slightly modified version of Felix Klings answer to another question, here is an example function that can be used in re.sub() so that only the third match to a regex will be replaced:
Function attribute:
def replace(match):
replace.c = getattr(replace, "c", 0) + 1
return repl if replace.c == 3 else match.group(0)
Mutable default value:
def replace(match, c=[0]):
c[0] += 1
return repl if c[0] == 3 else match.group(0)
To me the first seems cleaner, but I have seen the second more commonly. Which is preferable and why?

I use closure instead, no side effects.
Here is the example (I've just modified the original example of Felix Klings answer):
def replaceNthWith(n, replacement):
c = [0]
def replace(match):
c[0] += 1
return replacement if c[0] == n else match.group(0)
return replace
And the usage:
# reset state (in our case count, c=0) for each string manipulation
re.sub(pattern, replaceNthWith(n, replacement), str1)
re.sub(pattern, replaceNthWith(n, replacement), str2)
#or persist state between calls
replace = replaceNthWith(n, replacement)
re.sub(pattern, replace, str1)
re.sub(pattern, replace, str2)
For mutable what should happen if somebody call replace(match, c=[])?
For attribute you broke encapsulation (yes i know that python didn't implemented in classes from diff reasons ...)

Both ways feel strange to me. The first though is much better. But when you think about it this way: "Something with a state that can do operations with that state and additional input", it really sounds like a normal object. And when something sounds like an object, it should be an object...
SO, my solution would be to use a simple object with a __call__ method:
class StatefulReplace(object):
def __init__(self, initial_c=0):
self.c = initial_c
def __call__(self, match):
self.c += 1
return repl if self.c == 3 else match.group(0)
And then you can write in the global space or your module init:
replace = StatefulReplace(0)

How about:
Use a class
Use a global variable
True, these are not stored entirely within the function. I would probably use a class:
class Replacer(object):
c = 0
#staticmethod # if you like
def replace(match):
replace.c += 1
...
To answer your actual question, use getattr. It's a very clear and readable way to store data away for later. It should be pretty obvious to someone reading it what you're trying to do.
The mutable default argument version is an example of a common programming error (assuming you'll get a new list every time). For that reason alone I would avoid it. Someone reading it later might decide that it's a good idea without fully understanding the consequences. And even in this case, it seems though your function would only work once (your c value is never reset to zero).

To me, both of these approaches look dodgy. The problem is crying out for a class instance. We don't normally think about functions as maintaining state between calls; that's what classes are for.
That said, I've used function attributes before for this sort of thing. Particularly if it's a one-shot function defined within other code (i.e. not possible for it to be used from anywhere else), just tacking on attributes to it is more concise than defining a whole new class and creating an instance of it.
I would never abuse default values for this. There's a large barrier to understanding because the natural purpose of default values is to provide default values of arguments, not to maintain state between calls. Plus a default argument invites you to supply a non-default value, and you typically get very strange behaviour if you do that with a function that is abusing default values to maintain state.

Related

Get functions' previous return statement [duplicate]

Suppose I have a function like:
def foo():
x = 'hello world'
How do I get the function to return x, in such a way that I can use it as the input for another function or use the variable within the body of a program? I tried using return and then using the x variable in another function, but I get a NameError that way.
For the specific case of communicating information between methods in the same class, it is often best to store the information in self. See Passing variables between methods in Python? for details.
def foo():
x = 'hello world'
return x # return 'hello world' would do, too
foo()
print(x) # NameError - x is not defined outside the function
y = foo()
print(y) # this works
x = foo()
print(x) # this also works, and it's a completely different x than that inside
# foo()
z = bar(x) # of course, now you can use x as you want
z = bar(foo()) # but you don't have to
Effectively, there are two ways: directly and indirectly.
The direct way is to return a value from the function, as you tried, and let the calling code use that value. This is normally what you want. The natural, simple, direct, explicit way to get information back from a function is to return it. Broadly speaking, the purpose of a function is to compute a value, and return signifies "this is the value we computed; we are done here".
Directly using return
The main trick here is that return returns a value, not a variable. So return x does not enable the calling code to use x after calling the function, and does not modify any existing value that x had in the context of the call. (That's presumably why you got a NameError.)
After we use return in the function:
def example():
x = 'hello world'
return x
we need to write the calling code to use the return value:
result = example()
print(result)
The other key point here is that a call to a function is an expression, so we can use it the same way that we use, say, the result of an addition. Just as we may say result = 'hello ' + 'world', we may say result = foo(). After that, result is our own, local name for that string, and we can do whatever we want with it.
We can use the same name, x, if we want. Or we can use a different name. The calling code doesn't have to know anything about how the function is written, or what names it uses for things.1
We can use the value directly to call another function: for example, print(foo()).2 We can return the value directly: simply return 'hello world', without assigning to x. (Again: we are returning a value, not a variable.)
The function can only return once each time it is called. return terminates the function - again, we just determined the result of the calculation, so there is no reason to calculate any further. If we want to return multiple pieces of information, therefore, we will need to come up with a single object (in Python, "value" and "object" are effectively synonyms; this doesn't work out so well for some other languages.)
We can make a tuple right on the return line; or we can use a dictionary, a namedtuple (Python 2.6+), a types.simpleNamespace (Python 3.3+), a dataclass (Python 3.7+), or some other class (perhaps even one we write ourselves) to associate names with the values that are being returned; or we can accumulate values from a loop in a list; etc. etc. The possibilities are endless..
On the other hand, the function returns whether you like it or not (unless an exception is raised). If it reaches the end, it will implicitly return the special value None. You may or may not want to do it explicitly instead.
Indirect methods
Other than returning the result back to the caller directly, we can communicate it by modifying some existing object that the caller knows about. There are many ways to do that, but they're all variations on that same theme.
If you want the code to communicate information back this way, please just let it return None - don't also use the return value for something meaningful. That's how the built-in functionality works.
In order to modify that object, the called function also has to know about it, of course. That means, having a name for the object that can be looked up in a current scope. So, let's go through those in order:
Local scope: Modifying a passed-in argument
If one of our parameters is mutable, we can just mutate it, and rely on the caller to examine the change. This is usually not a great idea, because it can be hard to reason about the code. It looks like:
def called(mutable):
mutable.append('world')
def caller():
my_value = ['hello'] # a list with just that string
called(my_value)
# now it contains both strings
If the value is an instance of our own class, we could also assign to an attribute:
class Test:
def __init__(self, value):
self.value = value
def called(mutable):
mutable.value = 'world'
def caller():
test = Test('hello')
called(test)
# now test.value has changed
Assigning to an attribute does not work for built-in types, including object; and it might not work for some classes that explicitly prevent you from doing it.
Local scope: Modifying self, in a method
We already have an example of this above: setting self.value in the Test.__init__ code. This is a special case of modifying a passed-in argument; but it's part of how classes work in Python, and something we're expected to do. Normally, when we do this, the calling won't actually check for changes to self - it will just use the modified object in the next step of the logic. That's what makes it appropriate to write code this way: we're still presenting an interface, so the caller doesn't have to worry about the details.
class Example:
def __init__(self):
self._words = ['hello']
def add_word(self):
self._words.append('world')
def display(self):
print(*self.words)
x = Example()
x.add_word()
x.display()
In the example, calling add_word gave information back to the top-level code - but instead of looking for it, we just go ahead and call display.3
See also: Passing variables between methods in Python?
Enclosing scope
This is a rare special case when using nested functions. There isn't a lot to say here - it works the same way as with the global scope, just using the nonlocal keyword rather than global.4
Global scope: Modifying a global
Generally speaking, it is a bad idea to change anything in the global scope after setting it up in the first place. It makes code harder to reason about, because anything that uses that global (aside from whatever was responsible for the change) now has a "hidden" source of input.
If you still want to do it, the syntax is straightforward:
words = ['hello']
def add_global_word():
words.append('world')
add_global_word() # `words` is changed
Global scope: Assigning to a new or existing global
This is actually a special case of modifying a global. I don't mean that assignment is a kind of modification (it isn't). I mean that when you assign a global name, Python automatically updates a dict that represents the global namespace. You can get that dict with globals(), and you can modify that dict and it will actually impact what global variables exist. (I.e., the return from globals() is the dictionary itself, not a copy.)5
But please don't. That's even worse of an idea than the previous one. If you really need to get the result from your function by assigning to a global variable, use the global keyword to tell Python that the name should be looked up in the global scope:
words = ['hello']
def replace_global_words():
global words
words = ['hello', 'world']
replace_global_words() # `words` is a new list with both words
Global scope: Assigning to or modifying an attribute of the function itself
This is a rare special case, but now that you've seen the other examples, the theory should be clear. In Python, functions are mutable (i.e. you can set attributes on them); and if we define a function at top level, it's in the global namespace. So this is really just modifying a global:
def set_own_words():
set_own_words.words = ['hello', 'world']
set_own_words()
print(*set_own_words.words)
We shouldn't really use this to send information to the caller. It has all the usual problems with globals, and it's even harder to understand. But it can be useful to set a function's attributes from within the function, in order for the function to remember something in between calls. (It's similar to how methods remember things in between calls by modifying self.) The functools standard library does this, for example in the cache implementation.
Builtin scope
This doesn't work. The builtin namespace doesn't contain any mutable objects, and you can't assign new builtin names (they'll go into the global namespace instead).
Some approaches that don't work in Python
Just calculating something before the function ends
In some other programming languages, there is some kind of hidden variable that automatically picks up the result of the last calculation, every time something is calculated; and if you reach the end of a function without returning anything, it gets returned. That doesn't work in Python. If you reach the end without returning anything, your function returns None.
Assigning to the function's name
In some other programming languages, you are allowed (or expected) to assign to a variable with the same name as the function; and at the end of the function, that value is returned. That still doesn't work in Python. If you reach the end without returning anything, your function still returns None.
def broken():
broken = 1
broken()
print(broken + 1) # causes a `TypeError`
It might seem like you can at least use the value that way, if you use the global keyword:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
print(subtly_broken + 1) # 2
But this, of course, is just a special case of assigning to a global. And there's a big problem with it - the same name can't refer to two things at once. By doing this, the function replaced its own name. So it will fail next time:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
subtly_broken() # causes a `TypeError`
Assigning to a parameter
Sometimes people expect to be able to assign to one of the function's parameters, and have it affect a variable that was used for the corresponding argument. However, this does not work:
def broken(words):
words = ['hello', 'world']
data = ['hello']
broken(data) # `data` does not change
Just like how Python returns values, not variables, it also passes values, not variables. words is a local name; by definition the calling code doesn't know anything about that namespace.
One of the working methods that we saw is to modify the passed-in list. That works because if the list itself changes, then it changes - it doesn't matter what name is used for it, or what part of the code uses that name. However, assigning a new list to words does not cause the existing list to change. It just makes words start being a name for a different list.
For more information, see How do I pass a variable by reference?.
1 At least, not for getting the value back. If you want to use keyword arguments, you need to know what the keyword names are. But generally, the point of functions is that they're an abstraction; you only need to know about their interface, and you don't need to think about what they're doing internally.
2 In 2.x, print is a statement rather than a function, so this doesn't make an example of calling another function directly. However, print foo() still works with 2.x's print statement, and so does print(foo()) (in this case, the extra parentheses are just ordinary grouping parentheses). Aside from that, 2.7 (the last 2.x version) has been unsupported since the beginning of 2020 - which was nearly a 5 year extension of the normal schedule. But then, this question was originally asked in 2010.
3Again: if the purpose of a method is to update the object, don't also return a value. Some people like to return self so that you can "chain" method calls; but in Python this is considered poor style. If you want that kind of "fluent" interface, then instead of writing methods that update self, write methods that create a new, modified instance of the class.
4 Except, of course, that if we're modifying a value rather than assigning, we don't need either keyword.
5 There's also a locals() that gives you a dict of local variables. However, this cannot be used to make new local variables - the behaviour is undefined in 2.x, and in 3.x the dict is created on the fly and assigning to it has no effect. Some of Python's optimizations depend on the local variables for a function being known ahead of time.
>>> def foo():
return 'hello world'
>>> x = foo()
>>> x
'hello world'
You can use global statement and then achieve what you want without returning value from
the function. For example you can do something like below:
def foo():
global x
x = "hello world"
foo()
print x
The above code will print "hello world".
But please be warned that usage of "global" is not a good idea at all and it is better to avoid usage that is shown in my example.
Also check this related discussion on about usage of global statement in Python.

Why does the 'number' parameter does not pass? [duplicate]

Suppose I have a function like:
def foo():
x = 'hello world'
How do I get the function to return x, in such a way that I can use it as the input for another function or use the variable within the body of a program? I tried using return and then using the x variable in another function, but I get a NameError that way.
For the specific case of communicating information between methods in the same class, it is often best to store the information in self. See Passing variables between methods in Python? for details.
def foo():
x = 'hello world'
return x # return 'hello world' would do, too
foo()
print(x) # NameError - x is not defined outside the function
y = foo()
print(y) # this works
x = foo()
print(x) # this also works, and it's a completely different x than that inside
# foo()
z = bar(x) # of course, now you can use x as you want
z = bar(foo()) # but you don't have to
Effectively, there are two ways: directly and indirectly.
The direct way is to return a value from the function, as you tried, and let the calling code use that value. This is normally what you want. The natural, simple, direct, explicit way to get information back from a function is to return it. Broadly speaking, the purpose of a function is to compute a value, and return signifies "this is the value we computed; we are done here".
Directly using return
The main trick here is that return returns a value, not a variable. So return x does not enable the calling code to use x after calling the function, and does not modify any existing value that x had in the context of the call. (That's presumably why you got a NameError.)
After we use return in the function:
def example():
x = 'hello world'
return x
we need to write the calling code to use the return value:
result = example()
print(result)
The other key point here is that a call to a function is an expression, so we can use it the same way that we use, say, the result of an addition. Just as we may say result = 'hello ' + 'world', we may say result = foo(). After that, result is our own, local name for that string, and we can do whatever we want with it.
We can use the same name, x, if we want. Or we can use a different name. The calling code doesn't have to know anything about how the function is written, or what names it uses for things.1
We can use the value directly to call another function: for example, print(foo()).2 We can return the value directly: simply return 'hello world', without assigning to x. (Again: we are returning a value, not a variable.)
The function can only return once each time it is called. return terminates the function - again, we just determined the result of the calculation, so there is no reason to calculate any further. If we want to return multiple pieces of information, therefore, we will need to come up with a single object (in Python, "value" and "object" are effectively synonyms; this doesn't work out so well for some other languages.)
We can make a tuple right on the return line; or we can use a dictionary, a namedtuple (Python 2.6+), a types.simpleNamespace (Python 3.3+), a dataclass (Python 3.7+), or some other class (perhaps even one we write ourselves) to associate names with the values that are being returned; or we can accumulate values from a loop in a list; etc. etc. The possibilities are endless..
On the other hand, the function returns whether you like it or not (unless an exception is raised). If it reaches the end, it will implicitly return the special value None. You may or may not want to do it explicitly instead.
Indirect methods
Other than returning the result back to the caller directly, we can communicate it by modifying some existing object that the caller knows about. There are many ways to do that, but they're all variations on that same theme.
If you want the code to communicate information back this way, please just let it return None - don't also use the return value for something meaningful. That's how the built-in functionality works.
In order to modify that object, the called function also has to know about it, of course. That means, having a name for the object that can be looked up in a current scope. So, let's go through those in order:
Local scope: Modifying a passed-in argument
If one of our parameters is mutable, we can just mutate it, and rely on the caller to examine the change. This is usually not a great idea, because it can be hard to reason about the code. It looks like:
def called(mutable):
mutable.append('world')
def caller():
my_value = ['hello'] # a list with just that string
called(my_value)
# now it contains both strings
If the value is an instance of our own class, we could also assign to an attribute:
class Test:
def __init__(self, value):
self.value = value
def called(mutable):
mutable.value = 'world'
def caller():
test = Test('hello')
called(test)
# now test.value has changed
Assigning to an attribute does not work for built-in types, including object; and it might not work for some classes that explicitly prevent you from doing it.
Local scope: Modifying self, in a method
We already have an example of this above: setting self.value in the Test.__init__ code. This is a special case of modifying a passed-in argument; but it's part of how classes work in Python, and something we're expected to do. Normally, when we do this, the calling won't actually check for changes to self - it will just use the modified object in the next step of the logic. That's what makes it appropriate to write code this way: we're still presenting an interface, so the caller doesn't have to worry about the details.
class Example:
def __init__(self):
self._words = ['hello']
def add_word(self):
self._words.append('world')
def display(self):
print(*self.words)
x = Example()
x.add_word()
x.display()
In the example, calling add_word gave information back to the top-level code - but instead of looking for it, we just go ahead and call display.3
See also: Passing variables between methods in Python?
Enclosing scope
This is a rare special case when using nested functions. There isn't a lot to say here - it works the same way as with the global scope, just using the nonlocal keyword rather than global.4
Global scope: Modifying a global
Generally speaking, it is a bad idea to change anything in the global scope after setting it up in the first place. It makes code harder to reason about, because anything that uses that global (aside from whatever was responsible for the change) now has a "hidden" source of input.
If you still want to do it, the syntax is straightforward:
words = ['hello']
def add_global_word():
words.append('world')
add_global_word() # `words` is changed
Global scope: Assigning to a new or existing global
This is actually a special case of modifying a global. I don't mean that assignment is a kind of modification (it isn't). I mean that when you assign a global name, Python automatically updates a dict that represents the global namespace. You can get that dict with globals(), and you can modify that dict and it will actually impact what global variables exist. (I.e., the return from globals() is the dictionary itself, not a copy.)5
But please don't. That's even worse of an idea than the previous one. If you really need to get the result from your function by assigning to a global variable, use the global keyword to tell Python that the name should be looked up in the global scope:
words = ['hello']
def replace_global_words():
global words
words = ['hello', 'world']
replace_global_words() # `words` is a new list with both words
Global scope: Assigning to or modifying an attribute of the function itself
This is a rare special case, but now that you've seen the other examples, the theory should be clear. In Python, functions are mutable (i.e. you can set attributes on them); and if we define a function at top level, it's in the global namespace. So this is really just modifying a global:
def set_own_words():
set_own_words.words = ['hello', 'world']
set_own_words()
print(*set_own_words.words)
We shouldn't really use this to send information to the caller. It has all the usual problems with globals, and it's even harder to understand. But it can be useful to set a function's attributes from within the function, in order for the function to remember something in between calls. (It's similar to how methods remember things in between calls by modifying self.) The functools standard library does this, for example in the cache implementation.
Builtin scope
This doesn't work. The builtin namespace doesn't contain any mutable objects, and you can't assign new builtin names (they'll go into the global namespace instead).
Some approaches that don't work in Python
Just calculating something before the function ends
In some other programming languages, there is some kind of hidden variable that automatically picks up the result of the last calculation, every time something is calculated; and if you reach the end of a function without returning anything, it gets returned. That doesn't work in Python. If you reach the end without returning anything, your function returns None.
Assigning to the function's name
In some other programming languages, you are allowed (or expected) to assign to a variable with the same name as the function; and at the end of the function, that value is returned. That still doesn't work in Python. If you reach the end without returning anything, your function still returns None.
def broken():
broken = 1
broken()
print(broken + 1) # causes a `TypeError`
It might seem like you can at least use the value that way, if you use the global keyword:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
print(subtly_broken + 1) # 2
But this, of course, is just a special case of assigning to a global. And there's a big problem with it - the same name can't refer to two things at once. By doing this, the function replaced its own name. So it will fail next time:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
subtly_broken() # causes a `TypeError`
Assigning to a parameter
Sometimes people expect to be able to assign to one of the function's parameters, and have it affect a variable that was used for the corresponding argument. However, this does not work:
def broken(words):
words = ['hello', 'world']
data = ['hello']
broken(data) # `data` does not change
Just like how Python returns values, not variables, it also passes values, not variables. words is a local name; by definition the calling code doesn't know anything about that namespace.
One of the working methods that we saw is to modify the passed-in list. That works because if the list itself changes, then it changes - it doesn't matter what name is used for it, or what part of the code uses that name. However, assigning a new list to words does not cause the existing list to change. It just makes words start being a name for a different list.
For more information, see How do I pass a variable by reference?.
1 At least, not for getting the value back. If you want to use keyword arguments, you need to know what the keyword names are. But generally, the point of functions is that they're an abstraction; you only need to know about their interface, and you don't need to think about what they're doing internally.
2 In 2.x, print is a statement rather than a function, so this doesn't make an example of calling another function directly. However, print foo() still works with 2.x's print statement, and so does print(foo()) (in this case, the extra parentheses are just ordinary grouping parentheses). Aside from that, 2.7 (the last 2.x version) has been unsupported since the beginning of 2020 - which was nearly a 5 year extension of the normal schedule. But then, this question was originally asked in 2010.
3Again: if the purpose of a method is to update the object, don't also return a value. Some people like to return self so that you can "chain" method calls; but in Python this is considered poor style. If you want that kind of "fluent" interface, then instead of writing methods that update self, write methods that create a new, modified instance of the class.
4 Except, of course, that if we're modifying a value rather than assigning, we don't need either keyword.
5 There's also a locals() that gives you a dict of local variables. However, this cannot be used to make new local variables - the behaviour is undefined in 2.x, and in 3.x the dict is created on the fly and assigning to it has no effect. Some of Python's optimizations depend on the local variables for a function being known ahead of time.
>>> def foo():
return 'hello world'
>>> x = foo()
>>> x
'hello world'
You can use global statement and then achieve what you want without returning value from
the function. For example you can do something like below:
def foo():
global x
x = "hello world"
foo()
print x
The above code will print "hello world".
But please be warned that usage of "global" is not a good idea at all and it is better to avoid usage that is shown in my example.
Also check this related discussion on about usage of global statement in Python.

python beginner craps: global name not defined - naming variables [duplicate]

Suppose I have a function like:
def foo():
x = 'hello world'
How do I get the function to return x, in such a way that I can use it as the input for another function or use the variable within the body of a program? I tried using return and then using the x variable in another function, but I get a NameError that way.
For the specific case of communicating information between methods in the same class, it is often best to store the information in self. See Passing variables between methods in Python? for details.
def foo():
x = 'hello world'
return x # return 'hello world' would do, too
foo()
print(x) # NameError - x is not defined outside the function
y = foo()
print(y) # this works
x = foo()
print(x) # this also works, and it's a completely different x than that inside
# foo()
z = bar(x) # of course, now you can use x as you want
z = bar(foo()) # but you don't have to
Effectively, there are two ways: directly and indirectly.
The direct way is to return a value from the function, as you tried, and let the calling code use that value. This is normally what you want. The natural, simple, direct, explicit way to get information back from a function is to return it. Broadly speaking, the purpose of a function is to compute a value, and return signifies "this is the value we computed; we are done here".
Directly using return
The main trick here is that return returns a value, not a variable. So return x does not enable the calling code to use x after calling the function, and does not modify any existing value that x had in the context of the call. (That's presumably why you got a NameError.)
After we use return in the function:
def example():
x = 'hello world'
return x
we need to write the calling code to use the return value:
result = example()
print(result)
The other key point here is that a call to a function is an expression, so we can use it the same way that we use, say, the result of an addition. Just as we may say result = 'hello ' + 'world', we may say result = foo(). After that, result is our own, local name for that string, and we can do whatever we want with it.
We can use the same name, x, if we want. Or we can use a different name. The calling code doesn't have to know anything about how the function is written, or what names it uses for things.1
We can use the value directly to call another function: for example, print(foo()).2 We can return the value directly: simply return 'hello world', without assigning to x. (Again: we are returning a value, not a variable.)
The function can only return once each time it is called. return terminates the function - again, we just determined the result of the calculation, so there is no reason to calculate any further. If we want to return multiple pieces of information, therefore, we will need to come up with a single object (in Python, "value" and "object" are effectively synonyms; this doesn't work out so well for some other languages.)
We can make a tuple right on the return line; or we can use a dictionary, a namedtuple (Python 2.6+), a types.simpleNamespace (Python 3.3+), a dataclass (Python 3.7+), or some other class (perhaps even one we write ourselves) to associate names with the values that are being returned; or we can accumulate values from a loop in a list; etc. etc. The possibilities are endless..
On the other hand, the function returns whether you like it or not (unless an exception is raised). If it reaches the end, it will implicitly return the special value None. You may or may not want to do it explicitly instead.
Indirect methods
Other than returning the result back to the caller directly, we can communicate it by modifying some existing object that the caller knows about. There are many ways to do that, but they're all variations on that same theme.
If you want the code to communicate information back this way, please just let it return None - don't also use the return value for something meaningful. That's how the built-in functionality works.
In order to modify that object, the called function also has to know about it, of course. That means, having a name for the object that can be looked up in a current scope. So, let's go through those in order:
Local scope: Modifying a passed-in argument
If one of our parameters is mutable, we can just mutate it, and rely on the caller to examine the change. This is usually not a great idea, because it can be hard to reason about the code. It looks like:
def called(mutable):
mutable.append('world')
def caller():
my_value = ['hello'] # a list with just that string
called(my_value)
# now it contains both strings
If the value is an instance of our own class, we could also assign to an attribute:
class Test:
def __init__(self, value):
self.value = value
def called(mutable):
mutable.value = 'world'
def caller():
test = Test('hello')
called(test)
# now test.value has changed
Assigning to an attribute does not work for built-in types, including object; and it might not work for some classes that explicitly prevent you from doing it.
Local scope: Modifying self, in a method
We already have an example of this above: setting self.value in the Test.__init__ code. This is a special case of modifying a passed-in argument; but it's part of how classes work in Python, and something we're expected to do. Normally, when we do this, the calling won't actually check for changes to self - it will just use the modified object in the next step of the logic. That's what makes it appropriate to write code this way: we're still presenting an interface, so the caller doesn't have to worry about the details.
class Example:
def __init__(self):
self._words = ['hello']
def add_word(self):
self._words.append('world')
def display(self):
print(*self.words)
x = Example()
x.add_word()
x.display()
In the example, calling add_word gave information back to the top-level code - but instead of looking for it, we just go ahead and call display.3
See also: Passing variables between methods in Python?
Enclosing scope
This is a rare special case when using nested functions. There isn't a lot to say here - it works the same way as with the global scope, just using the nonlocal keyword rather than global.4
Global scope: Modifying a global
Generally speaking, it is a bad idea to change anything in the global scope after setting it up in the first place. It makes code harder to reason about, because anything that uses that global (aside from whatever was responsible for the change) now has a "hidden" source of input.
If you still want to do it, the syntax is straightforward:
words = ['hello']
def add_global_word():
words.append('world')
add_global_word() # `words` is changed
Global scope: Assigning to a new or existing global
This is actually a special case of modifying a global. I don't mean that assignment is a kind of modification (it isn't). I mean that when you assign a global name, Python automatically updates a dict that represents the global namespace. You can get that dict with globals(), and you can modify that dict and it will actually impact what global variables exist. (I.e., the return from globals() is the dictionary itself, not a copy.)5
But please don't. That's even worse of an idea than the previous one. If you really need to get the result from your function by assigning to a global variable, use the global keyword to tell Python that the name should be looked up in the global scope:
words = ['hello']
def replace_global_words():
global words
words = ['hello', 'world']
replace_global_words() # `words` is a new list with both words
Global scope: Assigning to or modifying an attribute of the function itself
This is a rare special case, but now that you've seen the other examples, the theory should be clear. In Python, functions are mutable (i.e. you can set attributes on them); and if we define a function at top level, it's in the global namespace. So this is really just modifying a global:
def set_own_words():
set_own_words.words = ['hello', 'world']
set_own_words()
print(*set_own_words.words)
We shouldn't really use this to send information to the caller. It has all the usual problems with globals, and it's even harder to understand. But it can be useful to set a function's attributes from within the function, in order for the function to remember something in between calls. (It's similar to how methods remember things in between calls by modifying self.) The functools standard library does this, for example in the cache implementation.
Builtin scope
This doesn't work. The builtin namespace doesn't contain any mutable objects, and you can't assign new builtin names (they'll go into the global namespace instead).
Some approaches that don't work in Python
Just calculating something before the function ends
In some other programming languages, there is some kind of hidden variable that automatically picks up the result of the last calculation, every time something is calculated; and if you reach the end of a function without returning anything, it gets returned. That doesn't work in Python. If you reach the end without returning anything, your function returns None.
Assigning to the function's name
In some other programming languages, you are allowed (or expected) to assign to a variable with the same name as the function; and at the end of the function, that value is returned. That still doesn't work in Python. If you reach the end without returning anything, your function still returns None.
def broken():
broken = 1
broken()
print(broken + 1) # causes a `TypeError`
It might seem like you can at least use the value that way, if you use the global keyword:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
print(subtly_broken + 1) # 2
But this, of course, is just a special case of assigning to a global. And there's a big problem with it - the same name can't refer to two things at once. By doing this, the function replaced its own name. So it will fail next time:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
subtly_broken() # causes a `TypeError`
Assigning to a parameter
Sometimes people expect to be able to assign to one of the function's parameters, and have it affect a variable that was used for the corresponding argument. However, this does not work:
def broken(words):
words = ['hello', 'world']
data = ['hello']
broken(data) # `data` does not change
Just like how Python returns values, not variables, it also passes values, not variables. words is a local name; by definition the calling code doesn't know anything about that namespace.
One of the working methods that we saw is to modify the passed-in list. That works because if the list itself changes, then it changes - it doesn't matter what name is used for it, or what part of the code uses that name. However, assigning a new list to words does not cause the existing list to change. It just makes words start being a name for a different list.
For more information, see How do I pass a variable by reference?.
1 At least, not for getting the value back. If you want to use keyword arguments, you need to know what the keyword names are. But generally, the point of functions is that they're an abstraction; you only need to know about their interface, and you don't need to think about what they're doing internally.
2 In 2.x, print is a statement rather than a function, so this doesn't make an example of calling another function directly. However, print foo() still works with 2.x's print statement, and so does print(foo()) (in this case, the extra parentheses are just ordinary grouping parentheses). Aside from that, 2.7 (the last 2.x version) has been unsupported since the beginning of 2020 - which was nearly a 5 year extension of the normal schedule. But then, this question was originally asked in 2010.
3Again: if the purpose of a method is to update the object, don't also return a value. Some people like to return self so that you can "chain" method calls; but in Python this is considered poor style. If you want that kind of "fluent" interface, then instead of writing methods that update self, write methods that create a new, modified instance of the class.
4 Except, of course, that if we're modifying a value rather than assigning, we don't need either keyword.
5 There's also a locals() that gives you a dict of local variables. However, this cannot be used to make new local variables - the behaviour is undefined in 2.x, and in 3.x the dict is created on the fly and assigning to it has no effect. Some of Python's optimizations depend on the local variables for a function being known ahead of time.
>>> def foo():
return 'hello world'
>>> x = foo()
>>> x
'hello world'
You can use global statement and then achieve what you want without returning value from
the function. For example you can do something like below:
def foo():
global x
x = "hello world"
foo()
print x
The above code will print "hello world".
But please be warned that usage of "global" is not a good idea at all and it is better to avoid usage that is shown in my example.
Also check this related discussion on about usage of global statement in Python.

Sentinel object and its applications?

I know in python the builtin object() returns a sentinel object. I'm curious to what it is, but mainly its applications.
object is the base class that all other classes inherit from in python 3. There's not a whole lot you can do with a plain old object. However an object's identity could be useful. For example the iter function takes a sentinel argument that signals when to stop termination. We could supply an object() to that.
sentinel = object()
def step():
inp = input('enter something: ')
if inp == 'stop' or inp == 'exit' or inp == 'done':
return sentinel
return inp
for inp in iter(step, sentinel):
print('you entered', inp)
This will ask for input until the user types stop, exit, or done. I'm not exactly sure when iter with a sentinel is more useful than a generator, but I guess it's interesting anyway.
I'm not sure if this answers your question. To be clear, this is just a possible application of object. Fundamentally its existence in the python language has nothing to do with it being usable as a sentinel value (to my knowledge).
Object identity and Classes in Python
You statement "I know in python the builtin object() returns a sentinel object." is slightly off, but not totally wrong, so let me first address that just to make sure we're on the same page:
object() in Python is merely the parent of all classes. In Python 2 this was for a while explicit. In Python 2 you had to write:
class Foo(object):
...
to get a so-called "new-style object". You could also define classes without that superclass, but that was only for backwards compatibility and not important for this question.
Today in Python 3, the object superclass is implicit. So all classes inherit from that class. As such, the two classes below are identical in Python 3:
class Foo:
pass
class Foo(object):
pass
Knowing this, we can slightly rephrase your initial statement:
... the builtin object() returns a sentinel object.
becomes then:
... the builtin object() returns an object instance of class "object"
So, when writing:
my_sentinel = object()
simply creates an empty object instance "somewhere in memory". That last part is importante, because by default, the builtin id() function and checks using ... is ..., rely on the memory address. For example:
>>> a = object()
>>> b = object()
>>> a is b
False
This gives you a way to create object instances that you can use to check for a certain kind of logic in your code that is otherwise very difficult or even impossible. That is the main use of "sentinel" objects.
Example use case: Making the difference between "None" and "Nothing/Uninitialised/Empty/..."
Sometimes the value None is a valid value for a variable and you may need to detect the difference between "empty" or something similar and None.
Let's assume you have a class doing lazy-loading for an expensive operation where "None" is a valid value. You can then write it like this:
#: sentinel value for uninitialised values
UNLOADED = object()
class MyLoader:
def __init__(self, remote_addr):
self.value = UNLOADED
self.remote_addr = remote_addr
def get(self):
if self.value is UNLOADED:
self.value = expensive_operation(self.remote_addr)
return self.value
Now expensive_operation can return any value. Even None or any other "falsy" value and the "caching" will work without unintended bugs. It also makes the code pretty readable as it communicates the intent pretty clearly to the reader of the code-block. You also save storage (albeit negilgable) for an additional "is_loaded" boolean value.
The same code using a boolean:
class MyLoader:
def __init__(self, remote_addr):
self.value = None
self.remote_addr = remote_addr
self.is_loaded = False # <- need for an additional variable
def get(self):
if not self.is_loaded:
self.value = expensive_operation(self.remote_addr)
self.is_loaded = True # <- source for a bug if this is forgotten
return self.value
or, using "None" as default:
class MyLoader:
def __init__(self, remote_addr):
self.value = None # <- We'll use this to detect load state
self.remote_addr = remote_addr
def get(self):
if self.value is None:
self.value = expensive_operation(self.remote_addr)
# If the above returned "None" we will never "cache" the result
return self.value
Final Thoughts
The above "MyLoader" example is just one example where sentinel values can be handy. They help making code more readable and more expressive. They also avoid certain types of bugs.
They are especially useful in areas where one is tempted to use None to signify a special value. Whenever you think something like "When X is the case, I will set the variable to None" it may be worth thinking about using a sentinel value. Because you now gave the value None a special meaning for a specific context.
Another such example would be to have special values for infinite integers. The concept of infinity only exists in floats. But if you want to ensure type-safety you may want to create your own "special" values like that to signify infinity.
Using sentinel values like that help distinguish between multiple different concepts which would otherwise be impossible. If you need many different "special" values and use None everywhere, you may end up using None from one concept in the context of another concept and end up with unintended side-effects which may be hard to debug. Imagine a contrived function like this:
SENTINEL_A = object()
SENTINEL_B = object()
def foobar(a = SENTINEL_A, b = SENTINEL_B):
if a is SENTINEL_A:
a = -12
if b is SENTINEL_B:
b = a * 2
print(a+b)
By using sentinels like this it becomes impossible to accidentally triggering the if-branches by mixing up the variables. For example, assume you refactor code and trip up somewhere, mixin a and b like this:
SENTINEL_A = object()
SENTINEL_B = object()
def foobar(a = SENTINEL_A, b = SENTINEL_B):
if b is SENTINEL_A: # <- bug: using *b* instead of *a*
a = -12
if b is SENTINEL_B:
b = a * 2
print(a+b)
In that case, the first if can never be true (unless the function is called improperly of course). If you would have used None as default instead, this bug would become harder to detect because you would end up with a = -12 in cases where you would not expect it.
In that sense, sentinels make your code more robust. And if logical-errors occur in your code they will be easier to find.
Having said all that, sentinel values are pretty rare. I personally find them very useful to avoid excessive usages of None for flagging special cases.
This is a source code example from the Python standard library for dataclasses on using sentinel values
# A sentinel object to detect if a parameter is supplied or not. Use
# a class to give it a better repr.
class _MISSING_TYPE:
pass
MISSING = _MISSING_TYPE()

How to maintain lists and dictionaries between function calls in Python?

I have a function. Inside that I'm maintainfing a dictionary of values.
I want that dictionary to be maintained between different function calls
Suppose the dic is :
a = {'a':1,'b':2,'c':3}
At first call,say,I changed a[a] to 100
Dict becomes a = {'a':100,'b':2,'c':3}
At another call,i changed a[b] to 200
I want that dic to be a = {'a':100,'b':200,'c':3}
But in my code a[a] doesn't remain 100.It changes to initial value 1.
I need an answer ASAP....I m already late...Please help me friends...
You might be talking about a callable object.
class MyFunction( object ):
def __init__( self ):
self.rememberThis= dict()
def __call__( self, arg1, arg2 ):
# do something
rememberThis['a'] = arg1
return someValue
myFunction= MyFunction()
From then on, use myFunction as a simple function. You can access the rememberThis dictionary using myFunction.rememberThis.
You could use a static variable:
def foo(k, v):
foo.a[k] = v
foo.a = {'a': 1, 'b': 2, 'c': 3}
foo('a', 100)
foo('b', 200)
print foo.a
Rather than forcing globals on the code base (that can be the decision of the caller) I prefer the idea of keeping the state related to an instance of the function. A class is good for this but doesn't communicate well what you are trying to accomplish and can be a bit verbose. Taking advantage of closures is, in my opinion, a lot cleaner.
def function_the_world_sees():
a = {'a':1,'b':2,'c':3}
def actual_function(arg0, arg1):
a[arg0] = arg1
return a
return actual_function
stateful_function = function_the_world_sees()
stateful_function("b", 100)
stateful_function("b", 200)
The main caution to keep in mind is that when you make assignments in "actual_function", they occur within "actual_function". This means you can't reassign a to a different variable. The work arounds I use are to put all of my variables I plan to reassign into either into a single element list per variable or a dictionary.
If 'a' is being created inside the function. It is going out of scope. Simply create it outside the function(and before the function is called). By doing this the list/hash will not be deleted after the program leaves the function.
a = {'a':1,'b':2,'c':3}
# call you funciton here
This question doesn't have an elegant answer, in my opinion. The options are callable objects, default values, and attribute hacks. Callable objects are the right answer, but they bring in a lot of structure for what would be a single "static" declaration in another language. Default values are a minor change to the code, but it's kludgy and can be confusing to a new python programmer looking at your code. I don't like them because their existence isn't hidden from anyone who might be looking at your API.
I generally go with an attribute hack. My preferred method is:
def myfunct():
if not hasattr(myfunct, 'state'): myfunct.state = list()
# access myfunct.state in the body however you want
This keeps the declaration of the state in the first line of the function where it belongs, as well as keeping myfunct as a function. The downside is you do the attribute check every time you call the function. This is almost certainly not going to be a bottleneck in most code.
You can 'cheat' using Python's behavior for default arguments. Default arguments are only evaluated once; they get reused for every call of the function.
>>> def testFunction(persistent_dict={'a': 0}):
... persistent_dict['a'] += 1
... print persistent_dict['a']
...
>>> testFunction()
1
>>> testFunction()
2
This isn't the most elegant solution; if someone calls the function and passes in a parameter it will override the default, which probably isn't what you want.
If you just want a quick and dirty way to get the results, that will work. If you're doing something more complicated it might be better to factor it out into a class like S. Lott mentioned.
EDIT: Renamed the dictionary so it wouldn't hide the builtin dict as per the comment below.
Personally, I like the idea of the global statement. It doesn't introduce a global variable but states that a local identifier actually refers to one in the global namespace.
d = dict()
l = list()
def foo(bar, baz):
global d
global l
l.append(bar, baz)
d[bar] = baz
In python 3.0 there is also a "nonlocal" statement.

Categories

Resources