Python and reference passing. Limitation? - python

I would like to do something like the following:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(foo):
foo = Foo()
aTestFoo = None
factory(aTestFoo)
print aTestFoo.member
However it crashes with AttributeError: 'NoneType' object has no attribute 'member':
the object aTestFoo has not been modified inside the call of the function factory.
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
In C++, in the function prototype, I would have added a reference to the pointer to be created in the factory... but maybe this is not the kind of things I should think about in Python.
In C#, there's the key word ref that allows to modify the reference itself, really close to the C++ way. I don't know in Java... and I do wonder in Python.

Python does not have pass by reference. One of the few things it shares with Java, by the way. Some people describe argument passing in Python as call by value (and define the values as references, where reference means not what it means in C++), some people describe it as pass by reference with reasoning I find quite questionable (they re-define it to use to what Python calls "reference", and end up with something which has nothing to do with what has been known as pass by reference for decades), others go for terms which are not as widely used and abused (popular examples are "{pass,call} by {object,sharing}"). See Call By Object on effbot.org for a rather extensive discussion on the defintions of the various terms, on history, and on the flaws in some of the arguments for the terms pass by reference and pass by value.
The short story, without naming it, goes like this:
Every variable, object attribute, collection item, etc. refers to an object.
Assignment, argument passing, etc. create another variable, object attribute, collection item, etc. which refers to the same object but has no knowledge which other variables, object attributes, collection items, etc. refer to that object.
Any variable, object attribute, collection item, etc. can be used to modify an object, and any other variable, object attribute, collection item, etc. can be used to observe that modification.
No variable, object attribute, collection item, etc. refers to another variable, object attribute, collection items, etc. and thus you can't emulate pass by reference (in the C++ sense) except by treating a mutable object/collection as your "namespace". This is excessively ugly, so don't use it when there's a much easier alternative (such as a return value, or exceptions, or multiple return values via iterable unpacking).
You may consider this like using pointers, but not pointers to pointers (but sometimes pointers to structures containing pointers) in C. And then passing those pointers by value. But don't read too much into this simile. Python's data model is significantly different from C's.

You are making a mistake here because in Python
"We call the argument passing technique _call by sharing_,
because the argument objects are shared between the
caller and the called routine. This technique does not
correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP). In particular it
is not call by value because mutations of arguments per-
formed by the called routine will be visible to the caller.
And it is not call by reference because access is not given
to the variables of the caller, but merely to certain objects."
in Python, the variables in the formal argument list are bound to the
actual argument objects. the objects are shared between caller
and callee; there are no "fresh locations" or extra "stores" involved.
(which, of course, is why the CLU folks called this mechanism "call-
by-sharing".)
and btw, Python functions doesn't run in an extended environment, either. function bodies have very limited access to the surrounding environment.

The Assignment Statements section of the Python docs might be interesting.
The = statement in Python acts differently depending on the situation, but in the case you present, it just binds the new object to a new local variable:
def factory(foo):
# This makes a new instance of Foo,
# and binds it to a local variable `foo`,
foo = Foo()
# This binds `None` to a top-level variable `aTestFoo`
aTestFoo = None
# Call `factory` with first argument of `None`
factory(aTestFoo)
print aTestFoo.member
Although it can potentially be more confusing than helpful, the dis module can show you the byte-code representation of a function, which can reveal how Python works internally. Here is the disassembly of `factory:
>>> dis.dis(factory)
4 0 LOAD_GLOBAL 0 (Foo)
3 CALL_FUNCTION 0
6 STORE_FAST 0 (foo)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
What that says is, Python loads the global Foo class by name (0), and calls it (3, instantiation and calling are very similar), then stores the result in a local variable (6, see STORE_FAST). Then it loads the default return value None (9) and returns it (12)
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
Factory functions are rarely necessary in Python. In the occasional case where they are necessary, you would just return the new instance from your factory (instead of trying to assign it to a passed-in variable):
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory():
return Foo()
aTestFoo = factory()
print aTestFoo.member

Your factory method doesn't return anything - and by default it will have a return value of None. You assign aTestFoo to None, but never re-assign it - which is where your actual error is coming from.
Fixing these issues:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(obj):
return obj()
aTestFoo = factory(Foo)
print aTestFoo.member
This should do what I think you are after, although such patterns are not that typical in Python (ie, factory methods).

Related

Refcount python objects

I am trying to understand how the refcounts work in python, Can someone explain why the count gets printed as 2 when its just 1 instance for the object? (is it because of the temporary being passed to the getrefcount method?) Also why is the number more when invoked when invoked from the member (is self reference bumping the refcount?)
import sys
class A:
def print_ref_count(self):
print sys.getrefcount(self)
a = A()
print sys.getrefcount(a) # prints 2
a.print_ref_count() # prints 4
print sys.getrefcount(a) # prints 2
There are implicit reference increments everywhere in Python. When you pass a as an argument to a function, it is incref-ed, as getrefcount's own docstring notes:
The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to getrefcount().
The two additional references when you invoke the method are, respectively:
The reference held by the name self within the method
The implicit reference created when it makes a bound method object specific to that instance when you do a.print_ref_count (even before actually calling it); type(a.print_ref_count) is a temporary bound method, which must include references to both the instance and the method itself (you could do to_print = a.print_ref_count, then del a, and to_print must still work even though a is no longer referencing the instance)
Getting hung up on the reference counts is not very helpful though; it's an implementation detail of Python (CPython specifically; other Python interpreters can and do use non-reference counting garbage collection), and as you can see, many implicit references are created that may not actually be stored in named variables at all.

Why do functions/methods in python need self as parameter? [duplicate]

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 8 years ago.
I can understand why it is needed for local variables, (self.x), but why is is nessecary as parameter in a function? Is there something else you could put there instead of self?
Please explain in as much layman terms as possible, I never had decent programming education.
By default, every function declared in the namespace of a class assumes that its first argument will be a reference to an instance of that class. (Other types of functions are decorated with #classmethod and #staticmethod to change this assumption.) Such functions are called methods. By convention, Python programmers name that first parameter self, but Python doesn't care what you call it. When a method is called, you must provide that reference. For example (with self replaced by foobar to demonstrate that self is not the required name):
class A:
def __init__(foobar):
foobar.x = 5
def somefunc(foobar, y):
print foobar.x + y
a = A()
print A.somefunc(a, 3) # Prints 8
Python provides some syntactic sugar to make the link between an object and a method called on it more obvious, by letting you call a bound method instead of the function itself. That is, a.somefunc(3) and A.somefunc(a, 3) are equivalent. In Python terminology, A.somefunc is an unbound method, since it still needs an instance when it is called:
f = A.somefunc
print f(a, 3)
By contrast, a.somefunc is called a bound method, since you have already provided the instance to use as the first argument:
f = a.somefunc
print f(3)
If you consider it, EVERY programming language does that - or, at least, the most common languages like pascal, c++ or java do). EXCEPT that, in most programming languages, the this keyword is assumed and not passed as a parameter. Consider the function pointers in those languages: they're different than method-pointers.
Pascal:
function(a: Integer): Integer;
vs
function(a: Integer): Integer of object;
The latter considers the self pointer (yes, it's named self but it's an implicit pointer like the this in c++, while the python self is explicit).
C++:
typedef int (*mytype)(int a);
vs
typedef int Anyclass::(*mytype)(int a);
As difference with Pascal, in C++ you must specify the class owning the method. Anyway, this method pointer declaration states the difference between a function expecting a this or not.
But Python takes seriously it's Zen, as quichua people takes seriously their Ama Suway, Ama Llullay, Ama K'ellay:
Explicit is better than implicit.
So, that's why you see explicitly the self parameter (and must write it, of course) for instance methods and #classmethods (althought it's usually called cls there since it's intention is to dynamically know the class and not the instance). Python does not assume a this or self keyword must exist inside the methods (so, the namespace has only true variables - remember you are NOT forced to name them self or cls althought it's usual and expected).
Finally, if you get:
x = AClass.amethod #unbound method
You must call it as
x(aninstance, param, param2, ..., named=param, named2=param2, ...)
While getting it as:
x = anInstance.method #bound method, has `im_self` attribute set to the instance.
must be called as:
x(param, param2, ..., named=param, named2=param2, ...)
Yes, the self is explicit in the parameter list since it's not assumed a keyword or 'backdoor' must exist, but not in the argument list because of syntactic sugar every OOP language has (weird criteria, huh?).
It's how Python's implementation of object oriented programming works -- a method of an instance (a so-called bound method) is called with that instance as its first argument.
Besides variables of the instance (self.x) you can also use it for anything else, e.g. call another method (self.another_method()), pass this instance as a parameter to something else entirely (mod.some_function(3, self)), or use it to call this method in the superclass of this class (return super(ThisClass, self).this_method()).
You can give it an entirely different name as well. Using pony instead of self will work just as well. But you shouldn't, for obvious reasons.
The use of self for the first reference in a method is entirely convention.
You can call it something else, even inconsistently in the same class:
class Foo(object):
def __init__(notself, i):
notself.i=i # note 'notself' instead of 'self'
def __str__(self):
return str(self.i) # back to the convention
f=Foo(22)
print f
But please don't do that. It is confusing to others that may read your code (or yourself when you read it later).

Python: how to check if an argument's type is such that it will be passed by value or by reference?

It seems that atomic types (int, string, ...) are passed by value, and all others (objects, pointers to functions, pointers to methods, ...) are passed by reference.
What is the best way to check if a variable will be passed by value or by reference?
isinstance(input_, float) or isinstance(input_, basestring) or <...>
seems to be very inelegant.
The reason why I need it is below: I have a class that wraps wx.Button, if args/kwargs are of types that are passed by value, updating their values in other objects will not be taken into account. So some checks will be beneficial
class PalpyButton(wx.Button):
def __init__(self, parent, btnLabel, handler, successMsg = None, args = (), kwargs = {}):
super(PalpyButton, self).__init__(parent, -1, btnLabel)
self.handler = handler
self.successMsg = successMsg
parent.Bind(wx.EVT_BUTTON, lambda event: self.onClick(event, *args, **kwargs), self)
def onClick(self, event, *args, **kwargs):
try:
self.handler(*args, **kwargs)
if self.successMsg != None:
if hasattr(self.successMsg, '__call__'):
showInfoMessageBox(self.successMsg())
else:
showInfoMessageBox(self.successMsg)
except BaseException, detail:
showErrorMessageBox(detail)
The question does not arise. Everything is passed in exactly the same way, regardless of type, value, operating system, phase of the moon, etc.
Whether that style of argument passing is best described as pass by value, pass by reference, or something else, is up for debate (though note that using "pass by value/reference" requires non-standard definitions of "value"/"reference", which is the primary reasoning for other terms such as "pass by object"). It doesn't matter here. It's just always the same.
Edit: In terms of your example, the semantics of handler doesn't change based on the argument types. Either is assigns to the variables (AKA names):
def my_handler(x):
x = <whatever>
... but then my_handler(<anything>) does not change what object the argument refers to (and it doesn't matter either if you pass in a local variable, an object attribute, the result of a more complex expression, or anything else). By the way, everything is an object, your artificial distinctions ("atomic", "pointers to functions") make no sense.
Alternatively, the function (tries to) change the value of the objects (e.g. add/remove items to a collection, change an object attribute, call a method that changes some state that can be observed indirectly). That may raise exceptions if the object does not support that (e.g. call a method that does not exist), but that happens for any access, not just for mutation. If the object's value is indeed changed, this is always visible through every reference to that object. As new references to existing objects are created in a number of cases (such as argument passing, variable assignment, attribute assignment, etc.) you can observe that objects are not copied by changing their value in one place and observing the change in another place. This is true for all objects (again, provided you can change the object's value).
Now, some objects are immutable, which (by definition) means you can't change their value, and hence can't observe the object "sharing" in this way (you can observe it in other ways though). That doesn't mean assignment has different semantics depending on mutability (it doesn't), it just means you can do things with mutable objects which you can't do with immutable objects. The operations you can perform on both work the same for both.
If you really have the need to change the value of a variable passed to you, you can wrap it in an array to implement a passing-by-reference in the process:
def increaseInt(i_var):
i_var[0] += 1
i_ref = [42]
increase(i_ref)
print i_ref[0] # will print 43
In some cases this can be a (warty) solution to a problem like yours.
But in general, all values are passed by reference in Python (in the sense that no copies are created); some values are just immutable (like ints, strings, tuples, …). You might be confused by assigning a new value to the parameter variable — that definitely won't change the value which was in the parameter variable before, it just overwrites it with a new one (and thus removes additional reference to the original value). That is why the example above does not assign anything to i_var but something to i_var[0].

Python achieve pointer like behaviour

I have to write a testing module and have c++-Background. That said, I am aware that there are no pointers in python but how do I achieve the following:
I have a test method which looks in pseudocode like this:
def check(self,obj,prop,value):
if obj.prop <> value: #this does not work,
#getattr does not work either, (objects has no such method (interpreter output)
#I am working with objects from InCyte's python interface
#the supplied findProp method does not do either (i get
#None for objects I can access on the shell with obj.prop
#and yes I supply the method with a string 'prop'
if self._autoadjust:
print("Adjusting prop from x to y")
obj.prop = value #setattr does not work, see above
else:
print("Warning Value != expected value for obj")
Since I want to check many different objects in separate functions I would like to be able to keep the check method in place.
In general, how do I ensure that a function affects the passed object and does not create a copy?
myobj.size=5
resize(myobj,10)
print myobj.size #jython =python2.5 => print is not a function
I can't make resize a member method since the myobj implementation is out of reach, and I don't want to type myobj=resize(myobj, 10) everywhere
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name?
getattr isn't a method, you need to call it like this
getattr(obj, prop)
similarly setattr is called like this
setattr(obj, prop, value)
In general how do I ensure that a function affects the passed object and does not create a copy?
Python is not C++, you never create copies unless you explicitly do so.
I cant make resize a member method since myobj implementation is out of reach, and I don't want to type myobj=resize(myobj,10) everywere
I don't get it? Why should be out of reach? if you have the instance, you can invoke its methods.
In general, how do I ensure that a function affects the passed object
By writing code inside the function that affects the passed-in object, instead of re-assigning to the name.
and does not create a copy?
A copy is never created unless you ask for one.
Python "variables" are names for things. They don't store objects; they refer to objects. However, unlike C++ references, they can be made to refer to something else.
When you write
def change(parameter):
parameter = 42
x = 23
change(x)
# x is still 23
The reason x is still 23 is not because a copy was made, because a copy wasn't made. The reason is that, inside the function, parameter starts out as a name for the passed-in integer object 23, and then the line parameter = 42 causes parameter to stop being a name for 23, and start being a name for 42.
If you do
def change(parameter):
parameter.append(42)
x = [23]
change(x)
# now x is [23, 42]
The passed-in parameter changes, because .append on a list changes the actual list object.
I can't make resize a member method since the myobj implementation is out of reach
That doesn't matter. When Python compiles, there is no type-checking step, and there is no step to look up the implementation of a method to insert the call. All of that is handled when the code actually runs. The code will get to the point myobj.resize(), look for a resize attribute of whatever object myobj currently refers to (after all, it can't know ahead of time even what kind of object it's dealing with; variables don't have types in Python but instead objects do), and attempt to call it (throwing the appropriate exceptions if (a) the object turns out not to have that attribute; (b) the attribute turns out not to actually be a method or other sort of function).
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name? / getattr does not work either
Certainly it works if you use it properly. It is not a method; it is a built-in top-level function. Same thing with setattr.

Python functions can be given new attributes from outside the scope?

I didn't know you could do this:
def tom():
print "tom's locals: ", locals()
def dick(z):
print "z.__name__ = ", z.__name__
z.guest = "Harry"
print "z.guest = ", z.guest
print "dick's locals: ", locals()
tom() #>>> tom's locals: {}
#print tom.guest #AttributeError: 'function' object has no attribute 'guest'
print "tom's dir:", dir(tom) # no 'guest' entry
dick( tom) #>>> z.__name__ = tom
#>>> z.guest = Harry
#>>> dick's locals: {'z': <function tom at 0x02819F30>}
tom() #>>> tom's locals: {}
#print dick.guest #AttributeError: 'function' object has no attribute 'guest'
print tom.guest #>>> Harry
print "tom's dir:", dir(tom) # 'guest' entry appears
Function tom() has no locals. Function dick() knows where tom() lives and puts up Harry as 'guest' over at tom()'s place. harry doesn't appear as a local at tom()'s place, but if you ask for tom's guest, harry answers. harry is a new attribute at tom().
UPDATE: From outside tom(), you can say "print dir(tom)" and see the the tom-object's dictionary. (You can do it from inside tom(), too. So tom could find out he had a new lodger, harry, going under the name of 'guest'.)
So, attributes can be added to a function's namespace from outside the function? Is that often done? Is it acceptable practice? Is it recommended in some situations? Is it actually vital at times? (Is it Pythonic?)
UPDATE: Title now says 'attributes'; it used to say 'variables'. Here's a PEP about Function Attributes.
I think you might be conflating the concepts of local variables and function attributes. For more information on Python function attributes, see the SO question Python function attributes - uses and abuses.
#behindthefall, the motivation to give function objects generic assignable attributes (they didn't use to have them) was that, absent such possibilities, real and popular frameworks were abusing what few assignable attributes existed (typically __doc__) to record information about each given function object. So there was clearly a "pent-up demand" for this functionality, so Guido decided to address it directly (adding an optional dict to each function object to record its attributes isn't a big deal -- most function objects don't need it, and it is optional, so the cost is just 4 bytes for a null pointer;-).
Assigning such attributes in arbitrary places would be very bad practice, making the code harder to understand for no real benefit, but they're very useful when used in a controlled way -- for example, a decorator could usefully record all kinds of things about the function being decorated, and the context in which the decoration occurred, as attributes of the wrapper function, allowing trivially-easy introspection of such metadata to occur later at any time, as needed.
As other answers already pointed out, local variables (which are per-instance, not per-function object!) are a completely disjoint namespace from a function object's attributes held in its __dict__.
In python, a namespace is just a dictionary object, mapping variable name as a string (in this case, 'guest') to a value (in this case, 'Harry'). So as long as you have access to an object, and it's mutable, you can change anything about its namespace.
On small projects, it's not a huge problem, and lets you hack things together faster, but incredibly confusing on larger projects, where your data could be modified from anywhere.
There are ways of making attributes of classes "more private", such as Name Mangling.
tom.guest is just a property on the tom function object, it has nothing to do with the scope or locals() inside that function, and nothing to do with that fact that tom is a function, it would work on any object.
I have used this in the past to make a self-contained function with "enums" that go along with it.
Suppose I were implementing a seek() function. The built-in Python one (on file objects) takes an integer to tell it how to operate; yuck, give me an enum please.
def seek(f, offset, whence=0):
return f.seek(offset, whence)
seek.START = 0
seek.RELATIVE = 1
seek.END = 2
f = open(filename)
seek(f, 0, seek.START) # seek to start of file
seek(f, 0, seek.END) # seek to end of file
What do you think, too tricky and weird? I do like how it keeps the "enum" values bundled together with the function; if you import the function from a module, you get its "enum" values as well, automatically.
Python functions are lexically scoped so there is no way to add variables to the function outside of its defined scope.
However, the function still will have access to all parent scopes, if you really wanted to design the system like that (generally considered bad practice though):
>>> def foo():
>>> def bar():
>>> print x
>>> x = 1
>>> bar()
1
Mutating function variables is mostly a bad idea, since functions are assumed to be immutable. The most pythonic way of implementing this behavior is using classes and methods instead.
Python API documentation generation tools, such as pydoc and epydoc, use introspection to determine a function's name and docstring (available as the __name__ and __doc__ attributes). Well-behaved function decorators are expected to preserve these attributes, so such tools continue to work as expected (i.e. decorating a function should preserve the decorated function's documentation). You do this by copying these attributes from the decorated function to the decorator. Take a look at update_wrapper in the functools module:
WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')
WRAPPER_UPDATES = ('__dict__',)
def update_wrapper(wrapper,
wrapped,
assigned = WRAPPER_ASSIGNMENTS,
updated = WRAPPER_UPDATES):
"""Update a wrapper function to look like the wrapped function
wrapper is the function to be updated
wrapped is the original function
...
"""
for attr in assigned:
setattr(wrapper, attr, getattr(wrapped, attr))
for attr in updated:
getattr(wrapper, attr).update(getattr(wrapped, attr, {}))
...
So, that's at least one example where modifying function attributes is useful and accepted.
It some situations, it can be useful to "annotate" a function by setting an attribute; Django uses this in a few places:
You can set alters_data to True
on model methods that change the
database, preventing them from being
called in templates.
You can set
allow_tags on model methods that
will be displayed in the admin, to
signify that the method returns HTML
content, which shouldn't be
automatically escaped.
As always, use your judgement. If modifying attributes is accepted practice (for example, when writing a decorator), then by all means go ahead. If it's going to be part of a well documented API, it's probably fine too.

Categories

Resources