I'm relatively new to Python so I hope I haven't missed something, but here goes...
I'm trying to write a Python module, and I'd like to create a class with a "private" attribute that can (or maybe 'should') only be modified through one or more functions within the module. This is in an effort to make the module more robust, since setting of this attribute outside of these functions could lead to unwanted behaviour. For example, I might have:
A class that stores x and y values for a scatter plot, Data
A function to read x and y values from a file and store them in the class, read()
A function to plot them, plot()
In this case, I would prefer if the user wasn't able to do something like this:
data = Data()
read("file.csv", data)
data.x = [0, 3, 2, 6, 1]
plot(data)
I realise that adding a single leading underscore to the name indicates to the user that the attribute should not be changed, i.e. rename to _x and add a property decorator so that the user can access the value without feeling guilty. However, what if I wanted to add a setter property as well:
class Data(object):
_x = []
_y = []
#property
def x(self):
return self._x
#x.setter
def x(self, value):
# Do something with value
self._x = value
I'm now in the same position as I was before - the user can no longer directly access the attribute _x, but they can still set it using:
data.x = [0, 3, 2, 6, 1]
Ideally I'd rename the property function definitions to _x(), but this leads to confusion about what self._x actually means (depending on the order in which they are declared, this seems to result in either the setter being called recursively or the setter being ignored in favour of the attribute).
A couple of solutions I can think of:
Add a double leading underscore to the attribute, __x, so that the name becomes mangled and does not get confused with the setter function. As I understand it, this should be reserved for attributes that a class does not wish to share with possible subclasses, so I'm not sure if this is a legitimate use.
Rename the attribute, e.g. _x_stored. While this solves the problem completely, it makes the code harder to read and introduces naming convention issues - which attributes do I rename? just the ones that are relevant? just the ones that have properties? just the ones within this class?
Are either of the above solutions applicable? And if not, is there a better way to solve this problem?
Edit
Thanks for the responses so far. A few points thrown up by the comments:
I want to retain the extra logic that the setter property gives me - the # Do something with value section in the above example - so internally setting the attribute through direct access of self._x doesn't solve the problem.
Removing the setter property and creating a separate function _set_x() does solve the problem, but is not a very neat solution since it allows setting of _x in two different ways - either by calling that function or through direct access of self._x. I'd then have to keep track of which attributes should be set by their own (non-property) setter function and which should be modified through direct access. I'd probably rather use one of the solutions I suggested above, because even though they make a mess of the naming conventions within the class they are at least consistent in their use outside of the class, i.e. they all use the syntactical sugar of properties. If there's no way of doing this in a neater way then I guess I just have to choose the one that causes the least disruption.
If you want to discourage users from changing a property, but want it to be clear that they can read it, I'd use #property without providing a setter, similar to what you described earlier:
class Data(object):
def __init__(self):
self._x = []
self._y = []
#property
def x(self):
return self._x
#property
def y(self):
return self._x
I know you mention "What if I wanted to add a setter to the property?", but I guess I would counter that with: Why add the setter if you don't want your clients to be able to set the property? Internally, you can access self._x directly.
As for a client directly accessing _x or _y, any variable with an '_' prefix is understood to be "private" in Python, so you should trust your clients to obey that. If they don't obey that, and end up screwing things up, that's their own fault. This kind of mindset is counter to a many other languages (C++, Java, etc.) where keeping data private is considered very important, but Python's culture is just different in this regard.
Edit
One other note, since your private properties in this particular case are lists, which are mutable (unlike strings or ints, which are immutable), a client could end up changing them somewhat accidentally:
>>> d = Data()
>>> print d.x
['1', '2']
>>> l = d.x
>>> print l
['1', '2']
>>> l.append("3")
>>> print d.x
['1', '2', '3'] # Oops!
If you want to avoid this, you'd need your property to return a copy of the list:
#property
def x(self):
return list(self._x)
If you want less convoluted properties, that manage their own storage without leaving it open to "under the hood" alteration, you can define a class (similar to property) and use it to declare your class member:
I called mine 'Field':
class Field:
def __init__(self,default=None):
self.valueName = None # actual attribute name
self.default = default # type or value or lambda
if not callable(default): self.default = lambda:default
self._didSet = None # observers
self._willSet = None
def findName(self,owner): # find name of field
if self.valueName: return # once per field for class
for name,attr in owner.__dict__.items():
if attr is self:
self.valueName = f"<{name}>" # actual attribute name
break
def __get__(self,obj,owner=None): # generic getter
if not obj: return self
self.findName(owner or type(obj))
value = getattr(obj,self.valueName,self) # attribute from instance
if value is self:
value = self.default() # default value
setattr(obj,self.valueName,value) # set at 1st reference
return value
def __set__(self,obj,value): # generic setter
self.findName(type(obj))
if self._willSet: value = self._willSet(obj,value)
if self._didSet: oldValue = self.__get__(obj)
setattr(obj,self.valueName,value) # attribute in instance
if self._didSet: self._didSet(obj,oldValue)
def willSet(self,f): self._willSet = f
def didSet(self,f): self._didSet = f
usage:
class myClass:
lastName = Field("Doe")
firstName = Field("")
age = Field(int)
gender = Field("M")
relatives = Field(list)
#lastName.willSet
def _(self,newValue): # no function name needed
return newValue.capitalize()
#lastName.didSet
def _(self,oldValue): # no function name needed
print('last name changed from',oldValue,'to',self.lastName)
c = myClass()
c.firstName = "John"
c.lastName = "Smith"
# last name changed from Doe to Smith
c.relatives.extend(['Lucy','Frank'])
print(c.gender)
# M
print(c.__dict__)
# {'<lastName>': 'Smith', '<firstName>': 'John',
'<relatives>': ['Lucy', 'Frank'], '<gender>': 'M'}
Attributes added to the instance are not accessible from Python because they use identifiers that would not be valid in code.
Because you define default values at the class level, there is no need to set the field values in the constructor (although you could still do it as needed)
Field values are only added as instance attributes when they are referenced making the instance creation process more efficient.
Note that my actual Field class is a lot more sophisticated and supports change tracking, more observer functions, type checking, and read-only/calculated fields. I boiled it down to essentials for this response
For Private/Public method protection, you may want to look at this answer
Related
I want to be able to change the reference of a variable within the class Test
class Test():
def change(self, Other_Class):
self.__class__ = Other_Class.__class__
self = Other
class Other():
def set_data(self, data):
self.data = data
def one(self):
print('foo')
a = Test()
b = Other()
b.set_data([1,2,3])
a.change(b)
a.data
AttributeError: 'Other' object has no attribute 'data'
How can I change the reference to a to be what ever variable I pass through to Test().change
I would like this to work for builtin datatypes as well, but I get a different error for that.
what would be the best way to do this?
Inside Test.change, that self is a parameter, and parameters are just local variables.
And rebinding local variables doesn't have any effect on anything outside of the function.
In particular, it has no effect on any other variables (or list elements, or attributes of other objects, etc.), like the global a, that were also bound to the same value. They remain names for that same value.
It's not even clear what you're trying to do here. You change the type of a into the type of b, and that works. But what else do you want to do?
Do you want to change a into the object b, with the same identity? If so, you don't need any methods for that; that's what a = b means. Or do you want to be a distinct instance, but share an instance __dict__? Or to copy all of b's attributes into a? Shallow or deep? Should any extra attributes a had lying around be removed as well? Do you only care about attributes stored in the __dict__, or do you need, e.g., __slots__ to work?
Anyway, something that might be reasonable, for some strange use case, is this:
def change(self, other):
inherited = dict(inspect.getmembers(self.__class__))
for name, value in inspect.getmembers(self):
if name not in inherited and not name.startswith('__'):
delattr(self, name)
self.__class__ = other.__class__
inherited = dict(inspect.getmembers(other.__class__))
for name, value in inspect.getmembers(other):
if name not in inherited and not name.startswith('__'):
setattr(self, name, value)
Whether that's useful for your use case, I have no idea. But maybe it gives you an idea of the kinds of things you can actually do with the Python data model.
I cannot get my head around what is happening here.
I create my class and use double underscore to mangle the names so that it works as a manner of encapsulation, I create an instance of an object and I have get methods to access them and then I try to set them from outside the class and it looks like instead of attempting to modify the attributes and failing, python is creating new variables, external to the object, that have the same name as the attributes of the object.
Is that what is happening? If not, what?
class Person:
def __init__(self, __name, __position):
self.__name = __name
self.__position = __position
def talk(self, aStatement):
print(aStatement)
def walk(self, aPosition):
self.__position = aPosition
def getPosition(self):
return self.__position
def getName(self):
return self.__name
person1 = Person("Mr. Table", ["Room 115", [13, 20]])
print(person1.getPosition())
person1.walk(["Room 117", [0, 0]])
print(person1.getPosition())
person1.name = "Pedro"
person1.position = ["Room XXX", [13, 20]]
print(person1.name)
print(person1.getName())
print(person1.position)
print(person1.getPosition())
Python's name mangling is very simple It transforms self.__name into self.__Person_name when used in the Person class (when used in other classes it uses those other class's names instead of Person).
Name mangling is intended to help you avoid accidental name conflicts, especially for mixin classes that can't know in advance what other attributes will exist on their instances.
It's not really intended for encapsulation or making variables "private". Python's designers generally don't think that's very useful. The way to avoid other programmers using your class in the wrong way is to document your code well, not to declare things private or jump through a bunch of hoops to lock them out.
If you really do want to control the access to attributes on your class, consider using property (to control assignment to a single attribute) or __setattr__ (to control assignment to all attributes of a class). Often though it's not necessary to prevent attribute assignments, as long as the documentation clearly states what attributes are part of the public API (and what values are acceptable for each). If you find a clear need to validate the value of a specific attribute after the API is being used, you can still go back and turn it into a property after the fact. (The inability to refactor attribute access is the main reason other languages like C++ and Java tend to say you should always use getter and setter methods instead.)
By default, the backing store of Python instances are dictionaries, which can be manipulated pretty much like, well dictionaries. You can get or set any property you want and add new ones. To limit this free-for-all a bit, you can use decorators to designate your accessors:
class Person:
def __init__(self, name, position):
self.__name = name
self.__position = position
#property
def name(self):
return self.__name
#property
def position(self):
return self.__position
#position.setter
def position(self, aPosition):
self.__position = aPosition
person1 = Person("Mr. Table", ["Room 115", [13, 20]])
print(person1.position)
person1.position = ["Room 117", [0, 0]]
print(person1.position)
person1.name = "Pedro" # this will crash the code as it's "readonly"
person1.position = ["Room XXX", [13, 20]]
print(person1.name)
print(person1.position)
Although we'd expect the accessor person1.position(), it is now person1.person which can be evaluated or set. The decorator functions set this up for you. It looks like a simple property but there's actually code behind it. Since we didn't include a setter for 'name', you can access it via person1.name but you can't set it that way.
You might read about __slots__ and be tempted to use it to limit access to the underlying backing store. This feature is really only for space savings on classes with huge numbers of instances and trying to use it to constrain the behavior of your object will only lead to tears.
This is a two-part query, which broadly relates to class attributes referencing mutable and immutable objects, and how these should be dealt with in code design. I have abstracted away the details to provide an example class below.
In this example, the class is designed for two instances which, through an instance method, can access a class attribute that references a mutable object (a list in this case), each can “take” (by mutating the object) elements of this object into their own instance attribute (by mutating the object it references). If one instance “takes” an element of the class attribute, that element is subsequently unavailable to the other instance, which is the effect I wish to achieve. I find this a convenient way of avoiding the use of class methods, but is it bad practice?
Also in this example, there is a class method that reassigns an immutable object (a Boolean value, in this case) to a class attribute based on the state of an instance attribute. I can achieve this by using a class method with cls as the first argument and self as the second argument, but I’m not sure if this is correct. On the other hand, perhaps this is how I should be dealing with the first part of this query?
class Foo(object):
mutable_attr = ['1', '2']
immutable_attr = False
def __init__(self):
self.instance_attr = []
def change_mutable(self):
self.instance_attr.append(self.mutable_attr[0])
self.mutable_attr.remove(self.mutable_attr[0])
#classmethod
def change_immutable(cls, self):
if len(self.instance_attr) == 1:
cls.immutable_attr = True
eggs = Foo()
spam = Foo()
If you want a class-level attribute (which, as you say, is "visible" to all instances of this class) using a class method like you show is fine. This is, mostly, a question of style and there are no clear answers here. So what you show is fine.
I just want to point out that you don't have to use a class method to accomplish your goal. To accomplish your goal this is also perfectly fine (and in my opinion, more standard):
class Foo(object):
# ... same as it ever was ...
def change_immutable(self):
"""If instance has list length of 1, change immutable_attr for all insts."""
if len(self.instance_attr) == 1:
type(self).immutable_attr = True
Or even:
def change_immutable(self):
"""If instance has list length of 1, change immutable_attr for all insts."""
if len(self.instance_attr) == 1:
Foo.immutable_attr = True
if that's what you want to do. The major point being that you are not forced into using a class method to get/set class level attributes.
The type builtin function (https://docs.python.org/2/library/functions.html#type) simply returns the class of an instance. For new style classes (most classes nowadays, ones that ultimately descend from object) type(self) is the same as self.__class__, but using type is the more idiomatic way to access an object's type.
You use type when you want to write code that gets an object's ultimate type, even if it's subclassed. This may or may not be what you want to do. For example, say you have this:
class Baz(Foo):
pass
bazzer = Baz()
bazzer.change_mutable()
bazzer.change_immutable()
Then the code:
type(self).immutable_attr = True
Changes the immutable_attr on the Baz class, not the Foo class. That may or may not be what you want -- just be aware that only objects that descend from Baz see this. If you want to make it visible to all descendants of Foo, then the more appropriate code is:
Foo.immutable_attr = True
Hope this helps -- this question is a good one but a bit open ended. Again, major point being you are not forced to use class methods to set/get class attrs -- but not that there's anything wrong with that either :)
Just finally note the way you first wrote it:
#classmethod
def change_immutable(cls, self):
if len(self.instance_attr) == 1:
cls.immutable_attr = True
Is like doing the:
type(self).immutable_attr = True
way, because the cls variable will not necessarily be Foo if it's subclassed. If you for sure want to set it for all instances of Foo, then just setting the Foo class directly:
Foo.immutable_attr = True
is the way to go.
This is one possibility:
class Foo(object):
__mutable_attr = ['1', '2']
__immutable_attr = False
def __init__(self):
self.instance_attr = []
def change_mutable(self):
self.instance_attr.append(self.__class__.__mutable_attr.pop(0))
if len(self.instance_attr) == 1:
self.__class__.__immutable_attr = True
#property
def immutable_attr(self):
return self.__class__.__immutable_attr
So a little bit of explanation:
1. I'm making it harder to access class attributes from the outside to protect them from accidental change by prefixing them with double underscore.
2. I'm doing pop() and append() in one line.
3. I'm setting the value for __immutable_attr immediately after modifying __mutable_attr if the condition is met.
4. I'm exposing immutable_attr as read only property to provide easy way to check it's value.
5. I'm using self.__class__ to access class of the instance - it's more readable than type(self) and gives us direct access to attributes with double underscore.
What's the correct idiom for this please?
I want to define an object containing properties which can (optionally) be initialized from a dict (the dict comes from JSON; it may be incomplete). Later on I may modify the properties via setters.
There are actually 13+ properties, and I want to be able to use default getters and setters, but that doesn't seem to work for this case:
But I don't want to have to write explicit descriptors for all of prop1... propn
Also, I'd like to move the default assignments out of __init__() and into the accessors... but then I'd need expicit descriptors.
What's the most elegant solution? (other than move all the setter calls out of __init__() and into a method/classmethod _make()?)
[DELETED COMMENT The code for badprop using default descriptor was due to comment by a previous SO user, who gave the impression it gives you a default setter. But it doesn't - the setter is undefined and it necessarily throws AttributeError.]
class DubiousPropertyExample(object):
def __init__(self,dct=None):
self.prop1 = 'some default'
self.prop2 = 'other default'
#self.badprop = 'This throws AttributeError: can\'t set attribute'
if dct is None: dct = dict() # or use defaultdict
for prop,val in dct.items():
self.__setattr__(prop,val)
# How do I do default property descriptors? this is wrong
##property
#def badprop(self): pass
# Explicit descriptors for all properties - yukk
#property
def prop1(self): return self._prop1
#prop1.setter
def prop1(self,value): self._prop1 = value
#property
def prop2(self): return self._prop2
#prop2.setter
def prop2(self,value): self._prop2 = value
dub = DubiousPropertyExample({'prop2':'crashandburn'})
print dub.__dict__
# {'_prop2': 'crashandburn', '_prop1': 'some default'}
If you run this with line 5 self.badprop = ... uncommented, it fails:
self.badprop = 'This throws AttributeError: can\'t set attribute'
AttributeError: can't set attribute
[As ever, I read the SO posts on descriptors, implicit descriptors, calling them from init]
I think you're slightly misunderstanding how properties work. There is no "default setter". It throws an AttributeError on setting badprop not because it doesn't yet know that badprop is a property rather than a normal attribute (if that were the case it would just set the attribute with no error, because that's now normal attributes behave), but because you haven't provided a setter for badprop, only a getter.
Have a look at this:
>>> class Foo(object):
#property
def foo(self):
return self._foo
def __init__(self):
self._foo = 1
>>> f = Foo()
>>> f.foo = 2
Traceback (most recent call last):
File "<pyshell#12>", line 1, in <module>
f.foo = 2
AttributeError: can't set attribute
You can't set such an attribute even from outside of __init__, after the instance is constructed. If you just use #property, then what you have is a read-only property (effectively a method call that looks like an attribute read).
If all you're doing in your getters and setters is redirecting read/write access to an attribute of the same name but with an underscore prepended, then by far the simplest thing to do is get rid of the properties altogether and just use normal attributes. Python isn't Java (and even in Java I'm not convinced of the virtue of private fields with the obvious public getter/setter anyway). An attribute that is directly accessible to the outside world is a perfectly reasonable part of your "public" interface. If you later discover that you need to run some code whenever an attribute is read/written you can make it a property then without changing your interface (this is actually what descriptors were originally intended for, not so that we could start writing Java style getters/setters for every single attribute).
If you're actually doing something in the properties other than changing the name of the attribute, and you do want your attributes to be readonly, then your best bet is probably to treat the initialisation in __init__ as directly setting the underlying data attributes with the underscore prepended. Then your class can be straightforwardly initialised without AttributeErrors, and thereafter the properties will do their thing as the attributes are read.
If you're actually doing something in the properties other than changing the name of the attribute, and you want your attributes to be readable and writable, then you'll need to actually specify what happens when you get/set them. If each attribute has independent custom behaviour, then there is no more efficient way to do this than explicitly providing a getter and a setter for each attribute.
If you're running exactly the same (or very similar) code in every single getter/setter (and it's not just adding an underscore to the real attribute name), and that's why you object to writing them all out (rightly so!), then you may be better served by implementing some of __getattr__, __getattribute__, and __setattr__. These allow you to redirect attribute reading/writing to the same code each time (with the name of the attribute as a parameter), rather than to two functions for each attribute (getting/setting).
It seems like the easiest way to go about this is to just implement __getattr__ and __setattr__ such that they will access any key in your parsed JSON dict, which you should set as an instance member. Alternatively, you could call update() on self.__dict__ with your parsed JSON, but that's not really the best way to go about things, as it means your input dict could potentially trample members of your instance.
As to your setters and getters, you should only be creating them if they actually do something special other than directly set or retrieve the value in question. Python isn't Java (or C++ or anything else), you shouldn't try to mimic the private/set/get paradigm that is common in those languages.
I simply put the dict in the local scope and get/set there my properties.
class test(object):
def __init__(self,**kwargs):
self.kwargs = kwargs
#self.value = 20 asign from init is possible
#property
def value(self):
if self.kwargs.get('value') == None:
self.kwargs.update(value=0)#default
return self.kwargs.get('value')
#value.setter
def value(self,v):
print(v) #do something with v
self.kwargs.update(value=v)
x = test()
print(x.value)
x.value = 10
x.value = 5
Output
0
10
5
I come from Java, so I'm getting confused here.
class Sample(object):
x = 100 # class var?
def __init__(self, value):
self.y = value # instance var?
z = 300 # private var? how do we access this outside Sample?
What is the difference between the 3 variable declarations?
class Sample(object):
x = 100
_a = 1
__b = 11
def __init__(self, value):
self.y = value
self._c = 'private'
self.__d = 'more private'
z = 300
In this example:
x is class variable,
_a is private class variable (by naming convention),
__b is private class variable (mangled by interpreter),
y is instance variable,
_c is private instance variable (by naming convention),
__d is private instance variable (mangled by interpreter),
z is local variable within scope of __init__ method.
In case of single underscore in names, it's strictly a convention. It is still possible to access these variables. In case of double underscore names, they are mangled. It's still possible to circumvent that.
#vartec is spot on. Since you come from Java, however, some clarification is in order:
Java has public and private members. Access to private members is strictly forbidden, and this is enforced by the language.
Python only has public members. There is no way to enforce something like java's private keyword.
If you want to declare that something is an implementation detail only and shouldn't be relied on by other code, you prefix it with a single underscore - _variable, _function(). This is a hint to other programmers that they shouldn't use that variable or that function, but it is not enforced.
It might seem like a feature has been omitted from Python, but the culture in Python is "everyone is a consenting adult". If you tell other programmers that something is "private" by prefixing it with an underscore, they will generally take the hint.
You seem to be getting the hang of it. The only one that you were completely wrong about is z = 300. This is a name that is local to the __init__ method. Python never inserts self for you in the same manner that C++ and Java will assume this where it can.
One thing to remember as you continue learning Python is that member functions can always be executed as class members. Consider the following:
>>> class Sample(object):
... def __init__(self, value):
... self.value = value
... def get_value(self):
... return self.value
...
>>> s = Sample(1)
>>> t = Sample(2)
>>> s.get_value()
1
>>> Sample.get_value(s)
1
>>> t.__class__.get_value(s)
1
The last three examples all call the member function of the s object. The last two use the fact that get_value is just an attribute of the Sample class that expects to receive an instance of Sample as the argument.