I found out that it is possible to assign to class variables after class definitions and that methods are technically class variables. So I tried the following, which appeared to work.
class Fruit():
def __init__(self, name, price):
self.name = name
self.price = price
a = Fruit('apple', 5)
Fruit.__init__ = lambda self: None
b = Fruit()
Can something like this potentially break things? On the other hand, is there a practical situation where this can be useful?
Of course it can and will break things. Any other code that tries to initialize a Fruit with a name and a price will now raise an exception, as the replaced constructor doesn't accept those parameters.
In general, the only practical situation is mocking/patching for tests (or certain, very rare runtime cases where there is no other way). However, that patching is best done with a library to deal with it, e.g. the standard library's unittest.mock.
It is very dangerous to have direct access to class instances for many reasons.
1- Changing the Name of an Instance Variable
The first problem with direct access is that changing the name of an
instance variable will break any client code that uses the original name directly. if the developer changes the name of an instance
variable in the class from self.originalName to self.newName, then any client software that uses the original name directly will break.
2- Changing an Instance Variable into a Calculation
A second situation where direct access is problematic is when the code of
a class needs to change to meet new requirements. Suppose that when
writing a class, you use an instance variable to represent a piece of data,
but the functionality changes so that you need an algorithm to compute a
value instead.
3- Validating Data
The third reason to avoid direct access when setting a value is that client
code can too easily set an instance variable to an invalid value. A better
approach is to call a method in the class, whose job is to set the value. As
the developer, you can include validation code in that method to ensure
that the value being set is appropriate.
So, it is always better to use getters and setters methods just in case an instance variable needs to be accessed from outside the class.
But, there are certain circumstances where it is safe to use direct access: when it is absolutely clear what the instance variable means, little or no validation of the data is needed, and there is no
chance that the name will ever change. A good example of this is the Rect
(rectangle) class in the pygame package. A rectangle in pygame is defined
using four values—x, y, width, and height—like this:
oRectangle = pygame.Rect(10, 20, 300, 300)
After creating that rectangle object, using oRectangle.x, oRectangle.y,
oRectangle.width, and oRectangle.height directly as variables seems acceptable
Source: Object-oriented python by Irv Kalb
I am designing an RPG and would like to have the ability to attach classes to each other. What I'm looking to do is have say an Item class. The weapon class would inherit from it. A sword would be an instance of the weapon class. I want to then be able to attach properties to the sword. These properties would be other classes. For example I could attach the container class to it and the sword (only that instance of the sword) would become a container. I could also maybe attach something like an enchantment to that sword.
For a bonus it would be nice to be able to combine instances as well. So instead of having to have a fire_enchantment class I could just make it an instance of Enchantment and attach it to the sword instance.
I've googled around and haven't been able to find a design pattern that fits this. I recall studying one but can't remember what it was called (Was a few years ago)
I'm at a loss of of which design pattern allowed this. The combining of multiple classes dynamically.
I think you seem to understand the idea of inheritance in python (e.g. class Subclass(Superclass): ) so I won't cover that here.
The classes you want to 'attach' can be treated as any other variable within the Weapon class.
class Enchantment(object):
def __init__(self, name, type):
self.name = name
self.type = type
# can define more member variables here, and set with setter methods
# more Enchantment methods here...
class Weapon(object):
def __init__(self, name, type)
self.name = name
self.type = type
self.enchantments = []
# more Weapon member variables here
def add_enchantment(self, enchantment):
# any logic you need to check when adding an enchantment
self.enchantments.append(enchantment)
Then in wherever your game code is running you could do
sword = Weapon('My sword', 'sword')
fire_enchantment = Enchantment('Fireball', 'fire')
sword.add_enchantment(fire_enchantment)
You can then add methods on the Weapon class to do things with the enchantments/add certain logic.
The enchantment is still an instance of an object, so if you access it in the list (maybe by identifying it by its name, or looping through the list) all its methods and variables are accessible. You just need to build an interface to it via the Weapon class e.g. get_enchantment(self, name), or have other methods in the Weapon class interact with it (e.g. when you attack you might loop through the enchantments and see if they add any extra damage).
There's obviously design considerations about how you design your classes (the above was thrown together for example and doesn't include inheritance). For example you might only allow one enchantment per weapon, in which case you shouldn't use a list in the weapon object, but could just set self.enchantment = None in the constructor, and set self.enchantment = enchantment in the add_enchantment method.
The point I'm making is you can treat instances of Enchantment or other 'attachable' classes as variables. Just make sure you create an instance of the class e.g. fire_enchantment = Enchantment('Fireball', 'fire').
There's plenty of reading out there in terms of inheritance and OOP in general. Hope this helps!
Additional answer from OP
I think the Mixin pattern is what I was looking for. After digging around more I found this post which has an answer for dynamic mixin's.
I'm coming from the Java world and reading Bruce Eckels' Python 3 Patterns, Recipes and Idioms.
While reading about classes, it goes on to say that in Python there is no need to declare instance variables. You just use them in the constructor, and boom, they are there.
So for example:
class Simple:
def __init__(self, s):
print("inside the simple constructor")
self.s = s
def show(self):
print(self.s)
def showMsg(self, msg):
print(msg + ':', self.show())
If that’s true, then any object of class Simple can just change the value of variable s outside of the class.
For example:
if __name__ == "__main__":
x = Simple("constructor argument")
x.s = "test15" # this changes the value
x.show()
x.showMsg("A message")
In Java, we have been taught about public/private/protected variables. Those keywords make sense because at times you want variables in a class to which no one outside the class has access to.
Why is that not required in Python?
It's cultural. In Python, you don't write to other classes' instance or class variables. In Java, nothing prevents you from doing the same if you really want to - after all, you can always edit the source of the class itself to achieve the same effect. Python drops that pretence of security and encourages programmers to be responsible. In practice, this works very nicely.
If you want to emulate private variables for some reason, you can always use the __ prefix from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the namespace that contains them (although you can get around it if you're determined enough, just like you can get around Java's protections if you work at it).
By the same convention, the _ prefix means _variable should be used internally in the class (or module) only, even if you're not technically prevented from accessing it from somewhere else. You don't play around with another class's variables that look like __foo or _bar.
Private variables in Python is more or less a hack: the interpreter intentionally renames the variable.
class A:
def __init__(self):
self.__var = 123
def printVar(self):
print self.__var
Now, if you try to access __var outside the class definition, it will fail:
>>> x = A()
>>> x.__var # this will return error: "A has no attribute __var"
>>> x.printVar() # this gives back 123
But you can easily get away with this:
>>> x.__dict__ # this will show everything that is contained in object x
# which in this case is something like {'_A__var' : 123}
>>> x._A__var = 456 # you now know the masked name of private variables
>>> x.printVar() # this gives back 456
You probably know that methods in OOP are invoked like this: x.printVar() => A.printVar(x). If A.printVar() can access some field in x, this field can also be accessed outside A.printVar()... After all, functions are created for reusability, and there isn't any special power given to the statements inside.
As correctly mentioned by many of the comments above, let's not forget the main goal of Access Modifiers: To help users of code understand what is supposed to change and what is supposed not to. When you see a private field you don't mess around with it. So it's mostly syntactic sugar which is easily achieved in Python by the _ and __.
Python does not have any private variables like C++ or Java does. You could access any member variable at any time if wanted, too. However, you don't need private variables in Python, because in Python it is not bad to expose your classes' member variables. If you have the need to encapsulate a member variable, you can do this by using "#property" later on without breaking existing client code.
In Python, the single underscore "_" is used to indicate that a method or variable is not considered as part of the public API of a class and that this part of the API could change between different versions. You can use these methods and variables, but your code could break, if you use a newer version of this class.
The double underscore "__" does not mean a "private variable". You use it to define variables which are "class local" and which can not be easily overridden by subclasses. It mangles the variables name.
For example:
class A(object):
def __init__(self):
self.__foobar = None # Will be automatically mangled to self._A__foobar
class B(A):
def __init__(self):
self.__foobar = 1 # Will be automatically mangled to self._B__foobar
self.__foobar's name is automatically mangled to self._A__foobar in class A. In class B it is mangled to self._B__foobar. So every subclass can define its own variable __foobar without overriding its parents variable(s). But nothing prevents you from accessing variables beginning with double underscores. However, name mangling prevents you from calling this variables /methods incidentally.
I strongly recommend you watch Raymond Hettinger's Python's class development toolkit from PyCon 2013, which gives a good example why and how you should use #property and "__"-instance variables.
If you have exposed public variables and you have the need to encapsulate them, then you can use #property. Therefore you can start with the simplest solution possible. You can leave member variables public unless you have a concrete reason to not do so. Here is an example:
class Distance:
def __init__(self, meter):
self.meter = meter
d = Distance(1.0)
print(d.meter)
# prints 1.0
class Distance:
def __init__(self, meter):
# Customer request: Distances must be stored in millimeters.
# Public available internals must be changed.
# This would break client code in C++.
# This is why you never expose public variables in C++ or Java.
# However, this is Python.
self.millimeter = meter * 1000
# In Python we have #property to the rescue.
#property
def meter(self):
return self.millimeter *0.001
#meter.setter
def meter(self, value):
self.millimeter = value * 1000
d = Distance(1.0)
print(d.meter)
# prints 1.0
There is a variation of private variables in the underscore convention.
In [5]: class Test(object):
...: def __private_method(self):
...: return "Boo"
...: def public_method(self):
...: return self.__private_method()
...:
In [6]: x = Test()
In [7]: x.public_method()
Out[7]: 'Boo'
In [8]: x.__private_method()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-fa17ce05d8bc> in <module>()
----> 1 x.__private_method()
AttributeError: 'Test' object has no attribute '__private_method'
There are some subtle differences, but for the sake of programming pattern ideological purity, it's good enough.
There are examples out there of #private decorators that more closely implement the concept, but your mileage may vary. Arguably, one could also write a class definition that uses meta.
As mentioned earlier, you can indicate that a variable or method is private by prefixing it with an underscore. If you don't feel like this is enough, you can always use the property decorator. Here's an example:
class Foo:
def __init__(self, bar):
self._bar = bar
#property
def bar(self):
"""Getter for '_bar'."""
return self._bar
This way, someone or something that references bar is actually referencing the return value of the bar function rather than the variable itself, and therefore it can be accessed but not changed. However, if someone really wanted to, they could simply use _bar and assign a new value to it. There is no surefire way to prevent someone from accessing variables and methods that you wish to hide, as has been said repeatedly. However, using property is the clearest message you can send that a variable is not to be edited. property can also be used for more complex getter/setter/deleter access paths, as explained here: https://docs.python.org/3/library/functions.html#property
Python has limited support for private identifiers, through a feature that automatically prepends the class name to any identifiers starting with two underscores. This is transparent to the programmer, for the most part, but the net effect is that any variables named this way can be used as private variables.
See here for more on that.
In general, Python's implementation of object orientation is a bit primitive compared to other languages. But I enjoy this, actually. It's a very conceptually simple implementation and fits well with the dynamic style of the language.
The only time I ever use private variables is when I need to do other things when writing to or reading from the variable and as such I need to force the use of a setter and/or getter.
Again this goes to culture, as already stated. I've been working on projects where reading and writing other classes variables was free-for-all. When one implementation became deprecated it took a lot longer to identify all code paths that used that function. When use of setters and getters was forced, a debug statement could easily be written to identify that the deprecated method had been called and the code path that calls it.
When you are on a project where anyone can write an extension, notifying users about deprecated methods that are to disappear in a few releases hence is vital to keep module breakage at a minimum upon upgrades.
So my answer is; if you and your colleagues maintain a simple code set then protecting class variables is not always necessary. If you are writing an extensible system then it becomes imperative when changes to the core is made that needs to be caught by all extensions using the code.
"In java, we have been taught about public/private/protected variables"
"Why is that not required in python?"
For the same reason, it's not required in Java.
You're free to use -- or not use private and protected.
As a Python and Java programmer, I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected.
Why not?
Here's my question "protected from whom?"
Other programmers on my team? They have the source. What does protected mean when they can change it?
Other programmers on other teams? They work for the same company. They can -- with a phone call -- get the source.
Clients? It's work-for-hire programming (generally). The clients (generally) own the code.
So, who -- precisely -- am I protecting it from?
In Python 3, if you just want to "encapsulate" the class attributes, like in Java, you can just do the same thing like this:
class Simple:
def __init__(self, str):
print("inside the simple constructor")
self.__s = str
def show(self):
print(self.__s)
def showMsg(self, msg):
print(msg + ':', self.show())
To instantiate this do:
ss = Simple("lol")
ss.show()
Note that: print(ss.__s) will throw an error.
In practice, Python 3 will obfuscate the global attribute name. It is turning this like a "private" attribute, like in Java. The attribute's name is still global, but in an inaccessible way, like a private attribute in other languages.
But don't be afraid of it. It doesn't matter. It does the job too. ;)
Private and protected concepts are very important. But Python is just a tool for prototyping and rapid development with restricted resources available for development, and that is why some of the protection levels are not so strictly followed in Python. You can use "__" in a class member. It works properly, but it does not look good enough. Each access to such field contains these characters.
Also, you can notice that the Python OOP concept is not perfect. Smalltalk or Ruby are much closer to a pure OOP concept. Even C# or Java are closer.
Python is a very good tool. But it is a simplified OOP language. Syntactically and conceptually simplified. The main goal of Python's existence is to bring to developers the possibility to write easy readable code with a high abstraction level in a very fast manner.
Here's how I handle Python 3 class fields:
class MyClass:
def __init__(self, public_read_variable, private_variable):
self.public_read_variable_ = public_read_variable
self.__private_variable = private_variable
I access the __private_variable with two underscores only inside MyClass methods.
I do read access of the public_read_variable_ with one underscore
outside the class, but never modify the variable:
my_class = MyClass("public", "private")
print(my_class.public_read_variable_) # OK
my_class.public_read_variable_ = 'another value' # NOT OK, don't do that.
So I’m new to Python but I have a background in C# and JavaScript. Python feels like a mix of the two in terms of features. JavaScript also struggles in this area and the way around it here, is to create a closure. This prevents access to data you don’t want to expose by returning a different object.
def print_msg(msg):
# This is the outer enclosing function
def printer():
# This is the nested function
print(msg)
return printer # returns the nested function
# Now let's try calling this function.
# Output: Hello
another = print_msg("Hello")
another()
https://www.programiz.com/python-programming/closure
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#emulating_private_methods_with_closures
About sources (to change the access rights and thus bypass language encapsulation like Java or C++):
You don't always have the sources and even if you do, the sources are managed by a system that only allows certain programmers to access a source (in a professional context). Often, every programmer is responsible for certain classes and therefore knows what he can and cannot do. The source manager also locks the sources being modified and of course, manages the access rights of programmers.
So I trust more in software than in human, by experience. So convention is good, but multiple protections are better, like access management (real private variable) + sources management.
I have been thinking about private class attributes and methods (named members in further reading) since I have started to develop a package that I want to publish. The thought behind it were never to make it impossible to overwrite these members, but to have a warning for those who touch them. I came up with a few solutions that might help. The first solution is used in one of my favorite Python books, Fluent Python.
Upsides of technique 1:
It is unlikely to be overwritten by accident.
It is easily understood and implemented.
Its easier to handle than leading double underscore for instance attributes.
*In the book the hash-symbol was used, but you could use integer converted to strings as well. In Python it is forbidden to use klass.1
class Technique1:
def __init__(self, name, value):
setattr(self, f'private#{name}', value)
setattr(self, f'1{name}', value)
Downsides of technique 1:
Methods are not easily protected with this technique though. It is possible.
Attribute lookups are just possible via getattr
Still no warning to the user
Another solution I came across was to write __setattr__. Pros:
It is easily implemented and understood
It works with methods
Lookup is not affected
The user gets a warning or error
class Demonstration:
def __init__(self):
self.a = 1
def method(self):
return None
def __setattr__(self, name, value):
if not getattr(self, name, None):
super().__setattr__(name, value)
else:
raise ValueError(f'Already reserved name: {name}')
d = Demonstration()
#d.a = 2
d.method = None
Cons:
You can still overwrite the class
To have variables not just constants, you need to map allowed input.
Subclasses can still overwrite methods
To prevent subclasses from overwriting methods you can use __init_subclass__:
class Demonstration:
__protected = ['method']
def method(self):
return None
def __init_subclass__(cls):
protected_methods = Demonstration.__protected
subclass_methods = dir(cls)
for i in protected_methods:
p = getattr(Demonstration,i)
j = getattr(cls, i)
if not p is j:
raise ValueError(f'Protected method "{i}" was touched')
You see, there are ways to protect your class members, but it isn't any guarantee that users don't overwrite them anyway. This should just give you some ideas. In the end, you could also use a meta class, but this might open up new dangers to encounter. The techniques used here are also very simple minded and you should definitely take a look at the documentation, you can find useful feature to this technique and customize them to your need.
I'm interested in hearing some discussion about class attributes in Python. For example, what is a good use case for class attributes? For the most part, I can not come up with a case where a class attribute is preferable to using a module level attribute. If this is true, then why have them around?
The problem I have with them, is that it is almost too easy to clobber a class attribute value by mistake, and then your "global" value has turned into a local instance attribute.
Feel free to comment on how you would handle the following situations:
Constant values used by a class and/or sub-classes. This may include "magic number" dictionary keys or list indexes that will never change, but possible need one-time initialization.
Default class attribute, that in rare occasions updated for a special instance of the class.
Global data structure used to represent an internal state of a class shared between all instances.
A class that initializes a number of default attributes, not influenced by constructor arguments.
Some Related Posts:
Difference Between Class and Instance Attributes
#4:
I never use class attributes to initialize default instance attributes (the ones you normally put in __init__). For example:
class Obj(object):
def __init__(self):
self.users = 0
and never:
class Obj(object):
users = 0
Why? Because it's inconsistent: it doesn't do what you want when you assign anything but an invariant object:
class Obj(object):
users = []
causes the users list to be shared across all objects, which in this case isn't wanted. It's confusing to split these into class attributes and assignments in __init__ depending on their type, so I always put them all in __init__, which I find clearer anyway.
As for the rest, I generally put class-specific values inside the class. This isn't so much because globals are "evil"--they're not so big a deal as in some languages, because they're still scoped to the module, unless the module itself is too big--but if external code wants to access them, it's handy to have all of the relevant values in one place. For example, in module.py:
class Obj(object):
class Exception(Exception): pass
...
and then:
from module import Obj
try:
o = Obj()
o.go()
except o.Exception:
print "error"
Aside from allowing subclasses to change the value (which isn't always wanted anyway), it means I don't have to laboriously import exception names and a bunch of other stuff needed to use Obj. "from module import Obj, ObjException, ..." gets tiresome quickly.
what is a good use case for class attributes
Case 0. Class methods are just class attributes. This is not just a technical similarity - you can access and modify class methods at runtime by assigning callables to them.
Case 1. A module can easily define several classes. It's reasonable to encapsulate everything about class A into A... and everything about class B into B.... For example,
# module xxx
class X:
MAX_THREADS = 100
...
# main program
from xxx import X
if nthreads < X.MAX_THREADS: ...
Case 2. This class has lots of default attributes which can be modified in an instance. Here the ability to leave attribute to be a 'global default' is a feature, not bug.
class NiceDiff:
"""Formats time difference given in seconds into a form '15 minutes ago'."""
magic = .249
pattern = 'in {0}', 'right now', '{0} ago'
divisions = 1
# there are more default attributes
One creates instance of NiceDiff to use the existing or slightly modified formatting, but a localizer to a different language subclasses the class to implement some functions in a fundamentally different way and redefine constants:
class Разница(NiceDiff): # NiceDiff localized to Russian
'''Из разницы во времени, типа -300, делает конкретно '5 минут назад'.'''
pattern = 'через {0}', 'прям щас', '{0} назад'
Your cases:
constants -- yes, I put them to class. It's strange to say self.CONSTANT = ..., so I don't see a big risk for clobbering them.
Default attribute -- mixed, as above may go to class, but may also go to __init__ depending on the semantics.
Global data structure --- goes to class if used only by the class, but may also go to module, in either case must be very well-documented.
Class attributes are often used to allow overriding defaults in subclasses. For example, BaseHTTPRequestHandler has class constants sys_version and server_version, the latter defaulting to "BaseHTTP/" + __version__. SimpleHTTPRequestHandler overrides server_version to "SimpleHTTP/" + __version__.
Encapsulation is a good principle: when an attribute is inside the class it pertains to instead of being in the global scope, this gives additional information to people reading the code.
In your situations 1-4, I would thus avoid globals as much as I can, and prefer using class attributes, which allow one to benefit from encapsulation.
class Ball:
a = []
def __init__(self):
pass
def add(self,thing):
self.a.append(thing)
def size(self):
print len(self.a)
for i in range(3):
foo = Ball()
foo.add(1)
foo.add(2)
foo.size()
I would expect a return of :
2
2
2
But I get :
2
4
6
Why is this? I've found that by doing a=[] in the init, I can route around this behavior, but I'm less than clear why.
doh
I just figured out why.
In the above case, the a is a class attribute, not a data attribute - those are shared by all Balls(). Commenting out the a=[] and placing it into the init block means that it's a data attribute instead. (And, I couldn't access it then with foo.a, which I shouldn't do anyhow.) It seems like the class attributes act like static attributes of the class, they're shared by all instances.
Whoa.
One question though : CodeCompletion sucks like this. In the foo class, I can't do self.(variable), because it's not being defined automatically - it's being defined by a function. Can I define a class variable and replace it with a data variable?
What you probably want to do is:
class Ball:
def __init__(self):
self.a = []
If you use just a = [], it creates a local variable in the __init__ function, which disappears when the function returns. Assigning to self.a makes it an instance variable which is what you're after.
For a semi-related gotcha, see how you can change the value of default parameters for future callers.
"Can I define a class variable and replace it with a data variable?"
No. They're separate things. A class variable exists precisely once -- in the class.
You could -- to finesse code completion -- start with some class variables and then delete those lines of code after you've written your class. But every time you forget to do that nothing good will happen.
Better is to try a different IDE. Komodo Edit's code completions seem to be sensible.
If you have so many variables with such long names that code completion is actually helpful, perhaps you should make your classes smaller or use shorter names. Seriously.
I find that when you get to a place where code completion is more helpful than annoying, you've exceeded the "keep it all in my brain" complexity threshold. If the class won't fit in my brain, it's too complex.