Although I may be very confused as to what the property() function does, I'm trying to use it to create an attribute which is read-only. Ideally, I'd like to be able to refer to the attribute directly but not be allowed to assign to it. When experimenting, I got this very curious behavior:
>>> class Boo(object):
... def __init__(self, x):
... self.__x = x
... def getx(self):
... return self.__x
... x = property(getx)
...
>>> b = Boo(1)
>>> b.__x = 2
>>> b.getx()
1
>>> b.__x
2
I'd like to add that when I used x and _x as the attribute names, reassigning the attribute caused the getter to return the changed value, i.e. both b.getx() and b.x/b._x gave me 2.
I realize that I'm using x as the property name, though, but when I tried the following I got an AttributeError in my __init__():
>>> class Boo(object):
... def __init__(self, x):
... self.__x = x
... def getx(self):
... return self.__x
... __x = property(getx)
...
>>> b = Boo(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
AttributeError: can't set attribute
The problem here has nothing to do with property, but with double-underscore attributes, which are subject to what's called "Private name mangling".
>>> b = Boo(1)
>>> '__x' in dir(b)
False
>>> '_Boo__x' in dir(b)
True
So, when you do this:
>>> b.__x = 2
You're not changing the value of the attribute the getx function is looking at, you're creating a new attribute.
If you just use a name for the attribute that doesn't start with two underscores—such as _x—everything works as you intended.
As a general rule, use a single underscore for "advisory private"—as in, "users of this object probably shouldn't care about this value", and a double underscore only when you actually need mangling (because of complex inheritance issues that rarely come up).
What if you want "real private", like C++ or Java? You can't have it. If you hide or protect the attribute well enough, someone will just monkeypatch the getx method or the x property. So, Python doesn't give a way to hide attributes.
Your problem is that using double underscore attribute names mangles the name. So when you are dealing with __x inside of your class definition, outside of the class it actually looks like _Boo__x. That is,
_ + (class name) + (double underscore attribute name)
To demonstrate,
>>> b = Boo(1)
>>> b.__x = 2
>>> b.getx()
1
>>> b.x # NOTE: same as calling getx
1
>>> b.__x # why didn't x return 2 if we changed it?
2
>>> b._Boo__x # because it's actually saved in this attribute
1
>>> b._Boo__x = 3 # setting it here then works
>>> b.x
3
>>> b.getx()
3
Really just wanted to comment (rather than answer) your question. I think you will find the following informative:
>>> b = Boo(1)
>>> print b.__dict__
{'_Boo__X': 1}
>>> b.__x = 2
>>> print b.__dict__
{'__x': 2, '_Boo__X': 1}
Might provide a hint as to the behavior (which I do not understand sufficiently well to explain).
It's because you don't have a setx function defined.
#!/usr/bin/python
class Boo(object):
def __init__(self, initialize_x):
self.x = initialize_x
def getx(self):
print
print '\t\tgetx: returning the value of x', self.__x
print
return self.__x
def setx(self, new_x):
print
print '\t\tsetx: setting x to new value',new_x
print
self.__x = new_x
x = property(getx, setx)
print '1 ########'
print
print '\tinitializing Boo object with a default x of 20'
print
o = Boo(20)
print '\treading the value of x through property o.x'
t = o.x
print
print '2 ########'
print
print '\tsetting x\'s value through the property o.x'
o.x = 100
print
print '3 ########'
print
print '\treading the value of x through the property o.x'
t = o.x
When run produces:
1 ########
initializing Boo object with a default x of 20
setx: setting x to new value 20
reading the value of x through property o.x
getx: returning the value of x 20
2 ########
setting x's value through the property o.x
setx: setting x to new value 100
3 ########
reading the value of x through the property o.x
getx: returning the value of x 100
I use property as decorator. It convenient for calculating data.
May be for make read-only attribute better use magic function as __set__ and __get__?
Should I give my class members default values like this:
class Foo:
num = 1
or like this?
class Foo:
def __init__(self):
self.num = 1
In this question I discovered that in both cases,
bar = Foo()
bar.num += 1
is a well-defined operation.
I understand that the first method will give me a class variable while the second one will not. However, if I do not require a class variable, but only need to set a default value for my instance variables, are both methods equally good? Or one of them more 'pythonic' than the other?
One thing I've noticed is that in the Django tutorial, they use the second method to declare Models. Personally I think the second method is more elegant, but I'd like to know what the 'standard' way is.
Extending bp's answer, I wanted to show you what he meant by immutable types.
First, this is okay:
>>> class TestB():
... def __init__(self, attr=1):
... self.attr = attr
...
>>> a = TestB()
>>> b = TestB()
>>> a.attr = 2
>>> a.attr
2
>>> b.attr
1
However, this only works for immutable (unchangable) types. If the default value was mutable (meaning it can be replaced), this would happen instead:
>>> class Test():
... def __init__(self, attr=[]):
... self.attr = attr
...
>>> a = Test()
>>> b = Test()
>>> a.attr.append(1)
>>> a.attr
[1]
>>> b.attr
[1]
>>>
Note that both a and b have a shared attribute. This is often unwanted.
This is the Pythonic way of defining default values for instance variables, when the type is mutable:
>>> class TestC():
... def __init__(self, attr=None):
... if attr is None:
... attr = []
... self.attr = attr
...
>>> a = TestC()
>>> b = TestC()
>>> a.attr.append(1)
>>> a.attr
[1]
>>> b.attr
[]
The reason my first snippet of code works is because, with immutable types, Python creates a new instance of it whenever you want one. If you needed to add 1 to 1, Python makes a new 2 for you, because the old 1 cannot be changed. The reason is mostly for hashing, I believe.
The two snippets do different things, so it's not a matter of taste but a matter of what's the right behaviour in your context. Python documentation explains the difference, but here are some examples:
Exhibit A
class Foo:
def __init__(self):
self.num = 1
This binds num to the Foo instances. Change to this field is not propagated to other instances.
Thus:
>>> foo1 = Foo()
>>> foo2 = Foo()
>>> foo1.num = 2
>>> foo2.num
1
Exhibit B
class Bar:
num = 1
This binds num to the Bar class. Changes are propagated!
>>> bar1 = Bar()
>>> bar2 = Bar()
>>> bar1.num = 2 #this creates an INSTANCE variable that HIDES the propagation
>>> bar2.num
1
>>> Bar.num = 3
>>> bar2.num
3
>>> bar1.num
2
>>> bar1.__class__.num
3
Actual answer
If I do not require a class variable, but only need to set a default value for my instance variables, are both methods equally good? Or one of them more 'pythonic' than the other?
The code in exhibit B is plain wrong for this: why would you want to bind a class attribute (default value on instance creation) to the single instance?
The code in exhibit A is okay.
If you want to give defaults for instance variables in your constructor I would however do this:
class Foo:
def __init__(self, num = None):
self.num = num if num is not None else 1
...or even:
class Foo:
DEFAULT_NUM = 1
def __init__(self, num = None):
self.num = num if num is not None else DEFAULT_NUM
...or even: (preferrable, but if and only if you are dealing with immutable types!)
class Foo:
def __init__(self, num = 1):
self.num = num
This way you can do:
foo1 = Foo(4)
foo2 = Foo() #use default
Using class members to give default values works very well just so long as you are careful only to do it with immutable values. If you try to do it with a list or a dict that would be pretty deadly. It also works where the instance attribute is a reference to a class just so long as the default value is None.
I've seen this technique used very successfully in repoze which is a framework that runs on top of Zope. The advantage here is not just that when your class is persisted to the database only the non-default attributes need to be saved, but also when you need to add a new field into the schema all the existing objects see the new field with its default value without any need to actually change the stored data.
I find it also works well in more general coding, but it's a style thing. Use whatever you are happiest with.
With dataclasses, a feature added in Python 3.7, there is now yet another (quite convenient) way to achieve setting default values on class instances. The decorator dataclass will automatically generate a few methods on your class, such as the constructor. As the documentation linked above notes, "[t]he member variables to use in these generated methods are defined using PEP 526 type annotations".
Considering OP's example, we could implement it like this:
from dataclasses import dataclass
#dataclass
class Foo:
num: int = 0
When constructing an object of this class's type we could optionally overwrite the value.
print('Default val: {}'.format(Foo()))
# Default val: Foo(num=0)
print('Custom val: {}'.format(Foo(num=5)))
# Custom val: Foo(num=5)
Using class members for default values of instance variables is not a good idea, and it's the first time I've seen this idea mentioned at all. It works in your example, but it may fail in a lot of cases. E.g., if the value is mutable, mutating it on an unmodified instance will alter the default:
>>> class c:
... l = []
...
>>> x = c()
>>> y = c()
>>> x.l
[]
>>> y.l
[]
>>> x.l.append(10)
>>> y.l
[10]
>>> c.l
[10]
You can also declare class variables as None which will prevent propagation. This is useful when you need a well defined class and want to prevent AttributeErrors.
For example:
>>> class TestClass(object):
... t = None
...
>>> test = TestClass()
>>> test.t
>>> test2 = TestClass()
>>> test.t = 'test'
>>> test.t
'test'
>>> test2.t
>>>
Also if you need defaults:
>>> class TestClassDefaults(object):
... t = None
... def __init__(self, t=None):
... self.t = t
...
>>> test = TestClassDefaults()
>>> test.t
>>> test2 = TestClassDefaults([])
>>> test2.t
[]
>>> test.t
>>>
Of course still follow the info in the other answers about using mutable vs immutable types as the default in __init__.
I played around with overloading or masking classes in Python. Do the following code examples create equivalent classes?
class CustASample(object):
def __init__(self):
self.__class__.__name__ = "Sample"
def doSomething(self):
dummy = 1
and
class Sample(object):
def doSomething(self):
dummy = 1
EDIT: From the comments and and the good answer by gs, it occured to me, that I really wanted to ask: What "attributes" make these classes differ?
Because
>>> dir(a) == dir(b)
True
and
>>> print Sample
<class '__main__.Sample'>
>>> print CustASample
<class '__main__.Sample'>
but
>>> Sample == CustASample
False
No, they are still different.
a = CustASample()
b = Sample()
a.__class__ is b.__class__
-> False
Here's how you could do it:
class A(object):
def __init__(self):
self.__class__ = B
class B(object):
def bark(self):
print "Wuff!"
a = A()
b = B()
a.__class__ is b.__class__
-> True
a.bark()
-> Wuff!
b.bark()
-> Wuff!
Usually you would do it in the __new__ method instead of in __init__:
class C(object):
def __new__(cls):
return A()
To answer your updated question:
>>> a = object()
>>> b = object()
>>> a == b
False
Why would a not be equal to b, since both are just plain objects without attributes?
Well, that answer is simple. The == operator invokes __eq__, if it's available. But unless you define it yourself it's not. Instead of it a is b gets used.
is compares the ids of the objects. (In CPython the memory address.) You can get the id of an object like this:
>>> id(a)
156808
Classes also, not only instances, are objects. For example, you can get id(Sample). Try it and see that these two numbers differ, as differ the classes' memory locations. They are not the same object. It's like asking whether [] is [].
EDIT: Too late and the explanation by gs is better.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
The purpose of my question is to strengthen my knowledge base with Python and get a better picture of it, which includes knowing its faults and surprises. To keep things specific, I'm only interested in the CPython interpreter.
I'm looking for something similar to what learned from my PHP landmines
question where some of the answers were well known to me but a couple were borderline horrifying.
Update:
Apparently one maybe two people are upset that I asked a question that's already partially answered outside of Stack Overflow. As some sort of compromise here's the URL
http://www.ferg.org/projects/python_gotchas.html
Note that one or two answers here already are original from what was written on the site referenced above.
Expressions in default arguments are calculated when the function is defined, not when it’s called.
Example: consider defaulting an argument to the current time:
>>>import time
>>> def report(when=time.time()):
... print when
...
>>> report()
1210294387.19
>>> time.sleep(5)
>>> report()
1210294387.19
The when argument doesn't change. It is evaluated when you define the function. It won't change until the application is re-started.
Strategy: you won't trip over this if you default arguments to None and then do something useful when you see it:
>>> def report(when=None):
... if when is None:
... when = time.time()
... print when
...
>>> report()
1210294762.29
>>> time.sleep(5)
>>> report()
1210294772.23
Exercise: to make sure you've understood: why is this happening?
>>> def spam(eggs=[]):
... eggs.append("spam")
... return eggs
...
>>> spam()
['spam']
>>> spam()
['spam', 'spam']
>>> spam()
['spam', 'spam', 'spam']
>>> spam()
['spam', 'spam', 'spam', 'spam']
You should be aware of how class variables are handled in Python. Consider the following class hierarchy:
class AAA(object):
x = 1
class BBB(AAA):
pass
class CCC(AAA):
pass
Now, check the output of the following code:
>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
>>> print AAA.x, BBB.x, CCC.x
3 2 3
Surprised? You won't be if you remember that class variables are internally handled as dictionaries of a class object. For read operations, if a variable name is not found in the dictionary of current class, the parent classes are searched for it. So, the following code again, but with explanations:
# AAA: {'x': 1}, BBB: {}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
# AAA: {'x': 1}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
# AAA: {'x': 3}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
3 2 3
Same goes for handling class variables in class instances (treat this example as a continuation of the one above):
>>> a = AAA()
# a: {}, AAA: {'x': 3}
>>> print a.x, AAA.x
3 3
>>> a.x = 4
# a: {'x': 4}, AAA: {'x': 3}
>>> print a.x, AAA.x
4 3
Loops and lambdas (or any closure, really): variables are bound by name
funcs = []
for x in range(5):
funcs.append(lambda: x)
[f() for f in funcs]
# output:
# 4 4 4 4 4
A work around is either creating a separate function or passing the args by name:
funcs = []
for x in range(5):
funcs.append(lambda x=x: x)
[f() for f in funcs]
# output:
# 0 1 2 3 4
Dynamic binding makes typos in your variable names surprisingly hard to find. It's easy to spend half an hour fixing a trivial bug.
EDIT: an example...
for item in some_list:
... # lots of code
... # more code
for tiem in some_other_list:
process(item) # oops!
One of the biggest surprises I ever had with Python is this one:
a = ([42],)
a[0] += [43, 44]
This works as one might expect, except for raising a TypeError after updating the first entry of the tuple! So a will be ([42, 43, 44],) after executing the += statement, but there will be an exception anyway. If you try this on the other hand
a = ([42],)
b = a[0]
b += [43, 44]
you won't get an error.
try:
int("z")
except IndexError, ValueError:
pass
reason this doesn't work is because IndexError is the type of exception you're catching, and ValueError is the name of the variable you're assigning the exception to.
Correct code to catch multiple exceptions is:
try:
int("z")
except (IndexError, ValueError):
pass
There was a lot of discussion on hidden language features a while back: hidden-features-of-python. Where some pitfalls were mentioned (and some of the good stuff too).
Also you might want to check out Python Warts.
But for me, integer division's a gotcha:
>>> 5/2
2
You probably wanted:
>>> 5*1.0/2
2.5
If you really want this (C-like) behaviour, you should write:
>>> 5//2
2
As that will work with floats too (and it will work when you eventually go to Python 3):
>>> 5*1.0//2
2.0
GvR explains how integer division came to work how it does on the history of Python.
Not including an __init__.py in your packages. That one still gets me sometimes.
List slicing has caused me a lot of grief. I actually consider the following behavior a bug.
Define a list x
>>> x = [10, 20, 30, 40, 50]
Access index 2:
>>> x[2]
30
As you expect.
Slice the list from index 2 and to the end of the list:
>>> x[2:]
[30, 40, 50]
As you expect.
Access index 7:
>>> x[7]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
Again, as you expect.
However, try to slice the list from index 7 until the end of the list:
>>> x[7:]
[]
???
The remedy is to put a lot of tests when using list slicing. I wish I'd just get an error instead. Much easier to debug.
The only gotcha/surprise I've dealt with is with CPython's GIL. If for whatever reason you expect python threads in CPython to run concurrently... well they're not and this is pretty well documented by the Python crowd and even Guido himself.
A long but thorough explanation of CPython threading and some of the things going on under the hood and why true concurrency with CPython isn't possible.
http://jessenoller.com/2009/02/01/python-threads-and-the-global-interpreter-lock/
James Dumay eloquently reminded me of another Python gotcha:
Not all of Python's “included batteries” are wonderful.
James’ specific example was the HTTP libraries: httplib, urllib, urllib2, urlparse, mimetools, and ftplib. Some of the functionality is duplicated, and some of the functionality you'd expect is completely absent, e.g. redirect handling. Frankly, it's horrible.
If I ever have to grab something via HTTP these days, I use the urlgrabber module forked from the Yum project.
Floats are not printed at full precision by default (without repr):
x = 1.0 / 3
y = 0.333333333333
print x #: 0.333333333333
print y #: 0.333333333333
print x == y #: False
repr prints too many digits:
print repr(x) #: 0.33333333333333331
print repr(y) #: 0.33333333333300003
print x == 0.3333333333333333 #: True
Unintentionally mixing oldstyle and newstyle classes can cause seemingly mysterious errors.
Say you have a simple class hierarchy consisting of superclass A and subclass B. When B is instantiated, A's constructor must be called first. The code below correctly does this:
class A(object):
def __init__(self):
self.a = 1
class B(A):
def __init__(self):
super(B, self).__init__()
self.b = 1
b = B()
But if you forget to make A a newstyle class and define it like this:
class A:
def __init__(self):
self.a = 1
you get this traceback:
Traceback (most recent call last):
File "AB.py", line 11, in <module>
b = B()
File "AB.py", line 7, in __init__
super(B, self).__init__()
TypeError: super() argument 1 must be type, not classobj
Two other questions relating to this issue are 489269 and 770134
def f():
x += 1
x = 42
f()
results in an UnboundLocalError, because local names are detected statically. A different example would be
def f():
print x
x = 43
x = 42
f()
You cannot use locals()['x'] = whatever to change local variable values as you might expect.
This works:
>>> x = 1
>>> x
1
>>> locals()['x'] = 2
>>> x
2
BUT:
>>> def test():
... x = 1
... print x
... locals()['x'] = 2
... print x # *** prints 1, not 2 ***
...
>>> test()
1
1
This actually burnt me in an answer here on SO, since I had tested it outside a function and got the change I wanted. Afterwards, I found it mentioned and contrasted to the case of globals() in "Dive Into Python." See example 8.12. (Though it does not note that the change via locals() will work at the top level as I show above.)
x += [...] is not the same as x = x + [...] when x is a list`
>>> x = y = [1,2,3]
>>> x = x + [4]
>>> x == y
False
>>> x = y = [1,2,3]
>>> x += [4]
>>> x == y
True
One creates a new list while the other modifies in place
List repetition with nested lists
This caught me out today and wasted an hour of my time debugging:
>>> x = [[]]*5
>>> x[0].append(0)
# Expect x equals [[0], [], [], [], []]
>>> x
[[0], [0], [0], [0], [0]] # Oh dear
Explanation: Python list problem
Using class variables when you want instance variables. Most of the time this doesn't cause problems, but if it's a mutable value it causes surprises.
class Foo(object):
x = {}
But:
>>> f1 = Foo()
>>> f2 = Foo()
>>> f1.x['a'] = 'b'
>>> f2.x
{'a': 'b'}
You almost always want instance variables, which require you to assign inside __init__:
class Foo(object):
def __init__(self):
self.x = {}
Python 2 has some surprising behaviour with comparisons:
>>> print x
0
>>> print y
1
>>> x < y
False
What's going on? repr() to the rescue:
>>> print "x: %r, y: %r" % (x, y)
x: '0', y: 1
If you assign to a variable inside a function, Python assumes that the variable is defined inside that function:
>>> x = 1
>>> def increase_x():
... x += 1
...
>>> increase_x()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in increase_x
UnboundLocalError: local variable 'x' referenced before assignment
Use global x (or nonlocal x in Python 3) to declare you want to set a variable defined outside your function.
The values of range(end_val) are not only strictly smaller than end_val, but strictly smaller than int(end_val). For a float argument to range, this might be an unexpected result:
from future.builtins import range
list(range(2.89))
[0, 1]
Due to 'truthiness' this makes sense:
>>>bool(1)
True
but you might not expect it to go the other way:
>>>float(True)
1.0
This can be a gotcha if you're converting strings to numeric and your data has True/False values.
If you create a list of list this way:
arr = [[2]] * 5
print arr
[[2], [2], [2], [2], [2]]
Then this creates an array with all elements pointing to the same object ! This might create a real confusion. Consider this:
arr[0][0] = 5
then if you print arr
print arr
[[5], [5], [5], [5], [5]]
The proper way of initializing the array is for example with a list comprehension:
arr = [[2] for _ in range(5)]
arr[0][0] = 5
print arr
[[5], [2], [2], [2], [2]]