Why __dict__ does not contain class member unless using derect initialization? - python

I need to access list of class members in class methods specifically in __init__() function with intention of initializing them en masse. I tried to use __dict__ and vars(self) but Unfortunately they return empty dict object, unless using direct initialization such as self.y=5. The questions are; why that is empty and how can I initialize them at once, or essentially is variable __dict__ suitable for bulk initialization?
thank you
sample code is like this:
class P:
def __init__(self):
print("inside __init__() :",self.y,self.__dict__)
x=8
y=9
p=P()
print(" y is: ", p.y)
print("and __dict__ is:",p.__dict__)
output:
inside __init__() : 9 {}
y is: 9
and __dict__ is: {}
Python version: 3.8.5
Tested Operating systems: windows 10, MacOS 10.15.7 and Linux CentOS 7

Python class instances are dynamic - they don't have variables until you put them there (see note below). And they don't have any pre-knowledge of what those variables should be. Usually that's done by adding them one by one (e.g., self.foo = 1) in __init__ or other methods in the class. But if you have a different source of variables, you can do it by adding to self.__dict__. Suppose I have a row from a CSV file and I want a class that is initialized by that row. I could do
class P:
cell_names = ('a', 'b', 'c')
def __init__(self, row):
print("before", self.__dict__)
self.__dict__.update(zip(self.cell_names, row))
print("after", self.__dict__)
p = P((1,2,3))
print("attributes", p.a, p.b, p.c)
Outputs
before {}
after {'a': 1, 'b': 2, 'c': 3}
attributes 1 2 3
There are other ways to initialize, of course. It depends on what the source of your "en mass" data is.
If you want to add all of the keyword arguments you could
class P1:
def __init__(self, **kw):
self.__dict__.update(kw)
p = P1(a=1, b=3, c=3)
print("P2", p.a, p.b, p.c)
(Note: You can use __slot__ to predefine variables in a class. These bypass the instance dict)

__dict__ only holds instance attributes. x and y in your examples are class attributes. self.x is an expression that could mean many different things; the actual result depends on which lookup succeeds first.
When you make an assignment like self.x = 5, then the value 5 is associated with the key x in self.__dict__.
When you try to get the value of self.x, the first thing that is tried is self.__dict__['x']. If that fails, though, it tries P.x. If that failed, it would check for x as an attribute in each ancestor of P in the MRO, until a value is found. If nothing is found, an AttributeError is raised. (This ignores the use of __getattr__ or __getattribute__, but is sufficient to explain how self.x can provide a value when self.__dict__ is empty.)

Related

Assigning dictionary to a class object

What is the difference between the two class definitions below,
class my_dict1(dict):
def __init__(self, data):
self = data.copy()
self.N = sum(self.values)
The above code results in AttributeError: 'dict' object has no attribute 'N', while the below code compiles
class my_dict2(dict):
def __init__(self, data):
for k, v in data.items():
self[k] = v
self.N = sum(self.values)
For example,
d = {'a': 3, 'b': 5}
a = my_dict1(d) # results in attribute error
b = my_dict2(d) # works fine
By assigning self itself to anything you assign self to a completely different instance than you were originally dealing with, making it no longer the "self". This instance will be of the broader type dict (because data is a dict), not of the narrower type my_dict1. You would need to do self["N"] in the first example for it to be interpreted without error, but note that even with this, in something like:
abc = mydict_1({})
abc will still not have the key "N" because a completely difference instance in __init__ was given a value for the key "N". This shows you that there's no reasonable scenario where you want to assign self itself to something else.
In regards to my_dict2, prefer composition over inheritance if you want to use a particular dict as a representation of your domain. This means having data as an instance field. See the related C# question Why not inherit from List?, the core answer is still the same. It comes down to whether you want to extend the dict mechanism vs. having a business object based on it.

Python method changing self value (dict-inherited class) [duplicate]

I have a class (list of dicts) and I want it to sort itself:
class Table(list):
…
def sort (self, in_col_name):
self = Table(sorted(self, key=lambda x: x[in_col_name]))
but it doesn't work at all. Why? How to avoid it? Except for sorting it externally, like:
new_table = Table(sorted(old_table, key=lambda x: x['col_name'])
Isn't it possible to manipulate the object itself? It's more meaningful to have:
class Table(list):
pass
than:
class Table(object):
l = []
…
def sort (self, in_col_name):
self.l = sorted(self.l, key=lambda x: x[in_col_name])
which, I think, works.
And in general, isn't there any way in Python which an object is able to change itself (not only an instance variable)?
You can't re-assign to self from within a method and expect it to change external references to the object.
self is just an argument that is passed to your function. It's a name that points to the instance the method was called on. "Assigning to self" is equivalent to:
def fn(a):
a = 2
a = 1
fn(a)
# a is still equal to 1
Assigning to self changes what the self name points to (from one Table instance to a new Table instance here). But that's it. It just changes the name (in the scope of your method), and does affect not the underlying object, nor other names (references) that point to it.
Just sort in place using list.sort:
def sort(self, in_col_name):
super(Table, self).sort(key=lambda x: x[in_col_name])
Python is pass by value, always. This means that assigning to a parameter will never have an effect on the outside of the function. self is just the name you chose for one of the parameters.
I was intrigued by this question because I had never thought about this. I looked for the list.sort code, to see how it's done there, but apparently it's in C. I think I see where you're getting at; what if there is no super method to invoke? Then you can do something like this:
class Table(list):
def pop_n(self, n):
for _ in range(n):
self.pop()
>>> a = Table(range(10))
>>> a.pop_n(3)
>>> print a
[0, 1, 2, 3, 4, 5, 6]
You can call self's methods, do index assignments to self and whatever else is implemented in its class (or that you implement yourself).

Calling classmethods through a dictionary

I'm working on a class describing an object that can be expressed in several "units", I'll say, to keep things simple. Let's say we're talking about length. (It's actually something more complicated.) What I would like is for the user to be able to input 1 and "inch", for example, and automatically get member variables in feet, meters, furlongs, what have you as well. I want the user to be able to input any of the units I am dealing in, and get member variables in all the other units. My thought was to do something like this:
class length:
#classmethod
def inch_to_foot(cls,inch):
# etc.
#classmethod
def inch_to_meter(cls,inch):
# etc.
I guess you get the idea. Then I would define a dictionary in the class:
from_to={'inch':{'foot':inch_to_foot,'meter':inch_to_meter, ...},
'furlong':{'foot':furlong_to_foot, ...},
#etc
}
So then I think I can write an __init__ method
def __init__(self,num,unit):
cls = self.__class__
setattr(self,unit,num)
for k in cls.from_to[unit].keys:
setattr(self,k,cls.from_to[unit][k](num)
But no go. I get the error "class method not callable". Any ideas how I can make this work? Any ideas for scrapping the whole thing and trying a different approach? Thanks.
If you move the from_to variable into __init__ and modify it to something like:
cls.from_to={'inch':{'foot':cls.inch_to_foot,'meter':cls.inch_to_meter, }}
then I think it works as you expect.
Unfortunately I can't answer why because i haven't used classmethods much myself, but I think it is something to do with bound vs unbound methods. Anyway, if you print the functions stored in to_from in your code vs the ones with my modification you'll see they are different (mine are bound, yours are classmethod objects)
Hope that helps somewhat!
EDIT: I've thought about it a bit more, I think the problem is because you are storing a reference to the functions before they have been bound to the class (not surprising that the binding happens once the rest of the class has been parsed). My advice would be to forget about storing a dictionary of function references, but to store (in some representation of your choice) strings that indicate the units you can change between. For instance you might choose a similar format, such as:
from_to = {'inch':['foot','meter']}
and then look up the functions during __init__ using getattr
E.G.:
class length:
from_to = {'inch':['foot','meter']}
def __init__(self,num,unit):
if unit not in self.from_to:
raise RuntimeError('unit %s not supported'%unit)
cls = self.__class__
setattr(self,unit,num)
for k in cls.from_to[unit]:
f = getattr(cls,'%s_to_%s'%(unit,k))
setattr(self,k,f(num))
#classmethod
def inch_to_foot(cls,inch):
return inch/12.0
#classmethod
def inch_to_meter(cls,inch):
return inch*2.54/100
a = length(3,'inches')
print a.meter
print a.foot
print length.inch_to_foot(3)
I don't think doing with an __init__() method would be a good idea. I once saw an interesting way to do it in the Overriding the __new__ method section of in the classic document titled Unifying types and classes in Python 2.2 by Guido van Rossum.
Here's some examples:
class inch_to_foot(float):
"Convert from inch to feet"
def __new__(cls, arg=0.0):
return float.__new__(cls, float(arg)/12)
class inch_to_meter(float):
"Convert from inch to meter"
def __new__(cls, arg=0.0):
return float.__new__(cls, arg*0.0254)
print inch_to_meter(5) # 0.127
Here's a completely different answer that uses a metaclass and requires the conversion functions to bestaticmethodsrather thanclassmethods-- which it turns into properties based on the target unit's name. If searches for the names of any conversion functions itself, eliminating the need to manually definefrom_totype tables.
One thing about this approach is that the conversion functions aren't even called unless indirect references are made to the units associated with them. Another is that they're dynamic in the sense that the results returned will reflect the current value of the instance (unlike instances of three_pineapples'lengthclass, which stores the results of calling them on the numeric value of the instance when it's initially constructed).
You've never said what version of Python you're using, so the following code is for Python 2.2 - 2.x.
import re
class MetaUnit(type):
def __new__(metaclass, classname, bases, classdict):
cls = type.__new__(metaclass, classname, bases, classdict)
# add a constructor
setattr(cls, '__init__',
lambda self, value=0: setattr(self, '_value', value))
# add a property for getting and setting the underlying value
setattr(cls, 'value',
property(lambda self: self._value,
lambda self, value: setattr(self, '_value', value)))
# add an identity property the just returns the value unchanged
unitname = classname.lower() # lowercase classname becomes name of unit
setattr(cls, unitname, property(lambda self: self._value))
# find conversion methods and create properties that use them
matcher = re.compile(unitname + r'''_to_(?P<target_unitname>\w+)''')
for name in cls.__dict__.keys():
match = matcher.match(name)
if match:
target_unitname = match.group('target_unitname').lower()
fget = (lambda self, conversion_method=getattr(cls, name):
conversion_method(self._value))
setattr(cls, target_unitname, property(fget))
return cls
Sample usage:
scalar_conversion_staticmethod = (
lambda scale_factor: staticmethod(lambda value: value * scale_factor))
class Inch(object):
__metaclass__ = MetaUnit
inch_to_foot = scalar_conversion_staticmethod(1./12.)
inch_to_meter = scalar_conversion_staticmethod(0.0254)
a = Inch(3)
print a.inch # 3
print a.meter # 0.0762
print a.foot # 0.25
a.value = 6
print a.inch # 6
print a.meter # 0.1524
print a.foot # 0.5

Difference between Class variables and Instance variables

I have already read many answers here on Stack Exchange like Python - why use "self" in a class?
After reading these answers, I understand that instance variables are unique to each instance of the class while class variables are shared across all instances.
While playing around, I found that this code which gives the output [1]:
class A:
x = []
def add(self):
self.x.append(1)
x = A()
y = A()
x.add()
print "Y's x: ", y.x
However, this code gives 10 as the output, when in my opinion it should be 11:
class A:
x = 10
def add(self):
self.x += 1
x = A()
y = A()
x.add()
print "Y's x: ", y.x
Why A class variable is not updated when I run x.add()? I am not very experienced in programming, so please excuse me.
Class variables are shadowed by instance attribute. This means that when looking up an attribute, Python first looks in the instance, then in the class. Furthermore, setting a variable on an object (e.g. self) always creates an instance variable - it never changes the class variable.
This means that when, in your second example you do:
self.x += 1
which is (in this case, see footnote) equivalent to:
self.x = self.x + 1
what Python does is:
Look up self.x. At that point, self doesn't have the instance attribute x, so the class attribute A.x is found, with the value 10.
The RHS is evaluated, giving the result 11.
This result is assigned to a new instance attribute x of self.
So below that, when you look up x.x, you get this new instance attribute that was created in add(). When looking up y.x, you still get the class attribute. To change the class attribute, you'd have to use A.x += 1 explicitly – the lookup only happens when reading the value of an attribute.
Your first example is a classical gotcha and the reason you shouldn't use class attributes as "default" values for instance attributes. When you call:
self.x.append(1)
there is no assignment to self.x taking place. (Changing the contents of a mutable object, like a list, is not the same as assignment.) Thus, no new instance attribute is added to x that would shadow it, and looking up x.x and y.x later on gives you the same list from the class attribute.
Note: In Python, x += y is not always equivalent to x = x + y. Python allows you to override the in-place operators separately from the normal ones for a type. This mostly makes sense for mutable objects, where the in-place version will directly change the contents without a reassignment of the LHS of the expression. However, immutable objects (such as numbers in your second example) do not override in-place operators. In that case, the statement does get evaluated as a regular addition and a reassignment, explaining the behaviour you see.
(I lifted the above from this SO answer, see there for more details.)

Reuse existing objects for immutable objects?

In Python, how is it possible to reuse existing equal immutable objects (like is done for str)? Can this be done just by defining a __hash__ method, or does it require more complicated measures?
If you want to create via the class constructor and have it return a previously created object then you will need to provide a __new__ method (because by the time you get to __init__ the object has already been created).
Here is a simple example - if the value used to initialise has been seen before then a previously created object is returned rather than a new one created:
class Cached(object):
"""Simple example of immutable object reuse."""
def __init__(self, i):
self.i = i
def __new__(cls, i, _cache={}):
try:
return _cache[i]
except KeyError:
# you must call __new__ on the base class
x = super(Cached, cls).__new__(cls)
x.__init__(i)
_cache[i] = x
return x
Note that for this example you can use anything to initialise as long as it's hashable. And just to show that objects really are being reused:
>>> a = Cached(100)
>>> b = Cached(200)
>>> c = Cached(100)
>>> a is b
False
>>> a is c
True
There are two 'software engineering' solutions to this that don't require any low-level knowledge of Python. They apply in the following scenarios:
First Scenario: Objects of your class are 'equal' if they are constructed with the same constructor parameters, and equality won't change over time after construction. Solution: Use a factory that hashses the constructor parameters:
class MyClass:
def __init__(self, someint, someotherint):
self.a = someint
self.b = someotherint
cachedict = { }
def construct_myobject(someint, someotherint):
if (someint, someotherint) not in cachedict:
cachedict[(someint, someotherint)] = MyClass(someint, someotherint)
return cachedict[(someint, someotherint)]
This approach essentially limits the instances of your class to one unique object per distinct input pair. There are obvious drawbacks as well: not all types are easily hashable and so on.
Second Scenario: Objects of your class are mutable and their 'equality' may change over time. Solution: define a class-level registry of equal instances:
class MyClass:
registry = { }
def __init__(self, someint, someotherint, third):
MyClass.registry[id(self)] = (someint, someotherint)
self.someint = someint
self.someotherint = someotherint
self.third = third
def __eq__(self, other):
return MyClass.registry[id(self)] == MyClass.registry[id(other)]
def update(self, someint, someotherint):
MyClass.registry[id(self)] = (someint, someotherint)
In this example, objects with the same someint, someotherint pair are equal, while the third parameter does not factor in. The trick is to keep the parameters in registry in sync. As an alternative to update, you could override getattr and setattr for your class instead; this would ensure that any assignment foo.someint = y would be kept synced with your class-level dictionary. See an example here.
I believe you would have to keep a dict {args: object} of instances already created, then override the class' __new__ method to check in that dictionary, and return the relevant object if it already existed. Note that I haven't implemented or tested this idea. Of course, strings are handled at the C level.

Categories

Resources