Inspect python class attributes - python

I need a way to inspect a class so I can safely identify which attributes are user-defined class attributes. The problem is that functions like dir(), inspect.getmembers() and friends return all class attributes including the pre-defined ones like: __class__, __doc__, __dict__, __hash__. This is of course understandable, and one could argue that I could just make a list of named members to ignore, but unfortunately these pre-defined attributes are bound to change with different versions of Python therefore making my project volnerable to changed in the python project - and I don't like that.
example:
>>> class A:
... a=10
... b=20
... def __init__(self):
... self.c=30
>>> dir(A)
['__doc__', '__init__', '__module__', 'a', 'b']
>>> get_user_attributes(A)
['a','b']
In the example above I want a safe way to retrieve only the user-defined class attributes ['a','b'] not 'c' as it is an instance attribute. So my question is... Can anyone help me with the above fictive function get_user_attributes(cls)?
I have spent some time trying to solve the problem by parsing the class in AST level which would be very easy. But I can't find a way to convert already parsed objects to an AST node tree. I guess all AST info is discarded once a class has been compiled into bytecode.

Below is the hard way. Here's the easy way. Don't know why it didn't occur to me sooner.
import inspect
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
return [item
for item in inspect.getmembers(cls)
if item[0] not in boring]
Here's a start
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
attrs = {}
bases = reversed(inspect.getmro(cls))
for base in bases:
if hasattr(base, '__dict__'):
attrs.update(base.__dict__)
elif hasattr(base, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
This should be fairly robust. Essentially, it works by getting the attributes that are on a default subclass of object to ignore. It then gets the mro of the class that's passed to it and traverses it in reverse order so that subclass keys can overwrite superclass keys. It returns a dictionary of key-value pairs. If you want a list of key, value tuples like in inspect.getmembers then just return either attrs.items() or list(attrs.items()) in Python 3.
If you don't actually want to traverse the mro and just want attributes defined directly on the subclass then it's easier:
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
if hasattr(cls, '__dict__'):
attrs = cls.__dict__.copy()
elif hasattr(cls, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs

Double underscores on both ends of 'special attributes' have been a part of python before 2.0. It would be very unlikely that they would change that any time in the near future.
class Foo(object):
a = 1
b = 2
def get_attrs(klass):
return [k for k in klass.__dict__.keys()
if not k.startswith('__')
and not k.endswith('__')]
print get_attrs(Foo)
['a', 'b']

Thanks aaronasterling, you gave me the expression i needed :-)
My final class attribute inspector function looks like this:
def get_user_attributes(cls,exclude_methods=True):
base_attrs = dir(type('dummy', (object,), {}))
this_cls_attrs = dir(cls)
res = []
for attr in this_cls_attrs:
if base_attrs.count(attr) or (callable(getattr(cls,attr)) and exclude_methods):
continue
res += [attr]
return res
Either return class attribute variabels only (exclude_methods=True) or also retrieve the methods.
My initial tests og the above function supports both old and new-style python classes.
/ Jakob

If you use new style classes, could you simply subtract the attributes of the parent class?
class A(object):
a = 10
b = 20
#...
def get_attrs(Foo):
return [k for k in dir(Foo) if k not in dir(super(Foo))]
Edit: Not quite. __dict__,__module__ and __weakref__ appear when inheriting from object, but aren't there in object itself. You could special case these--I doubt they'd change very often.

Sorry for necro-bumping the thread. I'm surprised that there's still no simple function (or a library) to handle such common usage as of 2019.
I'd like to thank aaronasterling for the idea. Actually, set container provides a more straightforward way to express it:
class dummy: pass
def abridged_set_of_user_attributes(obj):
return set(dir(obj))-set(dir(dummy))
def abridged_list_of_user_attributes(obj):
return list(abridged_set_of_user_attributes(obj))
The original solution using list comprehension is actually two level of loops because there are two in keyword compounded, despite having only one for keyword made it look like less work than it is.

This worked for me to include user defined attributes with __ that might be be found in cls.__dict__
import inspect
class A:
__a = True
def __init__(self, _a, b, c):
self._a = _a
self.b = b
self.c = c
def test(self):
return False
cls = A(1, 2, 3)
members = inspect.getmembers(cls, predicate=lambda x: not inspect.ismethod(x))
attrs = set(dict(members).keys()).intersection(set(cls.__dict__.keys()))
__attrs = {m[0] for m in members if m[0].startswith(f'_{cls.__class__.__name__}')}
attrs.update(__attrs)
This will correctly yield: {'_A__a', '_a', 'b', 'c'}
You can update to clean the cls.__class__.__name__ if you wish

Related

How to tell a class method which variable should be processed

sorry for the noob question. Lets say i have a class which holds 3 lists and a method to combine one of the lists to a string. How can i tell the method which list it should take ? or should i move the method out of the class into a function ?
Here is what i mean:
class images():
def __init__(self, lista, listb, listc):
self.lista=lista
self.listb=listb
self.listc=listc
def makelist(self):
items = ""
for item in self.whateverListIWant:
items=items+item
return items
test=images([1,2,3],["b"],["c"])
print (test.makelist(WhichListIWant))
You can do this
class images():
def __init__(self, lista, listb, listc):
self.lista=lista
self.listb=listb
self.listc=listc
def makelist(self, param):
items = ""
for item in param:
items= items + str(item)
return items
test=images([1,2,3],["b"],["c"])
print (test.makelist(test.lista))
Be aware that "class method" (indicated by the decorator #classmethod) is not what you have in your example. Yours is a standard object method that acts on the object you create, as highlighted by your use of "self" as the first parameter.
The object method acts on the object, and can refer to the data contained within self. A "class method" would act only on a class, and only use parameters available to that class.
Use of a class method would be something like this:
class ClassyImages():
lista = []
listb = []
listc = []
#classmethod
def makelist(cls, which):
if which == 'a':
the_list = cls.lista
elif which == 'b':
the_list = cls.listb
elif which == 'c':
the_list = cls.listc
else:
raise ValueError("Invalid value " + str(which))
return the_list[:] # returns a soft copy of the list
And you would use it as follows:
ClassyImages.lista.extend([1,2,3])
ClassyImages.listb.add("b")
ClassyImages.listc.add("c")
print(ClassyImages.makelist("a"))
See the difference? In your example, you're using methods on the instance of an object, which you create. In this case, we're only using class-level variables, and never using an instance of an object.
However, the nice feature of the #classmethod decorator, is you still can use the class method on an object. So you can also do this:
my_classy_object_1 = ClassyImages()
my_classy_object_2 = ClassyImages()
print(my_classy_object_1.makelist("b")) # prints: ['b']
ClassyImages.listb.add("bb")
print(my_classy_object_1.makelist("b")) # prints: ['b', 'bb']
print(my_classy_object_2.makelist("b")) # prints: ['b', 'bb']
My answer glossed over what you may really be asking - how to tell it which list to look at. You can inject the list into the argument, as suggested by sjaymj64. Or you can pass a string or a number into the function, similar to how I did for the class-level makelist. It's largely up to you - provide whatever way seems most fitting and convenient for letting the logic of makelist choose which of its components to look at.

Initiate subclasses from parent class

Suppose I have a list of inputs that will generate O objects, of the following form:
inps = [['A', 5], ['B', 2]]
and O has subclasses A and B. A and B each are initiated with a single integer --
5 or 2 in the example above -- and have a method update(self, t), so I believe it makes sense to group them under an O superclass. I could complete the program with a loop:
Os = []
for inp in inps:
if inp[0] == 'A':
Os.append(A(inp[1]))
elif inp[0] == 'B':
Os.append(B(inp[1]))
and then at runtime,
for O in Os: O.update(t)
I'm wondering, however, if there is a more object oriented way to accomplish this. One way, I suppose, might be to make a fake "O constructor" outside of the O class:
def initO(inp):
if inp[0] == 'A':
return A(inp[1])
elif inp[0] == 'B':
return B(inp[1])
Os = [initO(inp) for inp in inps]
This is more elegant, in my opinion, and for all intensive purposes gives me the result I want; but it feels like a complete abuse of the class system in python. Is there a better way to do this, perhaps by initiating A and B from the O constructor?
EDIT: The ideal would be to be able to use
Os = [O(inp) for inp in inps]
while maintaining O as a superclass of A and B.
You could use a dict to map the names to the actual classes:
dct = {'A': A, 'B': B}
[dct[name](argument) for name, argument in inps]
Or if you don't want the list-comprehension:
dct = {'A': A, 'B': B}
Os = []
for inp in inps:
cls = dct[inp[0]]
Os.append(cls(inp[1]))
Although it is technically possible to perform call by name in Python, I strongly advice not to do that. The cleanest way is probably using a dictionary:
trans = { 'A' : A, 'B' : B }
def initO(inp):
cons = trans.get(inp[0])
if cons is not None:
return cons(*inp[1:])
So here trans is a dictionary that maps names on classes (and thus corresponding constructors).
In the initO we perform a lookup, if the lookup succeeds, we call the constructor cons with the remaining arguments of inp.
In case you really want to create a (direct) subclass from within a parent class you could use the special __subclasses__ method:
class O(object):
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
# probably not a really good name for that method - I'm out of creativity...
subcls = next(sub for sub in cls.__subclasses__() if sub.__name__ == subclassname)
return subcls(value)
def __repr__(self):
return '{self.__class__.__name__}({self.value})'.format(self=self)
class A(O):
pass
class B(O):
pass
This acts like a factory:
>>> O.get_subclass('A', 1)
A(1)
Or as list-comprehension:
>>> [O.get_subclass(*inp) for inp in inps]
In case you want to optimize it and you know that you won't add subclasses during the programs progress you could put the subclasses in a dictionary that maps from __name__ to the subclass:
class O(object):
__subs = {}
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
if not cls.__subs:
cls.__subs = {sub.__name__: sub for sub in cls.__subclasses__()}
return cls.__subs[subclassname](value)
You could probably also use __new__ to implement that behavior or a metaclass but I think a classmethod may be more appropriate here because it's easy to understand and allows for more flexibility.
In case you not only want direct subclasses you might want to check this recipe to find even subclasses of your subclasses (I also implemented it in a 3rd party extension package of mine: iteration_utilities.itersubclasses).
Without knowing more about your A and B, it's hard to say. But this looks like a classic case for a switch in a language like C. Python doesn't have a switch statement, so the use of a dict or dict-like construct is used instead.
If you're sure your inputs are clean, you can directly get your classes using the globals() function:
Os = [globals()[f](x) for (f, x) in inps]
If you want to sanitize, you can do something like this:
allowed = {'A', 'B'}
Os = [globals()[f](x) for (f, x) in inps if f in allowed]
This solution can also be changed if you prefer to have a fixed dictionary and sanitized inputs:
allowed = {'A', 'B'}
classname_to_class = {k: v for (k, v) in globals().iteritems() if k in allowed}
# Now, you can have a dict mapping class names to classes without writing 'A': A, 'B': B ...
Alternately, if you can prefix all your class definitions, you could even do something like this:
classname_to_class = {k[13:]: v for (k, v) in globals().iteritems() if k.startswith('SpecialPrefix'} # 13: is the length of 'SpecialPrefix'
This solution allows you to just name your classes with a prefix and have the dictionary automatically populate (after stripping out the special prefix if you so choose). These dictionaries are equivalent to trans and dct in the other solutions posted here, except without having to manually generate the dictionary.
Unlike the other solutions posted so far, these reduce the likelihood of a transcription error (and the amount of boilerplate code required) in cases where you have a lot more classes than A and B.
At the risk of drawing more negative fire... we can use metaclasses. This may or may not be suitable for your particular application. Every time you define a subclass of class O, you always have an up-to-date list (well, dict) of O's subclasses. Oh, and this is written for Python 2 (but can be ported to Python 3).
class OMetaclass(type):
'''This metaclass adds a 'subclasses' attribute to its classes that
maps subclass name to the class object.'''
def __init__(cls, name, bases, dct):
if not hasattr(cls, 'subclasses'):
cls.subclasses = {}
else:
cls.subclasses[name] = cls
super(OMetaclass, cls).__init__(name, bases, dct)
class O(object):
__metaclass__ = OMetaclass
### Now, define the rest of your subclasses of O as usual.
class A(O):
def __init__(self, x): pass
class B(O):
def __init__(self, x): pass
Now, you have a dictionary, O.subclasses, that contains all the subclasses of O. You can now just do this:
Os = [O.subclasses[cls](arg) for (cls, arg) in inps]
Now, you don't have to worry about weird prefixes for your classes and you won't need to change your code if you're subclassing O already, but you've introduced magic (metaclasses) that may make your program harder to grok.

Python assignment to self in constructor does not make object the same

I am making a constructor in Python. When called with an existing object as its input, it should set the "new" object to that same object. Here is a 10 line demonstration:
class A:
def __init__(self, value):
if isinstance(value, A):
self = value
else:
self.attribute = value
a = A(1)
b = A(a)#a and b should be references to the same object
print("b is a", b is a)#this should be true: the identities should be the same
print("b == a", b == a)#this should be true: the values should be the same
I want the object A(a) constructed from the existing object a to be a. Why is it not? To be clear, I want A(a) to reference the same object as a, NOT a copy.
self, like any other argument, is among the local variables of a function or method. Assignment to the bare name of a local variable never affects anything outside of that function or method, it just locally rebinds that name.
As a comment rightly suggests, it's unclear why you wouldn't just do
b = a
Assuming you have a sound reason, what you need to override is not __init__, but rather __new__ (then take some precaution in __init__ to avoid double initialization). It's not an obvious course so I'll wait for you to explain what exactly you're trying to accomplish.
Added: having clarified the need I agree with the OP that a factory function (ideally, I suggest, as a class method) is better -- and clearer than __new__, which would work (it is a class method after all) but in a less-sharply-clear way.
So, I would code as follows:
class A(object):
#classmethod
def make(cls, value):
if isinstance(value, cls): return value
return cls(value)
def __init__(self, value):
self.attribute = value
Now,
a = A.make(1)
b = A.make(a)
accomplishes the OP's desires, polymorphically over the type of argument passed to A.make.
The only way to make it work exactly as you have it is to implement __new__, the constructor, rather than __init__, the initialiser (the behaviour can get rather complex if both are implemented). It would also be wise to implement __eq__ for equality comparison, although this will fall back to identity comparison. For example:
>>> class A(object):
def __new__(cls, value):
if isinstance(value, cls):
return value
inst = super(A, cls).__new__(cls)
inst.attribute = value
return inst
def __eq__(self, other):
return self.attribute == other.attribute
>>> a = A(1)
>>> b = A(a)
>>> a is b
True
>>> a == b
True
>>> a == A(1)
True # also equal to other instance with same attribute value
You should have a look at the data model documentation, which explains the various "magic methods" available and what they do. See e.g. __new__.
__init__ is an initializer, not a constructor. You would have to mess around with __new__ to do what you want, and it's probably not a good idea to go there.
Try
a = b = A(1)
instead.
If you call a constructor, it's going to create a new object. The simplest thing is to do what hacatu suggested and simply assign b to a's value. If not, perhaps you could have an if statement checking if the value passed in is equal to the object you want referenced and if it is, simply return that item before ever calling the constructor. I haven't tested so I'm not sure if it'd work.

str.format() with lazy dict?

I want to use str.format() and pass it a custom lazy dictionary.
str.format() should only access the key in the lazy dict it needs.
Is this possible?
Which interface needs to be implemented by the lazy_dict?
Update
This is not what I want:
'{0[a]}'.format(d)
I need something like this:
'{a}'.format(**d)
Need to run on Python2.7
For doing '{a}'.format(**d), especially the **d part, the "lazy" dict is transformed into a regular one. Here happens the access to all keys, and format() can't do anything about it.
You could craft some proxy objects which are put in place of the elements, and on string access they do the "real" work.
Something like
class LazyProxy(object):
def __init__(self, prx):
self.prx = prx
def __format__(self, fmtspec):
return format(self.prx(), fmtspec)
def __repr__(self):
return repr(self.prx())
def __str__(self):
return str(self.prx())
You can put these elements into a dict, such as
interd = { k, LazyProxy(lambda: lazydict[k]) for i in lazydict.iterkeys()}
I didn't test this, but I think this fulfills your needs.
After the last edit, it now works with !r and !s as well.
You can use the __format__ method (Python 3 only). See the doc here.
If I understand your question correctly, you want to pass a custom dictionary, that would compute values only when needed. First, we're looking for implementation of __getitem__():
>>> class LazyDict(object):
... def __init__(self, d):
... self.d = d
... def __getitem__(self, k):
... print k # <-- tracks the needed keys
... return self.d[k]
...
>>> d = D({'a': 19, 'b': 20})
>>> '{0[a]}'.format(d)
a
'19'
This shows that only key 'a' is accessed; 'b' is not, so you already have your lazy access.
But also, any object attribute is usable for str.format this way, and using #property decorator, you can access function results:
class MyObject(object):
def __init__(self):
self.a = 19
self.b = 20
def __getitem__(self, var):
return getattr(self, var)
# this command lets you able to call any attribute of your instance,
# or even the result of a function if it is decorated by #property:
#property
def c(self):
return 21
Example of usage:
>>> m = MyObject()
>>> '{0[c]}'.format(m)
'21'
But note that this also works, making the formating string a little bit specific, but avoid the need for __getitem__() implementation.
>>> '{0.c}'.format(m)
'21'

How do I get the string representation of a variable in python?

I have a variable x in python. How can i find the string 'x' from the variable. Here is my attempt:
def var(v,c):
for key in c.keys():
if c[key] == v:
return key
def f():
x = '321'
print 'Local var %s = %s'%(var(x,locals()),x)
x = '123'
print 'Global var %s = %s'%(var(x,locals()),x)
f()
The results are:
Global var x = 123
Local var x = 321
The above recipe seems a bit un-pythonesque. Is there a better/shorter way to achieve the same result?
Q: I have a variable x in python. How can i find the string 'x' from the variable.
A: If I am understanding your question properly, you want to go from the value of a variable to its name. This is not really possible in Python.
In Python, there really isn't any such thing as a "variable". What Python really has are "names" which can have objects bound to them. It makes no difference to the object what names, if any, it might be bound to. It might be bound to dozens of different names, or none.
Consider this example:
foo = 1
bar = foo
baz = foo
Now, suppose you have the integer object with value 1, and you want to work backwards and find its name. What would you print? Three different names have that object bound to them, and all are equally valid.
print(bar is foo) # prints True
print(baz is foo) # prints True
In Python, a name is a way to access an object, so there is no way to work with names directly. You might be able to search through locals() to find the value and recover a name, but that is at best a parlor trick. And in my above example, which of foo, bar, and baz is the "correct" answer? They all refer to exactly the same object.
P.S. The above is a somewhat edited version of an answer I wrote before. I think I did a better job of wording things this time.
I believe the general form of what you want is repr() or the __repr__() method of an object.
with regards to __repr__():
Called by the repr() built-in function
and by string conversions (reverse
quotes) to compute the “official”
string representation of an object.
See the docs here: object.repr(self)
stevenha has a great answer to this question. But, if you actually do want to poke around in the namespace dictionaries anyway, you can get all the names for a given value in a particular scope / namespace like this:
def foo1():
x = 5
y = 4
z = x
print names_of1(x, locals())
def names_of1(var, callers_namespace):
return [name for (name, value) in callers_namespace.iteritems() if var is value]
foo1() # prints ['x', 'z']
If you're working with a Python that has stack frame support (most do, CPython does), it isn't required that you pass the locals dict into the names_of function; the function can retrieve that dictionary from its caller's frame itself:
def foo2():
xx = object()
yy = object()
zz = xx
print names_of2(xx)
def names_of2(var):
import inspect
callers_namespace = inspect.currentframe().f_back.f_locals
return [name for (name, value) in callers_namespace.iteritems() if var is value]
foo2() # ['xx', 'zz']
If you're working with a value type that you can assign a name attribute to, you can give it a name, and then use that:
class SomeClass(object):
pass
obj = SomeClass()
obj.name = 'obj'
class NamedInt(int):
__slots__ = ['name']
x = NamedInt(321)
x.name = 'x'
Finally, if you're working with class attributes and you want them to know their names (descriptors are the obvious use case), you can do cool tricks with metaclass programming like they do in the Django ORM and SQLAlchemy declarative-style table definitions:
class AutonamingType(type):
def __init__(cls, name, bases, attrs):
for (attrname, attrvalue) in attrs.iteritems():
if getattr(attrvalue, '__autoname__', False):
attrvalue.name = attrname
super(AutonamingType,cls).__init__(name, bases, attrs)
class NamedDescriptor(object):
__autoname__ = True
name = None
def __get__(self, instance, instance_type):
return self.name
class Foo(object):
__metaclass__ = AutonamingType
bar = NamedDescriptor()
baaz = NamedDescriptor()
lilfoo = Foo()
print lilfoo.bar # prints 'bar'
print lilfoo.baaz # prints 'baaz'
There are three ways to get "the" string representation of an object in python:
1: str()
>>> foo={"a":"z","b":"y"}
>>> str(foo)
"{'a': 'z', 'b': 'y'}"
2: repr()
>>> foo={"a":"z","b":"y"}
>>> repr(foo)
"{'a': 'z', 'b': 'y'}"
3: string interpolation:
>>> foo={"a":"z","b":"y"}
>>> "%s" % (foo,)
"{'a': 'z', 'b': 'y'}"
In this case all three methods generated the same output, the difference is that str() calls dict.__str__(), while repr() calls dict.__repr__(). str() is used on string interpolation, while repr() is used by Python internally on each object in a list or dict when you print the list or dict.
As Tendayi Mawushe mentiones above, string produced by repr isn't necessarily human-readable.
Also, the default implementation of .__str__() is to call .__repr__(), so if the class does not have it's own overrides to .__str__(), the value returned from .__repr__() is used.

Categories

Resources