How to access call parameters and arguments with __getattr__

How to access call parameters and arguments with __getattr__ - python

I have the following code, where most of the code seem to look awkward, confusing and/or circumstantial, but most of it is to demonstrate the parts of the much larger code where I have a problem with. Please read carefully
# The following part is just to demonstrate the behavior AND CANNOT BE CHANGED UNDER NO CIRCUMSTANCES
# Just define something so you can access something like derived.obj.foo(x)
class Basic(object):
def foo(self, x=10):
return x*x
class Derived(object):
def info(self, x):
return "Info of Derived: "+str(x)
def set(self, obj):
self.obj = obj
# The following piece of code might be changed, but I would rather not
class DeviceProxy(object):
def __init__(self):
# just to set up something that somewhat behaves as the real code in question
self.proxy = Derived()
self.proxy.set(Basic())
# crucial part: I want any attributes forwarded to the proxy object here, without knowing beforehand what the names will be
def __getattr__(self, attr):
return getattr(self.proxy, attr)
# ======================================
# This code is the only I want to change to get things work
# Original __getattr__ function
original = DeviceProxy.__getattr__
# wrapper for the __getattr__ function to log/print out any attribute/parameter/argument/...
def mygetattr(device, key):
attr = original(device, key)
if callable(attr):
def wrapper(*args, **kw):
print('%r called with %r and %r' % (attr, args, kw))
return attr(*args, **kw)
return wrapper
else:
print "not callable: ", attr
return attr
DeviceProxy.__getattr__ = mygetattr
# make an instance of the DeviceProxy class and call the double-dotted function
dev = DeviceProxy()
print dev.info(1)
print dev.obj.foo(3)
What I want is to catch all method calls to DeviceProxy to be able to print all arguments/parameters and so on. In the given example, this works great when calling info(1), all of the information is printed.
But when I call the double-dotted function dev.obj.foo(3), I only get the message that this is not a callable.
How can I modify the above code so I also get my information in the second case? Only the code below the === can be modified.

You have just a __getattr__ on dev and you want, from within this __getattr__, to have access to foo when you do dev.obj.foo. This isn't possible. The attribute accesses are not a "dotted function" that is accessed as a whole. The sequence of attribute accesses (the dots) is evaluated one at a time, left to right. At the time that you access dev.obj, there is no way to know that you will later access foo. The method dev.__getattr__ only knows what attributes you are accessing on dev, not what attributes of that result you may later access.
The only way to achieve what you want would be to also include some wrapping behavior in obj. You say you can't modify the "Base"/"Derived" classes, so you can't do it that way. You could, in theory, have DeviceProxy.__getattr__ not return the actual value of the accessed attribute, but instead wrap that object in another proxy and return the proxy. However, that could get a bit tricky and make your code more difficult to understand and debug, since you could wind up with tons of objects being wrapped in thin proxies.

Related

maximum recursion depth exceeded when using a class descriptor with get and set

I have been playing with my code and I ended up with this:
class TestProperty:
def __init__(self,func):
self.func = func
def __set__(self,instance,value):
setattr(instance,self.func.__name__,value)
def __get__(self,instance,cls):
if instance is None:
return self
else:
return getattr(instance,self.func.__name__)
class John(object):
def __init__(self):
pass
#TestProperty
def TestProp(self):
print('This line won\'t be printed')
p = John()
p.TestProp = 99
print(p.TestProp)
I'm trying to understand the behavior when creating a class descriptor and using them on methods instead of attributes. I am having a hard time understanding what's going on underneath and It would be really nice if someone can shed some light on me how did this end up as recursive error?
My initial guess is something like this:
Method that is decorated with the descriptor is called
Calls either __set__ or __get__ depending on how we accessed it.
Descriptors attempts to set the value of the instance which calls which ends up mapping it back to step 1(error).
Can anyone explain to me in great detail how did this happen and how do I resolve this?
The code provided serves no purpose other than understanding the behavior of class descriptor.

Don't use getattr() and setattr(); you are triggering the descriptor again there! The descriptor handles all access to the TestProp name, using setattr() and getattr() just goes through the same path as p.TestProp would.
Set the attribute value directly in the instance.__dict__:
def __get__(self,instance,cls):
if instance is None:
return self
try:
return instance.__dict__[self.func.__name__]
except KeyError:
raise AttributeError(self.func.__name__)
def __set__(self,instance,value):
instance.__dict__[self.func.__name__] = value
This works because you have a data descriptor; a data descriptor takes precedence over instance attributes. Access to p.TestProp continues to use the descriptor object on the class even though the name 'TestProp' exists in instance __dict__.

How to replace/bypass a class property?

I would like to have a class with an attribute attr that, when accessed for the first time, runs a function and returns a value, and then becomes this value (its type changes, etc.).
A similar behavior can be obtained with:
class MyClass(object):
#property
def attr(self):
try:
return self._cached_result
except AttributeError:
result = ...
self._cached_result = result
return result
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
However, .attr does not become the initial result, when doing this. It would be more efficient if it did.
A difficulty is that after obj.attr is set to a property, it cannot be set easily to something else, because infinite loops appear naturally. Thus, in the code above, the obj.attr property has no setter so it cannot be directly modified. If a setter is defined, then replacing obj.attr in this setter creates an infinite loop (the setter is accessed from within the setter). I also thought of first deleting the setter so as to be able to do a regular self.attr = …, with del self.attr, but this calls the property deleter (if any), which recreates the infinite loop problem (modifications of self.attr anywhere generally tend to go through the property rules).
So, is there a way to bypass the property mechanism and replace the bound property obj.attr by anything, from within MyClass.attr.__getter__?

This looks a bit like premature optimization : you want to skip a method call by making a descriptor change itself.
It's perfectly possible, but it would have to be justified.
To modify the descriptor from your property, you'd have to be editing your class, which is probably not what you want.
I think a better way to implement this would be to :
do not define obj.attr
override __getattr__, if argument is "attr", obj.attr = new_value, otherwise raise AttributeError
As soon as obj.attr is set, __getattr__ will not be called any more, as it is only called when the attribute does not exist. (__getattribute__ is the one that would get called all the time.)
The main difference with your initial proposal is that the first attribute access is slower, because of the method call overhead of __getattr__, but then it will be as fact as a regular __dict__ lookup.
Example :
class MyClass(object):
def __getattr__(self, name):
if name == 'attr':
self.attr = ...
return self.attr
raise AttributeError(name)
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
EDIT : Please see the other answer, especially if you use Python 3.6 or more.

For new-style classes, which utilize the descriptor protocol, you could do this by creating your own custom descriptor class whose __get__() method will be called at most one time. When that happens, the result is then cached by creating an instance attribute with the same name the class method has.
Here's what I mean.
from __future__ import print_function
class cached_property(object):
"""Descriptor class for making class methods lazily-evaluated and caches the result."""
def __init__(self, func):
self.func = func
def __get__(self, inst, cls):
if inst is None:
return self
else:
value = self.func(inst)
setattr(inst, self.func.__name__, value)
return value
class MyClass(object):
#cached_property
def attr(self):
print('doing long calculation...', end='')
result = 42
return result
obj = MyClass()
print(obj.attr) # -> doing long calculation...42
print(obj.attr) # -> 42

How to detect if a parameter is not used / not essential for a particular use case?

I have some working code (library) that, in some situations, I only need a small subset of its functional.
Thinking of a simpler case, the code (library) is a class that takes a few parameters when initializing.
For my limited use case, many of those parameters are not vital as they are not directly used in the internal calculation (some parameters are only used when I call particular methods of the object), while it is very hard to prepare those parameters properly.
So, I am wondering, if there is any easy way to know what parameters are essential without fully analyzing the library code (which is too complicated). For example, I may pass fake parameters to the api, And it would raise an exception only if they are actually used.
For example, I can pass in some_parameter = None for some_parameter that I guess won't be used. So whenever the library tries to access some_parameter.some_field an exception would be raised thus I can further look into the issue and replace it by the actually parameter. However, it would change the behavior of the library if the code itself accepts None as a parameter.
Are there any established approach to this problem? I don't mind false positive as I can always look into the problem and manually check if the usage of the fake parameters by the library is trivial.
For those suggestions on reading documentation and code, I don't have documentations! And the code is legacy code left by previous developers.
Update
#sapi:
Yes I would like to use the proxy pattern / object: I will further investigate on such topic.
"A virtual proxy is a placeholder for "expensive to create" objects. The real object is only created when a client first requests/accesses the object."

I am assuming all classes in question are new-style. This is always the case if you are using Python 3; in Python 2, they must extend from object. You can check a class with isinstance(MyClass, type). For the remainder of my answer, I will assume Python 3, since it was not specified. If you are using Python 2, make sure to extend from object where no other base class is specified.
If those conditions hold, you can write a descriptor that raises an exception whenever it is accessed:
class ParameterUsed(Exception):
pass
class UsageDescriptor:
def __init__(self, name):
super(UsageDescriptor, self).__init__()
self.name = name
def __get__(self, instance, owner):
raise ParameterUsed(self.name)
def __set__(self, instance, value):
# Ignore sets if the value is None.
if value is not None:
raise ParameterUsed(self.name)
def __delete__(self, instance):
# Ignore deletes.
pass
I will assume we are using this class as an example:
class Example:
def __init__(self, a, b):
self.a = a
self.b = b
def use_a(self):
print(self.a)
def use_b(self):
print(self.b)
If we want to see if a is used anywhere, extend the class and put an instance of our descriptor on the class:
class ExtExample(Example):
a = UsageDescriptor('a')
Now if we were to try to use the class, we can see which methods use a:
>>> example = ExtExample(None, None)
>>> example.use_a()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ParameterUsed: a
>>> example.use_b()
None
Here, we can see that use_a tried to use a (raising an exception because it did), but use_b did not (it completed successfully).
This approach works more generally than sapi’s does: in particular, sapi’s approach will only detect an attribute being accessed on the object. But there are plenty of things you can do that do not access attributes on that object. This approach, rather than detecting attributes being accessed on that object, detects the object itself being accessed.

Depending on what you're looking to achieve, you may be able to pass in a proxy object which throws an exception when accessed.
For example:
class ObjectUsedException(Exception):
pass
class ErrorOnUseProxy(object):
def __getattr__(self, name):
raise ObjectUsedException('Tried to access %s'%name)
Of course, that approach will fail in two pretty common situations:
if the library itself checks if the attribute exists (eg, to provide some default value)
if it's treated as a primitive (float, string etc), though you could modify this approach to take that into account

I belive the simplest and least intrusive way is to turn the parameters into properties:
class Foo(object):
def __init__(self):
pass
#property
def a(self):
print >>sys.stderr, 'Accesing parameter a'
return 1
bar = Foo()
print bar.a == 1
Will print True in stdout, and Accesing parameter a to stderr. You would have to tweak it to allow the class to change it.

Python - how do I force the use of a factory method to instantiate an object?

I have a set of related classes that all inherit from one base class. I would like to use a factory method to instantiate objects for these classes. I want to do this because then I can store the objects in a dictionary keyed by the class name before returning the object to the caller. Then if there is a request for an object of a particular class, I can check to see whether one already exists in my dictionary. If not, I'll instantiate it and add it to the dictionary. If so, then I'll return the existing object from the dictionary. This will essentially turn all the classes in my module into singletons.
I want to do this because the base class that they all inherit from does some automatic wrapping of the functions in the subclasses, and I don't want to the functions to get wrapped more than once, which is what happens currently if two objects of the same class are created.
The only way I can think of doing this is to check the stacktrace in the __init__() method of the base class, which will always be called, and to throw an exception if the stacktrace does not show that the request to make the object is coming from the factory function.
Is this a good idea?
Edit: Here is the source code for my base class. I've been told that I need to figure out metaclasses to accomplish this more elegantly, but this is what I have for now. All Page objects use the same Selenium Webdriver instance, which is in the driver module imported at the top. This driver is very expensive to initialize -- it is initialized the first time a LoginPage is created. After it is initialized the initialize() method will return the existing driver instead of creating a new one. The idea is that the user must begin by creating a LoginPage. There will eventually be dozens of Page classes defined and they will be used by unit testing code to verify that the behavior of a website is correct.
from driver import get_driver, urlpath, initialize
from settings import urlpaths
class DriverPageMismatchException(Exception):
pass
class URLVerifyingPage(object):
# we add logic in __init__() to check the expected urlpath for the page
# against the urlpath that the driver is showing - we only want the page's
# methods to be invokable if the driver is actualy at the appropriate page.
# If the driver shows a different urlpath than the page is supposed to
# have, the method should throw a DriverPageMismatchException
def __init__(self):
self.driver = get_driver()
self._adjust_methods(self.__class__)
def _adjust_methods(self, cls):
for attr, val in cls.__dict__.iteritems():
if callable(val) and not attr.startswith("_"):
print "adjusting:"+str(attr)+" - "+str(val)
setattr(
cls,
attr,
self._add_wrapper_to_confirm_page_matches_driver(val)
)
for base in cls.__bases__:
if base.__name__ == 'URLVerifyingPage': break
self._adjust_methods(base)
def _add_wrapper_to_confirm_page_matches_driver(self, page_method):
def _wrapper(self, *args, **kwargs):
if urlpath() != urlpaths[self.__class__.__name__]:
raise DriverPageMismatchException(
"path is '"+urlpath()+
"' but '"+urlpaths[self.__class.__name__]+"' expected "+
"for "+self.__class.__name__
)
return page_method(self, *args, **kwargs)
return _wrapper
class LoginPage(URLVerifyingPage):
def __init__(self, username=username, password=password, baseurl="http://example.com/"):
self.username = username
self.password = password
self.driver = initialize(baseurl)
super(LoginPage, self).__init__()
def login(self):
driver.find_element_by_id("username").clear()
driver.find_element_by_id("username").send_keys(self.username)
driver.find_element_by_id("password").clear()
driver.find_element_by_id("password").send_keys(self.password)
driver.find_element_by_id("login_button").click()
return HomePage()
class HomePage(URLVerifyingPage):
def some_method(self):
...
return SomePage()
def many_more_methods(self):
...
return ManyMorePages()
It's no big deal if a page gets instantiated a handful of times -- the methods will just get wrapped a handful of times and a handful of unnecessary checks will take place, but everything will still work. But it would be bad if a page was instantiated dozens or hundreds or tens of thousands of times. I could just put a flag in the class definition for each page and check to see if the methods have already been wrapped, but I like the idea of keeping the class definitions pure and clean and shoving all the hocus-pocus into a deep corner of my system where no one can see it and it just works.

In Python, it's almost never worth trying to "force" anything. Whatever you come up with, someone can get around it by monkeypatching your class, copying and editing the source, fooling around with bytecode, etc.
So, just write your factory, and document that as the right way to get an instance of your class, and expect anyone who writes code using your classes to understand TOOWTDI, and not violate it unless she really knows what she's doing and is willing to figure out and deal with the consequences.
If you're just trying to prevent accidents, rather than intentional "misuse", that's a different story. In fact, it's just standard design-by-contract: check the invariant. Of course at this point, SillyBaseClass is already screwed up, and it's too late to repair it, and all you can do is assert, raise, log, or whatever else is appropriate. But that's what you want: it's a logic error in the application, and the only thing to do is get the programmer to fix it, so assert is probably exactly what you want.
So:
class SillyBaseClass:
singletons = {}
class Foo(SillyBaseClass):
def __init__(self):
assert self.__class__ not in SillyBaseClass.singletons
def get_foo():
if Foo not in SillyBaseClass.singletons:
SillyBaseClass.singletons[Foo] = Foo()
return SillyBaseClass.singletons[Foo]
If you really do want to stop things from getting this far, you can check the invariant earlier, in the __new__ method, but unless "SillyBaseClass got screwed up" is equivalent to "launch the nukes", why bother?

it sounds like you want to provide a __new__ implementation: Something like:
class MySingledtonBase(object):
instance_cache = {}
def __new__(cls, arg1, arg2):
if cls in MySingletonBase.instance_cache:
return MySingletonBase.instance_cache[cls]
self = super(MySingletonBase, cls).__new__(arg1, arg2)
MySingletonBase.instance_cache[cls] = self
return self

Rather than adding complex code to catch mistakes at runtime, I'd first try to use convention to guide users of your module to do the right thing on their own.
Give your classes "private" names (prefixed by an underscore), give them names that suggest they shouldn't be instantiated (eg _Internal...) and make your factory function "public".
That is, something like this:
class _InternalSubClassOne(_BaseClass):
...
class _InternalSubClassTwo(_BaseClass):
...
# An example factory function.
def new_object(arg):
return _InternalSubClassOne() if arg == 'one' else _InternalSubClassTwo()
I'd also add docstrings or comments to each class, like "Don't instantiate this class by hand, use the factory method new_object."

You can also just nest classes in factory method, as described here:
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html#preventing-direct-creation
Working example from mentioned source:
# Factory/shapefact1/NestedShapeFactory.py
import random
class Shape(object):
types = []
def factory(type):
class Circle(Shape):
def draw(self): print("Circle.draw")
def erase(self): print("Circle.erase")
class Square(Shape):
def draw(self): print("Square.draw")
def erase(self): print("Square.erase")
if type == "Circle": return Circle()
if type == "Square": return Square()
assert 0, "Bad shape creation: " + type
def shapeNameGen(n):
for i in range(n):
yield factory(random.choice(["Circle", "Square"]))
# Circle() # Not defined
for shape in shapeNameGen(7):
shape.draw()
shape.erase()
I'm not fan of this solution, just want to add this as one more option.

Can't make properties work within nested methods

Im trying to create a class with some formatting options. But i can't figure out how to do it properly...
The code produced the following error:
AttributeError: 'NoneType' object has no attribute 'valuesonly'
class Testings(object):
def format_as_values_only(self,somedata):
buildstring=somedata.values()
return buildstring
def format_as_keys_only(self):
pass
def format_as_numbers(self):
pass
def get_data_method(self):
self.data= {'2_testkey':'2_testvalue',"2_testkey2":"2_testvalue2"}
#property
def valuesonly(self):
return format_as_values_only(self.data)
test=Testings()
print test.get_data_method().valuesonly
The important thing for me is to be able to get the formatters like: class.method.formatter or so...
Thanks a lot for any hints!

get_data_method has no return value, so the result of test.get_data_method() is None. That's why you're getting that exception.
If you really want to do something like test.get_data_method().valuesonly, either define the valuesonly property on Testings, and have get_data_method return self, or have get_data_method return some new object with the properties that you want defined.

You can't do things this way. Methods are just functions defined directly inside a class block. Your function is inside another function, so it's not a method. The property decorator is useless except in a class block.
But, more fundamentally, function definitions just create local names, the same as variable assignments or anything else. Your valuesonly function is not accessible at all from outside the get_data_method function, because nothing from within a function is accessible except its return value. What you have done is no different than:
def get_data_method(self):
a = 2
. . . and then expecting to be able to access the local variable a from outside the function. It won't work. When you call get_data_method(), you get the value None, because get_data_method doesn't return anything. Anything you subsequently do with the result of get_data_method() is just operating on that same None value.
If you want to access things using the syntax you describe, you will need to make get_data_method return an object that has properties like valuesonly. In other words, write another class that provides a valuesonly property, and have get_data_method return an instance of that class. A rough outline (untested):
class DataMethodGetter(object):
def __init__(self, parent):
self.parent = parent
#property
def valuesonly(self):
return format_as_values_only(self.parent.data)
class Testings(object):
# rest of class def here
def get_data_method(self):
self.data = {'blah': 'blah'}
return DataMethodGetter(self)
However, you should think about why you want to do this. It's likely to be simpler to set it up to just call valuesonly directly on the Testing object, or to pass a flag to get_data_method, doing something like get_data_method(valuesonly=True).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.