lazily providing the value of a string - python

I have a python object that conceptually allows access to an array full of strings through iterators and getters. However, since calculating the exact value of each element in the array is really expensive, I am looking into returning a proxy object for the content of each slot in the array and then calculate on the fly the actual value when it is really needed.
Namely, I would like to write this:
bar = foo.get(10) # just returns a proxy
baz = bar # increase proxy reference
l = [baz] # actually increase proxy reference again.
print baz # ooh, actually need the value. Calculate it only the fly.
v = '%s' % bar # I need the value here again
if bar is None: # I need the value here again
print 'x'
if bar: # I need the value here again
print 'x'
for i in bar: # I need the value here again
print i
In C++, I would try to overload the dereferencing operator... Any idea ?
I understand that for each of these cases, I could overload specific python 'magic' functions (such as __str__ for print baz) but I wonder if:
this is going to actually cover all possible usecases (are there ways to access the content of a variable that does not involve using a python magic function)
there is a more generic way to do this

In python you'd return a custom type, and override the __str__() method to calculate the string representation at printing time.
class MyCustomType(object):
def __str__(self):
return "My string is really costly to produce"
Depending on your use-cases, you are still looking at the various hooks python provides:
Attribute access on custom classes can be hooked into with the __getattr__ method, or by using a property.
Accessing individual items in a sequence-like class (list, tuple, string) and mapping-type classes can be hooked into with __getitem__.
You'll have to decide, based on your use-case, what you need to hook into, at which point it becomes inevitable that you need to make the expensive calculation. Python will let you hook that almost any point in an object's lifetime with ease.

Related

How to dynamically access object attributes in python without boxing?

getattr(dir,"__name__") is dir.__name__ evaluates to False - is there an alternative to getattr that would yield True ?
The __name__ attribute of built-in functions is implemented (on the CPython reference interpreter) as a property (technically, a get-set descriptor), not stored as an attribute in the form of a Python object.
Properties act like attributes, but call a function when the value is requested, and in this case, the function converts the C-style string name of the function to a Python str on demand. So each time you look up dir.__name__, you get freshly constructed str representing the data; as noted in the comments, this means there is no way to have an is check pass; even dir.__name__ is dir.__name__ returns False, because each lookup of __name__ returned a new str.
The language gives no guarantees of how __name__ is implemented, so you shouldn't be assuming it returns the same object each time. There are very few language guaranteed singletons (None, True, False, Ellipsis and NotImplemented are the biggies, and all classes have unique identities); assuming is will work with anything not in that set when it's not an object you controlled the creation of is a bad idea. If you want to check if the values are the same, test with ==, not is.
Update to address traversing an arbitrary graph of python objects without getting hung up by descriptors and other stuff (like __getattr__) that dynamically generate objects (and therefore shouldn't be invoked to describe the static graph):
The inspect.getattr_static function should let you "traverse an arbitrary graph of python objects reachable from a starting one while assuming as little possible about the types of objects and the implementation of their attributes" (as your comment requested). When the attribute is actually an attribute, it returns the value, but it doesn't trigger dynamic lookup for descriptors (like #property), __getattr__ or __getattribute__. So inspect.getattr_static(dir, '__name__') will return the getset_descriptor that CPython uses to implement __name__ without actually retrieving the string. On a different object where __name__ is a real attribute (e.g. the inspect module itself), it will return the attribute (inspect.getattr_static(inspect, '__name__') returns 'inspect').
While it's not perfect (some properties may actually be backed by real Python objects, not dynamically generated ones, that you can't otherwise access), it's at least a workable solution; you won't end up creating new objects by accident, and you won't end up in infinite loops of property lookup (e.g. every callable can have __call__ looked up on it forever, wrapping itself over and over as it goes), so you can at least arrive at a solution that mostly reflects the object graph accurately, and doesn't end up recursing to death.
Notably, it will preserve identity semantics properly. If two objects have the same attribute (by identity), the result will match as expected. If two objects share a descriptor (e.g. __name__ for all built-in functions, e.g. bin, dir), then it returns the descriptor itself, which will match on identity. And it does it all without needing to know up front if what you have is an attribute or descriptor.

Passing object attribute (id) or full object as argument

Suppose there is an Object with id among other attributes.
Suppose there is a generic function that would only need object.id for process (ex: retrieving object value from database from id)
Does it cost more in Python 3 (memory-wise or performance) to do:
def my_function(self, id):
do_stuff_with(id)
my_function(my_object.id)
or
def my_function(self, obj):
do_struff_with(obj.id)
my_function(my_object)
I imagine that answer depends on languages implementations, so I was wondering if there is an additional cost with object as argument in Python 3, since I feel it's more flexible (imagine that in the next implementation of this function you also need object.name, then you don't have to modify the signature)
In both cases, an object reference is passed to the function, I don't know how other implementations do this but, think C pointers to Python Objects for CPython (so same memory to make that call).
Also, the dot look-up in this case doesn't make a difference, you either perform a look-up for the id during the function call or you do it inside the body (and, really, attribute look-ups shouldn't be a concern).
In the end, it boils down to what makes more sense, passing the object and letting the function do what it wants with it is better conceptually. It's also is more flexible, as you stated, so pass the class instance.

Python/IronPython- how to return value of specific attribute of class when only class object referenced?

Ok, just trying to get into Python, specifically, I'm using the benefit of on-the-fly calculations from within a C# app calling the IronPython engine. I pass in to IP a list of C# objects of class C, and am going to do a bunch of calculations. There's a legacy (SQL) reference to a certain member of the class, call it CURRENTAMOUNT. There are other values of attributes that will be referenced in the calculations but I was wondering if there's a way to automatically return the value of CURRENTAMOUNT if only the object is referenced.
C# Class MyNumbersClass
public class MyNumbersClass
{
decimal SomeVal_A;
decimal SomeVal_B;
decimal CurrentAmount;
}
... gather a List called NumbersList...
... instantiate IronPython engine E
... set scope S
... set IP list PNumbers to C# list NumbersList
S.SetVariable(PNumbers, NumbersList)
So now, a simple calculation in IronPython(IP) is to calculate the following:
PNumbers[10].CurrentAmount = PNumbers[0].SomeVal_A * (PNumbers[2].SomeVal_B / DaysInWeek) - PNumbers[5].CurrentAmount
How can I re-write this so CurrentAmount is the default attribute to set and get when none specified?
I.e., PNumbers[10] = PNumbers[0].SomeVal_A * (PNumbers[2].SomeVal_B / DaysInWeek) - PNumbers[5]
I have way over-simplified the example as all I can say is that the existing legacy attributes of the passed in C# class are VERY long and wordy, and CurrentAmount is the default attribute set and/or referenced and it is HALF of the size of the legacy name for the commonly used attribute.
I realize as I type this that there might be unintended consequences of what I'm asking - i.e., when I really DO want to just reference the object of the class as a whole.
Any ideas?
It you're always dealing with lists of items, you could create a class in C# that wraps the list and implements __getitem__ and __setitem__ (basically overloading the [] operator) to refer to CurrentAmmount. You'll lose the ability to completely replace an element of that list from Python, but that might not matter in your case.

Python and reference passing. Limitation?

I would like to do something like the following:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(foo):
foo = Foo()
aTestFoo = None
factory(aTestFoo)
print aTestFoo.member
However it crashes with AttributeError: 'NoneType' object has no attribute 'member':
the object aTestFoo has not been modified inside the call of the function factory.
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
In C++, in the function prototype, I would have added a reference to the pointer to be created in the factory... but maybe this is not the kind of things I should think about in Python.
In C#, there's the key word ref that allows to modify the reference itself, really close to the C++ way. I don't know in Java... and I do wonder in Python.
Python does not have pass by reference. One of the few things it shares with Java, by the way. Some people describe argument passing in Python as call by value (and define the values as references, where reference means not what it means in C++), some people describe it as pass by reference with reasoning I find quite questionable (they re-define it to use to what Python calls "reference", and end up with something which has nothing to do with what has been known as pass by reference for decades), others go for terms which are not as widely used and abused (popular examples are "{pass,call} by {object,sharing}"). See Call By Object on effbot.org for a rather extensive discussion on the defintions of the various terms, on history, and on the flaws in some of the arguments for the terms pass by reference and pass by value.
The short story, without naming it, goes like this:
Every variable, object attribute, collection item, etc. refers to an object.
Assignment, argument passing, etc. create another variable, object attribute, collection item, etc. which refers to the same object but has no knowledge which other variables, object attributes, collection items, etc. refer to that object.
Any variable, object attribute, collection item, etc. can be used to modify an object, and any other variable, object attribute, collection item, etc. can be used to observe that modification.
No variable, object attribute, collection item, etc. refers to another variable, object attribute, collection items, etc. and thus you can't emulate pass by reference (in the C++ sense) except by treating a mutable object/collection as your "namespace". This is excessively ugly, so don't use it when there's a much easier alternative (such as a return value, or exceptions, or multiple return values via iterable unpacking).
You may consider this like using pointers, but not pointers to pointers (but sometimes pointers to structures containing pointers) in C. And then passing those pointers by value. But don't read too much into this simile. Python's data model is significantly different from C's.
You are making a mistake here because in Python
"We call the argument passing technique _call by sharing_,
because the argument objects are shared between the
caller and the called routine. This technique does not
correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP). In particular it
is not call by value because mutations of arguments per-
formed by the called routine will be visible to the caller.
And it is not call by reference because access is not given
to the variables of the caller, but merely to certain objects."
in Python, the variables in the formal argument list are bound to the
actual argument objects. the objects are shared between caller
and callee; there are no "fresh locations" or extra "stores" involved.
(which, of course, is why the CLU folks called this mechanism "call-
by-sharing".)
and btw, Python functions doesn't run in an extended environment, either. function bodies have very limited access to the surrounding environment.
The Assignment Statements section of the Python docs might be interesting.
The = statement in Python acts differently depending on the situation, but in the case you present, it just binds the new object to a new local variable:
def factory(foo):
# This makes a new instance of Foo,
# and binds it to a local variable `foo`,
foo = Foo()
# This binds `None` to a top-level variable `aTestFoo`
aTestFoo = None
# Call `factory` with first argument of `None`
factory(aTestFoo)
print aTestFoo.member
Although it can potentially be more confusing than helpful, the dis module can show you the byte-code representation of a function, which can reveal how Python works internally. Here is the disassembly of `factory:
>>> dis.dis(factory)
4 0 LOAD_GLOBAL 0 (Foo)
3 CALL_FUNCTION 0
6 STORE_FAST 0 (foo)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
What that says is, Python loads the global Foo class by name (0), and calls it (3, instantiation and calling are very similar), then stores the result in a local variable (6, see STORE_FAST). Then it loads the default return value None (9) and returns it (12)
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
Factory functions are rarely necessary in Python. In the occasional case where they are necessary, you would just return the new instance from your factory (instead of trying to assign it to a passed-in variable):
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory():
return Foo()
aTestFoo = factory()
print aTestFoo.member
Your factory method doesn't return anything - and by default it will have a return value of None. You assign aTestFoo to None, but never re-assign it - which is where your actual error is coming from.
Fixing these issues:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(obj):
return obj()
aTestFoo = factory(Foo)
print aTestFoo.member
This should do what I think you are after, although such patterns are not that typical in Python (ie, factory methods).

Python achieve pointer like behaviour

I have to write a testing module and have c++-Background. That said, I am aware that there are no pointers in python but how do I achieve the following:
I have a test method which looks in pseudocode like this:
def check(self,obj,prop,value):
if obj.prop <> value: #this does not work,
#getattr does not work either, (objects has no such method (interpreter output)
#I am working with objects from InCyte's python interface
#the supplied findProp method does not do either (i get
#None for objects I can access on the shell with obj.prop
#and yes I supply the method with a string 'prop'
if self._autoadjust:
print("Adjusting prop from x to y")
obj.prop = value #setattr does not work, see above
else:
print("Warning Value != expected value for obj")
Since I want to check many different objects in separate functions I would like to be able to keep the check method in place.
In general, how do I ensure that a function affects the passed object and does not create a copy?
myobj.size=5
resize(myobj,10)
print myobj.size #jython =python2.5 => print is not a function
I can't make resize a member method since the myobj implementation is out of reach, and I don't want to type myobj=resize(myobj, 10) everywhere
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name?
getattr isn't a method, you need to call it like this
getattr(obj, prop)
similarly setattr is called like this
setattr(obj, prop, value)
In general how do I ensure that a function affects the passed object and does not create a copy?
Python is not C++, you never create copies unless you explicitly do so.
I cant make resize a member method since myobj implementation is out of reach, and I don't want to type myobj=resize(myobj,10) everywere
I don't get it? Why should be out of reach? if you have the instance, you can invoke its methods.
In general, how do I ensure that a function affects the passed object
By writing code inside the function that affects the passed-in object, instead of re-assigning to the name.
and does not create a copy?
A copy is never created unless you ask for one.
Python "variables" are names for things. They don't store objects; they refer to objects. However, unlike C++ references, they can be made to refer to something else.
When you write
def change(parameter):
parameter = 42
x = 23
change(x)
# x is still 23
The reason x is still 23 is not because a copy was made, because a copy wasn't made. The reason is that, inside the function, parameter starts out as a name for the passed-in integer object 23, and then the line parameter = 42 causes parameter to stop being a name for 23, and start being a name for 42.
If you do
def change(parameter):
parameter.append(42)
x = [23]
change(x)
# now x is [23, 42]
The passed-in parameter changes, because .append on a list changes the actual list object.
I can't make resize a member method since the myobj implementation is out of reach
That doesn't matter. When Python compiles, there is no type-checking step, and there is no step to look up the implementation of a method to insert the call. All of that is handled when the code actually runs. The code will get to the point myobj.resize(), look for a resize attribute of whatever object myobj currently refers to (after all, it can't know ahead of time even what kind of object it's dealing with; variables don't have types in Python but instead objects do), and attempt to call it (throwing the appropriate exceptions if (a) the object turns out not to have that attribute; (b) the attribute turns out not to actually be a method or other sort of function).
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name? / getattr does not work either
Certainly it works if you use it properly. It is not a method; it is a built-in top-level function. Same thing with setattr.

Categories

Resources