I'm working with a 3rd party package that defines a result object from an expensive method call on a source object: result_object = source_object.method(input_value).
I'd like to modify result_object inside a function call, something like this:
def modify_result(result_object, update_value):
result_object = source_object.method(update_value)
Obviously the posted code won't work; it just creates a local result_object that gets discarded. Instead, I could:
make result_object nonlocal in the function, and modify it as above
extend the result_object class and add a modify_result method
something else?
A few clarifications. In this context, is result_object considered global? And more importantly, is there a preferred method to update result_object so other functions can access it?
It's not quite clear what is the expensive vs cheap operation in your example - are you just showing the expensive operation?
In any case, it sounds like you have an expensive operation and a cheap one and you'd like to be able to use the cheap one where applicable. To do this, I would think you'd need an existing object to leverage, so I would suggest having an optional keyword argument to supply such an object, but making the return value the same result type regardless of whether it was supplied or not. Something like:
def process_data(new_data, existing_result=None):
if existing_result is None:
# create a new result object
return make_result_via_expensive_op(new_data)
else:
# modify an existing result object
existing_result.modify_via_cheap_op(new_data)
return existing_result
I wouldn't recommend making it global. You can more easily pass around a reference and it's easier to follow the code.
Related
I would like to copy an existing function from an existing module in the following way:
def foo(a,b,c=1,d=3,*arg):
return True
myClass.foo = lambda b,c,d,*arg : foo(my_value_of_a, b,c,d,*arg)
However, there are several problems with this approach namely:
I am doing this in a loop and I don't know the arguments of most functions
I am losing the default values - which I absolutely cannot
The __docs__ and other attributes would be nice to keep too
I tried to do something like this:
handler = getattr(mod,'foo')
handler.__defaults__ = tuple([my_value_of_a] + list(handler.__defaults__))
myClass.foo = handler
which is almost enough for my use case (just because I always modify the first argument only). The problem is that if I call mod.foo() it also has my_value_of_a as the default value for a!
I tried using the copy module to do a handler=deepcopy(handler) but even that didn't work and modifying the default values of handler also modifies the default values of the module function itself.
Any suggestions on who to do this in a "pythonic" way? I probably cannot use decorators either, since I'm looping over functions from external modules (several, actually).
So I have a class with a couple of methods defined as:
class Recognizer(object):
def __init__(self):
self.image = None
self.reduced_image = None
def load_image(self, path):
self.image = cv2.imread(path)
return self.image
Say I wanna add a third method that uses a return value from load_image(). Should I define it like this:
def shrink_image(self):
self.reduced_img = cv2.resize(self.image, (300, 300))
return self.reduced_img
Or should I define it like this:
def shrink_image(self, path):
reduced_img = cv2.resize(self.load_image(path), (300, 300))
return reduced_img
What exactly is the difference between the two? I can see that I can have access to the fields inside of init from any method that I declare within that class so I guess if I update the fields within init I would be able to access those fields for a instance at a given time.
Is there a consensus on which way is better?
What exactly is the difference between the two?
In Python the function with the signature __init__ is the constructor of the object, which is invoked implicitly when calling it via (), such as Recognizer()
The term "better" is vague, because in the former example you are saving the image as a property on the object, hence making the object larger.
But in second example you are simply returning the data from the function, to be used by the caller.
So it's a matter of context and style.
A simple rule of thumb is if that you are going to be using the property reduced_img in the context of the Recognizer object then it would be ideal to save it as a property on the object, to be accessed via self. If the caller is simply using the reduced_img and Recognizer is unaware of any state changes, then it's fine to just return it from the function.
In the second way the variable is scoped to the shrink_image function.
In the first way the variable is scoped to the objects lifetime, and having self.reduced_img set is a side-effect of the method.
Only seeing your code sample, without seeing clients, the second case is "better", because reduced_img isn't used anywhere else, and is unecessary to bind it to the instance. There def may be a use case where you need to persist the last self.reduced_img call making it a necessary side-effect.
In general it is extremely helpful to minimize side effects. Having side effects especially ones that mutate state can make reasoning about your program more difficult.
This is especially seen when you have multiple accessors to your object.
Imagine having the first shrink_image, you release your program, you have a single client in a single call site of the program calling shrink_object, easy peasy. After the call self.reduced_img will be the result.
Imagine sharing the object between multiple call sites?? It introduces a temporal-ish coupling: you may no longer be able to make an assumption about what reduced_img is, and accesses to it before calling shrink_image may no longer be None, because there may be other callers!!!
Compare this to the second shrink image, callers no longer have the mutatable state, and it's easier to reason about the state of Recognizer instance across shrink_image calls.
Something really nuts happens for the first example when multiple concurrent calls are introduced. It goes from being difficult to reason about and potentially logically incorrect to being a synchronization and data race issue.
Without concurrent callers this isn't going to be an issue. But it's def a possibility, If you're using this call in a web framework and you create a single instance to share between multiple web worker processes you can get this implicit concurrency and could potentially, maybe be subject to race conditions :p
I'd like to make a copy of class, while updating all of its methods to refer a new set of __globals__
I was thinking something like below, however unlike types.FunctionType, the constructor for types.UnboundMethodType does not accept __globals__, any suggestions how to work around this?
def copy_class(old_class, new_module):
"""Copies a class, updating __globals__ of all methods to point to new_module"""
new_dict = {}
for name, entry in old_class.__dict__.items():
if isinstance(entry, types.UnboundMethodType):
entry = types.UnboundMethodType(name, None, old_class.__class__, globals=new_module.__dict__)
new_dict[name] = entry
return type(old_class.name, old_class.__bases__, new_dict)
The __dict__ values are functions, not unbound methods. The unbound method objects only get created on attribute access. If you are seeing unbound method objects in the __dict__, something weird happened with your class object before this function got to it.
I don't know about you, but I generally don't like to use types for anything other than type checking (which I don't do very often ;-). I'd much rather inspect...
I have to preface this code by saying that I hope you have a really good reason for wanting to do this ;-) -- to me, it seems like just subclassing and overriding class properties should get the job done much more elegantly ... However, If you really want to copy a class -- Why not just execute it's source again in the new namespace?
I've put together the following simple modules:
# test.py
# Just some test data
FOO = 1
class Bar(object):
def subclass_method(self):
print('Hello World!')
class Foo(Bar):
def method(self):
return FOO
And then something to do the heavy lifting:
import sys
import inspect
def copy_class(cls, new_globals):
source = inspect.getsource(cls)
globs = {}
globs.update(sys.modules[cls.__module__].__dict__)
globs.update(new_globals)
exec source in globs
return globs[cls.__name__]
# Check that it works...
import test
NewFoo = copy_class(test.Foo, {'FOO': 2})
print NewFoo().method()
NewFoo().subclass_method()
print test.Foo().method()
test.Foo().subclass_method()
This has some possibly desirable properties and undesirable... First, it only works on classes that are inspectable. That's pretty much anything user-defined so probably not too restrictive... It also might be a bit slower than other solutions that don't involve re-parsing the source string -- But again, it doesn't seem like this should be executed too frequently, so that's probably Ok.
Now the "advantages"...
If a global is requested by a function but not supplied, this will use the global from the old namespace. If this behavior isn't desireable (i.e. you'd rather have the NameError), you can modify the function easily to remove it.
The "copy" doesn't inherit from the original. For most purposes, that probably doesn't matter, but it's a bit weird to have the copy of something inherit from the original ...
Some people might see the exec in here and immediately think "Oh no! exec!?!?! The world is about to end!!!". Franky, that's a good default response. However, I argue that if you're copying a function that you plan to use later in the code, it is no more safe than using exec (after all, the function's code has already been executed).
I have a function with way to much going on in it so I've decided to split it up into smaller functions and call all my block functions inside a single function. --> e.g.
def main_function(self):
time_subtraction(self)
pay_calculation(self,todays_hours)
and -->
def time_subtraction(self):
todays_hours = datetime.combine(datetime(1,1,1,0,0,0), single_object2) - datetime.combine(datetime(1,1,1,0,0,0),single_object)
return todays_hours
So what im trying to accomplish here is to make todays_hours available to my main_function. I've read lots of documentation and other resources but apparently I'm still struggling with this aspect.
EDIT--
This is not a method of the class. Its just a file where i have a lot of functions coded and i import it where needed.
If you want to pass the return value of one function to another, you need to either nest the function calls:
pay_calculation(self, time_subtraction(self))
… or store the value so you can pass it:
hours = time_subtraction(self)
pay_calculation(self, hours)
As a side note, if these are methods in a class, you should be calling them as self.time_subtraction(), self.pay_calculation(hours), etc., not time_subtraction(self), etc. And if they aren't methods in a class, maybe they should be.
Often it makes sense for a function to take a Spam instance, and for a method of Spam to send self as the first argument, in which case this is all fine. But the fact that you've defined def time_subtraction(self): implies that's not what's going on here, and you're confused about methods vs. normal functions.
I have to write a testing module and have c++-Background. That said, I am aware that there are no pointers in python but how do I achieve the following:
I have a test method which looks in pseudocode like this:
def check(self,obj,prop,value):
if obj.prop <> value: #this does not work,
#getattr does not work either, (objects has no such method (interpreter output)
#I am working with objects from InCyte's python interface
#the supplied findProp method does not do either (i get
#None for objects I can access on the shell with obj.prop
#and yes I supply the method with a string 'prop'
if self._autoadjust:
print("Adjusting prop from x to y")
obj.prop = value #setattr does not work, see above
else:
print("Warning Value != expected value for obj")
Since I want to check many different objects in separate functions I would like to be able to keep the check method in place.
In general, how do I ensure that a function affects the passed object and does not create a copy?
myobj.size=5
resize(myobj,10)
print myobj.size #jython =python2.5 => print is not a function
I can't make resize a member method since the myobj implementation is out of reach, and I don't want to type myobj=resize(myobj, 10) everywhere
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name?
getattr isn't a method, you need to call it like this
getattr(obj, prop)
similarly setattr is called like this
setattr(obj, prop, value)
In general how do I ensure that a function affects the passed object and does not create a copy?
Python is not C++, you never create copies unless you explicitly do so.
I cant make resize a member method since myobj implementation is out of reach, and I don't want to type myobj=resize(myobj,10) everywere
I don't get it? Why should be out of reach? if you have the instance, you can invoke its methods.
In general, how do I ensure that a function affects the passed object
By writing code inside the function that affects the passed-in object, instead of re-assigning to the name.
and does not create a copy?
A copy is never created unless you ask for one.
Python "variables" are names for things. They don't store objects; they refer to objects. However, unlike C++ references, they can be made to refer to something else.
When you write
def change(parameter):
parameter = 42
x = 23
change(x)
# x is still 23
The reason x is still 23 is not because a copy was made, because a copy wasn't made. The reason is that, inside the function, parameter starts out as a name for the passed-in integer object 23, and then the line parameter = 42 causes parameter to stop being a name for 23, and start being a name for 42.
If you do
def change(parameter):
parameter.append(42)
x = [23]
change(x)
# now x is [23, 42]
The passed-in parameter changes, because .append on a list changes the actual list object.
I can't make resize a member method since the myobj implementation is out of reach
That doesn't matter. When Python compiles, there is no type-checking step, and there is no step to look up the implementation of a method to insert the call. All of that is handled when the code actually runs. The code will get to the point myobj.resize(), look for a resize attribute of whatever object myobj currently refers to (after all, it can't know ahead of time even what kind of object it's dealing with; variables don't have types in Python but instead objects do), and attempt to call it (throwing the appropriate exceptions if (a) the object turns out not to have that attribute; (b) the attribute turns out not to actually be a method or other sort of function).
Also, how can I make it so that I can access those attributes in a function to which i pass the object and the attribute name? / getattr does not work either
Certainly it works if you use it properly. It is not a method; it is a built-in top-level function. Same thing with setattr.