Long story short, I need to find a shortcut to calling the same method in multiple objects which were made from multiple classes.
Now, all of the classes have the same parent class, and even though the method differs bewteen the different classes, I figured the methods be the same name would work. So I though I might just be able to do something like this:
for object in listOfObjects:
object.method()
It hasn't worked. It might very well be a misspelling by me, but I can't find it. I think I could solve it by making a list that only adds the objects I need, but that would require a lot of coding, including changing other classes.
~~ skip to last paragraph for pseudo code accurately describing what I need~~
At this point, I will begin to go more in detail as to what specifically I am doing. I hope that it will better illustrate the scope of my question, and that the answering of my question will be more broadly applicable. The more general usage of this question are above, but this might help answer the question. Please be aware that I will change the question once I get an answer to more closely represent what I need done, so that it can apply to a wide variety of problems.
I am working on a gravity simulator. Whereas most simulators make objects which interact with one another and represent full bodies where their center of gravity is the actual attraction point, I am attempting to write a program which will simulate the distribution of gravity across all given points within an object.
As such, each object(not in programming terms, in literal terms) is made up of a bunch of tiny objects (both literally and figuratively). Essentially, what I am trying to do is call the object.gravity() method, which essentially takes into account all of the gravity from all other objects in the simulation and then moves the position of this particular object based on that input.
Now, either due to a syntactical bug (which I kinda doubt) or due to Python's limitations, I am unable to get all of the particles to behave properly all at once. The code snippet I posted before doesn't seem to be working.
tl;dr:
As such, I am wondering if there is a way (save adding all objects to a list and then iterating through it) to simply call the .gravity() method on every object that has the method. basically, even though this is sort of list format, this is what I want to do:
for ALL_OBJECTS:
if OBJECT has .gravity():
OBJECT.gravity()
You want the hasattr() function here:
for obj in all_objects:
if hasattr(obj, 'gravity'):
obj.gravity()
or, if the gravity method is defined by a specific parent class, you can test for that too:
for obj in all_objects:
if isinstance(obj, Planet):
obj.gravity()
Can also do ... better pythonic way to do it
for obj in all_objects:
try:
obj.gravity()
except AttributeError:
pass
Using getattr while set default option of getattr to lambda: None.
for obj in all_objects:
getattr(obj, 'gravity', lambda: None)()
Related
I'm encountering some behaviour which I can't explain.
I have some expensive functions that get called repeatedly; I have decorated them with #lru_cache(None) to help speed things up. My run times were still quite slow after doing that so I was a little confused.
I then realised that some of these functions had custom objects as parameters. My understanding is that by default, the hash for any custom object is based on it's ID. So my theory was that some of these expensive functions were being re-evaluated despite these arguments containing identical data. My objects are only used to group immutable data so I'm comfortable with looking up the cached value where the data within those objects is the same.
So based on my understanding of the lru_cache function, I added a __hash__ method to my objects, just doing something very crude for starters:
def __hash__(self):
return hash(str(self.__dict__))
So my theory is that my program should now be much quicker, as the caching will now take place on some of these expensive functions where it wasn't before.
To my dismay, my program is vastly slower; possibly it's getting stuck somewhere as I have not even had the patience to let it finish. For context, without the custom __hash__ methods a test case ran in about 16s; after adding the __hash__ methods the same test case was still running after about 10 minutes.
I don't have a deep understanding of how lru_cache works, but I have had a look at the source code and as far as I can tell it will just use my __hash__ function when it encounters those objects as parameters. Based on the drastic increase in run time, my current theory is that this is somehow causing the program to get stuck somewhere, rather than the cache lookups actually taking that long for some reason. But I can't see any reason why that would happen.
This feels like a bit of a wild goose chase to me but I can't imagine I'm the first person to try this. Does anybody know why this might happen?
Thanks
Edit:
I ran an even smaller test case to check if the program is actually terminating; it is. The smaller test case took 2.5s to run without the custom __hash__ functions, and 40s with them.
I have to stress that nothing else is changing between these two runs. The only difference is that I am adding the __hash__ function described above to three classes which take a journey around my code. Therefore I think the only possible conclusion is that my __hash__ function is somehow hugely slower than the default that would otherwise be used by lru_cache. That is, unless implementing a custom __hash__ function has other (invisible) costs that I'm not aware of.
I'm still at a loss to explain this. These are quite large objects which contain a lot of data, so str(self.__dict__) will be a pretty long string (probably thousands of characters). However I don't believe that hashing should take appreciably longer for a longer string. Perhaps Python does huge amounts of hashing in the background in various places and this small difference can add up? It seems far-fetched to me but there don't seem to be many options - the only alternative I can see is some weird interaction with the lru_cache logic which leads to a big slow-down. I'll keep doing experiments but hopefully someone will know the answer!
Edit 2:
I followed Samwise's suggestion and benchmarked this __hash__ function and it does seem to be genuinely a lot slower, and given the number of calls I can believe that this is the entire reason for my issue. I'm guessing that the self.__dict__ part is the bottleneck but my intuition about this doesn't have the best track-record so far.
That still leaves me with the problem of trying to speed up my code, but at least I know what's going on now.
Edit 3:
For anyone else who encounters this problem in the future - I decided to just pre-compute a hash value in my initialiser for my objects and return that in my __hash__ function, and that has sped things up massively. This solution does depend on the object not being mutated after creation.
The answer to this question ended up being quite simple - str(self.__dict__) is actually a pretty slow thing to run on every function call. I'm not sure why I didn't think of that in the first place.
Ultimately what I decided to do was just add a property to my classes, _hash, and I would set this equal to str(self.__dict__) at the end of initialising a new object. Then in my custom __hash__ method I would just grab the value of _hash, so that now lru_cache will work for functions with arguments which assume the type of my objects and not have to call str(self.__dict__) for every function call.
I should make it clear that this only works under the assumption that the object has its entire state defined at initialisation and doesn't get mutated over its lifetime - if it does, then the hash will go out of date you'll end up getting hits from the cache that aren't appropriate.
The title of my question is really but an entrance to my real question here:
If even the attributes and methods making up an object are themselves objects too, and each one of those is actually a dictionary under-hood, then I can't help but think of an infinite chain of dictionaries having other dictionaries as values inside of them, since there is no, I believe, object in Python that has zero attributes and methods.
The only explanation I can think of is that built-in types and functions, which are the building blocks of all other objects and which are supposedly implemented as dictionaries too, have primitives and traditional C functions as ultimate values to end the chain. But that's only a speculation.
Does anyone have an explanation?
I need several very similar plotting functions in python that share many arguments, but differ in some and of course also differ slightly in what they do. This is what I came up with so far:
Obviously just defining them one after the other and copying the code they share is a possibility, though not a very good one, I reckon.
One could also transfer the "shared" part of the code to helper functions and call these from inside the different plotting functions. This would make it tedious though, to later add features that all functions should have.
And finally I've also thought of implementing one "big" function, making possibly not needed arguments optional and then deciding on what to do in the function body based on additional arguments. This, I believe, would make it difficult though, to find out what really happens in a specific case as one would face a forest of arguments.
I can rule out the first option, but I'm hard pressed to decide between the second and third. So I started wondering: is there another, maybe object-oriented, way? And if not, how does one decide between option two and three?
I hope this question is not too general and I guess it is not really python-specific, but since I am rather new to programming (I've never done OOP) and first thought about this now, I guess I will add the python tag.
EDIT:
As pointed out by many, this question is quite general and it was intended to be so, but I understand that this makes answering it rather difficult. So here's some info on the problem that caused me to ask:
I need to plot simulation data, so all the plotting problems have simulation parameters in common (location of files, physical parameters,...). I also want the figure design to be the same. But depending on the quantity, some plots will be 1D, some 2D, some should contain more than one figure, sometimes I need to normalize the data or take a logarithm before plotting it. The output format might also vary.
I hope this helps a bit.
How about something like this. You can create a Base class that will have a method foo that is your base shared method that performs all the similar code. Then for your different classes you can inherit from Base and super the method of interest and extend the implementation to whatever extra functionality you need.
Here is an example of how it works. Note the different example I provided between how to use super in Python 2 and Python 3.
class Base:
def foo(self, *args, **kwargs):
print("foo stuff from Base")
return "return something here"
class SomeClass(Base):
def foo(self, *args, **kwargs):
# python 2
#x = super(SomeClass, self).foo(*args, **kwargs)
# python 3
x = super().foo(*args, **kwargs)
print(x)
print("SomeClass extension of foo")
s = SomeClass()
s.foo()
Output:
foo stuff from Base
return something here
SomeClass extension of foo from Base
More information needs to be given to fully understand the context. But, in a general sense, I'd do a mix of all of them. Use helper functions for "shared" parts, and use conditional statements too. Honestly, a lot of it comes down to just what is easier for you to do?
What is the most pythonic way to add several identical methods to several classes?
I could use a class decorator, but that seems to bring in a fair bit of complication and its harder to write and read than the other methods.
I could make a base class with all the methods and let the other classes inherit, but then for some of the classes I would be very tempted to allow multiple inheritance, which I have read frequently is to be avoided or minimized. Also, the "is-a" relationship does not apply.
I could also change them from being methods to make them stand-alone functions which just expect their values to supply the appropriate properties through duck-typing. This is in some ways clean, but it is less object oriented and makes it less clear when a function could be used on that type of object.
I could use delegation, but this requires all of the classes that want to call to have methods calling up to the helper classes methods. This would make the code base much longer than the other options and require adding a method to delegate every time I want to add a new function to the helper class.
I know giving one class an instance of the other as an attribute works nicely in some cases, but it does not always work cleanly and can make calls more complicated than they would be otherwise.
After playing around with it a bit, I am leaning towrds inheritance even when it leads to multiple inheritance. But I hesitate due to numerous texts warning very strongly against ever allowing multiple inheritance and some (such as the wikipedia entry) going so far as to say that inheritance just for code reuse such as this should be minimized.
This may be more clear with an example, so for a simplified example say we are dealing with numerous distinct classes which all have a location on an x, y grid. There are a lot of operations we might want to make methods of everything with an x, y location, such as a method to get the distance between two such entities, or the relative direction, midpoint between them, etc.
What would be the most pythonic way to give all such classes access to these methods that rely only on having x and y as attributes?
For your specific example, I would try to take advantage of duck-typing. Write plain simple functions that take objects which are assumed to have x and y attributes:
def distance(a, b):
"""
Returns distance between `a` and `b`.
`a` and `b` should have `x` and `y` attributes.
"""
return math.sqrt((a.x-b.x)**2 + (a.y-b.y)**2)
Very simple. To make it clear how the function can be used, just document it.
Plain old functions are best for this problem. E.g. instead of this ...
class BaseGeoObject:
def distanceFromZeroZero(self):
return math.sqrt(self.x()**2 + self.y()**2)
...
... just have functions like this one:
def distanceFromZeroZero(point):
return math.sqrt(point.x()**2 + point.y()**2)
This is a good solution because it's also easy to test - it's not necessary to subclass just to exercise a specific function.
I'm writing a small piece of code in Python and am curious what other people think of this.
I have a few classes, each with a few methods, and am trying to determine what is "better": to pass objects through method calls, or to pass methods through method calls when only one method from an object is needed. Basically, should I do this:
def do_something(self, x, y, manipulator):
self.my_value = manipulator.process(x, y)
or this
def do_the_same_thing_but_differently(self, x, y, manipulation):
self.my_value = manipulation(x, y)
The way I see it, the second one is arguably "better" because it promotes even looser coupling/stronger cohesion between the manipulation and the other class. I'm curious to see some arguments for and against this approach for cases when only a single method is needed from an object.
EDIT: I removed the OOP wording because it was clearly upsetting. I was mostly referring to loose coupling and high cohesion.
The second solution may provide looser coupling because it is more "functional", not more "OOP". The first solution has the advantage that it works in languages like C++ which don't have closures (though one can get a similar effect using templates and pointer-to-member-functions); but in a language like Python, IMHO the 2nd alternative seems to be more "natural".
EDIT: you will find a very nice discussion of "functional vs. object oriented" techniques in the free book "Higher order Perl", available here:
http://hop.perl.plover.com/
(look into chapter 1, part 6). Though it is a Perl (and not a Python) book, the discussion there fits exactly to the question asked here, and the functional techniques described there can be applied to Python in a similar way.
I will say the second approach ; because it's definitely look like a callback which they are very used when using the Hollywood principle (don't call us we will call you) which is a paradigm that assists in the development of code with high cohesion and low coupling [Ref 2] .
I would definitely go with the second approach.
Also consider that you could change the interface of whatever Manipulator class so that process is instead spelled __call__, and then it will work transparently with the second approach.