Nested method in a class in Python - python

I'm using a software package for optimization called CVX. It has these "atoms" which take a CVX expression and construct a new CVX expression. One example is the trace atom to compute the trace of a matrix.
I thought I would need code as below to create a CVX variable (an n by n matrix) and compute its trace
X = cvxpy.Variable((n,n))
tr = cvxpy.atoms.affine.trace.trace(X)
This does work but what also works is simply
X = cvxpy.Variable((n,n))
tr = cvxpy.trace(X)
Why does the second option work? In general, when there is a class with nested methods, how am I able to call an inner method directly in Python?

I wouldn't generalize the behavior. Almost certainly this is done by design. Likely the lib devs didn't want to have to be so verbose with tracing down the hierarchy so gave you this shortcut. It's also possible (very likely) that the .trace directly against the cvxpy object is operating on that object itself whereas the deeper one is acting on the cvxpy.atoms.affine.trace object.
Be very careful because the side effects may not be the same.
To answer your question very directly I'd suggest that the second option works because somebody thought to make their API easier OR it just happens to work the way you're expecting.
To your second question: nested methods aren't a thing. There is a property of the cvxpy object called atoms which in turn has a property called affine which has a method called trace.

Related

How to track the "calling chain" from numpy to C implementation?

I have read the tutorial and API guide of Numpy, and I learned how to extend Numpy with my own C code or how to use C to call Numpy function from this helpful documentation.
However, what I really want to know is: how could I track the calling chain from python code to C implementation? Or i.e. how could I know which part of its C implementation corresponds to this simple numpy array addition?
x = np.array([1, 2, 3])
y = np.array([1, 2, 3])
print(x + y)
Can I use some tools like gdb to track its stack frame step by step?
Or can I directly recognize the corresponding codes from variable naming policy? (like if I want to know the code about addition, I can search for something like function PyNumpyArrayAdd(...) )
[EDIT] I found a very useful video about how to point out the C implementation of these basic C-implemented function or operator overrides like "+" "-".
https://www.youtube.com/watch?v=mTWpBf1zewc
Got this from Andras Deak via Numpy mailing-list.
[EDIT2] There is another way to track all the functions called in Numpy using gdb. It's very heavy because it will display all the functions in Numpy that are called, including these trivial ones. And it might take some time.
First you need to download/clone the Numpy repository to your own working space and then compile it with -g option, which will attach debug informations for debugging.
Then you open a terminal in the "path/to/numpy-main" directory where the setup.py of Numpy lies, and then run gdb.
If you want to know what functions in Numpy's C implementation are called in this single python statement:
y = np.exp(x)
you can set breakpoints on all the functions implemented by Numpy using this gdb python script provided by the first answer here:
Can gdb set break at every function inside a directory?
Once you load this python script by source somename.py, you can run this command in gdb: rbreak-dir numpy/core/src
And you can set commands for each breakpoint:
commands 1-5004
> silent
> bt 1
> c
> end
(here 1-5004 is the range of the breakpoints that you want to run commands on)
Once a breakpoint is activated, this command will run and print the first layer of backtrace (which is the info of the current function you are in) and then continue. In this way, you can track all the functions in Numpy, and this is a pic from my own working environment (I took a snapshot since there are rules preventing copying any byte from working computer):
Hope my trials can help the future comers.
However, what I really want to know is: how could I track the calling chain from python code to C implementation? Or i.e. how could I know which part of its C implementation corresponds to this simple numpy array addition?
AFAIK, there is two main way to do that: using a debugger or by tracking the function in the code (typically by looking the wrapping part or by searching keywords in numpy/core/src/XXX/). Numpy has different kind of functions. Some are focusing more on the CPython interaction part (eg. type checking, array creation, generic iterators, etc.) and some are focusing on the computing part (doing the computation efficiently). Regarding what you want, different files needs to be inspected. core/src/umath/loops.c.src is the way to go for core computing functions doing basic independent math operations.
Can I use some tools like gdb to track its stack frame step by step?
Using a debugger is the common way to do unless you are familiar with the code of Numpy. You can try to find the Numpy entry point function by looking the wrapper code but I think it is a bit difficult as this part of the code is not very readable (many related parts are generated certainly to ease the development of avoid mistakes). The hard part with GDB is to find the first entry point of the function in Numpy (the CPython interpreter function calls are hard to track as they are many of them (sometime called recursively) and the call stack is quite big far from being clear (ie. there is no clear information about the actual statement/expression being executed). That being said, AFAIR, the entry point is often something like PyArray_XXX or array_XXX. You can also track the first function executing code of the Numpy library.
Or can I directly recognize the corresponding codes from variable naming policy?
Some functions have a standardized name like typically PyArray_XXX. That being said, core computing function generally does not. They have a name generated by a template system that parse comments and annotations and generate code based on that. For adding two array, the main computing function should be for example #TYPE#_add#isa# where #TYPE# is either INT or LONG regarding your target platform. There is a special version (ie. specialization) for floating-point numbers that makes use of an optimized pair-wise summation so sake of accuracy. This kind of naming convention is quite frequent though so you can search _add in the code or a begin repeat section with add as a kind parameter.
Related post: Numpy argmax source

Why are classes mostly instantiated through functions?

I've been using python for scientific purposes for some years now. I recently became more familiar with class writing, but I feel like I'm missing something regarding the standard way to instantiate classes.
Say I define a class MyClass.
class MyClass:
def __init__(self):
pass
Then I know that I can map x to an instance of MyClass simply with
x = MyClass()
This works well and exactly as I expect.
However, it seems to me that when I use code from standard libraries or from numpy or scipy, I don't create objects in the same way: as far as I know, I generally don't use the name of a class to instantiate it. From what I understand, I'd say that this implies that I use neither class methods nor the default constructor of a class, but rather other functions which are defined outside the class.
For example, numpy's random module uses a class Generator to generate random numbers. However, numpy explicitly recommends not to use the class constructor to get a Generator instance, and to use instead the default_rng function from the random module. So if I want to generate random numbers, I use
rng = numpy.random.default_rng()
to create a Generator instance. This is done without using explicitly the name of the class.
It seems to me that most of the code that I use is written in the latter way. Why is that so? Is it somehow considered bad practice to directly call default class constructors? Is it considered to be a better practice to have separate functions in a module to create class instances? Is it only because some preprocessing must usually be done before creating an instance of a class? (I guess not, because it that case, why not do that in the initialization of the class?)
No, it is not bad practice to use the normal constructor, but sometimes it can be useful to have an alternative constructor.
Reasons for using a function as an alternative constructor to create an object:
(not a complete list and not in any order)
Decouple the creation of an object from its implementation.
Decoupling is often aimed for in OOP.
Hide complexity
The constructor could have many parameters, but often a default object is needed.
Easier to read/write and understand
numpy.random.default_rng() vs numpy.random.Generator(numpy.random.PCG64())
A factory, that creates and returns a (different) object, based on sometimes complex conditions.
e.g. python's open() returns different objects for text files and for binary files.
Where to implement these?
In some other languages, these would be implemented as class methods of the class they instantiate, or even of a new class.
This could be done in python, too, but it is often shorter and more convenient to use, if they are implemented as functions at module level.
I think np.array call to create np.ndarray is probably one of the most common ways in which an object is created by calling another function. Here is an explanation of that.
What is the difference between ndarray and array in numpy?
I cannot answer for all cases in which we use a function to "wrap" the construction of an object, but I have used such functions to simplify object creation in many situations which results in cleaner code. I can speak of such situations.
For example, the underlying class definition may expose a lot of parameters. It may not make sense to ask the user to provide parameters values for all parameters of the class in 99.9% of the cases (say). These "spurious" parameters may be fixed, or may be inferred from other parameter values in most such situations (e.g., parameter b is 2x parameter a in most cases). The code becomes unwieldy in these 99.9% of cases to explicitly provide values for such parameters, so a wrapper function is written to make it cleaner.
It is possible to use default parameters to deal with many such situations, but it may not make sense to push the inference of parameter values into the class' init function itself. For example, while something like b = 2 * a if a is None else b seems reasonable to put in the init function, where a, b are parameters, it may not be so simple practically (e.g., b may have a complex relationship with a, c, d, f, etc or it may be a class object itself), or there may be 1000 such parameter inferences to be made. So it is logical to separate such "glue" code (which is a customization for ease of usage) into another function and keep the base code (which implements a specific functionality) clean and to-the-point.
Do we want to write another class wrapper instead of a function wrapper? In this case, the new class wrapper will present a simplified interface. But writing a class wrapper in this situation is unnecessary since class implies many things, while a function implies just procedural execution.
Note that this happens mostly in case of library type code which has the largest number of use cases where you want to make usage easiest for most people to use. Such issues do not exist for most "user" code where we simply write classes for a specific application. So in practice when we write applications, we should create classes directly using constructors when possible.
There is also the popular Factory Design pattern that some #ekhumoro referenced above which is very similar to this. But based on text-book definition, the Factory Design pattern seems to be restricted to super/sub classes (I could be wrong, and this might be useless semantics).

Defining tensorflow operations in python with attributes

I am trying to register a python function and its gradient as a tensorflow operation.
I found many useful examples e.g.:
Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)
https://programtalk.com/python-examples/tensorflow.python.framework.function.Defun/
Nonetheless I would like to register attributes in the operation and use these attributes in the gradient definition by calling op.get_attr('attr_name').
Is this possible without going down to C implementation?
May you give me an example?
Unfortunately I don't believe it is possible to add attributes without using a C++ implementation of the operation. One feature that may help though is that you can define 'private' attributes by prepending an underscore to the start. I'm not sure if this is well documented or what the long-term guarantees are, but you can try setting '_my_attr_name' and you should be able to retrieve it later.

Call the same method in all objects in Python?

Long story short, I need to find a shortcut to calling the same method in multiple objects which were made from multiple classes.
Now, all of the classes have the same parent class, and even though the method differs bewteen the different classes, I figured the methods be the same name would work. So I though I might just be able to do something like this:
for object in listOfObjects:
object.method()
It hasn't worked. It might very well be a misspelling by me, but I can't find it. I think I could solve it by making a list that only adds the objects I need, but that would require a lot of coding, including changing other classes.
~~ skip to last paragraph for pseudo code accurately describing what I need~~
At this point, I will begin to go more in detail as to what specifically I am doing. I hope that it will better illustrate the scope of my question, and that the answering of my question will be more broadly applicable. The more general usage of this question are above, but this might help answer the question. Please be aware that I will change the question once I get an answer to more closely represent what I need done, so that it can apply to a wide variety of problems.
I am working on a gravity simulator. Whereas most simulators make objects which interact with one another and represent full bodies where their center of gravity is the actual attraction point, I am attempting to write a program which will simulate the distribution of gravity across all given points within an object.
As such, each object(not in programming terms, in literal terms) is made up of a bunch of tiny objects (both literally and figuratively). Essentially, what I am trying to do is call the object.gravity() method, which essentially takes into account all of the gravity from all other objects in the simulation and then moves the position of this particular object based on that input.
Now, either due to a syntactical bug (which I kinda doubt) or due to Python's limitations, I am unable to get all of the particles to behave properly all at once. The code snippet I posted before doesn't seem to be working.
tl;dr:
As such, I am wondering if there is a way (save adding all objects to a list and then iterating through it) to simply call the .gravity() method on every object that has the method. basically, even though this is sort of list format, this is what I want to do:
for ALL_OBJECTS:
if OBJECT has .gravity():
OBJECT.gravity()
You want the hasattr() function here:
for obj in all_objects:
if hasattr(obj, 'gravity'):
obj.gravity()
or, if the gravity method is defined by a specific parent class, you can test for that too:
for obj in all_objects:
if isinstance(obj, Planet):
obj.gravity()
Can also do ... better pythonic way to do it
for obj in all_objects:
try:
obj.gravity()
except AttributeError:
pass
Using getattr while set default option of getattr to lambda: None.
for obj in all_objects:
getattr(obj, 'gravity', lambda: None)()

Which is "better" practice? Passing object references or object method references in Python

I'm writing a small piece of code in Python and am curious what other people think of this.
I have a few classes, each with a few methods, and am trying to determine what is "better": to pass objects through method calls, or to pass methods through method calls when only one method from an object is needed. Basically, should I do this:
def do_something(self, x, y, manipulator):
self.my_value = manipulator.process(x, y)
or this
def do_the_same_thing_but_differently(self, x, y, manipulation):
self.my_value = manipulation(x, y)
The way I see it, the second one is arguably "better" because it promotes even looser coupling/stronger cohesion between the manipulation and the other class. I'm curious to see some arguments for and against this approach for cases when only a single method is needed from an object.
EDIT: I removed the OOP wording because it was clearly upsetting. I was mostly referring to loose coupling and high cohesion.
The second solution may provide looser coupling because it is more "functional", not more "OOP". The first solution has the advantage that it works in languages like C++ which don't have closures (though one can get a similar effect using templates and pointer-to-member-functions); but in a language like Python, IMHO the 2nd alternative seems to be more "natural".
EDIT: you will find a very nice discussion of "functional vs. object oriented" techniques in the free book "Higher order Perl", available here:
http://hop.perl.plover.com/
(look into chapter 1, part 6). Though it is a Perl (and not a Python) book, the discussion there fits exactly to the question asked here, and the functional techniques described there can be applied to Python in a similar way.
I will say the second approach ; because it's definitely look like a callback which they are very used when using the Hollywood principle (don't call us we will call you) which is a paradigm that assists in the development of code with high cohesion and low coupling [Ref 2] .
I would definitely go with the second approach.
Also consider that you could change the interface of whatever Manipulator class so that process is instead spelled __call__, and then it will work transparently with the second approach.

Categories

Resources