I have read the tutorial and API guide of Numpy, and I learned how to extend Numpy with my own C code or how to use C to call Numpy function from this helpful documentation.
However, what I really want to know is: how could I track the calling chain from python code to C implementation? Or i.e. how could I know which part of its C implementation corresponds to this simple numpy array addition?
x = np.array([1, 2, 3])
y = np.array([1, 2, 3])
print(x + y)
Can I use some tools like gdb to track its stack frame step by step?
Or can I directly recognize the corresponding codes from variable naming policy? (like if I want to know the code about addition, I can search for something like function PyNumpyArrayAdd(...) )
[EDIT] I found a very useful video about how to point out the C implementation of these basic C-implemented function or operator overrides like "+" "-".
https://www.youtube.com/watch?v=mTWpBf1zewc
Got this from Andras Deak via Numpy mailing-list.
[EDIT2] There is another way to track all the functions called in Numpy using gdb. It's very heavy because it will display all the functions in Numpy that are called, including these trivial ones. And it might take some time.
First you need to download/clone the Numpy repository to your own working space and then compile it with -g option, which will attach debug informations for debugging.
Then you open a terminal in the "path/to/numpy-main" directory where the setup.py of Numpy lies, and then run gdb.
If you want to know what functions in Numpy's C implementation are called in this single python statement:
y = np.exp(x)
you can set breakpoints on all the functions implemented by Numpy using this gdb python script provided by the first answer here:
Can gdb set break at every function inside a directory?
Once you load this python script by source somename.py, you can run this command in gdb: rbreak-dir numpy/core/src
And you can set commands for each breakpoint:
commands 1-5004
> silent
> bt 1
> c
> end
(here 1-5004 is the range of the breakpoints that you want to run commands on)
Once a breakpoint is activated, this command will run and print the first layer of backtrace (which is the info of the current function you are in) and then continue. In this way, you can track all the functions in Numpy, and this is a pic from my own working environment (I took a snapshot since there are rules preventing copying any byte from working computer):
Hope my trials can help the future comers.
However, what I really want to know is: how could I track the calling chain from python code to C implementation? Or i.e. how could I know which part of its C implementation corresponds to this simple numpy array addition?
AFAIK, there is two main way to do that: using a debugger or by tracking the function in the code (typically by looking the wrapping part or by searching keywords in numpy/core/src/XXX/). Numpy has different kind of functions. Some are focusing more on the CPython interaction part (eg. type checking, array creation, generic iterators, etc.) and some are focusing on the computing part (doing the computation efficiently). Regarding what you want, different files needs to be inspected. core/src/umath/loops.c.src is the way to go for core computing functions doing basic independent math operations.
Can I use some tools like gdb to track its stack frame step by step?
Using a debugger is the common way to do unless you are familiar with the code of Numpy. You can try to find the Numpy entry point function by looking the wrapper code but I think it is a bit difficult as this part of the code is not very readable (many related parts are generated certainly to ease the development of avoid mistakes). The hard part with GDB is to find the first entry point of the function in Numpy (the CPython interpreter function calls are hard to track as they are many of them (sometime called recursively) and the call stack is quite big far from being clear (ie. there is no clear information about the actual statement/expression being executed). That being said, AFAIR, the entry point is often something like PyArray_XXX or array_XXX. You can also track the first function executing code of the Numpy library.
Or can I directly recognize the corresponding codes from variable naming policy?
Some functions have a standardized name like typically PyArray_XXX. That being said, core computing function generally does not. They have a name generated by a template system that parse comments and annotations and generate code based on that. For adding two array, the main computing function should be for example #TYPE#_add#isa# where #TYPE# is either INT or LONG regarding your target platform. There is a special version (ie. specialization) for floating-point numbers that makes use of an optimized pair-wise summation so sake of accuracy. This kind of naming convention is quite frequent though so you can search _add in the code or a begin repeat section with add as a kind parameter.
Related post: Numpy argmax source
Context: I am evaluating libraries for stereo correspondence. They almost universally fail to work at all until you get a handful of algorithm-dependent parameters set correctly.
Is there any sort of well-generalized tool to make the process of manually tuning tens of parameters to a badly documented C++ function until it works less painful?
I am looking for something like a combination of SWIG and the dynamic-reconfigure infrastructure from ROS, where you point it at a pure C++ function, and it generates a simple gui with sliders, check-boxes, etc... for the values of the inputs, and calls the function over-and-over so you can tune the parameters interactively.
It sounds like ROS's dynamic_reconfigure with the rqt_reconfigure GUI might be close to what you're looking for. Once you specify the parameters you want to change, the GUI will generate sliders/toggles/fields/etc. to change the parameters on the fly:
You still need to explicitly add the mapping from a ROS param to the algorithm's parameter (and update the algorithm in the dynamic_reconfigure callback), but having your parameters stored in the ROS parameter server can be beneficial in the long run:
parameters can be under version control very easily (stored as a yaml file).
you can save all parameters once you find a good solution (rosparam dump)
you can have different 'versions' of parameters for different applications.
other nodes can read the parameters if necessary
I am a meteorologist, and lately I am trying to investigate the possibility of building my one sondes.
In order to do that, I have the following work plan :
I would like to generate 3D models pyformex. An alternative is openSCAD. But I start with pyformex - to generate simple cylindrical sonde shapes with associated extra features, e.g. intake tube or such.
Next, I will like to split it in Meshes, using PyDistMesh; as well as prepare a raytraced point cloud model with Xrt.
In the third step, I would like to perform the CFD works.
Now, my questions :
Are there some other simple Python Libraries to generate 3D models? I would like a very simple system, where i can issue commands like p = Parallelogram (length, height, width), or p.position(x,y,z) etc. It would be nice to have built in mouse interaction - that is, a built in drawing component, which I can use to display the model, and rotate/ zoom/pan with mouse.
Any other mesh generation tools?
For this step, I would need a multiphysics system. I tried to use OpenFOAM, it is too huge (to hack through). I have taken a look at SU2, but it seems to focus more on aerospace engineering, than Fluid Dynamics (I would like to simulate the flight of the sonde - which is closer to aerospace engineering, as well as the state of the atmosphere). Fluidity seems to suit my needs better, but I dont find a python fork thereof. So are there some general purpose, not too bloated up, multiphysics python library for geophysical and general hydrodynamic simulations? I have taken a look a MOOSE, also dont find a python binding for it.
Scientific visualization : Are there some 3 or 4 (or may be higher dimensional) visualization libraries? I would prefer to issue simple commands as Plot instead of first generating a window / form, and then putting the graphs on it, if possible.
FINALLY, and most importantly, if the same can be done by C++ or Fortan, or some other language besides java, I would also consider using those.
Have a look at http://freecadweb.org/. This seems to be under active development. It is a fairly complete open source CAD package written in python. I believe it also has tools for meshing.
For cfd, you might want to consider openfoam - http://www.openfoam.com/. This is an open source cfd package with the obligatory steep learning curve. There seem to be some python libraries to be available that link to it, however I'm not sure how active these are:
http://openfoamwiki.net/index.php/Contrib/PyFoam
http://pythonflu.wikidot.com/
I am currently in a project where a lot of 3D printing designs need to be done. They are all parameterized, so I'd like to write a python code to generate those design files (in .STL format) for me. I was wondering that, is there a python package that can do this? Because currently I am all doing those by hand using SolidWorks.
Thanks!
Yes there is... It's called FreeCAD.
The assembly module is already in the devel version (as of 06/15/2014) and will be of production quality really soon for real assemblies!
http://freecadweb.org/
Yes, more than one.
In my humble experience, I tried many Open Source tools for parametric CAD modeling using Python (FreeCAD, Rhino-Grasshopper, Blender, Salome).
All of them are valid options and the best one is represented by your ability to either model or code.
I recently favour SALOME (www.salome-platform.org) because of the straight forward "dump study" option, the continue development and the good API documentation.
Particularly I did some 3d prints using the exportSTL command once I had a solid worthy of printing and it was ok.
Nevertheless, if you intend to work on surfaces rather than solids, I don't think you will find anything worthy Open Source (Rhino has a little price to pay).
There is also a new one ! called pymadcad
It's a library meant to do complete CAD stuff only with python scripts.
At contrary to FreeCAD, Pymadcad is natively dealing with triangular meshes so it makes it very easy to import/export .stl files.
There is a growing amount of surface generation functions (extrusion, revolution, tube, screw, smooth surface, ...). And there is also all the stuff to generate and deal with 3D primtives such as Lines, Arc, ...
Here is a brief look at the features
I'm learning python and I'm not sure of understanding the following statement : "The function (including its name) can capture our mental chunking, or abstraction, of the problem."
It's the part that is in bold that I don't understand the meaning in terms of programming. The quote comes from http://www.openbookproject.net/thinkcs/python/english3e/functions.html
How to think like a computer scientist, 3 edition.
Thank you !
Abstraction is a core concept in all of computer science. Without abstraction, we would still be programming in machine code or worse not have computers in the first place. So IMHO that's a really good question.
What is abstraction
Abstracting something means to give names to things, so that the name captures the core of what a function or a whole program does.
One example is given in the book you reference, where it says
Suppose we’re working with turtles, and a common operation we need is
to draw squares. “Draw a square” is an abstraction, or a mental chunk,
of a number of smaller steps. So let’s write a function to capture the
pattern of this “building block”:
Forget about the turtles for a moment and just think of drawing a square. If I tell you to draw a square (on paper), you immediately know what to do:
draw a square => draw a rectangle with all sides of the same length.
You can do this without further questions because you know by heart what a square is, without me telling you step by step. Here, the word square is the abstraction of "draw a rectangle with all sides of the same length".
Abstractions run deep
But wait, how do you know what a rectangle is? Well, that's another abstraction for the following:
rectangle => draw two lines parallel to each other, of the same length, and then add another two parallel lines perpendicular to the other two lines, again of the same length but possibly of different length than the first two.
Of course it goes on and on - lines, parallel, perpendicular, connecting are all abstractions of well-known concepts.
Now, imagine each time you want a rectangle or a square to be drawn you have to give the full definition of a rectangle, or explain lines, parallel lines, perpendicular lines and connecting lines -- it would take far too long to do so.
The real power of abstraction
That's the first power of abstractions: they make talking and getting things done much easier.
The second power of abstractions comes from the nice property of composability: once you have defined abstractions, you can compose two or more abstractions to form a new, larger abstraction: say you are tired of drawing squares, but you really want to draw a house. Assume we have already defined the triangle, so then we can define:
house => draw a square with a triangle on top of it
Next, you want a village:
village => draw multiple houses next to each other
Oh wait, we want a city -- and we have a new concept street:
city => draw many villages close to each other, fill empty spaces with more houses, but leave room for streets
street => (some definition of street)
and so on...
How does this all apply to programmming?
If in the course of planning your program (a process known as analysis and design), you find good abstractions to the problem you are trying to solve, your programs become shorter, hence easier to write and - maybe more importantly - easier to read. The way to do this is to try and grasp the major concepts that define your problems -- as in the (simplified) example of drawing a house, this was squares and triangles, to draw a village it was houses.
In programming, we define abstractions as functions (and some other constructs like classes and modules, but let's focus on functions for now). A function essentially names a set of single statements, so a function essentially is an abstraction -- see the examples in your book for details.
The beauty of it all
In programming, abstractions can make or break productivity. That's why often times, commonly used functions are collected into libraries which can be reused by others. This means you don't have to worry about the details, you only need to understand how to use the ready-made abstractions. Obviously that should make things easier for you, so you can work faster and thus be more productive:
Example:
Imagine there is a graphics library called "nicepic" that contains pre-defined functions for all abstractions discussed above: rectangles, squares, triangles, house, village.
Say you want to create a program based on the above abstractions that paints a nice picture of a house, all you have to write is this:
import nicepic
draw_house()
So that's just two lines of code to get something much more elaborate. Isn't that just wonderful?
A great way to understand abstraction is through abstract classes.
Say we are writing a program which models a house. The house is going to have several different rooms, which we will represent as objects. We define a class for a Bathroom, Kitchen, Living Room, Dining Room, etc.
However, all of these are Rooms, and thus share several properties (# of doors/windows, square feet, etc.) BUT, a Room can never exist on it's own...it's always going to be some type of room.
It then makes sense to create an abstract class called Room, which will contain the properties all rooms share, and then have the classes of Kitchen, Living Room, etc, inherit the abstract class Room.
The concept of a room is abstract and only exists in our head, because any room that actually exists isn't just a room; it's a bedroom or a living room or a classroom.
We want our code to thus represent our "mental chunking". It makes everything a lot neater and easier to deal with.
As defined on wikipedia: Abstraction_(computer_science)
Abstraction tries to factor out details from a common pattern so that
programmers can work close to the level of human thought, leaving out
details which matter in practice, but are not exigent to the problem being
solved.
Basically it is removing the details of the problem. For example, to draw a square requires several steps, but I just want a function that draws a square.
Let's say you write a function which receives a bunch of text as parameter, then reads credentials in a config file, then connects to a SMTP server using those credentials and sends a mail using that text.
The function should be named sendMail(text), not parseTextReadCredentialsInFileConnectToSmtpThenSend(text) because it is more easy to represent what it does this way, to yourself and when presenting the API to coworkers or users... even though the 2nd name is more accurate, the first is a better abstraction.
In a simple sentence, I can say: The essence of abstraction is to extract essential properties while omitting inessential details. But why should we omit inessential details? The key motivator is preventing the risk of change.
The best way to to describe something is to use examples:
A function is nothing more than a series of commands to get stuff done. Basically you can organize a chunk of code that does a single thing. That single thing can get re-used over and over and over through your program.
Now that your function does this thing, you should name it so that it's instantly identifiable as to what it does. Once you have named it you can re-use it all over the place by simply calling it's name.
def bark():
print "woof!"
Then to use that function you can just do something like:
bark();
What happens if we wanted this to bark 4 times? Well you could write bark(); 4 times.
bark();
bark();
bark();
bark();
Or you could modify your function to accept some type of input, to change how it works.
def bark(times):
i=0
while i < times:
i = i + 1
print "woof"
Then we could just call it once:
bark(4);
When we start talking about Object Oriented Programming (OOP) abstraction means something different. You'll discover that part later :)
Abstraction: is a very important concept both in hardware and software.
Importance: We the human can not remember all the things all the times. For example, if your friend speaks 30 random numbers quickly and asks you to add them all, it won't be possible for you. Reason? You might not be able to keep all those numbers in mind. You might write those numbers on a paper even then you will be adding right most digits one by one ignoring the left digits at one time and then ignoring the right most digits at the other time having added the right most ones.
It shows that at one time we the human can focus at some particular issue while ignoring those which are already solved and moving focus towards what are left to be solved.
Ignoring less important thing and focusing the most important (for the time being and in particular context) is called Abstraction
Here is how abstraction works in programming.
Below is the world's famous hello world program in C language:
//C hello world example hello.c
#include <stdio.h>
int main()
{
printf("Hello world\n");
return 0;
}
This is the simplest and usually the first computer program a programmer writes. When you compile and run this program on command prompt, the output may appear like this:
Here are the serious questions
Computer only understands the binary code how was it able to run your English like code? You may say that you compiled the code to binary using compiler. Did you write compiler to make your program work? No. You needed not to. You installed GNU C compiler on your Linux system and just used it by giving command:
gcc -o hello hello.c
And it converted your English like C language code to binary code and you could run that code by giving command:
./hello
So, for writing an application in C program, you never need to know how C compiler converts C language code to binary code. So you used GCC compiler as an abstraction.
Did you write code for main() and printf() functions? No. Both are already defined by someone else in C language. When we run a C program, it looks for main() function as starting point of program while printf() function prints output on computer screen and is already defined in stdio.h so we have to include it in program. If both the functions were not already written, we had to write them ourselves just to print two words and computers would be the most boring machines on earth ever. Here again you use abstraction i.e. you don't need to know how printf prints text on monitor and all you need to know is how to give input to printf function so that it shows the desired output.
I did not expand the answer to abstraction of operating system, kernel, firmware and hardware for the sake of simplicity.
Things to remember:
While doing programming, you can use abstraction in a variety of ways to make your program simple and easy.
Example 1: You can use a constant to abstract value of PI 3.14159 in your program because PI is easy to remember than 3.14159 for the rest of program
Example 2: You can write a function which returns square of a given number and then anyone, including you, can use that function by giving it input as parameters and getting a return value from it.
Example 3: In an Object-oriented programming (OOP), like Java, you may define an object which encapsulates data and methods and you can use that object by invoking its methods.
Example 4: Many applications provide you API which you use to interact with that application. When you use API methods, you never need to know how they are implemented. So abstraction is there.
Through all the examples, you can realize the importance of abstraction and how it is implemented in programming. One key thing to remember is that abstraction is contextual i.e. keeps on changing as per context