What is the best way to sample/profile a PyObjC application?

What is the best way to sample/profile a PyObjC application? - python

Sampling with Activity Monitor/Instruments/Shark will show stack traces full of C functions for the Python interpreter. I would be helpful to see the corresponding Python symbol names. Is there some DTrace magic that can do that? Python's cProfile module can be useful for profiling individual subtrees of Python calls, but not so much for getting a picture of what's going on with the whole application in response to user events.

The answer is "dtrace", but it won't work on sufficiently old macs.
http://tech.marshallfamily.com.au/archives/python-dtrace-on-os-x-leopard-part-1/
http://tech.marshallfamily.com.au/archives/python-dtrace-on-os-x-leopard-part-2/

Related

Tracingback a Python function

A way I usually use to understand a project architecture in C++ is to set a break point in a particular function using GDB, and then using the traceback feature, I can easily understand the function calls between the different classes. I am totally new to Python and I want to ask what are the best Technics used to understand the Python project,
I took a look at traceback but the problem is that it is only tracing the functions inside the same module, so if the caller is in a different module it won't be traced. Besides, the size of the stack is also limited, correct me if I am wrong.
Could you please share the Technics you are using based on your own experience.

I don't know where you got the idea that python's tracebacks would be limited to a single module or limited in size - when an exception occurs, you of course have the full stack trace available.
This being said, Python has a full step debugger in it's stdlib, which lets you inspect and navigate the whole call stack. And there are of course third-part step debuggers in various IDEs and custom shells or environments (ie IPython etc).
NB the inspect module might also be of interest to you - and not only for stack inspection ;-)

Control an existing application with Python

I have been programming with Python for some time and noticed that it's possible to interact, for example, with MS Excel files through the library XLWT.
Now I would like to know if it's possible to use Python to control other applications, such as the Calculator.exe which is on the standard Windows path C:\Windows\system32.
Is there any way to write a script with Python, let's say, to make the calculator opening and executing 9+3=? I usually like to write some code myself first and ask for help later, but here I've no clue even if it's possible and my researches on Internet have yield only this script to launch the program:
import subprocess
subprocess.call("C:\Windows\system32\calc.exe")
Any help, suggestion or even just "no, it's not possible" would be highly appreciated.

It will always depend on the cooperativity of the other program. If it allows being tweaked, it will offer an API for this (and hopefully a documentation telling you how to use it).
This is not really a Python question because it rather depends on how the API is written. If this API comes as a C library, you will have to write at least a bit of C code to access it via Python. If it is a way of calling the program (special options, etc.) then Python will have as much or less trouble providing these as any other programming language.

Using embedded C library in Python emulation

Short Question
Which would be easier to emulate (in Python) a complex (SAE J1939) communication stack from an existing embedded C library:
1) Full port - meaning manually convert all of the C functions to python modules
2) Wrap the stack in a Python wrapper - meaning call the real c code in Python
Background Information
I have already written small portions of this stack in Python, however they are very non-trival to implement with 100% coverage. Because of this very reason, we have recently purchased an off the shelf SAE J1939 stack for our embedded platforms. To clarify, I know that portions touching the hardware layer will have to be re-created and mapped to the PC's CAN drivers.
I am hoping to find someone here on SO that has or even looked into porting a 5k LOC C library to Python. If there are any C to Python tools that work well that would be helpful for me to look into as well.

My advice would be to wrap it.
Reasons for that:
if you convert function by function, you'll introduce new bugs (we're just human) and this kind of stuff is pretty hard to test
wrapping for python is done easily, using swig or even ctypes to load a dll on the fly, you'll find tons of tutorial
if your lib gets updated, you have less impact in the long term.
However, you need to
check that the license you purchase allows you to do that
know that having same implementation on embedded and PC side, it won't help tracking bugs
you might have a bit less portability than a full python implementation (anyway, not much of a point for you as your low layer needs to be rewritten per target)

Definitely wrap it. It might be as easy are running ctypesgen.py and then using it. Check this blog article about using ctypesgen to create a wrapper for libreadline http://wavetossed.blogspot.com/2011/07/asynchronous-gnu-readline.html in order to get access to the full API.

Why use Python interactive mode?

When I first started reading about Python, all of the tutorials have you use Python's Interactive Mode. It is difficult to save, write long programs, or edit your existing lines (for me at least). It seems like a far more difficult way of writing Python code than opening up a code.py file and running the interpreter on that file.
python code.py
I am coming from a Java background, so I have ingrained expectations of writing and compiling files for programs. I also know that a feature would not be so prominent in Python documentation if it were not somehow useful. So what am I missing?

Let's see:
If you want to know how something works, you can just try it. There is no need to write up a file. I almost always scratch write my programs in the interpreter before coding them. It's not just for things that you don't know how they work in the programming language. I never remember what the correct arguments to range are to create, for example, [-2, -1, 0, 1]. I don't need to. I just have to fire up the interpreter and try stuff until I figure out it is range(-2, 2) (did that just now, actually).
You can use it as a calculator.
Python is a very introspective programming language. If you want to know anything about an object, you can just do dir(object). If you use IPython, you can even do object.<TAB> and it will tab-complete the methods and attributes of that object. That's way faster than looking stuff up in documentation or even in code.
help(anything) for documentation. It's way faster than any web interface.
Again, you have to use IPython (highly recommended), but you can time stuff. %timeit func1() and %timeit func2() is a common idiom to determine what is faster.
How often have you wanted to write a program to use once, and then never again. The fastest way to do this is to just do it in the Python interpreter. Sure, you have to be careful writing loops or functions (they must have the correct syntax the first time), but most stuff is just line by line, and you can play around with it.
Debugging. You don't need to put selective print statements in code to see what variables are when you write it in the interpreter. You just have to type >>> a, and it will show what a is. Nice again to see if you constructed something correctly. The building Python debugger pdb also uses the intrepeter functionality, so you can not only see what a variable is when debugging, but you can also manipulate or even change it without halting debugging.
When people say that Python is faster to develop in, I guarantee that this is a big part of what they are talking about.
Commenters: anything I am forgetting?

REPL Loops (like Python's interactive mode) provide immediate feedback to the programmer. As such, you can rapidly write and test small pieces of code, and assemble those pieces into a larger program.

You're talking about running Python in the console by simply typing "python"? That's just for little tests and for practicing with the language. It's very useful when learning the language and testing out other modules.
Of course any real software project is written in .py files and later executed by the interpreter!

The Python interpreter is a least common denominator: you can run it on multiple platforms, and it acts the same way (modulo platform-specific modules), so it's pretty easy to get a newbie going with.
It's a lot easier to tell a newbie to launch the interpreter and "do this" than to have them open a file, type in some code, save it, make it executable, make sure python is in your PATH, or use a #! line, etc etc. Scrap all of that and just launch the interpreter. For simple examples, you can't beat it. It was never meant for long programs, so if you were using it for that, you probably missed the part of the tutorial that told you "longer scripts go in a file". :)

you use the interactive interpreter to test snippets of your code before you put them into your script.

As already mentioned, the Python interactive interpreter gives a quick and dirty way to test simple Python functions and/or code snippets.
I personally use the Python shell as a very quick way to perform simple Numerical operations (provided by the math module). I have my environment setup, so that the math module is automatically imported whenever I start a Python shell. In fact, its a good way to "market" Python to non-Pythoniasts. Show them how they can use Python as a neat scientific calculator, and for simple mathematical prototyping.

One thing I use interactive mode for that others haven't mentioned: To see if a module is installed. Just fire up Python and try to import the module; if it dies, then your PYTHONPATH is broke or the module is not installed.
This is a great first step for "Hey, it's not working on my machine" or "Which Python did that get installed in, anyway" bugs.

I find the interactive interpreter very, very good for testing quick code, or to show others the Power of Python. Sometimes I use the interpreter as a handy calculator, too. It's amazing what you can do in a very short amount of time.
Aside from the built-in console, I also have to recommend Pyshell. It has auto-completion, and a decent syntax highlighting. You can also edit multiple lines of code at once. Of course, it's not perfect, but certainly better than the default python console.

When coding in Java, you almost always will have the API open in some browser window. However with the python interpreter, you can always import any module that you are thinking about using and check what it offers. You can also test the behavior of new methods that you are unsure of, to eliminate the "Oh! so THAT's how it works" as a source of bugs.

Interactive mode makes it easy to test code snippets before incorporating them into a larger program. If you use IDLE there's syntax highlighting and argument pop-ups to help you out. It's also a quick way of checking that you've figured out how to use a module without having to write a test program.

Python compute cluster

Would it be possible to make a python cluster, by writing a telnet server, then telnet-ing the commands and output back-and-forth? Has anyone got a better idea for a python compute cluster?
PS. Preferably for python 3.x, if anyone knows how.

The Python wiki hosts a very comprehensive list of Python cluster computing libraries and tools. You might be especially interested in Parallel Python.
Edit: There is a new library that is IMHO especially good at clustering: execnet. It is small and simple. And it appears to have less bugs than, say, the standard multiprocessing module.

You can see most of the third-party packages available for Python 3 listed here; relevant to cluster computation is mpi4py -- most other distributed computing tools such as pyro are still Python-2 only, but MPI is a leading standard for cluster distributed computation and well looking into (I have no direct experience using mpi4py with Python 3, yet, but by hearsay I believe it's a good implementation).
The main alternative is Python's own built-in multiprocessing, which also scales up pretty well if you have no interest in interfacing existing nodes that respect the MPI standards but may not be coded in Python.
There is no real added value in rolling your own (as Atwood says, don't reinvent the wheel, unless your purpose is just to better understand wheels!-) -- use one of the solid, tested, widespread solutions, already tested, debugged and optimized on your behalf!-)

Look into these
http://www.parallelpython.com/
http://pyro.sourceforge.net/
I have used both and both are exellent for distributed computing
for more detailed list of options see
http://wiki.python.org/moin/ParallelProcessing
and if you want to auto execute something on remote machine , better alternative to telnet is ssh as in http://pydsh.sourceforge.net/

What kind of stuff do you want to do? You might want to check out hadoop. The backend, heavy lifting is done in java, but has a python interface, so you can write python scripts create and send the input, as well as process the results.

If you need to write administrative scripts, take a look at the ClusterShell Python library too, or/and its parallel shell clush. It's useful when dealing with node sets also (man nodeset).

I think IPython.parallel is the way to go. I've been using it extensively for the last year and a half. It allows you to work interactively with as many worker nodes as you want. If you are on AWS, StarCluster is a great way to get IPython.parallel up and running quickly and easily with as many EC2 nodes as you can afford. (It can also automatically install Hadoop, and a variety of other useful tools, if needed.) There are some tricks to using it. (For example, you don't want to send large amounts of data through the IPython.parallel interface itself. Better to distribute a script that will pull down chunks of data on each engine individually.) But overall, I've found it to be a remarkably easy way to do distributed processing (WAY better than Hadoop!)

"Would it be possible to make a python cluster"
Yes.
I love yes/no questions. Anything else you want to know?
(Note that Python 3 has few third-party libraries yet, so you may wanna stay with Python 2 at the moment.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.