Reading a process memory with Python3 on OSX - python

I've been searching how to read a process memory adresses in Python3 on OSX without finding a good and simple way to implement this (I'm new to Python3 and OSX).
I'm aiming at real time analysis (no pause on the process to analyse), and an efficient way to get those variables without having to dump the whole memory.
It seems like ptrace is not the right tool for this issue, and Mach library of OSX (giving access to vm_read functions) doesn't seems to be available on Python3.
I get that I need to spawn a child process with something like the subprocess import in order to get the rights to actually read the memory.
Does anyone have any clue on this issue ?
Thanks !

Related

External executable crashes when being launched from Python script

I am currently getting an issue with an external executable crashing when it is launched from a Python script. So far I have tried using various subprocess calls. As well as the more redundant methods such as os.system and os.startfile.
Now the exe doesn't have this issue when I call it normally from the command line or by double-clicking on it from the explorer window. I've looked around to see if other people have had a similar problem too. As far as I can tell the closest possible cause of this issue is that the child process unnecessarily hangs due to the I/O exceeding 65K. So I've tried using Popen without PIPES and I have also changed the stdout and stdin to write to temporary files to try and alleviate my problem. But unfortunately none of this has worked.
What I eventually want to do is be able to autorun this executable several times with various outputs provided by xmls. Everything else is pretty much in place, including the xml modifications which the executable requires. I have also tested the xml modification portion of the code as a standalone script to make sure that this isn't the issue.
Due to the nature of script I am a bit reluctant to put up any actual code up on the net as the company I work for is a bit strict when it comes to showing code. I would ask my colleagues if I could but unfortunately I'm the only one here who actually has used python.
Any help would be much appreciated.
Thanks.
As I've not had any response I've kind of gone down a different route with this. Rather than relying on the subprocess module to call the exe I have moved that logic out into a batch file. The xmls are still modified by the python script and most of the logic is still handled in script. It's not what ideally would have liked from the program but it will have to do.
Thanks to anybody who gave this some thought and tried to at least look for an alternative. Even if nobody answered.

Run daemon server or shell command?

I need to validate phone numbers and there is a very good python library that will do this. My stack however is Go and I'm really not looking forward to porting a very large library. Do you think it would be better to use the python library by running a shell command from within the Go codebase or by running a daemon that I then have to communicate with somehow?
Python, being an interpreted language, requires the system to load the interpreter each time a script is run from the command line. Also
On my particular system, after disk caching, it takes the system 20ms to execute a script with import string (which is plausible for your use case). If you're processing a lot information, and can't submit it all at once, you should consider setting up a daemon to avoid this kind of overhead.
On the other hand, a daemon is more complex to write and test, so you should probably see if a script suits your needs before optimizing prematurely.
There's no answer to your question that fits every possible case. Ultimately, you always have to try the performance with your data and in your system,

Python: subprocess vs native API

In case both options available: to call a command line tool with subprocess (say, hg) or to make use of native python API (say, mercurial API), is there a case where it's more favorable to use the former?
If you want to execute some third party native code which you know is not stable and may crash with a segvault then it is better to execute it as a subprocess - you will be able to safely handle the possible crashes from your Python process.
Also, if you want to call several times some code which is known to leak memory, leave open files or other resources, from a long running Python process, then again it may be wise to run it as a subprocess. In this case the leaking memory or other resources will be reclaimed by the operating system for you each time the subprocess exits, and not accumulate.
The only way that i see myself using subprocess instead of a native python api is if some option of the program is not provided in the api.

Isolating pieces of code that segfault

I have a function call that sometimes produces a segfault. Right now I isolate this code by putting it in a separate Python file and doing Popen('python snippet.py') inside of my main script. Is there a more elegant way of doing this on a Unix system?
Update 7/24
I found the following to be convenient
if not os.fork():
# do stuff that could segfault here
This has the advantage of not having to put all initialization code in a separate file. Apparently there's also multiprocessing module in Python 2.6 that supports inter-process communication
Why not find it and fix it?
strace python yourapp.py yourargs and options >trace.out 2>&1
Then, after it segfaults try tail -50 trace.out and you will see how it leads up to the segfault. Typically it is a pointer that is garbled and points to memory addresses that are not part of the process. It is quite possible for a Python error to create such pointer errors if you are using something like ctypes to access shared libraries.
I

Writing a kernel mode profiler for processes in python

I would like seek some guidance in writing a "process profiler" which runs in kernel mode. I am asking for a kernel mode profiler is because I run loads of applications and I do not want my profiler to be swapped out.
When I said "process profiler" I mean to something that would monitor resource usage by the process. including usage of threads and their statistics.
And I wish to write this in python. Point me to some modules or helpful resource.
Please provide me guidance/suggestion for doing it.
Thanks,
Edit::: Would like to add that currently my interest isto write only for linux. however after i built it i will have to support windows.
It's going to be very difficult to do the process monitoring part in Python, since the python interpreter doesn't run in the kernel.
I suspect there are two easy approaches to this:
use the /proc filesystem if you have one (you don't mention your OS)
Use dtrace if you have dtrace (again, without the OS, who knows.)
Okay, following up after the edit.
First, there's no way you're going to be able to write code that runs in the kernel, in python, and is portable between Linux and Windows. Or at least if you were to, it would be a hack that would live in glory forever.
That said, though, if your purpose is to process Python, there are a lot of Python tools available to get information from the Python interpreter at run time.
If instead your desire is to get process information from other processes in general, you're going to need to examine the options available to you in the various OS APIs. Linux has a /proc filesystem; that's a useful start. I suspect Windows has similar APIs, but I don't know them.
If you have to write kernel code, you'll almost certainly need to write it in C or C++.
don't try and get python running in kernel space!
You would be much better using an existing tool and getting it to spit out XML that can be sucked into Python. I wouldn't want to port the Python interpreter to kernel-mode (it sounds grim writing it).
The /proc option does sound good.
some code code that reads proc information to determine memory usage and such. Should get you going:
http://www.pixelbeat.org/scripts/ps_mem.py reads memory information of processes using Python through /proc/smaps like charlie suggested.
Some of your comments on other answers suggest that you are a relatively inexperienced programmer. Therefore I would strongly suggest that you stay away from kernel programming, as it is very hard even for experienced programmers.
Why would you want to write something that
is a very complex system (just look at existing profiling infrastructures and how complex they are)
can not be done in python (I don't know any kernel that would allow execution of python in kernel mode)
already exists (oprofile on Linux)
have you looked at PSI? (http://www.psychofx.com/psi/)
"PSI is a Python module providing direct access to real-time system and process information. PSI is a Python C extension, providing the most efficient access to system information directly from system calls."
it might give you what you are looking for. .... or at least a starting point.
Edit 2014:
I'd recommend checking out psutil instead:
https://pypi.python.org/pypi/psutil
psutil is actively maintained and has some nifty process monitoring features. PSI seems to be somewhat dead (last release 2009).

Categories

Resources