I need to validate phone numbers and there is a very good python library that will do this. My stack however is Go and I'm really not looking forward to porting a very large library. Do you think it would be better to use the python library by running a shell command from within the Go codebase or by running a daemon that I then have to communicate with somehow?
Python, being an interpreted language, requires the system to load the interpreter each time a script is run from the command line. Also
On my particular system, after disk caching, it takes the system 20ms to execute a script with import string (which is plausible for your use case). If you're processing a lot information, and can't submit it all at once, you should consider setting up a daemon to avoid this kind of overhead.
On the other hand, a daemon is more complex to write and test, so you should probably see if a script suits your needs before optimizing prematurely.
There's no answer to your question that fits every possible case. Ultimately, you always have to try the performance with your data and in your system,
Related
I want to do something kinda weird to automate some processes involving embedded processors and other hardware.
I would like to control an ARM processor via GDB to run some tests by monitoring and changing some variables. I want to control the whole test from Python so I can synchronize other instruments/hardware when I trigger certain events on the processor. I think arm-none-eabi-gdb-py would be very helpful in this, but how would that work in practice?
Do I start GDB from a Python subprocess, and then run Python from within GDB? It seems a bit convoluted, but Python for arm-none-eabi-gdb-py seems to require being run from within the GDB process. So I am a little confused how I would tie that into the outer 'test driver'.
I'm planning to build a WebApp that will need to execute scripts based on the argument that an user will provide in a text-field or in the Url.
possible solutions that I have found:
create a lib directory in the root directory of the project, and put the scripts there, and import it from views.
using subprocess module to directly run the scripts in the following way:
subprocess.call(['python', 'somescript.py', argument_1,...])
argument_1: should be what an end user provides.
I'm planning to build a WebApp that will need to execute scripts
Why should it "execute scripts" ? Turn your "scripts" into proper modules, import the relevant functions and call them. The fact that Python can be used as a "scripting language" doesn't mean it's not a proper programming language.
Approach (1) should be the default approach. Never subprocess unless you absolutely have to.
Disadvantages of subprocessing:
Depends on the underlying OS and in your case Python (i.e. is python command the same as the Python that runs the original script?).
Potentially harder to make safe.
Harder to pass values, return results and report errors.
Eats more memory and cpu (a side effect is that you can utilize all cpu cores but since you are writing a web app it is likely you do that anyway).
Generally harder to code and maintain.
Advantages of subprocessing:
Isolates the runtime. This is useful if for example scripts are uploaded by users. You don't want them to mess with your application.
Related to 1: potentially easier to dynamically add scripts. Not that you should do that anyway. Also becomes harder when you have more then 1 server and you need to synchronize them.
Well, you can run non-python code that way. But it doesn't apply to your case.
In case both options available: to call a command line tool with subprocess (say, hg) or to make use of native python API (say, mercurial API), is there a case where it's more favorable to use the former?
If you want to execute some third party native code which you know is not stable and may crash with a segvault then it is better to execute it as a subprocess - you will be able to safely handle the possible crashes from your Python process.
Also, if you want to call several times some code which is known to leak memory, leave open files or other resources, from a long running Python process, then again it may be wise to run it as a subprocess. In this case the leaking memory or other resources will be reclaimed by the operating system for you each time the subprocess exits, and not accumulate.
The only way that i see myself using subprocess instead of a native python api is if some option of the program is not provided in the api.
I need to send code to remote clients to be executed in them but security is a concern for me right now. I don't want unsafe code to be executed there so I would like to control what a program is doing. I mean for example, know if is making connections, where is connecting to, if is reading local files, etc. Is this possible with Python?
EDIT: I'm thinking in something similar to Android permission system. I want to know what a code will do and if it does something different, stop it.
You could use a different Python runtime:
if you run your script using Jython; you can exploit Java's permission system
with Pypy's sandboxed version you can choose what is allowed to run in your controller script
There used to be a module in Python called bastian, but that was deprecated as it wasn't that secure. There's also I believe something called RPython, but I don't know too much about that.
I would in this case use Pyro and write the code on the target server. That way you know clients can only execute written and tested code.
edit - it's probably worth noting that Pyro also supports http://en.wikipedia.org/wiki/Privilege_separation - although I've not had to use it for that.
I think you are looking for a sandboxed python. There used to be an effort to implement this, but it has been abolished a couple of years ago.
Sandboxed python in the python wiki offers a nice overview of possible options for your usecase.
The most rigourous (but probably the slowest) way is to run Python on a bare OS in an emulator.
Depending on the OS you use, there are several ways of running programs with restrictions, but without the overhead of an emulator:
FreeBSD has a nice integrated solution in the form of jails.
These grew out of the chroot system call.
Linux-VServer aims to do more or less the same on Linux.
I would like seek some guidance in writing a "process profiler" which runs in kernel mode. I am asking for a kernel mode profiler is because I run loads of applications and I do not want my profiler to be swapped out.
When I said "process profiler" I mean to something that would monitor resource usage by the process. including usage of threads and their statistics.
And I wish to write this in python. Point me to some modules or helpful resource.
Please provide me guidance/suggestion for doing it.
Thanks,
Edit::: Would like to add that currently my interest isto write only for linux. however after i built it i will have to support windows.
It's going to be very difficult to do the process monitoring part in Python, since the python interpreter doesn't run in the kernel.
I suspect there are two easy approaches to this:
use the /proc filesystem if you have one (you don't mention your OS)
Use dtrace if you have dtrace (again, without the OS, who knows.)
Okay, following up after the edit.
First, there's no way you're going to be able to write code that runs in the kernel, in python, and is portable between Linux and Windows. Or at least if you were to, it would be a hack that would live in glory forever.
That said, though, if your purpose is to process Python, there are a lot of Python tools available to get information from the Python interpreter at run time.
If instead your desire is to get process information from other processes in general, you're going to need to examine the options available to you in the various OS APIs. Linux has a /proc filesystem; that's a useful start. I suspect Windows has similar APIs, but I don't know them.
If you have to write kernel code, you'll almost certainly need to write it in C or C++.
don't try and get python running in kernel space!
You would be much better using an existing tool and getting it to spit out XML that can be sucked into Python. I wouldn't want to port the Python interpreter to kernel-mode (it sounds grim writing it).
The /proc option does sound good.
some code code that reads proc information to determine memory usage and such. Should get you going:
http://www.pixelbeat.org/scripts/ps_mem.py reads memory information of processes using Python through /proc/smaps like charlie suggested.
Some of your comments on other answers suggest that you are a relatively inexperienced programmer. Therefore I would strongly suggest that you stay away from kernel programming, as it is very hard even for experienced programmers.
Why would you want to write something that
is a very complex system (just look at existing profiling infrastructures and how complex they are)
can not be done in python (I don't know any kernel that would allow execution of python in kernel mode)
already exists (oprofile on Linux)
have you looked at PSI? (http://www.psychofx.com/psi/)
"PSI is a Python module providing direct access to real-time system and process information. PSI is a Python C extension, providing the most efficient access to system information directly from system calls."
it might give you what you are looking for. .... or at least a starting point.
Edit 2014:
I'd recommend checking out psutil instead:
https://pypi.python.org/pypi/psutil
psutil is actively maintained and has some nifty process monitoring features. PSI seems to be somewhat dead (last release 2009).