Note that I'm constrained to use Python 2.6. I have a Python 2.6 application that uses a C++ multi-threaded API library built with boost-python. My use-case was simply to execute a Python function callback from a C++ boost thread but despite the many different attempts and researching all the available online resources I haven't found any way that works. All the proposed solutions revolve around a different combination of the functions: Py_Initialize*, PyEval_InitThreads, PyGILState_Ensure, PyGILState_Release but after trying all possible combinations nothing works in practice e.g.
Embedding Python in multi-threaded C++ applications with code here
PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseum)
Therefore, this question: how can I start and run a Python thread from C++? I basically want to: create it, run it with a Python target function object and forget about it.
Is that possible?
Based on the text below from your question:
Run a Python thread from C++? I basically want to: create it, run it with a Python target function object and forget about it.
You may find useful to simply spawn a process using sytem:
system("python myscript.py")
And if you need to include arguments:
string args = "arg1 arg2 arg3 ... argn"
system("python myscript.py " + args)
You should call PyEval_InitThreads from your init routine (and leave the GIL held). Then, you can spawn a pthread (using boost), and in that thread, call PyGILState_Ensure, then call your python callback (PyObject_CallFunction?), release any returned value or deal with any error (PyErr_Print?), release the GIL with PyGILState_Release, and let the thread die.
void *my_thread(void *arg)
{
PyGILState_STATE gstate;
PyObject *result;
gstate = PyGILState_Ensure();
if ((result = PyObject_CallFunction(func, NULL))) {
Py_DECREF(result);
}
else {
PyErr_Print();
}
PyGILState_Release(gstate);
return NULL;
}
The answer to the OP How to call Python from a boost thread? also answers this OP or DP (Derived:)). It clearly demonstrates how to start a thread from C++ that callbacks to Python. I have tested it and works perfectly though needs adaptation for pre-C++11. It uses Boost Python, is indeed an all inclusive 5* answer and the example source code is here.
I'm returning 0 all over the place in a python script but would prefer something more semantic, something more readable. I don't like that magic number. Is there an idea in python similar to how in C you can return EXIT_SUCCESS instead of just 0?
I was unable to find it here:
https://docs.python.org/3.5/library/errno.html
I'm returning 0
return is not how you set your script's exit code in Python. If you want to exit with an exit code of 0, just let your script complete normally. The exit code will automatically be set to 0. If you want to exit with a different exit code, sys.exit is the tool to use.
If you're using return values of 0 or 1 within your code to indicate whether functions succeeded or failed, this is a bad idea. You should raise an appropriate exception if something goes wrong.
Since you discuss return here, it seems like you may be programming Python like C. Most Python functions should ideally be written to raise an exception if they fail, and the calling code can determine how to handle the exceptional conditions. For validation functions it's probably best to return True or False - not as literals, usually, but as the result of some expression like s.isdigit().
When talking about the return value of a process into its environment you caonnt use return because a module isn't a function, so a return statement at top level would be flagged as a syntax error. Instead you should use sys.exit.
Python might seem a little minimalist in this respect, but a call to sys.exit with no arguments defaults to success (i.e. return code zero). So the easiest way to simplify your program might be to stop coding it with an argument where you don't want to indicate failure!
As the documentation reminds us, integer arguments are passed back, and string arguments result in a return code of 1 and the string is printed to stderr.
The language doesn't, as far as I am aware, contain any constants (while it does have features specific to some environments, if it provided exit codes their values might need to be implementation- or platform-specific and the language developers prefer to avoid this where possible.
I have an Python 3 interpreter embedded into an C++ MPI application. This application loads a script and passes it to the interpreter.
When I execute the program on 1 process without the MPI launcher (simply calling ./myprogram), the script is executed properly and its "print" statements output to the terminal. When the script has an error, I print it on the C++ side using PyErr_Print().
However when I lauch the program through mpirun (even on a single process), I don't get any output from the "print" in the python code. I also don't get anything from PyErr_Print() when my script has errors.
I guess there is something in the way Python deals with standard output that do not match the way MPI (actuall Mpich here) deals with redirecting the processes' output to the launcher and finally to the terminal.
Any idea on how to solve this?
[edit, following the advice from this issue]
You need to flush_io() after each call to PyErr_Print, where flush_io could be this function:
void flush_io(void)
{
PyObject *type, *value, *traceback;
PyErr_Fetch(&type, &value, &traceback); // in Python/pythonrun.c, they save the traceback, let's do the the same
for (auto& s: {"stdout", "stderr"}) {
PyObject *f = PySys_GetObject(s);
if (f) PyObject_CallMethod(f, "flush", NULL);
else PyErr_Clear();
}
PyErr_Restore(type, value, traceback);
}
[below my old analysis, it still has some interesting info]
I ended up with the same issue (PyErr_Print not working from an mpirun). Tracing back (some gdb of python3 involved) and comparing the working thing (./myprogram) and non-working thing (mpirun -np 1 ./myprogram), I ended up in _io_TextIOWrapper_write_impl at ./Modules/_io/textio.c:1277 (python-3.6.0 by the way).
The only difference between the 2 runs is that self->line_buffering is 1 vs. 0 (at this point self represents sys.stderr).
Then, in pylifecycle.c:1128, we can see who decided this value:
if (isatty || Py_UnbufferedStdioFlag)
line_buffering = Py_True;
So it seems that MPI does something to stderr before launching the program, which makes it not a tty. I haven't investigated if there's an option in mpirun to keep the tty flag on stderr ... if someone knows, it'd be interesting (though on second thought mpi probably has good reasons to put his file descriptors in place of stdout&stderr, for its --output-filename for example).
With this info, I can come out with 3 solutions (the first 2 are quick-fixes, the 3rd is better):
1/ in the C code that starts the python interpreter, set the buffering flag before creating sys.stderr. The code becomes :
Py_UnbufferedStdioFlag = 1; // force line_buffering for _all_ I/O
Py_Initialize();
This brings Python's traceback back to screen in all situations; but will probably give catastrophic I/O ... so only an acceptable solution in debug mode.
2/ in the python (embedded) script, at the very beginning add this :
import sys
#sys.stderr.line_buffering = True # would be nice, but readonly attribute !
sys.stderr = open("error.log", 'w', buffering=1 )
The script then dumps the traceback to this file error.log.
I also tried adding a call to fflush(stderr) or fflush(NULL) right after the PyErr_Print() ... but this didn't work (because sys.stderr has its own internal buffering). That'd be a nice solution though.
3/ After a little more digging, I found the perfect function in
Python/pythonrun.c:57:static void flush_io(void);
It is in fact called after each PyErr_Print in this file.
Unfortunately it's static (only exists in that file, no reference to it in Python.h, at least in 3.6.0). I copied the function from this file to myprogram and it turns out to do exactly the job.
First a simple example to illustrate the situation. First the source .cpp that has been compiled into an executable:
#include <iostream>
using std::cout;
using std::endl;
int main()
{
cout << "Hello World!" << endl;
return 0;
}
This file has been made into an exe. Now in python it is called as follows:
import subprocess
import sys
sys.path.append('C:\Foo\Bar')
output = subprocess.Popen("hello_world.exe", stdout=subprocess.PIPE)
print output.stdout.read()
So this works fine for calling the executable and returning an output from the standard output. The question is, is there any feasible way to perform this to return potentially large arrays? From what I understand this is basically like returning and reading a text file, which would be quite slow! While I know SWIG or Cython are options for extending python with C++, I would find separate executable for each function more organized and modular!
TLDR: Can you return large arrays from executables back to python at a reasonable speed? Or is extending with Cython/SWIG/ctypes the only way to go?
You have two options.
subprocess.check_output or similar will return the entire program output as a string
subprocess.popen and incrementally read.
In the 1st scenario the output must fit in ram, otherwise very bad things will happen. In the second scenario you can set the buffer size and your subprocess should then halt until you read from the buffer at which point it will continue again.
cython/swig etc. etc. do nothing to reduce the memory footprint. If this is memory bound then you most likely want to use python and incrementally read/write to disk. Python is very efficient at passing around memory (i.e. it will not do eleventy seven memory copies unless you tell it to do that).
I have a list of file and there is only one in used in a time. So i wanna know which file is being used by specific program. Since i can use 'unlocker' to find out a file that are in used like this question have mentioned. But i want a programming way so that my program can help me find out. Is there any way?
Specially, the simple 'open' function in whatever r/w mode CAN access the using file and python won't throw any exception. I can tell which file being used only by 'unlocker'.
I have find out that the python 'open' function in w mode have took the access permission from that specific program, and the program then don't work so well. In this moment i open the unlocker and i can see two process accessing the file. Is there any 'weak' method that can only detect whether the file is being used?
I'm not sure which of these two you want to find:
Are there are any existing HANDLEs for a given file, like the handle and Process Explorer tools shows?
Are there any existing locked HANDLEs for a given file, like the Unlocker tool shows.
But either way, the answer is similar.
Obviously it's doable, or those tools couldn't do it, right? Unfortunately, there is nothing in the Python stdlib that can help. So, how do you do it?
You will need to access Windows APIs functions—through pywin32, ctypes, or otherwise.
There are two ways to go about it. The first, mucking about with the NT kernel object APIs, is much harder, and only really needed if you need to work with very old versions of Windows. The second, NtQuerySystemInformation, is the one you probably want.
The details are pretty complicated (and not well documented), but they're explained in a CodeProject sample program and on the Sysinternals forums.
If you don't understand the code on those pages, or how to call it from Python, you really shouldn't be doing this. If you get the gist, but have questions, or get stuck, of course you can ask for help here (or at the sysinternals forums or elsewhere).
However, even if you have used ctypes for Windows before, there are a few caveats you may not know about:
Many of these functions either aren't documented, or the documentation doesn't tell you which DLL to find them in. They will all be in either ntdll or kernel32; in some cases, however, the documented NtFooBar name is just an alias for the real ZwFooBar function, so if you don't find NtFooBar in either DLL, look for ZwFooBar.
At least on older versions of Windows, ZwQuerySystemInformation does not act as you'd expect: You cannot call it with a 0 SystemInformationLength, check the ReturnLength, allocate a buffer of that size, and try again. The only thing you can do is start with a buffer with enough room for, say, 8 handles, try that, see if you get an error STATUS_INFO_LENGTH_MISMATCH, increase that number 8, and try again until it succeeds (or fails with a different error). The code looks something like this (assuming you've defined the SYSTEM_HANDLE structure):
STATUS_INFO_LENGTH_MISMATCH = 0xc0000004
i = 8
while True:
class SYSTEM_HANDLE_INFORMATION(Structure):
_fields_ = [('HandleCount', c_ulong),
('Handles', SYSTEM_HANDLE * i)]
buf = SYSTEM_HANDLE_INFORMATION()
return_length = sizeof(buf)
rc = ntdll.ZwQuerySystemInformation(SystemHandleInformation,
buf, sizeof(buf),
byref(return_length))
if rc == STATUS_INFO_LENGTH_MISMATCH:
i += 8
continue
elif rc == 0:
return buf.Handles[:buf.HandleCount]
else:
raise SomeKindOfError(rc)
Finally, the documentation doesn't really explain this anywhere, but the way to get from a HANDLE that you know is a file to a pathname is a bit convoluted. Just using NtQueryObject(ObjectNameInformation) returns you a kernel object space pathname, which you then have to map to either a DOS pathname, a possibly-UNC normal NT pathname, or a \?\ pathname. Of course the first doesn't work files on network drives without a mapped drive letter; neither of the first two work for files with very long pathnames.
Of course there's a simpler alternative: Just drive handle, Unlocker, or some other command-line tool via subprocess.
Or, somewhere in between, build the CodeProject project linked above and just open its DLL via ctypes and call its GetOpenedFiles method, and it will do the hard work for you.
Since the project builds a normal WinDLL-style DLL, you can call it in the normal ctypes way. If you've never used ctypes, the examples in the docs show you almost everything you need to know, but I'll give some pseudocode to get you started.
First, we need to create an OF_CALLBACK type for the callback function you're going to write, as described in Callback functions. Since the prototype for OF_CALLBACK is defined in a .h file that I can't get to here, I'm just guessing at it; you'll have to look at the real version and translate it yourself. But your code is going to look something like this:
from ctypes import windll, c_int, WINFUNCTYPE
from ctypes.wintypes import LPCWSTR, UINT_PTR, HANDLE
# assuming int (* OF_CALLBACK)(int, HANDLE, int, LPCWSTR, UINT_PTR)
OF_CALLBACK = WINFUNCTYPE(c_int, HANDLE, c_int, LPWCSTR, UINT_PTR)
def my_callback(handle, namelen, name, context):
# do whatever you want with each handle, and whatever other args you get
my_of_callback = OF_CALLBACK(my_callback)
OpenFileFinder = windll.OpenFileFinder
# Might be GetOpenedFilesW, as with most Microsoft DLLs, as explained in docs
OpenFileFinder.GetOpenedFiles.argtypes = (LPCWSTR, c_int, OF_CALLBACK, UINT_PTR)
OpenFileFinder.GetOpenedFiles.restype = None
OpenFileFinder.GetOpenedFiles(ru'C:\Path\To\File\To\Check.txt', 0, my_of_callback, None)
It's quite possible that what you get in the callback is actually a pointer to some kind of structure or list of structures you're going to have to define—likely the same SYSTEM_HANDLE you'd need for calling the Nt/Zw functions directly, so let me show that as an example—if you get back a SYSTEM_HANDLE *, not a HANDLE, in the callback, it's as simple as this:
class SYSTEM_HANDLE(Structure):
_fields_ = [('dwProcessId', DWORD),
('bObjectType', BYTE),
('bFlags', BYTE),
('wValue', WORD),
('pAddress', PVOID),
('GrantedAccess', DWORD)]
# assuming int (* OF_CALLBACK)(int, SYSTEM_HANDLE *, int, LPCWSTR, UINT_PTR)
OF_CALLBACK = WINFUNCTYPE(c_int, POINTER(SYSTEM_HANDLE), c_int, LPWCSTR, UINT_PTR)
You could use a try/except block.
check if a file is open in Python
try:
f = open("file.txt", "r")
except IOError:
print "This file is already in use"