Recently I've been looking into the code of Python. I know how to use generators (next, send and etc..), but it's fun to understand it by reading the Python C code.
I found the code in Object/genobject.c, and it's not that hard (but still not easy) to understand. So I want to know how it really works, and make sure I do not have a misunderstanding about generators in Python.
I know everything calls
static PyObject *
gen_send_ex(PyGenObject *gen, PyObject *arg, int exc)
and the result is returned from PyEval_EvalFrameEx which looks like it's a dynamic frame struct, could I understand it as stack or something?
Ok, It looks like Python stores some context in memory (am I right?). It looks like every time we use yield it creates a generator, and stores the context in memory, although not all of the functions and vars.
I know if I have big loop or big data to parse, yield is amazing, it saves a lot of memory and make it simple. But some of my workmates like to use yield everywhere, just like return. It's not easy to read and understand the code, and Python stores context for most of the function that may never be called again. Is it a bad practice?
So, the questions are:
How does PyEval_EvalFrameEx work.
Memory use of yield.
Is it bad practice for using yield everywhere.
And I found if I have a generator, function gen_send_ex will be called twice, why?
def test():
while 1:
yield 'test here'
test().next()
It will call gen_send_ex twice, first time with no args, with args the second time, and get the result.
Thanks for your patience.
I saw these articles:
This article tell me how does PyEval_EvalFrameEx work.
http://tech.blog.aknin.name/2010/09/02/pythons-innards-hello-ceval-c-2/
This article tell me the frame struct in Python.
http://tech.blog.aknin.name/2010/07/22/pythons-innards-interpreter-stacks/
These two stuff are very important for us.
So let me answer my question myself. I don't know if am I right.
If I have misunderstanding or completely wrong, Please let me know.
If I have code:
def gen():
count = 0
while count < 10:
count += 1
print 'call here'
yield count
That is a very simple generator.
f = gen()
And every time we call it, Python create a generator object.
PyObject *
PyGen_New(PyFrameObject *f)
{
PyGenObject *gen = PyObject_GC_New(PyGenObject, &PyGen_Type);
if (gen == NULL) {
Py_DECREF(f);
return NULL;
}
gen->gi_frame = f;
Py_INCREF(f->f_code);
gen->gi_code = (PyObject *)(f->f_code);
gen->gi_running = 0;
gen->gi_weakreflist = NULL;
_PyObject_GC_TRACK(gen);
return (PyObject *)gen;
}
We could see it init a generator object. And Init a Frame.
Anything we do like f.send() or f.next(), It will call gen_send_ex, and the code below:
static PyObject *
gen_iternext(PyGenObject *gen)
{
return gen_send_ex(gen, NULL, 0);
}
static PyObject *
gen_send(PyGenObject *gen, PyObject *arg)
{
return gen_send_ex(gen, arg, 0);
}
Only difference between two function is arg, send is send an arg, next send NULL.
gen_send_ex code below:
static PyObject *
gen_send_ex(PyGenObject *gen, PyObject *arg, int exc)
{
PyThreadState *tstate = PyThreadState_GET();
PyFrameObject *f = gen->gi_frame;
PyObject *result;
if (gen->gi_running) {
fprintf(stderr, "gi init\n");
PyErr_SetString(PyExc_ValueError,
"generator already executing");
return NULL;
}
if (f==NULL || f->f_stacktop == NULL) {
fprintf(stderr, "check stack\n");
/* Only set exception if called from send() */
if (arg && !exc)
PyErr_SetNone(PyExc_StopIteration);
return NULL;
}
if (f->f_lasti == -1) {
fprintf(stderr, "f->f_lasti\n");
if (arg && arg != Py_None) {
fprintf(stderr, "something here\n");
PyErr_SetString(PyExc_TypeError,
"can't send non-None value to a "
"just-started generator");
return NULL;
}
} else {
/* Push arg onto the frame's value stack */
fprintf(stderr, "frame\n");
if(arg) {
/* fprintf arg */
}
result = arg ? arg : Py_None;
Py_INCREF(result);
*(f->f_stacktop++) = result;
}
fprintf(stderr, "here\n");
/* Generators always return to their most recent caller, not
* necessarily their creator. */
Py_XINCREF(tstate->frame);
assert(f->f_back == NULL);
f->f_back = tstate->frame;
gen->gi_running = 1;
result = PyEval_EvalFrameEx(f, exc);
gen->gi_running = 0;
/* Don't keep the reference to f_back any longer than necessary. It
* may keep a chain of frames alive or it could create a reference
* cycle. */
assert(f->f_back == tstate->frame);
Py_CLEAR(f->f_back);
/* If the generator just returned (as opposed to yielding), signal
* that the generator is exhausted. */
if (result == Py_None && f->f_stacktop == NULL) {
fprintf(stderr, "here2\n");
Py_DECREF(result);
result = NULL;
/* Set exception if not called by gen_iternext() */
if (arg)
PyErr_SetNone(PyExc_StopIteration);
}
if (!result || f->f_stacktop == NULL) {
fprintf(stderr, "here3\n");
/* generator can't be rerun, so release the frame */
Py_DECREF(f);
gen->gi_frame = NULL;
}
fprintf(stderr, "return result\n");
return result;
}
Looks like Generator Object is a controller of it's own Frame which called gi_frame.
I add some fprintf (...), so let's run code.
f.next()
f->f_lasti
here
call here
return result
1
So, first it goes to f_lasti(This is a integer offset into the byte code of the last instructions executed, initialized to -1), and yes it's -1, but with no args, then function goes on.
Then goto here, the most important thing now is PyEval_EvalFrameEx. PyEval_EvalFrameEx implements CPython’s evaluation loop, we could thing it runs every code (in fact is Python opcode), and run the line print 'call here', it print text.
When code goes to yield, Python stores context by using frame object (we could search Call Stack). Give value back and give up control of code.
After everything done, then return result, and showing value 1 in terminal.
Next time we run next(), it will not go to f_lasti scope. It shows:
frame
here
call here
return result
2
We did not send arg so still get result from PyEval_EvalFrameEx and result is 2.
Related
I'll try to keep it as short as possible.
I'm making a python application which uses a C Extension (hw_timer).
This extension is in charge of creating a timer.
The Python application starts by calling hw_timer.StartTimer which instantiates a timer object.
After that, inside an infinite loop, the Python keeps calling hw_timer.ResetTimer and hw_timer.StartTimer. Reset timer de facto destroys the timer and then a new one is created with StartTimer and so on.
The code went through several changes since, no matter what, I kept getting segmentation faults whenever I tried to create a new timer.
However the situation now is VERY strange.
Here's the Python code: you'll notice I've removed the infinite loop I described before because I wanted to create a very simple scenario to showcase my issue
if __name__ == "__main__":
timerPtr = hw_timer.StartTimer(os.getpid()) #start timer returns a tuple
print("timer,tid,pid -> {}".format(timerPtr))
print("Result of reset timer: {}".format(hw_timer.ResetTimer(timerPtr[0])))
print("---------")
time.sleep(10)
timerPtr=hw_timer.StartTimer(os.getpid())
print("timer,tid,pid -> {}".format(timerPtr))
print("Result of reset timer: {}".format(hw_timer.ResetTimer(timerPtr[0])))
print("---------")
time.sleep(10)
And this is the output
timer,tid,pid -> (16985424, 16598768, 45975)
Result of reset timer: (16985424, 16598768, 45975)
---------
timer,tid,pid -> (16598768, 15553760, 45975)
timer_delete: Invalid argument
As you can see, despite replicating the same code twice, the second time, system method time_delete fails during the computation of ResetTimer.
Now the following is the C code (I've removed as much Python.h related content as possible to make it more readable)
#include <Python.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <time.h>
struct tmr_struct
{
timer_t* tid;
int pid;
};
typedef struct sigevent signal_event;
typedef struct tmr_struct tmr_data;
typedef struct itimerspec tmr_spec;
static const int TIMER_SECOND = 5;
//Method called after timer expiration
void signal_handler (int sv)
{
exit(1);
}
//Creation of a new timer_t object stored inside a tmr_data struct
static void _StartTimer(tmr_data* timer)
{
timer->tid = new timer_t();
signal_event sev;
signal(SIGUSR1, signal_handler);
sev.sigev_notify = SIGEV_THREAD_ID;
sev.sigev_notify_function = NULL;
sev._sigev_un._tid = timer->pid;
sev.sigev_signo = SIGUSR1;
sev.sigev_notify_attributes = NULL;
tmr_spec ts;
if (timer_create(CLOCK_REALTIME, &sev, timer->tid) == -1) {
perror("timer_create");
exit(1);
}
memset(&ts,0,sizeof(ts));
ts.it_value.tv_nsec = 0;
ts.it_value.tv_sec = TIMER_SECOND;
ts.it_interval.tv_nsec = 0;
ts.it_interval.tv_sec = 0;
if (timer_settime(*(timer->tid), 0, &ts, NULL) == -1) {
perror("timer_settime");
exit(1);
}
return;
}
//Start timer method called from the Python application
//Accepts the PID of the Python App and later passes it to _StartTimer using the tmr_data struct
static PyObject* StartTimer(PyObject* self, PyObject* args) {
int pid;
tmr_data* timer = new tmr_data();
if (!PyArg_ParseTuple(args, "i", &pid)) {
return NULL;
}
timer->pid = pid;
_StartTimer(timer);
return Py_BuildValue("iii", timer,timer->tid,timer->pid);
}
//Receives a pointer to a tmr_data struct object, deletes the timer contained inside it
static PyObject* ResetTimer(PyObject* self, PyObject* args)
{
tmr_data* timer;
long ptr_timer;
if (!PyArg_ParseTuple(args, "i", &ptr_timer)) {
return NULL;
}
timer = (tmr_data*)ptr_timer;
if(timer_delete(timer->tid) != 0)
{
perror("timer_delete");
exit(1);
}
return Py_BuildValue("iii", timer,timer->tid,timer->pid);
}
Now consider that the entire code of _StartTimer is correct: it's already been used in other parts of the project for pure C applications and, indeed, the timer does work here as well, at least the first time.
But still, this stuff is all over the place: originally I had ResetTimer calling _StartTimer, but whenever I'd call "timer = new tmr_data()" I would get segmentation fault, so I'm starting to think that the entire timer implementation might be somewhat tricky or prone to errors.
C++ funtion:
DLLENTRY int VTS_API
SetParamValue( const char *paramName, void *paramValue ) //will return zero(0) in the case of an error
{
rtwCAPI_ModelMappingInfo* mmi = &(rtmGetDataMapInfo(PSA_StandAlone_M).mmi);
int idx = getParamIdx( paramName );
if (idx<0) {
return 0; //Error
}
int retval = capi_ModifyModelParameter( mmi, idx, paramValue );
if (retval == 1 ) {
ParamUpdateConst();
}
return retval;
}
and my Python code:
import os
from ctypes import *
print(os.getcwd())
os.chdir(r"C:\MY_SECRET_DIR")
print(os.getcwd())
PSAdll=cdll.LoadLibrary(os.getcwd()+"\PSA_StandAlone_1.dll")
setParam=PSAdll.SetParamValue
setParam.restype=c_int
setParam.argtypes=[c_char_p, c_void_p]
z=setParam(b"LDOGenSpdSetpoint", int(20) )
returns
z=setParam(b"LDO_PSA_GenSpdSetpoint01", int(20) )
WindowsError: exception: access violation reading 0x00000020
Any idea what can help?
I already tried POINTER and byref(), but I am getting the same output
SetParamValue expects a pointer (void* or c_void_p) as the 2nd argument, but you're passing an int. The function will interpret it as a pointer (memory address), and when attempting to dereference it (to get its content), it will segfault (Access Violation), as the process doesn't have permissions on that address.
To fix the problem, pass the proper arguments to the function:
z = setParam(b"LDOGenSpdSetpoint", pointer(c_int(20)))
You can find more details on [Python 3]: ctypes - A foreign function library for Python.
I was trying to reproduce the example from a previous SO post on a kernel above 4 (4.1):
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/netlink.h>
#include <net/netlink.h>
#include <net/net_namespace.h>
/* Protocol family, consistent in both kernel prog and user prog. */
#define MYPROTO NETLINK_USERSOCK
/* Multicast group, consistent in both kernel prog and user prog. */
#define MYGRP 31
static struct sock *nl_sk = NULL;
static void send_to_user(void)
{
struct sk_buff *skb;
struct nlmsghdr *nlh;
char *msg = "Hello from kernel";
int msg_size = strlen(msg) + 1;
int res;
pr_info("Creating skb.\n");
skb = nlmsg_new(NLMSG_ALIGN(msg_size + 1), GFP_KERNEL);
if (!skb) {
pr_err("Allocation failure.\n");
return;
}
nlh = nlmsg_put(skb, 0, 1, NLMSG_DONE, msg_size + 1, 0);
strcpy(nlmsg_data(nlh), msg);
pr_info("Sending skb.\n");
res = nlmsg_multicast(nl_sk, skb, 0, MYGRP, GFP_KERNEL);
if (res < 0)
pr_info("nlmsg_multicast() error: %d\n", res);
else
pr_info("Success.\n");
}
static int __init hello_init(void)
{
pr_info("Inserting hello module.\n");
nl_sk = netlink_kernel_create(&init_net, MYPROTO, NULL);
if (!nl_sk) {
pr_err("Error creating socket.\n");
return -10;
}
send_to_user();
netlink_kernel_release(nl_sk);
return 0;
}
static void __exit hello_exit(void)
{
pr_info("Exiting hello module.\n");
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
However, compilation works fine, but when I insert the module, it returns:
nlmsg_multicast() error: -3
I dont even know, where I can lookup the error codes to learn, what -3 means in this context (I searched here, but was unable to find anything useful, regarding the error code).
Just to be sure, I post the userland code (Python) also:
EDITED due to a comment: (but still not working)
#!/usr/bin/env python
import socket
import os
import time
sock = socket.socket(socket.AF_NETLINK, socket.SOCK_DGRAM, socket.NETLINK_USERSOCK)
# 270 is SOL_NETLINK and 1 is NETLINK_ADD_MEMBERSHIP
sock.setsockopt(270, 1, 31)
while 1:
try:
print sock.recvfrom(1024)
except socket.error, e:
print 'Exception'
You forgot to bind the socket. :-)
I'm not very fluent with Python, so use this only as a starting point (between the socket and the setsockopt):
sock.bind((0, 0))
That prints me a bunch of garbage, among which I can see
Hello from kernel
By the way: When nlmsg_multicast() throws ESRCH, it's usually (or maybe always) because there were no clients listening.
First open the client, then try to send the message from the kernel.
Otherwise you can always ignore that error code it that makes sense for your use case.
Trying to create a python callback which needs to be invoked while calling the C callback
from a dll in Windows environment. Please review the code below to understand the issue.
from ctypes import *
#---------qsort Callback-------------#
IntArray5 = c_int * 5
ia = IntArray5(5,1,7,33,99)
libc = cdll.msvcrt
qsort = libc.qsort
qsort.restype = None
CMPFUNC = CFUNCTYPE(c_int,POINTER(c_int),POINTER(c_int) )
test = 0
def py_cmp_func(a,b):
#print 'py_cmp_func:',a[0],b[0]
global test
test = 10000
return a[0]-b[0]
cmp_func = CMPFUNC(py_cmp_func)
qsort(ia, len(ia), sizeof(c_int), cmp_func)
print "global test=",test
for item in ia : print item
#----------Load DLL & Connect ------------#
gobiDLL = WinDLL("C:\LMS\QCWWAN2k.dll")
print 'Output of connect : ',gobiDLL.QCWWANConnect()
#----------SetByteTotalsCallback----------#
tx = POINTER(c_ulonglong)
rx = POINTER(c_ulonglong)
proto_callback = WINFUNCTYPE(c_void_p,tx,rx)
gtx = grx = 0 # Used to copy the response in the py_callback
def py_callback(t,r):
sleep(10)
print 'python callback ...'
print "tx=",t,"rx=",r
global gtx,grx
gtx = 5000 # gtx = t
grx = 2000 # grx = r
#return 0
callback = proto_callback(py_callback)
gobiDLL.SetByteTotalsCallback.restype = c_ulong
gobiDLL.SetByteTotalsCallback.argtypes = [proto_callback,c_byte]
print "SetByteTotalsCallback = ",gobiDLL.SetByteTotalsCallback(callback, c_byte(256))
print "gtx = ",gtx
print "grx = ",grx
The DLL Documents the Prototype and the callback for the SetByteTotalsCallback() method as shown below.
Prototype :
ULONG QCWWANAPI2K SetSessionStateCallback( tFNSessionState pCallback );
Callback :
void ByteTotalsCallback( ULONGLONG txTotalBytes, ULONGLONG rxTotalBytes );
OUTPUT :
>>>
global test= 10000
1
5
7
33
99
Output of connect : 0
SetByteTotalsCallback = 0
gtx = 0
grx = 0
>>>>
The current problem is that the whole program gets called properly,
but the python callback does not get called at all. The program exits with 0 status from
gobiDLL.SetByteTotalsCallback(callback, c_byte(256)) method, but the callback() method written
in python does not called during the call.
Could you please point out what could help enter the python callback ?
The other sample qsort() method passes the pointer to the python function pointer wonderfully.
At a loss to get the root cause of the issue here.
TIA,
Anthony
You can't. C/C++ functions can't access Python functions directly - that function prototype is probably expecting a pointer to C. Python will be passing it a pointer to its internal data structure for that particular function.
This is the time to build a C extension to python to wrap that DLL and expose it to Python. What you'd do is essentially have the C callback call the Python callback, since that can be done. To be clearer, what you want to achieve is:
| This side is C land i.e. "real" addresses
|
Python objects --> C extension -- register callback with --> DLL
| |
in the python | Calls callback
| |
interpreter <-------------- Callback in C extension <-------
|
The following is a very quick explanation for building a calling a python function from C. You'll need to build this code with the MSVC (or alternative tool) that was used to build your Python distribution; use depends.exe to find out which msvcXX.dll it is linked against.
Global state is generally considered bad, but for simplicity that's what I used:
static PyObject* pyfunc_event_handler = NULL;
static PyObject* pyfunc_event_args = NULL;
I then added a set handler function to make the process of setting the callback easier. However, you don't need to do that, you just need to
static PyObject* set_event_handler(PyObject *self, PyObject *args)
{
PyObject *result = NULL;
PyObject *temp;
The next line is the important one - I allow passing of two python objects (the O arguments to PyArg_ParseTuple. One object contains the function, the other its parameters.
if (PyArg_ParseTuple(args, "OO", &temp, &pyfunc_event_args)) {
if (!PyCallable_Check(temp)) {
PyErr_SetString(PyExc_TypeError, "parameter must be a function");
return NULL;
}
Sort out references. Python needs you to do this.
Py_XINCREF(temp); /* Add a reference to new func */
Py_XDECREF(pyfunc_event_handler); /* Dispose of previous callback */
pyfunc_event_handler = temp; /* Remember new callback */
/* Boilerplate to return "None" */
Py_INCREF(Py_None);
result = Py_None;
}
return result;
}
You can then call this elsewhere with:
PyObject* arglist = Py_BuildValue("(O)", pyfunc_event_args);
pyobjresult = PyObject_CallObject(pyfunc_event_handler, arglist);
Py_DECREF(arglist);
Don't forget the DECREF, you need Python to gc the arglist.
From python, using this is as simple as:
set_event_handler(func, some_tuple)
Where func has matching parameters like so:
def func(obj):
/* handle obj */
Things you probably want to read up on:
LoadLibrary (load DLL from C).
GetProcAddress (find a function to call).
Extending Python with C or C++ from the Python docs.
I have a C interface that looks like this (simplified):
extern bool Operation(void ** ppData);
extern float GetFieldValue(void* pData);
extern void Cleanup(p);
which is used as follows:
void * p = NULL;
float theAnswer = 0.0f;
if (Operation(&p))
{
theAnswer = GetFieldValue(p);
Cleanup(p);
}
You'll note that Operation() allocates the buffer p, that GetFieldValue queries p, and that Cleanup frees p. I don't have any control over the C interface -- that code is widely used elsewhere.
I'd like to call this code from Python via SWIG, but I was unable to find any good examples of how to pass a pointer to a pointer -- and retrieve its value.
I think the correct way to do this is by use of typemaps, so I defined an interface that would automatically dereference p for me on the C side:
%typemap(in) void** {
$1 = (void**)&($input);
}
However, I was unable to get the following python code to work:
import test
p = None
theAnswer = 0.0f
if test.Operation(p):
theAnswer = test.GetFieldValue(p)
test.Cleanup(p)
After calling test.Operation(), p always kept its initial value of None.
Any help with figuring out the correct way to do this in SWIG would be much appreciated. Otherwise, I'm likely to just write a C++ wrapper around the C code that stops Python from having to deal with the pointer. And then wrap that wrapper with SWIG. Somebody stop me!
Edit:
Thanks to Jorenko, I now have the following SWIG interface:
% module Test
%typemap (in,numinputs=0) void** (void *temp)
{
$1 = &temp;
}
%typemap (argout) void**
{
PyObject *obj = PyCObject_FromVoidPtr(*$1, Cleanup);
$result = PyTuple_Pack(2, $result, obj);
}
%{
extern bool Operation(void ** ppData);
extern float GetFieldValue(void *p);
extern void Cleanup(void *p);
%}
%inline
%{
float gfv(void *p){ return GetFieldValue(p);}
%}
%typemap (in) void*
{
if (PyCObject_Check($input))
{
$1 = PyCObject_AsVoidPtr($input);
}
}
The python code that uses this SWIG interface is as follows:
import test
success, p = test.Operation()
if success:
f = test.GetFieldValue(p) # This doesn't work
f = test.gvp(p) # This works!
test.Cleanup(p)
Oddly, in the python code, test.GetFieldValue(p) returns gibberish, but test.gfv(p) returns the correct value. I've inserting debugging code into the typemap for void*, and both have the same value of p! The call Any ideas about that?
Update: I've decided to use ctypes. MUCH easier.
I agree with theller, you should use ctypes instead. It's always easier than thinking about typemaps.
But, if you're dead set on using swig, what you need to do is make a typemap for void** that RETURNS the newly allocated void*:
%typemap (in,numinputs=0) void** (void *temp)
{
$1 = &temp;
}
%typemap (argout) void**
{
PyObject *obj = PyCObject_FromVoidPtr(*$1);
$result = PyTuple_Pack(2, $result, obj);
}
Then your python looks like:
import test
success, p = test.Operation()
theAnswer = 0.0f
if success:
theAnswer = test.GetFieldValue(p)
test.Cleanup(p)
Edit:
I'd expect swig to handle a simple by-value void* arg gracefully on its own, but just in case, here's swig code to wrap the void* for GetFieldValue() and Cleanup():
%typemap (in) void*
{
$1 = PyCObject_AsVoidPtr($input);
}
Would you be willing to use ctypes? Here is sample code that should work (although it is untested):
from ctypes import *
test = cdll("mydll")
test.Operation.restype = c_bool
test.Operation.argtypes = [POINTER(c_void_p)]
test.GetFieldValue.restype = c_float
test.GetFieldValue.argtypes = [c_void_p]
test.Cleanup.restype = None
test.Cleanup.argtypes = [c_void_p]
if __name__ == "__main__":
p = c_void_p()
if test.Operation(byref(p)):
theAnswer = test.GetFieldValue(p)
test.Cleanup(p)