If I have the following function and the optional argument myobj is not passed, does myobj remain NULL or is it set to Py_None?
static PyObject * myfunc(PyObject * self, PyObject * args) {
PyObject * myobj = NULL;
if (!PyArg_ParseTuple(args, "|O", &myobj)) {
return NULL;
}
// ...
}
According Parsing arguments and building values,
| Indicates that the remaining arguments in the Python argument list are optional. The C variables corresponding to optional arguments should be initialized to their default value — when an optional argument is not specified, PyArg_ParseTuple() does not touch the contents of the corresponding C variable(s).
Does this apply to PyObject *s? It's obviously a pointer that exists in C so one could say it's a C variable, but it's a pointer to a python object so one could also say it does not count as a C variable.
It will remain NULL. And of course a pointer to a struct is a C object.
Related
I have a simple class below,
class MyClass(int):
def __index__(self):
return 1
According to operator.index documentation,
operator.index(a)
Return a converted to an integer. Equivalent to a.__index__()
But when I use operator.index with MyClass instance, I got 100 instead of 1 (I am getting 1 if I use a.__index__()). Why is that?.
>>> a = MyClass(100)
>>>
>>> import operator
>>> print(operator.index(a))
100
>>> print(a.__index__())
1
This actually appears to be a deep-rooted issue in cpython. If you look at the source code for operator.py, you can see the definition of index:
def index(a):
"Same as a.__index__()."
return a.__index__()
So...why is it not equivalent? It's literally calling __index__. Well, at the bottom of the source, there's the culprit:
try:
from _operator import *
except ImportError:
pass
else:
from _operator import __doc__
It's overwriting the definitions with a native _operator module. In fact, if you comment this out (either by modifying the actual library or making your own fake operator.py* and importing that), it works. So, we can find the source code for the native _operator library, and look at the related part:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
{
return PyNumber_Index(a);
}
So, it's a wrapper around the PyNumber_Index function. PyNumber_Index is a wrapper around _PyNumber_Index, so we can look at that:
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item;
}
if (!_PyIndex_Check(item)) {
PyErr_Format(PyExc_TypeError,
"'%.200s' object cannot be interpreted "
"as an integer", Py_TYPE(item)->tp_name);
return NULL;
}
result = Py_TYPE(item)->tp_as_number->nb_index(item);
if (!result || PyLong_CheckExact(result))
return result;
if (!PyLong_Check(result)) {
PyErr_Format(PyExc_TypeError,
"__index__ returned non-int (type %.200s)",
Py_TYPE(result)->tp_name);
Py_DECREF(result);
return NULL;
}
/* Issue #17576: warn if 'result' not of exact type int. */
if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
"__index__ returned non-int (type %.200s). "
"The ability to return an instance of a strict subclass of int "
"is deprecated, and may be removed in a future version of Python.",
Py_TYPE(result)->tp_name)) {
Py_DECREF(result);
return NULL;
}
return result;
}
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
You can see before it even calls nb_index (the C name for __index__), it calls PyLong_Check on the argument, and if it's true, it just returns the item with no modification. PyLong_Check is a macro that checks for long subtyping (int in python is a PyLong):
#define PyLong_Check(op) \
PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
#define PyLong_CheckExact(op) Py_IS_TYPE(op, &PyLong_Type)
So, basically, the takeaway is that for whatever reason, probably for speed, int subclasses don't get their __index__ method called, and instead just get _PyLong_Copy'd to the resulting return value, but only in the native _operator module, and not in the non-native operator.py. This conflict of implementation as well as inconsistency in documentation leads me to believe that this is an issue, either in the documentation or the implementation, and you may want to raise it as one.
It's likely a documentation and not an implementation issue, as cpython has a habit of sacrificing correctness for speed: (nan,) == (nan,) but nan != nan.
* You may have to name it something like fake_operator.py then import it with import fake_operator as operator
This is because your type is an int subclass. __index__ will not be used because the instance is already an integer. That much is by design, and unlikely to be considered a bug in CPython. PyPy behaves the same.
In _operator.c:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
/*[clinic end generated code: output=d972b0764ac305fc input=6f54d50ea64a579c]*/
{
return PyNumber_Index(a);
}
Note that operator.py Python code is not used generally, this code is only a fallback in the case that compiled _operator module is not available. That explains why the result a.__index__() differs.
In abstract.c, cropped after the relevant PyLong_Check part:
/* Return an exact Python int from the object item.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
...
/* Return a Python int from the object item.
Can return an instance of int subclass.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item; /* <---- short-circuited here */
}
...
}
The documentation for operator.index is inaccurate, so this may be considered a minor documentation issue:
>>> import operator
>>> operator.index.__doc__
'Same as a.__index__()'
So, why isn't __index__ considered for integers? The probable answer is found in PEP 357, under the discussion section titled Speed:
Implementation should not slow down Python because integers and long integers used as indexes will complete in the same number of instructions. The only change will be that what used to generate an error will now be acceptable.
We do not want to slow down the most common case for slicing with integers, having to check for an nb_index slot every time.
Update
This answer is incorrect; I misread the documentation. See Aplet123's answer instead. Tl;dr the problem is actually that the C implementation doesn't match the documentation and Python implementation. The C implementation is more like a if isinstance(a, int) else a.__index__().
To prove it, try defining MyClass.__int__(). The outcome will be the same.
Original answer
See the documentation for object.__index__():
object.__index__(self)
Called to implement operator.index(), and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-in bin(), hex() and oct() functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.
If __int__(), __float__() and __complex__() are not defined then corresponding built-in functions int(), float() and complex() fall back to __index__().
(added bold)
a.__int__() exists, so its return value is used instead.
>>> a.__int__
<method-wrapper '__int__' of MyClass object at 0x7f2c5f0f4ec8>
>>> a.__int__()
100
I am working concurrently in C# and in Python.
Is there a difference, in terms of what is being created in memory, between passing a reference type in C#, and passing (by assignment) in Python? It seems in either case, if the variable is changed* in the function, it is changed in the outside scope as well.
(*) of course in Python it must be mutable for this to occur. An immutable object cannot be changed - but that is another topic.
Are we basically just talking different terminology for the same process, or is there a conceptual difference to be learned here, in terms of the underlying mechanism in memory?
First, all arguments are passed by value by default in C#. This has nothing to do with the type being a reference type or a value type, both behave exactly the same way.
Now, the question is, what is a variable? A variable is a placeholder for a value, nothing more. When a variabe is passed by copy, a copy of the value is made.
And what is the value stored in a variable? Well, if the type of the variable is a reference type, the value is basically the memory address of the object its referencing. If its a value type, then the value is the objet itself.
So when you say:
It seems in either case, if the variable is changed* in the function, it is changed in the outside scope as well.
That is deeply wrong because you seem to me be mixing up the type of the argument with how it is passed along:
First example:
var a = new object();
Foo(a);
var isNull = ReferenceEquals(a, null); //false!
void Foo(object o) { o = null; }
Here, a refence typed variable a is passed by value, a copy is made and then inside Foo its reassigned to null. a doesn't care a copy is reasigned inside Foo, it will still point to the same object.
Things of course change if you pass the argument by reference:
var a = new object();
Foo(ref a);
var isNull = ReferenceEquals(a, null); //true!
void Foo(ref object o) { o = null; }
Now you are not making a copy of a named o, you are passing a itself with an alias named o.
Things behave exactly the same with value types:
var a = 1;
Foo(a);
var isNull = 1 == 0; //false!
void Foo(int i) { i = 0; }
And
var a = 1;
Foo(ref a);
var isNull = 1 == 0; //true!
void Foo(ref int i) { i = 0; }
The difference between value types and reference types when you pass it a long by value is due to what the value of the variable is. Like we said before, reference typed variables store the address, so even if you pass along a copy, the copy points to the same object, so any changes in the object are visible from both variables:
var ii = new List<int>();
Foo(ii);
var b = ii.Count == 1; //true!
void Foo(List<int> list) { list.Add(1); }
But with value types, the value is the object itself, so you are passing along a copy of the object, and you are therefore modifying a copy:
struct MutableStruct
{
public int I { get; set; }
}
var m = new mutableStruct();
Foo(m);
var b = m.I == 1; //false!
void Foo(MutableStruct mutableStruct) { mutableStruct.I = 1; }
Does this make things clearer?
How does one go about to parse a group of required but mutually exclusive arguments using the python C-api?
E.g. given the function definition
static PyObject* my_func(PyObject *self, PyObject *args, PyObject *kwargs) {
double a; // first argument, required
double b=0, c=0; // second argument, required but mutually exclusive, b is default keyword if no keyword is set
char d[] = "..."; // third argument, optional
// parse arguments
...
}
My idea here was to parse the input arguments twice, i.e. replacing ... above with:
static const char *kwList1[] = {"a","b","c","d"};
static const char *kwList2[] = {"a","b","d"};
int ret;
if (!(ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|dds",(char **)kwList1,&a,&b,&c,&d))) {
ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|ds",(char **)kwList2,&a,&b,&d));
}
if (!ret) return NULL;
// verify that one of, but not both, variables b and c are non-zero
...
However, the second call to PyArg_ParseTupleAndKeywords() returns 0 for valid input so I assume here that the variables args and kwargs have some attributes set by the first call to PyArg_ParseTupleAndKeywords() that causes the second call to fail (output python error is: TypeError: a float is required).
I'm aware that the above could be solved using the argparse python module but would prefer a solution directly using the C-api. One idea here would be if it were possible to first copy of the input args and kwargs into two new PyObject variables and use these in the second call to PyArg_ParseTupleAndKeywords(), however I can't find any api-function to do so (guess I also would need to know howto release the memory allocated for this).
Seems like the issue were that the first call to PyArg_ParseTupleAndKeywords() set the error indicator which caused the second call to the function to fail. So the solution is to insert a call to PyErr_Clear() between the calls to PyArg_ParseTupleAndKeywords(). In summary, the following code performs the task
static PyObject* my_func(PyObject *self, PyObject *args, PyObject *kwargs) {
double a; // first argument, required
double b=0, c=0; // second argument, required but mutually exclusive, b is default keyword if no keyword is set
char d[] = "..."; // third argument, optional
// parse arguments
static const char *kwList1[] = {"a","b","c","d"};
static const char *kwList2[] = {"a","b","d"};
int ret;
if (!(ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|dds",(char **)kwList1,&a,&b,&c,&d))) {
PyErr_Clear();
ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|ds",(char **)kwList2,&a,&b,&d));
}
if (!ret) return NULL;
// verify that one of, but not both, variables b and c are non-zero
if (b==0 && c==0) {
PyErr_SetString(PyExc_TypeError,"Required mutually exclusive arguments 'b' or 'c' (pos 2) not found (or input with value 0)");
return NULL;
} else if (b!=0 && c!=0) {
PyErr_SetString(PyExc_TypeError,"Use of multiple mutually exclusive required arguments 'b' and 'c' (pos 2)");
return NULL;
}
...
}
Then again this does not guard against the calling the function with both the arguments b and c given that one of them is 0 and the other not. However this is a minor problem.
Im calling a cpp function from dll with ctypes
the function definition is
int foo(strc *mystrc, int *varsize);
And the structure:
typedef struct
{
int type;
int count;
void *value;
} strc;
So what I tried in python was to define:
class strc(ctypes.Structure):
_fields_ = [('type', ctypes.c_int),
('count', ctypes.c_int),
('value', ctypes.c_void_p)]
And to call the function as
varsize = ctypes.c_int()
mystrc = strc()
foo(ctypes.byref(mystrc), ctypes.byref(varsize))
I can perfectly call the function and retrieve all values except for the "value". It should be an array of variables indicated by the "type", have size "varsize" and be an array of "count" variables.
How can I retrieve what is indicated by the void pointer?
template<class T> void iterate_strc_value(const void* value, int size)
{
const T* element = reinterpret_cast<const T*>(value);
for(int offset = 0; offset != size; ++offset)
*(element + offset) // at this point you have the element at the offset+1th position
}
switch(strc_var.type)
{
case 0: iterate_strc_value<char>(strc_var.value, element_count); break;
case 1: iterate_strc_value<int>(strc_var.value, element_count); break;
case 2: iterate_strc_value<std::string>(strc_var.value, element_count); break;
default: // invalid type -> maybe throw exception within python?!
}
The value pointer is the pointer that you also named value, while size specifies the amount of elements within the array. The size of the type is not needed, as the size of your types is known at compile time.
Basically the function just converts the void* to a pointer of your desired type and then uses pointer arithmetic to iterate over the array. This approach is general enough to just iterate over any void* which points to an array as long as you know it's size.
You could also add a third argument as a callback which performs the given action on each element to keep specialized code outside of the template function.
Refering to http://mail.python.org/pipermail/python-dev/2009-June/090210.html
AND http://dan.iel.fm/posts/python-c-extensions/
and here is other places i searched regarding my question:
http://article.gmane.org/gmane.comp.python.general/424736
http://joyrex.spc.uchicago.edu/bookshelves/python/cookbook/pythoncook-CHP-16-SECT-3.html
http://docs.python.org/2/c-api/sequence.html#PySequence_Check
Python extension module with variable number of arguments
I am inexperienced in Python/C API.
I have the following code:
sm_int_list = (1,20,3)
c_int_array = (ctypes.c_int * len(sm_int_list))(*sm_int_list)
sm_str_tuple = ('some','text', 'here')
On the C extension side, i have done something like this:
static PyObject* stuff_here(PyObject *self, PyObject *args)
{
char* input;
int *i1, *i2;
char *s1, *s2;
// args = (('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
**PyArg_ParseTuple(args, "(s#:):#(i:)#(s#:):#(i:)#", &s1, &i1, &s2, &i2)**;
/*stuff*/
}
such that:
stuff.here(('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
returns data in the same form as args after some computation.
I would like to know the PyArg_ParseTuple expression, is it the proper way to parse
an array of varying string
an array of integers
UPDATE NEW
Is this the correct way?:
static PyObject* stuff_here(PyObject *self, PyObject *args)
unsigned int tint[], cint[];
ttotal=0, ctotal=0;
char *tstr, *cstr;
int *t_counts, *c_counts;
Py_ssize_t size;
PyObject *t_str1, *t_int1, *c_str2, *c_int2; //the C var that takes in the py variable value
PyObject *tseq, cseq;
int t_seqlen=0, c_seqlen=0;
if (!PyArg_ParseTuple(args, "OOiOOi", &t_str1, &t_int1, &ttotal, &c_str2, &c_int2, &ctotal))
{
return NULL;
}
if (!PySequence_Check(tag_str1) && !PySequence_Check(cat_str2)) return NULL;
else:
{
//All things t
tseq = PySequence_Fast(t_str1, "iterable");
t_seqlen = PySequence_Fast_GET_SIZE(tseq);
t_counts = PySequence_Fast(t_int1);
//All things c
cseq = PySequence_Fast(c_str2);
c_seqlen = PySequence_Fast_GET_SIZE(cseq);
c_counts = PySequence_Fast(c_int2);
//Make c arrays of all things tag and cat
for (i=0; i<t_seqlen; i++)
{
tstr[i] = PySequence_Fast_GET_ITEM(tseq, i);
tcounts[i] = PySequence_Fast_GET_ITEM(t_counts, i);
}
for (i=0; i<c_seqlen; i++)
{
cstr[i] = PySequence_Fast_GET_ITEM(cseq, i);
ccounts[i] = PySequence_Fast_GET_ITEM(c_counts, i);
}
}
OR
PyArg_ParseTuple(args, "(s:)(i:)(s:)(i:)", &s1, &i1, &s2, &i2)
And then again while returning,
Py_BuildValue("sisi", arr_str1,arr_int1,arr_str2,arr_int2) ??
Infact if someone could in detail clarify the various PyArg_ParseTuple function that would be of great benefit. the Python C API, as i find it in the documentation, is not exactly a tutorial on things to do.
You can use PyArg_ParseTuple to parse a real tuple, that has a fixed structure. Especially the number of items in the subtuples cannot change.
As the 2.7.5 documentation says, your format "(s#:):#(i:)#(s#:):#(i:)#" is wrong since : cannot occur in nested parenthesis. The format "(sss)(iii)(sss)(iii)", along with total of 12 pointer arguments should match your arguments. Likewise for Py_BuildValue you can use the same format string (which creates 4 tuples within 1 tuple), or "(sss)[iii](sss)[iii]" if the type matters (this makes the integers to be in lists instead of tuples).