Pass by reference vs Pass by assignment? C# vs Python - python

I am working concurrently in C# and in Python.
Is there a difference, in terms of what is being created in memory, between passing a reference type in C#, and passing (by assignment) in Python? It seems in either case, if the variable is changed* in the function, it is changed in the outside scope as well.
(*) of course in Python it must be mutable for this to occur. An immutable object cannot be changed - but that is another topic.
Are we basically just talking different terminology for the same process, or is there a conceptual difference to be learned here, in terms of the underlying mechanism in memory?

First, all arguments are passed by value by default in C#. This has nothing to do with the type being a reference type or a value type, both behave exactly the same way.
Now, the question is, what is a variable? A variable is a placeholder for a value, nothing more. When a variabe is passed by copy, a copy of the value is made.
And what is the value stored in a variable? Well, if the type of the variable is a reference type, the value is basically the memory address of the object its referencing. If its a value type, then the value is the objet itself.
So when you say:
It seems in either case, if the variable is changed* in the function, it is changed in the outside scope as well.
That is deeply wrong because you seem to me be mixing up the type of the argument with how it is passed along:
First example:
var a = new object();
Foo(a);
var isNull = ReferenceEquals(a, null); //false!
void Foo(object o) { o = null; }
Here, a refence typed variable a is passed by value, a copy is made and then inside Foo its reassigned to null. a doesn't care a copy is reasigned inside Foo, it will still point to the same object.
Things of course change if you pass the argument by reference:
var a = new object();
Foo(ref a);
var isNull = ReferenceEquals(a, null); //true!
void Foo(ref object o) { o = null; }
Now you are not making a copy of a named o, you are passing a itself with an alias named o.
Things behave exactly the same with value types:
var a = 1;
Foo(a);
var isNull = 1 == 0; //false!
void Foo(int i) { i = 0; }
And
var a = 1;
Foo(ref a);
var isNull = 1 == 0; //true!
void Foo(ref int i) { i = 0; }
The difference between value types and reference types when you pass it a long by value is due to what the value of the variable is. Like we said before, reference typed variables store the address, so even if you pass along a copy, the copy points to the same object, so any changes in the object are visible from both variables:
var ii = new List<int>();
Foo(ii);
var b = ii.Count == 1; //true!
void Foo(List<int> list) { list.Add(1); }
But with value types, the value is the object itself, so you are passing along a copy of the object, and you are therefore modifying a copy:
struct MutableStruct
{
public int I { get; set; }
}
var m = new mutableStruct();
Foo(m);
var b = m.I == 1; //false!
void Foo(MutableStruct mutableStruct) { mutableStruct.I = 1; }
Does this make things clearer?

Related

Equivalent of python walrus operator (:=) in c++11?

Recently I have been using the := operator in python quite a bit, in this way:
if my_object := SomeClass.function_that_returns_object():
# do something with this object if it exists
print(my_object.some_attribute)
The question
Is there any way to do this in c++11 without the use of stdlib?
for example in an arduino sketch if I wanted to use a method that may potentially return zero, such as:
if(char * data = myFile.readBytes(data, dataLen))
{
// do something
}
else
{
// do something else
}
Python's := assignment expression operator (aka, the "walrus" operator) returns the value of an assignment.
C++'s = assignment operator (both copy assignment and move assignment, as well as other assignment operators) does essentially the same thing, but in a different way. The result of an assignment is a reference to the object that was assigned to, allowing that object to be evaluated in further expressions.
So, the equivalent of:
if my_object := SomeClass.function_that_returns_object():
# do something with this object if it exists
print(my_object.some_attribute)
Would be just like you showed:
SomeType *object;
if ((my_object = SomeClass.function_that_returns_object())) {
// do something with this object if it exists
print(my_object->some_attribute);
}
If function_that_returns_object() returns a null pointer, the if evaluates object as false, otherwise it evaluates as true. The same can be done with other types, eg:
int value;
if ((value = SomeClass.function_that_returns_int()) == 12345) {
// do something with this value if it matches
}
Not exactly, no.
As is mentioned in other answers, the c++ = operator already does most of what you want. If have an existing variable then assignment to that variable returns a reference to it, so you can put that into an if condition:
Foo* a_pointer;
if (a_pointer = some_function()) {
//...
}
Here, the body of the if conditional will execute if some_function return a non-null pointer and a_pointer will be a copy of the pointer returned by some_function.
Unlike the walrus operator though, this has the limitation that a_pointer had to first be defined outside of the if condition.
C++17 adds something a bit closer, in that you can initialize a variable inside of the if condition with a special if-initializer syntax:
if (Foo* a_pointer = some_function(); a_pointer) {
//...
}
Note that the initializer still doesn't directly contribute to the truthiness of the if condition. It's only the expression after the ; that determines if the body of the if statement will execute. In this case, a_pointer is initialized to be the value returned by some_function in the initializer and then the condition part checks if a_pointer is truthy.
According to the documentation, readBytes returns the number of bytes placed in the buffer (not a pointer), so I think you just need to do something like:
if(myFile.readBytes(data, dataLen))
{
// do something with data
}
else
{
// do something else
}

How to create an Enum object in Python C API?

I'm struggling how to create a python Enum object inside the Python C API. The enum class has assigned tp_base to PyEnum_Type, so it inherits Enum. But, I can't figure out a way to tell the Enum base class what items are in the enum. I want to allow iteration and lookup from Python using the __members__ attribute that every Python Enum provides.
Thank you,
Jelle
It is not straightforward at all. The Enum is a Python class using a Python metaclass. It is possible to create it in C but it will be just emulating the constructing Python code in C - the end result is the same and while it speeds up things slightly, you'll most probably run the code only once within each program run.
In any case it is possible, but it is not easy at all. I'll show how to do it in Python:
from enum import Enum
class Color(Enum):
RED = 1
GREEN = 2
BLUE = 3
print(Color)
print(Color.RED)
is the same as:
from enum import Enum
name = 'Color'
bases = (Enum,)
enum_meta = type(Enum)
namespace = enum_meta.__prepare__(name, bases)
namespace['RED'] = 1
namespace['GREEN'] = 2
namespace['BLUE'] = 3
Color = enum_meta(name, bases, namespace)
print(Color)
print(Color.RED)
The latter is the code that you need to translate into C.
Edited note: An answer on a very similar question details how enum.Enum has a functional interface that can be used instead. That is almost certainly the correct approach. I think my answer here is a useful alternative approach to be aware of, although it probably isn't the best solution to this problem.
I'm aware that this answer is slightly cheating, but this is exactly the kind of code that's better written in Python, and in the C API we still have access to the full Python interpreter. My reasoning for this is that the main reason to keep things entirely in C is performance, and it seems unlikely that creating enum objects will be performance critical.
I'll give three versions, essentially depending on the level of complexity.
First, the simplest case: the enum is entirely known and defined and compile-time. Here we simply set up an empty global dict, run the Python code, then extract the enum from the global dict:
PyObject* get_enum(void) {
const char str[] = "from enum import Enum\n"
"class Colour(Enum):\n"
" RED = 1\n"
" GREEN = 2\n"
" BLUE = 3\n"
"";
PyObject *global_dict=NULL, *should_be_none=NULL, *output=NULL;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
should_be_none = PyRun_String(str, Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Second, we might want to change what we define in C at runtime. For example, maybe the input parameters pick the enum values. Here, I'm going to use string formatting to insert the appropriate values into our string. There's a number of options here: sprintf, PyBytes_Format, the C++ standard library, using Python strings (perhaps with another call into Python code?). Pick whichever you're most comfortable with.
PyObject* get_enum_fmt(int red, int green, int blue) {
const char str[] = "from enum import Enum\n"
"class Colour(Enum):\n"
" RED = %d\n"
" GREEN = %d\n"
" BLUE = %d\n"
"";
PyObject *formatted_str=NULL, *global_dict=NULL, *should_be_none=NULL, *output=NULL;
formatted_str = PyBytes_FromFormat(str, red, green, blue);
if (!formatted_str) goto cleanup;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
should_be_none = PyRun_String(PyBytes_AsString(formatted_str), Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(formatted_str);
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Obviously you can do as much or as little as you like with string formatting - I've just picked a simple example to show the point. The main differences from the previous version are the call to PyBytes_FromFormat to set up the string, and the call to PyBytes_AsString that gets the underlying char* out of the prepared bytes object.
Finally, we could prepare the enum attributes in C Python dict and pass it in. This necessitates a bit of a change. Essentially I use #AnttiHaapala's lower-level Python code, but insert namespace.update(contents) after the call to __prepare__.
PyObject* get_enum_dict(const char* key1, int value1, const char* key2, int value2) {
const char str[] = "from enum import Enum\n"
"name = 'Colour'\n"
"bases = (Enum,)\n"
"enum_meta = type(Enum)\n"
"namespace = enum_meta.__prepare__(name, bases)\n"
"namespace.update(contents)\n"
"Colour = enum_meta(name, bases, namespace)\n";
PyObject *global_dict=NULL, *contents_dict=NULL, *value_as_object=NULL, *should_be_none=NULL, *output=NULL;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
// create and fill the contents dictionary
contents_dict = PyDict_New();
if (!contents_dict) goto cleanup;
value_as_object = PyLong_FromLong(value1);
if (!value_as_object) goto cleanup;
int set_item_result = PyDict_SetItemString(contents_dict, key1, value_as_object);
Py_CLEAR(value_as_object);
if (set_item_result!=0) goto cleanup;
value_as_object = PyLong_FromLong(value2);
if (!value_as_object) goto cleanup;
set_item_result = PyDict_SetItemString(contents_dict, key2, value_as_object);
Py_CLEAR(value_as_object);
if (set_item_result!=0) goto cleanup;
set_item_result = PyDict_SetItemString(global_dict, "contents", contents_dict);
if (set_item_result!=0) goto cleanup;
should_be_none = PyRun_String(str, Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(contents_dict);
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Again, this presents a reasonably flexible way to get values from C into a generated enum.
For the sake of testing I used the follow simple Cython wrapper - this is just presented for completeness to help people try these functions.
cdef extern from "cenum.c":
object get_enum()
object get_enum_fmt(int, int, int)
object get_enum_dict(char*, int, char*, int)
def py_get_enum():
return get_enum()
def py_get_enum_fmt(red, green, blue):
return get_enum_fmt(red, green, blue)
def py_get_enum_dict(key1, value1, key2, value2):
return get_enum_dict(key1, value1, key2, value2)
To reiterate: this answer is only partly in the C API, but the approach of calling Python from C is one that I've found productive at times for "run-once" code that would be tricky to write entirely in C.

operator.index with custom class instance

I have a simple class below,
class MyClass(int):
def __index__(self):
return 1
According to operator.index documentation,
operator.index(a)
Return a converted to an integer. Equivalent to a.__index__()
But when I use operator.index with MyClass instance, I got 100 instead of 1 (I am getting 1 if I use a.__index__()). Why is that?.
>>> a = MyClass(100)
>>>
>>> import operator
>>> print(operator.index(a))
100
>>> print(a.__index__())
1
This actually appears to be a deep-rooted issue in cpython. If you look at the source code for operator.py, you can see the definition of index:
def index(a):
"Same as a.__index__()."
return a.__index__()
So...why is it not equivalent? It's literally calling __index__. Well, at the bottom of the source, there's the culprit:
try:
from _operator import *
except ImportError:
pass
else:
from _operator import __doc__
It's overwriting the definitions with a native _operator module. In fact, if you comment this out (either by modifying the actual library or making your own fake operator.py* and importing that), it works. So, we can find the source code for the native _operator library, and look at the related part:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
{
return PyNumber_Index(a);
}
So, it's a wrapper around the PyNumber_Index function. PyNumber_Index is a wrapper around _PyNumber_Index, so we can look at that:
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item;
}
if (!_PyIndex_Check(item)) {
PyErr_Format(PyExc_TypeError,
"'%.200s' object cannot be interpreted "
"as an integer", Py_TYPE(item)->tp_name);
return NULL;
}
result = Py_TYPE(item)->tp_as_number->nb_index(item);
if (!result || PyLong_CheckExact(result))
return result;
if (!PyLong_Check(result)) {
PyErr_Format(PyExc_TypeError,
"__index__ returned non-int (type %.200s)",
Py_TYPE(result)->tp_name);
Py_DECREF(result);
return NULL;
}
/* Issue #17576: warn if 'result' not of exact type int. */
if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
"__index__ returned non-int (type %.200s). "
"The ability to return an instance of a strict subclass of int "
"is deprecated, and may be removed in a future version of Python.",
Py_TYPE(result)->tp_name)) {
Py_DECREF(result);
return NULL;
}
return result;
}
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
You can see before it even calls nb_index (the C name for __index__), it calls PyLong_Check on the argument, and if it's true, it just returns the item with no modification. PyLong_Check is a macro that checks for long subtyping (int in python is a PyLong):
#define PyLong_Check(op) \
PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
#define PyLong_CheckExact(op) Py_IS_TYPE(op, &PyLong_Type)
So, basically, the takeaway is that for whatever reason, probably for speed, int subclasses don't get their __index__ method called, and instead just get _PyLong_Copy'd to the resulting return value, but only in the native _operator module, and not in the non-native operator.py. This conflict of implementation as well as inconsistency in documentation leads me to believe that this is an issue, either in the documentation or the implementation, and you may want to raise it as one.
It's likely a documentation and not an implementation issue, as cpython has a habit of sacrificing correctness for speed: (nan,) == (nan,) but nan != nan.
* You may have to name it something like fake_operator.py then import it with import fake_operator as operator
This is because your type is an int subclass. __index__ will not be used because the instance is already an integer. That much is by design, and unlikely to be considered a bug in CPython. PyPy behaves the same.
In _operator.c:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
/*[clinic end generated code: output=d972b0764ac305fc input=6f54d50ea64a579c]*/
{
return PyNumber_Index(a);
}
Note that operator.py Python code is not used generally, this code is only a fallback in the case that compiled _operator module is not available. That explains why the result a.__index__() differs.
In abstract.c, cropped after the relevant PyLong_Check part:
/* Return an exact Python int from the object item.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
...
/* Return a Python int from the object item.
Can return an instance of int subclass.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item; /* <---- short-circuited here */
}
...
}
The documentation for operator.index is inaccurate, so this may be considered a minor documentation issue:
>>> import operator
>>> operator.index.__doc__
'Same as a.__index__()'
So, why isn't __index__ considered for integers? The probable answer is found in PEP 357, under the discussion section titled Speed:
Implementation should not slow down Python because integers and long integers used as indexes will complete in the same number of instructions. The only change will be that what used to generate an error will now be acceptable.
We do not want to slow down the most common case for slicing with integers, having to check for an nb_index slot every time.
Update
This answer is incorrect; I misread the documentation. See Aplet123's answer instead. Tl;dr the problem is actually that the C implementation doesn't match the documentation and Python implementation. The C implementation is more like a if isinstance(a, int) else a.__index__().
To prove it, try defining MyClass.__int__(). The outcome will be the same.
Original answer
See the documentation for object.__index__():
object.__index__(self)
Called to implement operator.index(), and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-in bin(), hex() and oct() functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.
If __int__(), __float__() and __complex__() are not defined then corresponding built-in functions int(), float() and complex() fall back to __index__().
(added bold)
a.__int__() exists, so its return value is used instead.
>>> a.__int__
<method-wrapper '__int__' of MyClass object at 0x7f2c5f0f4ec8>
>>> a.__int__()
100

Replace variable name taking care of shadow variables

I'm parsing a GLSL source code and I need to replace global variable name with a new name. The problem is how to take care of shadow variable? For example, in the following source code, I would like to replace lines a = 1 and a = 3 but not a = 2 because of the local declaration.
int a; // To be replaced
void some_function(void)
{
a = 1; // To be replaced
{
int a;
a = 2; // To be kept
}
a = 3; // To be replaced
}
I'm using pyparsing but I did not find a solution (apart using nestedExpr and re-parsing each block).

Implementing nb_inplace_add results in returning a read-only buffer object

I'm writing an implementation of the in-place add operation. But, for some reason, I sometimes get a read-only buffer as result(while I'm adding a custom extension class and an integer...).
The relevant code is:
static PyObject *
ModPoly_InPlaceAdd(PyObject *self, PyObject *other)
{
if (!ModPoly_Check(self)) {
//Since it's in-place addition the control flow should never
// enter here(I suppose)
if (!ModPoly_Check(other)) {
PyErr_SetString(PyExc_TypeError, "Neither argument is a ModPolynomial.");
return NULL;
}
return ModPoly_InPlaceAdd(other, self);
} else {
if (!PyInt_Check(other) && !PyLong_Check(other)) {
Py_INCREF(Py_NotImplemented);
return Py_NotImplemented;
}
}
ModPoly *Tself = (ModPoly *)self;
PyObject *tmp, *tmp2;
tmp = PyNumber_Add(Tself->ob_item[0], other);
tmp2 = PyNumber_Remainder(tmp, Tself->n_modulus);
Py_DECREF(tmp);
tmp = Tself->ob_item[0];
Tself->ob_item[0] = tmp2;
Py_DECREF(tmp);
return (PyObject *)Tself;
}
If instead of returning (PyObject*)Tself(or simply "self"), I raise an exception, the original object gets update correctly[checked using some printf]. If I use the Py_RETURN_NONE macro, it correctly turns the ModPoly into None (in the python side).
What am I doing wrong? I'm returning a pointer to a ModPoly object, how can this become a buffer? And I don't see any operation on those pointers.
example usage:
>>> from algebra import polynomials
>>> pol = polynomials.ModPolynomial(3,17)
>>> pol += 5
>>> pol
<read-only buffer ptr 0xf31420, size 4 at 0xe6faf0>
I've tried change the return line into:
printf("%d\n", (int)ModPoly_Check(self));
return self;
and it prints 1 when adding in-place (meaning that the value returned is of type ModPolynomial...)
According to the documentation, the inplace add operation for an object returns a new reference.
By returning self directly without calling Py_INCREF on it, your object will be freed while it is still referenced. If some other object is allocated the same piece of memory, those references would now give you the new object.

Categories

Resources