How does python extend work? - python

If I have a list:
a = [1,2,3,4]
and then add 4 elements using extend
a.extend(range(5,10))
I get
a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
How does python do this? does it create a new list and copy the elements across or does it make 'a' bigger? just concerned that using extend will gobble up memory. I'am also asking as there is a comment in some code I'm revising that extending by 10000 x 100 is quicker than doing it in one block of 1000000.

Python's documentation on it says:
Extend the list by appending all the
items in the given list; equivalent to
a[len(a):] = L.
As to "how" it does it behind the scene, you really needn't concern yourself about it.

L.extend(M) is amortized O(n) where n=len(m), so excessive copying is not usually a problem. The times it can be a problem is when there is not enough space to extend into, so a copy is performed. This is a problem when the list is large and you have limits on how much time is acceptable for an individual extend call.
That is the point when you should look for a more efficient datastructure for your problem. I find it is rarely a problem in practice.
Here is the relevant code from CPython, you can see that extra space is allocated when the list is extended to avoid excessive copying
static PyObject *
listextend(PyListObject *self, PyObject *b)
{
PyObject *it; /* iter(v) */
Py_ssize_t m; /* size of self */
Py_ssize_t n; /* guess for size of b */
Py_ssize_t mn; /* m + n */
Py_ssize_t i;
PyObject *(*iternext)(PyObject *);
/* Special cases:
1) lists and tuples which can use PySequence_Fast ops
2) extending self to self requires making a copy first
*/
if (PyList_CheckExact(b) || PyTuple_CheckExact(b) || (PyObject *)self == b) {
PyObject **src, **dest;
b = PySequence_Fast(b, "argument must be iterable");
if (!b)
return NULL;
n = PySequence_Fast_GET_SIZE(b);
if (n == 0) {
/* short circuit when b is empty */
Py_DECREF(b);
Py_RETURN_NONE;
}
m = Py_SIZE(self);
if (list_resize(self, m + n) == -1) {
Py_DECREF(b);
return NULL;
}
/* note that we may still have self == b here for the
* situation a.extend(a), but the following code works
* in that case too. Just make sure to resize self
* before calling PySequence_Fast_ITEMS.
*/
/* populate the end of self with b's items */
src = PySequence_Fast_ITEMS(b);
dest = self->ob_item + m;
for (i = 0; i < n; i++) {
PyObject *o = src[i];
Py_INCREF(o);
dest[i] = o;
}
Py_DECREF(b);
Py_RETURN_NONE;
}
it = PyObject_GetIter(b);
if (it == NULL)
return NULL;
iternext = *it->ob_type->tp_iternext;
/* Guess a result list size. */
n = _PyObject_LengthHint(b, 8);
if (n == -1) {
Py_DECREF(it);
return NULL;
}
m = Py_SIZE(self);
mn = m + n;
if (mn >= m) {
/* Make room. */
if (list_resize(self, mn) == -1)
goto error;
/* Make the list sane again. */
Py_SIZE(self) = m;
}
/* Else m + n overflowed; on the chance that n lied, and there really
* is enough room, ignore it. If n was telling the truth, we'll
* eventually run out of memory during the loop.
*/
/* Run iterator to exhaustion. */
for (;;) {
PyObject *item = iternext(it);
if (item == NULL) {
if (PyErr_Occurred()) {
if (PyErr_ExceptionMatches(PyExc_StopIteration))
PyErr_Clear();
else
goto error;
}
break;
}
if (Py_SIZE(self) < self->allocated) {
/* steals ref */
PyList_SET_ITEM(self, Py_SIZE(self), item);
++Py_SIZE(self);
}
else {
int status = app1(self, item);
Py_DECREF(item); /* append creates a new ref */
if (status < 0)
goto error;
}
}
/* Cut back result list if initial guess was too large. */
if (Py_SIZE(self) < self->allocated)
list_resize(self, Py_SIZE(self)); /* shrinking can't fail */
Py_DECREF(it);
Py_RETURN_NONE;
error:
Py_DECREF(it);
return NULL;
}
PyObject *
_PyList_Extend(PyListObject *self, PyObject *b)
{
return listextend(self, b);
}

It works as if it were defined like this
def extend(lst, iterable):
for x in iterable:
lst.append(x)
This mutates the list, it does not create a copy of it.
Depending on the underlying implementation, append and extend may trigger the list to copy its own data structures but this is normal and nothing to worry about. For example array-based implementations typically grow the underlying array exponentially and need to copy the list of elements when they do so.

How does python do this? does it create a new list and copy the elements across or does it make 'a' bigger?
>>> a = ['apples', 'bananas']
>>> b = a
>>> a is b
True
>>> c = ['apples', 'bananas']
>>> a is c
False
>>> a.extend(b)
>>> a
['apples', 'bananas', 'apples', 'bananas']
>>> b
['apples', 'bananas', 'apples', 'bananas']
>>> a is b
True
>>>

It does not create a new list object, it extends a. This is self-evident from the fact that you don't make an assigment. Python will not magically replace your objects with other objects. :-)
How the memory allocation happens inside the list object is implementation dependent.

Related

How is python's float.__eq__ implemented in the language?

I know that the best way to compare two floats for equality is usually to use math.isclose(float_a, float_b). But I was curious to know how python does it if you simply do float_a == float_b.
I suppose it's implemented in C, but what is the logic behind it ?
Here is the source code for float object comparisons
Essentially. It looks super complex, but that complexity is mostly in handling the case where a float is compared to an int (int objects in Python are arbitrarily sized, they aren't C-int's wrapped in a Python object).
But for the simple case of float and float:
static PyObject*
float_richcompare(PyObject *v, PyObject *w, int op)
{
double i, j;
int r = 0;
assert(PyFloat_Check(v));
i = PyFloat_AS_DOUBLE(v);
/* Switch on the type of w. Set i and j to doubles to be compared,
* and op to the richcomp to use.
*/
if (PyFloat_Check(w))
j = PyFloat_AS_DOUBLE(w);
So it just creates two C doubles from the float objects, then (skipping all the int handling stuff):
Compare:
switch (op) {
case Py_EQ:
r = i == j;
break;
case Py_NE:
r = i != j;
break;
case Py_LE:
r = i <= j;
break;
case Py_GE:
r = i >= j;
break;
case Py_LT:
r = i < j;
break;
case Py_GT:
r = i > j;
break;
}
return PyBool_FromLong(r);
It just does a C-level == comparison, ultimately. So it does not do math.isclose(float_a, float_b). underneath the hood.

Python C binding - get array from python to C++

As the title says: I wold like to make a python binding in C++ that does some algebraic operations on some array. For this, I have to parse the python "array object" into C++ as a vector of double or integer or whatever the case may be.
I tried to do this but I face some issues. I've created a new python type and a class with the name Typer where I have this method that tries to get the elements of a python array, then compute the sum (as a first step).
tatic PyObject *Typer_vectorsum(Typer *self, PyObject *args)
{
PyObject *retval;
PyObject *list;
if (!PyArg_ParseTuple(args, "O", &list))
return NULL;
double *arr;
arr = (double *)malloc(sizeof(double) * PyTuple_Size(list));
int length;
length = PyTuple_Size(list);
PyObject *item = NULL;
for (int i = 0; i < length; ++i)
{
item = PyTuple_GetItem(list, i);
if (!PyFloat_Check(item))
{
exit(1);
}
arr[i] = PyFloat_AsDouble(item);
}
double result = 0.0;
for (int i = 0; i < length; ++i)
{
result += arr[i];
}
retval = PyFloat_FromDouble(result);
free(arr);
return retval;
}
In this method I parse the python array object into a C array (allocating the memory of the array with malloc). Then I add every element from the object to my C array and just compute the sum in the last for-loop.
If I build the project and then create a python test file, nothing happens (the file compiles without any issues but it is not printing anything).
y = example.Typer() . #typer is the init
tuple = (1, 2, 3)
print(y.vectorsum(tuple))
Am I missing something? And also, Is there a nice and easy way of getting a python array object into C++, but as a std::vector instead of a classic C array?
Thank you in advance!
The tuple contains ints, not floats, so your PyFloat_Check fails. And no, there is no direct way from Python tuple to C array or C++ std::vector. The reason being that the tuple is an array of Python objects, not an array of C values such as doubles.
Here's your example with improved error checking, after which it should work:
PyObject *retval;
PyObject *list;
if (!PyArg_ParseTuple(args, "O!", &PyTuple_Type, &list))
return NULL;
double *arr =
arr = (double *)malloc(sizeof(double) * PyTuple_GET_SIZE(list));
int length;
length = PyTuple_GET_SIZE(list);
PyObject *item = NULL;
for (int i = 0; i < length; ++i)
{
item = PyTuple_GET_ITEM(list, i);
arr[i] = PyFloat_AsDouble(item);
if (arr[i] == -1. && PyErr_Occurred())
{
exit(1);
}
}
double result = 0.0;
for (int i = 0; i < length; ++i)
{
result += arr[i];
}
retval = PyFloat_FromDouble(result);
free(arr);
return retval;

What is the time complexity of iterating through a deque in Python?

What is the time complexity of iterating, or more precisely each iteration through a deque from the collections library in Python?
An example is this:
elements = deque([1,2,3,4])
for element in elements:
print(element)
Is each iteration a constant O(1) operation? or does it do a linear O(n) operation to get to the element in each iteration?
There are many resources online for time complexity with all of the other deque methods like appendleft, append, popleft, pop. There doesn't seem to be any time complexity information about the iteration of a deque.
Thanks!
If your construction is something like:
elements = deque([1,2,3,4])
for i in range(len(elements)):
print(elements[i])
You are not iterating over the deque, you are iterating over the range object, and then indexing into the deque. This makes the iteration polynomial time, since each indexing operation, elements[i] is O(n). However, actually iterating over the deque is linear time.
for x in elements:
print(x)
Here's a quick, empirical test:
import timeit
import pandas as pd
from collections import deque
def build_deque(n):
return deque(range(n))
def iter_index(d):
for i in range(len(d)):
d[i]
def iter_it(d):
for x in d:
x
r = range(100, 10001, 100)
index_runs = [timeit.timeit('iter_index(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]
it_runs = [timeit.timeit('iter_it(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]
df = pd.DataFrame({'index':index_runs, 'iter':it_runs}, index=r)
df.plot()
And the resulting plot:
Now, we can actually see how the iterator protocol is implemented for deque objects in CPython source code:
First, the deque object itself:
typedef struct BLOCK {
struct BLOCK *leftlink;
PyObject *data[BLOCKLEN];
struct BLOCK *rightlink;
} block;
typedef struct {
PyObject_VAR_HEAD
block *leftblock;
block *rightblock;
Py_ssize_t leftindex; /* 0 <= leftindex < BLOCKLEN */
Py_ssize_t rightindex; /* 0 <= rightindex < BLOCKLEN */
size_t state; /* incremented whenever the indices move */
Py_ssize_t maxlen;
PyObject *weakreflist;
} dequeobject;
So, as stated in the comments, a deque is a doubly-linked list of "block" nodes, where a block is essentially an array of python object pointers. Now for the iterator protocol:
typedef struct {
PyObject_HEAD
block *b;
Py_ssize_t index;
dequeobject *deque;
size_t state; /* state when the iterator is created */
Py_ssize_t counter; /* number of items remaining for iteration */
} dequeiterobject;
static PyTypeObject dequeiter_type;
static PyObject *
deque_iter(dequeobject *deque)
{
dequeiterobject *it;
it = PyObject_GC_New(dequeiterobject, &dequeiter_type);
if (it == NULL)
return NULL;
it->b = deque->leftblock;
it->index = deque->leftindex;
Py_INCREF(deque);
it->deque = deque;
it->state = deque->state;
it->counter = Py_SIZE(deque);
PyObject_GC_Track(it);
return (PyObject *)it;
}
// ...
static PyObject *
dequeiter_next(dequeiterobject *it)
{
PyObject *item;
if (it->deque->state != it->state) {
it->counter = 0;
PyErr_SetString(PyExc_RuntimeError,
"deque mutated during iteration");
return NULL;
}
if (it->counter == 0)
return NULL;
assert (!(it->b == it->deque->rightblock &&
it->index > it->deque->rightindex));
item = it->b->data[it->index];
it->index++;
it->counter--;
if (it->index == BLOCKLEN && it->counter > 0) {
CHECK_NOT_END(it->b->rightlink);
it->b = it->b->rightlink;
it->index = 0;
}
Py_INCREF(item);
return item;
}
As you can see, the iterator essentially keeps track of a block index, a pointer to a block, and a counter of total items in the deque. It stops iterating if the counter reaches zero, if not, it grabs the element at the current index, increments the index, decrements the counter, and tales care of checking whether to move to the next block or not. In other words, A deque could be represented as a list-of-lists in Python, e.g. d = [[1,2,3],[4,5,6]], and it iterates
for block in d:
for x in block:
...

Python Generated String in C

I need to generate the following string in C:
$(python -c "print('\x90' * a + 'blablabla' + '\x90' * b + 'h\xef\xff\xbf')")
where a and b are arbitrary integers and blablabla represents an arbitrary string. I am attempting to do this by first creating
char str1[size];
and then doing:
for (int i = 0; i < a; i+=1) {
strcat(str1, "\x90");
}
Next I use strcat again:
strcat(str1, "blablabla");
and I run the loop again, this time b times, to concatenate the next b x90 characters. Finally, I use strcat once more as follows:
strcat(str1, "h\xef\xff\xbf");
However, these two strings do not match. Is there a more efficient way of replicating the behaviour of python's * in C? Or am I missing something?
char str1[size];
Even assuming you calculated size correctly, I recommend using
char * str = malloc(size);
Either way, after you get the needed memory for the string one way or the other, you gonna have to initialize it by first doing
str[0]=0;
if you intend in using strcat.
for (int i = 0; i < a; i+=1) {
strcat(str1, "\x90");
}
This is useful, if "\x90" actually is a string (i.e. something composed of more than one character) and that string is short (hard to give a hard border, but something about 16 bytes would be tops) and a is rather small[1]. Here, as John Coleman already suggested, memset is a better way to do it.
memset(str, '\x90', a);
Because you know the location, where "blablabla" shall be stored, just store it there using strcpy instead of strcat
// strcat(str1, "blablabla");
strcpy(str + a, "blablabla");
However, you need the address of the character after "blablabla" (one way or the other). So I would not even do it that way but instead like this:
const char * add_str = "blablabla";
size_t sl = strlen(add_str);
memcpy(str + a, add_str, sl);
Then, instead of your second loop, use another memset:
memset(str + a + sl, '\x90', b);
Last but not least, instead of strcat again strcpy is better (here, memcpy doesn't help):
strcpy(str + a + sl + b, "h\xef\xff\xbf");
But you need it's size for the size calculation at the beginning, so better do it like the blablabla string anyway (and remember the tailing '\0').
Finally, I would put all this code into a function like this:
char * gen_string(int a, int b) {
const char * add_str_1 = "blablabla";
size_t sl_1 = strlen(add_str_1);
const char * add_str_2 = "h\xef\xff\xbf";
size_t sl_2 = strlen(add_str_2);
size_t size = a + sl_1 + b + sl_2 + 1;
// The + 1 is important for the '\0' at the end
char * str = malloc(size);
if (!str) {
return NULL;
}
memset(str, '\x90', a);
memcpy(str + a, add_str_1, sl_1);
memset(str + a + sl_1, '\x90', b);
memcpy(str + a + sl_1 + b, add_str_2, sl_2);
str[a + sl_1 + b + sl_2] = 0; // 0 is the same as '\0'
return str;
}
Remember to free() the retval of gen_string at some point.
If the list of memset and memcpy calls get longer, then I'd suggest to do it like this:
char * ptr = str;
memset(ptr, '\x90', a ); ptr += a;
memcpy(ptr, add_str_1, sl_1); ptr += sl_1;
memset(ptr, '\x90', b ); ptr += b;
memcpy(ptr, add_str_2, sl_2); ptr += sl_2;
*ptr = 0; // 0 is the same as '\0'
maybe even creating a macro for memset and memcpy:
#define MEMSET(c, l) do { memset(ptr, c, l); ptr += l; } while (0)
#define MEMCPY(s, l) do { memcpy(ptr, s, l); ptr += l; } while (0)
char * ptr = str;
MEMSET('\x90', a );
MEMCPY(add_str_1, sl_1);
MEMSET('\x90', b );
MEMCPY(add_str_2, sl_2);
*ptr = 0; // 0 is the same as '\0'
#undef MEMSET
#undef MEMCPY
For the justifications why to do it the way I recommend it, I suggest you read the blog post Back to Basics (by one of the founders of Stack Overflow) which happens not only to be John Coleman's favorite blog post but mine also. There you will learn, that using strcat in a loop like the way you tried it first has quadratic run time and hence, why not use it the way you did it.
[1] If a is big and/or the string that needs to be repeated is long, a better solution would be something like this:
const char * str_a = "\x90";
size_t sl_a = strlen(str_a);
char * ptr = str;
for (size_t i = 0; i < a; ++i) {
strcpy(ptr, str_a);
ptr += sl_a;
}
// then go on at address str + a * sl_a
For individual 1-byte chars you can use memset to partially replicate the behavior of Python's *:
#include<stdio.h>
#include<string.h>
int main(void){
char buffer[100];
memset(buffer,'#',10);
buffer[10] = '\0';
printf("%s\n",buffer);
memset(buffer, '*', 5);
buffer[5] = '\0';
printf("%s\n",buffer);
return 0;
}
Output:
##########
*****
For a more robust solution, see this.

Parsing unsigned ints (uint32_t) in python's C api

If I write a function accepting a single unsigned integer (0 - 0xFFFFFFFF), I can use:
uint32_t myInt;
if(!PyArg_ParseTuple(args, "I", &myInt))
return NULL;
And then from python, I can pass an int or long.
But what if I get passed a list of integers?
uint32_t* myInts;
PyObject* pyMyInts;
PyArg_ParseTuple(args, "O", &pyMyInts);
if (PyList_Check(intsObj)) {
size_t n = PyList_Size(v);
myInts = calloc(n, sizeof(*myInts));
for(size_t i = 0; i < n; i++) {
PyObject* item = PyList_GetItem(pyMyInts, i);
// What function do I want here?
if(!GetAUInt(item, &myInts[i]))
return NULL;
}
}
// cleanup calloc'd array on exit, etc
Specifically, my issue is with dealing with:
Lists containing a mixture of ints and longs
detecting overflow when assigning to the the uint32
You could create a tuple and use the same method you used for a single argument. On the C side, the tuple objects are not really immutable, so it wouldn't be to much trouble.
Also PyLong_AsUnsignedLong could work for you. It accepts int and long objects and raises an error otherwise. But if sizeof(long) is bigger than 4, you might need to check for an upper-bound overflow yourself.
static int
GetAUInt(PyObject *pylong, uint32_t *myint) {
static unsigned long MAX = 0xffffffff;
unsigned long l = PyLong_AsUnsignedLong(pylong);
if (l == -1 && PyErr_Occurred() || l > MAX) {
PyErr_SetString(PyExc_OverflowError, "can't convert to uint32_t");
return false;
}
*myint = (uint32_t) l;
return true;
}

Categories

Resources