I am passing a single element of a list to a function. I want to modify that element, and therefore, the list itself.
def ModList(element):
element = 'TWO'
l = list();
l.append('one')
l.append('two')
l.append('three')
print l
ModList(l[1])
print l
But this method does not modify the list. It's like the element is passed by value. The output is:
['one','two','three']
['one','two','three']
I want that the second element of the list after the function call to be 'TWO':
['one','TWO','three']
Is this possible?
The explanations already here are correct. However, since I have wanted to abuse python in a similar fashion, I will submit this method as a workaround.
Calling a specific element from a list directly returns a copy of the value at that element in the list. Even copying a sublist of a list returns a new reference to an array containing copies of the values. Consider this example:
>>> a = [1, 2, 3, 4]
>>> b = a[2]
>>> b
3
>>> c = a[2:3]
>>> c
[3]
>>> b=5
>>> c[0]=6
>>> a
[1, 2, 3, 4]
Neither b, a value only copy, nor c, a sublist copied from a, is able to change values in a. There is no link, despite their common origin.
However, numpy arrays use a "raw-er" memory allocation and allow views of data to be returned. A view allows data to be represented in a different way while maintaining the association with the original data. A working example is therefore
>>> import numpy as np
>>> a = np.array([1, 2, 3, 4])
>>> a
array([1, 2, 3, 4])
>>> b = a[2]
>>> b
3
>>> b=5
>>> a
array([1, 2, 3, 4])
>>> c = a[2:3]
>>> c
array([3])
>>> c[0]=6
>>> a
array([1, 2, 6, 4])
>>>
While extracting a single element still copies by value only, maintaining an array view of element 2 is referenced to the original element 2 of a (although it is now element 0 of c), and the change made to c's value changes a as well.
Numpy ndarrays have many different types, including a generic object type. This means that you can maintain this "by-reference" behavior for almost any type of data, not only numerical values.
Python doesn't do pass by reference. Just do it explicitly:
l[1] = ModList(l[1])
Also, since this only changes one element, I'd suggest that ModList is a confusing name.
Python is a pass by value language hence you can't change the value by assignment in the function ModList. What you could do instead though is pass the list and index into ModList and then modify the element that way
def ModList(theList, theIndex) :
theList[theIndex] = 'TWO'
ModList(l, 1)
In many cases you can also consider to let the function both modify and return the modified list. This makes the caller code more readable:
def ModList(theList, theIndex) :
theList[theIndex] = 'TWO'
return theList
l = ModList(l, 1)
Related
When concatenating two lists,
a = [0......, 10000000]
b = [0......, 10000000]
a = a + b
does the Python runtime allocate a bigger array and loop through both arrays and put the elements of a and b into the bigger array?
Or does it loop through the elements of b and append them to a and resize as necessary?
I am interested in the CPython implementation.
In CPython, two lists are concatenated in function list_concat.
You can see in the linked source code that that function allocates the space needed to fit both lists.
size = Py_SIZE(a) + Py_SIZE(b);
np = (PyListObject *) list_new_prealloc(size);
Then it copies the items from both lists to the new list.
for (i = 0; i < Py_SIZE(a); i++) {
...
}
...
for (i = 0; i < Py_SIZE(b); i++) {
...
}
You can find out by looking at the id of a before and after concatenating b:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> id(a)
140025874463112
>>> a = a + b
>>> id(a)
140025874467144
Here, since the id is different, we see that the interpreter has created a new list and bound it to the name a. The old a list will be garbage collected eventually.
However, the behaviour can be different when using the augmented assignment operator +=:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> id(a)
140025844068296
>>> a += b
>>> id(a)
140025844068296
Here, since the id is the same, we see that the interpreter has reused the same list object a and appended the values of b to it.
For more detailed information, see these questions:
Why does += behave unexpectedly on lists?
Does list concatenation with the `+` operator always return a new `list` instance?
You can see the implementation in listobject.c::list_concat. Python will get the size of a and b and create a new list object of that size. It will then loop through the values of a and b, which are C pointers to python objects, increment their ref counts and add those pointers to the new list.
It will create a new list with a shallow copy of the items in the first list, followed by a shallow copy of the items in the second list. The + operator calls the object.__add__(self, other) method. For example, for the expression x + y, where x is an instance of a class that has an __add__() method, x.__add__(y) is called. You can read more in the documentation.
I am trying the use the cycle function from Sympy to simplify down a list like so.
from sympy.combinatorics import Permutation, Cycle
Cycle(1,2,3)(3,4,5)(7)
And the output should be...
Cycle(1, 2, 4, 5, 3)(7)
However, when I try using...
a_list = [[1,2,3,4],[4,5,7],[3,4,2]]
b = Cycle(a_list)
print(b)
I get this error
'tuple' object is not callable
I know that I am inputting the wrong kind of variable into cycle, but could someone tell me what I can do with Cycle. It is a function that does exactly what I need, I just need to find a way to convert a list into a cycle friendly type. Thanks for your help.
I think you're looking for something like this:
a_list = [[1,2,3,4],[4,5,7],[3,4,2]]
b = Cycle()
for i in a_list:
b = b(*tuple(i))
print(b)
To convert to a list, try b.list().
Explanation
tuple(i) converts [1,2,3,4] to (1,2,3,4)
Say you have a function foo. Running foo(1,2,3,4) is the same as running foo(*(1,2,3,4))
A simpler example:
a_list = [[1,2], [3,4]]
b = Cycle()
On the first iteration (i = [1,2]), calling b(*tuple(i)) is the same as calling b(1,2) which, because b = Cycle(), is really Cycle()(1,2) which is the same as Cycle(1,2) according to the docs.
On the second iteration (i = [3,4]), calling b(*tuple(i)) is really b(3,4) which is Cycle(1,2)(3,4)
Hopefully, that example makes some sense. It's a little confusing because there are so many parentheses. If you're still confused, you might want to run through the code step by step (maybe with a debugger) to help understand what happens.
Cycle (as described in the docstring) provides some subtle advantages to entry, but if all elements are present in the list of cycles that you have shown, then simply passing them to Permutation should suffice.
Remember that order matters:
>>> Cycle(1,3)(3,2)
Cycle(1, 2, 3)
>>> Cycle(2,3)(1,3)
Cycle(1, 3, 2)
>>> p = Permutation([[1,3],[3,2]])
>>> Cycle(p)
Cycle(1, 2, 3)
>>> p.list() == _.list() == [0, 2, 3, 1]
Note that the Permutation will allow you to explicitly ask for array_form or cyclic_form whereas Cycle only allows the list() method:
>>> p.array_form
[0, 2, 3, 1]
>>> p.cyclic_form
[[1, 2, 3]]
I found this line in the pip source:
sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
As I understand the line above is doing the same as below:
sys.path = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Another one thing is that the first case is two times slower than second:
>>> timeit('a[:] = a + [1,2]', setup='a=[]', number=20000)
2.111023200035561
>>> timeit('a = a + [1,2]', setup='a=[]', number=20000)
1.0290934000513516
The reason as I think is that in the case of slice assignment objects from a (references to objects) are copied to a new list and then copied back to the resized a.
So what are the benefits of using a slice assignment?
Assigning to a slice is useful if there are other references to the same list, and you want all references to pick up the changes.
So if you do something like:
bar = [1, 2, 3]
foo = bar
bar[:] = [5, 4, 3, 2, 1]
print(foo)
this will print [5, 4, 3, 2, 1]. If you instead do:
bar = [5, 4, 3, 2, 1]
print(foo)
the output will be [1, 2, 3].
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Right: That’s the whole point, you’re modifying the object behind the name instead of the name. Thus all other names referring to the same object also see the changes.
Another one thing is that the first case is two times slower than second:
Not really. Slice assignment performs a copy. Performing a copy is an O(n) operation while performing a name assignment is O(1). In other words, the bigger the list, the slower the copy; whereas the name assignment always takes the same (short) time.
Your assumptions are very good!
In python a variable is a name that has been set to point to an object in memory, which in essence is what gives python the ability to be a dynamically typed language, i.e. you can have the same variable as a number, then reassign it to a string etc.
as shown here whenever you assign a new value to a variable, you are just pointing a name to a different object in memory
>>> a = 1
>>> id(a)
10968800
>>> a = 1.0
>>> id(a)
140319774806136
>>> a = 'hello world'
>>> id(a)
140319773005552
(in CPython the id refers to its address in memory).
Now for your question sys.path is a list, and a python list is a mutable type, thus meaning that the type itself can change, i.e.
>>> l = []
>>> id(l)
140319772970184
>>> l.append(1)
>>> id(l)
140319772970184
>>> l.append(2)
>>> id(l)
140319772970184
even though I modified the list by adding items, the list still points to the same object, and following the nature of python, a lists elements as well are only pointers to different areas in memory (the elements aren't the objects, the are only like variables to the objects held there) as shown here,
>>> l
[1, 2]
>>> id(l[0])
10968800
>>> l[0] = 3
>>> id(l[0])
10968864
>>> id(l)
140319772970184
After reassigning to l[0] the id of that element has changed. but once again the list hasn't.
Seeing that assigning to an index in the list only changes the places where lists elements where pointing, now you will understand that when I reassign l I don't reassign, I just change where l was pointing
>>> id(l)
140319772970184
>>> l = [4, 5, 6]
>>> id(l)
140319765766728
but if I reassign to all of ls indexes, then l stays the same object only the elements point to different places
>>> id(l)
140319765766728
>>> l[:] = [7, 8, 9]
>>> id(l)
140319765766728
That will also give you understanding on why it is slower, as python is reassigning the elements of the list, and not just pointing the list somewhere else.
One more little point if you are wondering about the part where the line finishes with
sys.path[:] = ... + sys.path
it goes in the same concept, python first creates the object on the right side of the = and then points the name on the left side to the new object, so when python is still creating the new list on the right side, sys.path is in essence the original list, and python takes all of its elements and then reassigns all of the newly created elements to the mappings in the original sys.paths addresses (since we used [:])
now for why pip is using [:] instead of reassigning, I don't really know, but I would believe that it might have a benefit of reusing the same object in memory for sys.path.
python itself also does it for the small integers, for example
>>> id(a)
10968800
>>> id(b)
10968800
>>> id(c)
10968800
a, b and c all point to the same object in memory even though all requested to create an 1 and point to it, since python knows that the small numbers are most probably going to be used a lot in programs (for example in for loops) so they create it and reuse it throughout.
(you might also find it being the case with filehandles that python will recycle instead of creating a new one.)
You are right, slice assignment will not rebind, and slice object is one type of objects in Python. You can use it to set and get.
In [1]: a = [1, 2, 3, 4]
In [2]: a[slice(0, len(a), 2)]
Out[2]: [1, 3]
In [3]: a[slice(0, len(a), 2)] = 6, 6
In [4]: a[slice(0, len(a), 1)] = range(10)
In [5]: a
Out[5]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [6]: a[:] = range(4)
In [7]: a
Out[7]: [0, 1, 2, 3]
I occasionally use numpy, and I'm trying to become smarter about how I vectorize operations. I'm reading some code and trying to understand the semantics of the following:
arr_1[:] = arr_2
In this case,
I understand that in arr[:, 0], we're selecting the first column of the array, but I'm confused about what the difference is between arr_1[:] = arr_2 and arr_1 = arr_2
Your question involves a mix of basic Python syntax, and numpy specific details. In many ways it is the same for lists, but not exactly.
arr[:, 0] returns the 1st column of arr (a view), arr[:,0]=10 sets the values of that column to 10.
arr[:] returns arr (alist[:] returns a copy of a list). arr[:]=arr2 performs an inplace replacement; changing the values of arr to the values of arr2. The values of arr2 will be broadcasted and copied as needed.
arr=arr2 sets the object that the arr variable is pointing to. Now arr and arr2 point to the same thing (whether array, list or anything else).
arr[...]=arr2 also works when copying all the data
Play about with these actions in an interactive session. Try variations in the shape of arr2 to see how values get broadcasted. Also check id(arr) to see the object that the variable points to. And arr.__array_interface__ to see the data buffer of the array. That helps you distinguish views from copies.
arr_1[:] = ... changes the elements of the existing list object that arr_1 refers to.
arr_1 = ... makes the name arr_1 refer to a different list object.
The main difference is what happens if some other name also referred to the original list object. If that's the case, then the former updates the thing that both names refer to; while the latter changes what one name refers to while leaving the other referring to the original thing.
>>> a = [0]
>>> b = a
>>> a[:] = [1]
>>> print(b)
[1] <--- note, change reflected by a and b
>>> a = [2]
>>> print(b)
[1] <--- but now a points at something else, so no change to b
Perhaps it is best to understand by using id to examine the memory location of each variable.
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
>>> id(arr1)
4595568512
>>> id(arr2)
4595566192
# Slice assignment
arr1[:] = arr2
>>> arr1
array([4, 5, 6])
>>> id(arr1) # The object still points to the same memory location of `arr1`.
4595568512
# Reassignment.
arr1 = arr2
>>> id(arr1) # The object is now pointing to the object located to where `arr2` points.
4595566192
Using arr_1[:] = arr_2 is a shortcut for arr_1.__setitem__(slice(None, None), arr_2). The reason that is used instead of arr_1 = arr_2 is when you use __setitem__, you are modifying arr_1, whereas when you say arr_1 = arr_2, you are redefining arr_1. Using __setitem__, therefore, will modify other references to the arr_1 object rather than just redefining arr_1.
I thought that if you assign a variable to another list, it's not copied, but it points to the same location. That's why deepcopy() is for. This is not true with Python 2.7: it's copied.
>>> a=[1,2,3]
>>> b=a
>>> b=b[1:]+b[:1]
>>> b
[2, 3, 1]
>>> a
[1, 2, 3]
>>>
>>> a=(1,2,3)
>>> b=a
>>> b=b[1:]+b[:1]
>>> a
(1, 2, 3)
>>> b
(2, 3, 1)
>>>
What am I missing?
This line changes what b points to:
b=b[1:]+b[:1]
List or tuple addition creates a new list or tuple, and the assignment operator makes b refer to that new list while leaving a referring to the original list or tuple.
Slicing a list or tuple also creates a new object, so that line creates three new objects - one for each slice, and then one for the sum. b = a + b would be a simpler example to demonstrate that addition creates a new object.
You will sometimes see c = b[:] as a way to shallow copy a list, making use of the fact that slicing creates a new object.
When you do b=b[1:]+b[:1] you first create a new object of two b slices and then assign b to reference that object. The same is for both list and tuple cases