Difference between list() and [:] - python

If we have a list s, is there any difference between calling list(s) versus s[:]? It seems to me like they both create new list objects with the exact elements of s.

In both cases, they should create a (shallow) copy of the list.
Note that there is one corner case (which is hardly worth mentioning) where it might be different...
list = tuple # Don't ever do this!
list_copy = list(some_list) # Oops, actually it's a tuple ...
actually_list_copy = some_list[:]
With that said, nobody in their right mind should ever shadow the builtin list like that.
My advice, use whichever you feel is easier to read and works nicely in the current context.
list(...) makes it explicit that the output is a list and will make a list out of any iterable.
something[:] is a common idiom for "give me a shallow copy of this sequence, I don't really care what kind of sequence it is ...", but it doesn't work on arbitrary iterables.

list() is better - it's more readable. Other than that there is no difference.

The short answer is use list(). In google type python [:] then type python list.
If s is a list then there is no difference, but will s always be a list? Or could it be a sequence or a generator?
In [1]: nums = 1, 2, 3
In [2]: nums
Out[2]: (1, 2, 3)
In [3]: nums[:]
Out[3]: (1, 2, 3)
In [4]: list(nums)
Out[4]: [1, 2, 3]
In [7]: strings = (str(x) for x in nums)
In [8]: strings
Out[8]: <generator object <genexpr> at 0x7f77be460550>
In [9]: strings[:]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-358af12435ff> in <module>()
----> 1 strings[:]
TypeError: 'generator' object has no attribute '__getitem__'
In [10]: list(strings)
Out[10]: ['1', '2', '3']

I didn't realize, but as ventsyv mentioned, s[:] and list(s) both create a copy of s.
Note you can check if an object is the same using is and id() can be used to get the object's memory address to actually see if they are the same or not.
>>> s = [1,2,3]
>>> listed_s = list(s)
>>> id(s)
44056328
>>> id(listed_s) # different
44101840
>>> listed_s is s
False
>>> bracket_s = s[:]
>>> bracket_s is s
False
>>> id(bracket_s)
44123760
>>> z = s # points to the same object in memory
>>> z is s
True
>>> id(z)
44056328
>>>
id(...)
id(object) -> integer
Return the identity of an object. This is guaranteed to be unique among
simultaneously existing objects. (Hint: it's the object's memory address.)

Related

What's the point of assignment to slice?

I found this line in the pip source:
sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
As I understand the line above is doing the same as below:
sys.path = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Another one thing is that the first case is two times slower than second:
>>> timeit('a[:] = a + [1,2]', setup='a=[]', number=20000)
2.111023200035561
>>> timeit('a = a + [1,2]', setup='a=[]', number=20000)
1.0290934000513516
The reason as I think is that in the case of slice assignment objects from a (references to objects) are copied to a new list and then copied back to the resized a.
So what are the benefits of using a slice assignment?
Assigning to a slice is useful if there are other references to the same list, and you want all references to pick up the changes.
So if you do something like:
bar = [1, 2, 3]
foo = bar
bar[:] = [5, 4, 3, 2, 1]
print(foo)
this will print [5, 4, 3, 2, 1]. If you instead do:
bar = [5, 4, 3, 2, 1]
print(foo)
the output will be [1, 2, 3].
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Right: That’s the whole point, you’re modifying the object behind the name instead of the name. Thus all other names referring to the same object also see the changes.
Another one thing is that the first case is two times slower than second:
Not really. Slice assignment performs a copy. Performing a copy is an O(n) operation while performing a name assignment is O(1). In other words, the bigger the list, the slower the copy; whereas the name assignment always takes the same (short) time.
Your assumptions are very good!
In python a variable is a name that has been set to point to an object in memory, which in essence is what gives python the ability to be a dynamically typed language, i.e. you can have the same variable as a number, then reassign it to a string etc.
as shown here whenever you assign a new value to a variable, you are just pointing a name to a different object in memory
>>> a = 1
>>> id(a)
10968800
>>> a = 1.0
>>> id(a)
140319774806136
>>> a = 'hello world'
>>> id(a)
140319773005552
(in CPython the id refers to its address in memory).
Now for your question sys.path is a list, and a python list is a mutable type, thus meaning that the type itself can change, i.e.
>>> l = []
>>> id(l)
140319772970184
>>> l.append(1)
>>> id(l)
140319772970184
>>> l.append(2)
>>> id(l)
140319772970184
even though I modified the list by adding items, the list still points to the same object, and following the nature of python, a lists elements as well are only pointers to different areas in memory (the elements aren't the objects, the are only like variables to the objects held there) as shown here,
>>> l
[1, 2]
>>> id(l[0])
10968800
>>> l[0] = 3
>>> id(l[0])
10968864
>>> id(l)
140319772970184
After reassigning to l[0] the id of that element has changed. but once again the list hasn't.
Seeing that assigning to an index in the list only changes the places where lists elements where pointing, now you will understand that when I reassign l I don't reassign, I just change where l was pointing
>>> id(l)
140319772970184
>>> l = [4, 5, 6]
>>> id(l)
140319765766728
but if I reassign to all of ls indexes, then l stays the same object only the elements point to different places
>>> id(l)
140319765766728
>>> l[:] = [7, 8, 9]
>>> id(l)
140319765766728
That will also give you understanding on why it is slower, as python is reassigning the elements of the list, and not just pointing the list somewhere else.
One more little point if you are wondering about the part where the line finishes with
sys.path[:] = ... + sys.path
it goes in the same concept, python first creates the object on the right side of the = and then points the name on the left side to the new object, so when python is still creating the new list on the right side, sys.path is in essence the original list, and python takes all of its elements and then reassigns all of the newly created elements to the mappings in the original sys.paths addresses (since we used [:])
now for why pip is using [:] instead of reassigning, I don't really know, but I would believe that it might have a benefit of reusing the same object in memory for sys.path.
python itself also does it for the small integers, for example
>>> id(a)
10968800
>>> id(b)
10968800
>>> id(c)
10968800
a, b and c all point to the same object in memory even though all requested to create an 1 and point to it, since python knows that the small numbers are most probably going to be used a lot in programs (for example in for loops) so they create it and reuse it throughout.
(you might also find it being the case with filehandles that python will recycle instead of creating a new one.)
You are right, slice assignment will not rebind, and slice object is one type of objects in Python. You can use it to set and get.
In [1]: a = [1, 2, 3, 4]
In [2]: a[slice(0, len(a), 2)]
Out[2]: [1, 3]
In [3]: a[slice(0, len(a), 2)] = 6, 6
In [4]: a[slice(0, len(a), 1)] = range(10)
In [5]: a
Out[5]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [6]: a[:] = range(4)
In [7]: a
Out[7]: [0, 1, 2, 3]

Behaviour of map function in Python

I can not understand how and why the first time it is printed a reference of the object(which object is referring to?) and the second time when I use two variables, these variables get the result of the function instead of a reference.
>>> a = map(int,[1,2])
>>> a
<map object at 0x7f0b1142fa90>
>>> b,c = a
>>> b
1
>>> c
2
In Python 3, map (and other primitive combinators) return an iterator object rather than a list (as they did before.) At the first attempt, you printed that iterator object per se, while the second time you matched it against a sequence, thus forcing and extracting elements. Consider:
>>> a = map(int,[1,2])
>>> a
<map object at 0x7ff6ddbfe748>
>>> list(a)
[1, 2]

Does slice operation allocate a new object always?

I am confused about the slice operation.
>>> s = "hello world"
>>> y = s[::]
>>> id(s)
4507906480
>>> id(y)
4507906480 # they are the same - no new object was created
>>> z = s[:2]
>>> z
'he'
>>> id(z)
4507835488 # z is a new object
What allocation rule does slice operation follow?
For most built-in types, slicing is always a shallow copy... in the sense that modifying the copy will not modify the original. This means that for immutable types, an object counts as a copy of itself. The copy module also uses this concept of "copy":
>>> t = (1, 2, 3)
>>> copy.copy(t) is t
True
Objects are free to use whatever allocation strategy they choose, as long as they implement the semantics they document. y can be the same object as s, but z cannot, because s and z store different values.

Python: those variable dont point to the same values. Why?

I thought that if you assign a variable to another list, it's not copied, but it points to the same location. That's why deepcopy() is for. This is not true with Python 2.7: it's copied.
>>> a=[1,2,3]
>>> b=a
>>> b=b[1:]+b[:1]
>>> b
[2, 3, 1]
>>> a
[1, 2, 3]
>>>
>>> a=(1,2,3)
>>> b=a
>>> b=b[1:]+b[:1]
>>> a
(1, 2, 3)
>>> b
(2, 3, 1)
>>>
What am I missing?
This line changes what b points to:
b=b[1:]+b[:1]
List or tuple addition creates a new list or tuple, and the assignment operator makes b refer to that new list while leaving a referring to the original list or tuple.
Slicing a list or tuple also creates a new object, so that line creates three new objects - one for each slice, and then one for the sum. b = a + b would be a simpler example to demonstrate that addition creates a new object.
You will sometimes see c = b[:] as a way to shallow copy a list, making use of the fact that slicing creates a new object.
When you do b=b[1:]+b[:1] you first create a new object of two b slices and then assign b to reference that object. The same is for both list and tuple cases

Understanding the behavior of Python's set

The documentation for the built-in type set says:
class set([iterable])
Return a new set or frozenset object
whose elements are taken from
iterable. The elements of a set must
be hashable.
That is all right but why does this work:
>>> l = range(10)
>>> s = set(l)
>>> s
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
And this doesn't:
>>> s.add([10])
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
s.add([10])
TypeError: unhashable type: 'list'
Both are lists. Is some magic happening during the initialization?
When you initialize a set, you provide a list of values that must each be hashable.
s = set()
s.add([10])
is the same as
s = set([[10]])
which throws the same error that you're seeing right now.
In [13]: (2).__hash__
Out[13]: <method-wrapper '__hash__' of int object at 0x9f61d84>
In [14]: ([2]).__hash__ # nothing.
The thing is that set needs its items to be hashable, i.e. implement the __hash__ magic method (this is used for ordering in the tree as far as I know). list does not implement that magic method, hence it cannot be added in a set.
In this line:
s.add([10])
You are trying to add a list to the set, rather than the elements of the list. If you want ot add the elements of the list, use the update method.
Think of the constructor being something like:
class Set:
def __init__(self,l):
for elem in l:
self.add(elem)
Nothing too interesting to be concerned about why it takes lists but on the other hand add(element) does not.
It behaves according to the documentation: set.add() adds a single element (and since you give it a list, it complains it is unhashable - since lists are no good as hash keys). If you want to add a list of elements, use set.update(). Example:
>>> s = set([1,2,3])
>>> s.add(5)
>>> s
set([1, 2, 3, 5])
>>> s.update([8])
>>> s
set([8, 1, 2, 3, 5])
s.add([10]) works as documented. An exception is raised because [10] is not hashable.
There is no magic happening during initialisation.
set([0,1,2,3,4,5,6,7,8,9]) has the same effect as set(range(10)) and set(xrange(10)) and set(foo()) where
def foo():
for i in (9,8,7,6,5,4,3,2,1,0):
yield i
In other words, the arg to set is an iterable, and each of the values obtained from the iterable must be hashable.

Categories

Resources