Python mutable and immutable [duplicate] - python

I'm a bit confused about modifying tuple members. The following doesn't work:
>>> thing = (['a'],)
>>> thing[0] = ['b']
TypeError: 'tuple' object does not support item assignment
>>> thing
(['a'],)
But this does work:
>>> thing[0][0] = 'b'
>>> thing
(['b'],)
Also works:
>>> thing[0].append('c')
>>> thing
(['b', 'c'],)
Doesn't work, and works (huh?!):
>>> thing[0] += 'd'
TypeError: 'tuple' object does not support item assignment
>>> thing
(['b', 'c', 'd'],)
Seemingly equivalent to previous, but works:
>>> e = thing[0]
>>> e += 'e'
>>> thing
(['b', 'c', 'd', 'e'],)
So what exactly are the rules of the game, when you can and can't modify something inside a tuple? It seems to be more like prohibition of using the assignment operator for tuple members, but the last two cases are confusing me.

You can always modify a mutable value inside a tuple. The puzzling behavior you see with
>>> thing[0] += 'd'
is caused by +=. The += operator does in-place addition but also an assignment — the in-place addition works just file, but the assignment fails since the tuple is immutable. Thinking of it like
>>> thing[0] = thing[0] + 'd'
explains this better. We can use the dis module from the standard library to look at the bytecode generated from both expressions. With += we get an INPLACE_ADD bytecode:
>>> def f(some_list):
... some_list += ["foo"]
...
>>> dis.dis(f)
2 0 LOAD_FAST 0 (some_list)
3 LOAD_CONST 1 ('foo')
6 BUILD_LIST 1
9 INPLACE_ADD
10 STORE_FAST 0 (some_list)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
With + we get a BINARY_ADD:
>>> def g(some_list):
... some_list = some_list + ["foo"]
>>> dis.dis(g)
2 0 LOAD_FAST 0 (some_list)
3 LOAD_CONST 1 ('foo')
6 BUILD_LIST 1
9 BINARY_ADD
10 STORE_FAST 0 (some_list)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
Notice that we get a STORE_FAST in both places. This is the bytecode that fails when you try to store back into a tuple — the INPLACE_ADD that comes just before works fine.
This explains why the "Doesn't work, and works" case leaves the modified list behind: the tuple already has a reference to the list:
>>> id(thing[0])
3074072428L
The list is then modified by the INPLACE_ADD and the STORE_FAST fails:
>>> thing[0] += 'd'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
So the tuple still has a reference to the same list, but the list has been modified in-place:
>>> id(thing[0])
3074072428L
>>> thing[0]
['b', 'c', 'd']

You can't modify the tuple, but you can modify the contents of things contained within the tuple. Lists (along with sets, dicts, and objects) are a reference type and thus the "thing" in the tuple is just a reference - the actual list is a mutable object which is pointed to by that reference and can be modified without changing the reference itself.
( + ,) <--- your tuple (this can't be changed)
|
|
v
['a'] <--- the list object your tuple references (this can be changed)
After thing[0][0] = 'b':
( + ,) <--- notice how the contents of this are still the same
|
|
v
['b'] <--- but the contents of this have changed
After thing[0].append('c'):
( + ,) <--- notice how this is still the same
|
|
v
['b','c'] <--- but this has changed again
The reason why += errors is that it's not completely equivalent to .append() - it actually does an addition and then an assignment (and the assignment fails), rather than merely appending in-place.

You cannot replace an element of a tuple, but you can replace the entire contents of the element. This will work:
thing[0][:] = ['b']

Related

Python append to json dict inline

>>> salaries = '{"Alfred" : 300, "Jane" : 400 }'
>>> sal = json.loads(salaries)["Hritik"]=0
>>> sal
0
>>> sal = json.loads(salaries)
>>> sal["Hritik"]=0
>>> sal
{'Alfred': 300, 'Jane': 400, 'Hritik': 0}
>>> type(json.loads(salaries))
<class 'dict'>
>>> type(sal)
<class 'dict'>
Why can't I append to the dict returned by json.loads inline as I can do with the dict sal ?
Doesn't json.loads returns just a dict and that should be same as any other dict ?
The assignment statement x = y = z implies that both x and y will take on the value of z.
As an example, look at the byte code for the assignment a = b = 2:
In [45]: import dis; dis.dis(compile('a = b = 2', '', 'exec'))
1 0 LOAD_CONST 0 (2)
3 DUP_TOP
4 STORE_NAME 0 (a)
7 STORE_NAME 1 (b)
10 LOAD_CONST 1 (None)
13 RETURN_VALUE
With 4 STORE_NAME, a is assigned first to 2, followed by 7 STORE_NAME where b is then assigned to the same value, 2.
So, with
sal = json.loads(salaries)["Hritik"] = 0
sal receives the value 0. Also, a temporary variable is created when you call json.loads and that is modified, following which the reference is lost.
In order to get this to work, you'll need to break this up into 2 parts, as you have already done.
sal = json.loads(salaries)
sal['Hritik'] = 0
Why can't I append to the dict returned by json.loads inline as I can do with the dict sal ?
You can, and you do, but then you just discard that dict. It doesn't have any effect on the salaries variable, and you didn't assign the dict to sal. You assigned 0 to sal.
When you assign sal = json.loads(salaries), that makes a new dict, unrelated to the first dict, and then you actually assign the new dict to sal. Modifications to this new dict are still visible when you view the dict through sal.

Global dictionaries don't need keyword global to modify them? [duplicate]

This question already has answers here:
Why isn't the 'global' keyword needed to access a global variable?
(11 answers)
Closed 2 years ago.
I wonder why I can change global dictionary without global keyword? Why it's mandatory for other types? Is there any logic behind this?
E.g. code:
#!/usr/bin/env python3
stringvar = "mod"
dictvar = {'key1': 1,
'key2': 2}
def foo():
dictvar['key1'] += 1
def bar():
stringvar = "bar"
print(stringvar)
print(dictvar)
foo()
print(dictvar)
print(stringvar)
bar()
print(stringvar)
Gives following results:
me#pc:~/$ ./globalDict.py
{'key2': 2, 'key1': 1}
{'key2': 2, 'key1': 2} # Dictionary value has been changed
mod
bar
mod
where I would expect:
me#pc:~/$ ./globalDict.py
{'key2': 2, 'key1': 1}
{'key2': 2, 'key1': 1} # I didn't use global, so dictionary remains the same
mod
bar
mod
The reason is that the line
stringvar = "bar"
is ambiguous, it could be referring to a global variable, or it could be creating a new local variable called stringvar. In this case, Python defaults to assuming it is a local variable unless the global keyword has already been used.
However, the line
dictvar['key1'] += 1
Is entirely unambiguous. It can be referring only to the global variable dictvar, since dictvar must already exist for the statement not to throw an error.
This is not specific to dictionaries- the same is true for lists:
listvar = ["hello", "world"]
def listfoo():
listvar[0] = "goodbye"
or other kinds of objects:
class MyClass:
foo = 1
myclassvar = MyClass()
def myclassfoo():
myclassvar.foo = 2
It's true whenever a mutating operation is used rather than a rebinding one.
You can modify any mutable object without using global keyword.
This is possible in Python because global is used when you want to reassign new objects to variable names already used in global scope or to define new global variables.
But in case of mutable objects you're not re-assigning anything, you're just modifying them in-place, therefore Python simply loads them from global scope and modifies them.
As docs say:
It would be impossible to assign to a global variable without global.
In [101]: dic = {}
In [102]: lis = []
In [103]: def func():
dic['a'] = 'foo'
lis.append('foo') # but fails for lis += ['something']
.....:
In [104]: func()
In [105]: dic, lis
Out[105]: ({'a': 'foo'}, ['foo'])
dis.dis:
In [121]: dis.dis(func)
2 0 LOAD_CONST 1 ('foo')
3 LOAD_GLOBAL 0 (dic) # the global object dic is loaded
6 LOAD_CONST 2 ('a')
9 STORE_SUBSCR # modify the same object
3 10 LOAD_GLOBAL 1 (lis) # the global object lis is loaded
13 LOAD_ATTR 2 (append)
16 LOAD_CONST 1 ('foo')
19 CALL_FUNCTION 1
22 POP_TOP
23 LOAD_CONST 0 (None)
26 RETURN_VALUE

a mutable type inside an immutable container

I'm a bit confused about modifying tuple members. The following doesn't work:
>>> thing = (['a'],)
>>> thing[0] = ['b']
TypeError: 'tuple' object does not support item assignment
>>> thing
(['a'],)
But this does work:
>>> thing[0][0] = 'b'
>>> thing
(['b'],)
Also works:
>>> thing[0].append('c')
>>> thing
(['b', 'c'],)
Doesn't work, and works (huh?!):
>>> thing[0] += 'd'
TypeError: 'tuple' object does not support item assignment
>>> thing
(['b', 'c', 'd'],)
Seemingly equivalent to previous, but works:
>>> e = thing[0]
>>> e += 'e'
>>> thing
(['b', 'c', 'd', 'e'],)
So what exactly are the rules of the game, when you can and can't modify something inside a tuple? It seems to be more like prohibition of using the assignment operator for tuple members, but the last two cases are confusing me.
You can always modify a mutable value inside a tuple. The puzzling behavior you see with
>>> thing[0] += 'd'
is caused by +=. The += operator does in-place addition but also an assignment — the in-place addition works just file, but the assignment fails since the tuple is immutable. Thinking of it like
>>> thing[0] = thing[0] + 'd'
explains this better. We can use the dis module from the standard library to look at the bytecode generated from both expressions. With += we get an INPLACE_ADD bytecode:
>>> def f(some_list):
... some_list += ["foo"]
...
>>> dis.dis(f)
2 0 LOAD_FAST 0 (some_list)
3 LOAD_CONST 1 ('foo')
6 BUILD_LIST 1
9 INPLACE_ADD
10 STORE_FAST 0 (some_list)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
With + we get a BINARY_ADD:
>>> def g(some_list):
... some_list = some_list + ["foo"]
>>> dis.dis(g)
2 0 LOAD_FAST 0 (some_list)
3 LOAD_CONST 1 ('foo')
6 BUILD_LIST 1
9 BINARY_ADD
10 STORE_FAST 0 (some_list)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
Notice that we get a STORE_FAST in both places. This is the bytecode that fails when you try to store back into a tuple — the INPLACE_ADD that comes just before works fine.
This explains why the "Doesn't work, and works" case leaves the modified list behind: the tuple already has a reference to the list:
>>> id(thing[0])
3074072428L
The list is then modified by the INPLACE_ADD and the STORE_FAST fails:
>>> thing[0] += 'd'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
So the tuple still has a reference to the same list, but the list has been modified in-place:
>>> id(thing[0])
3074072428L
>>> thing[0]
['b', 'c', 'd']
You can't modify the tuple, but you can modify the contents of things contained within the tuple. Lists (along with sets, dicts, and objects) are a reference type and thus the "thing" in the tuple is just a reference - the actual list is a mutable object which is pointed to by that reference and can be modified without changing the reference itself.
( + ,) <--- your tuple (this can't be changed)
|
|
v
['a'] <--- the list object your tuple references (this can be changed)
After thing[0][0] = 'b':
( + ,) <--- notice how the contents of this are still the same
|
|
v
['b'] <--- but the contents of this have changed
After thing[0].append('c'):
( + ,) <--- notice how this is still the same
|
|
v
['b','c'] <--- but this has changed again
The reason why += errors is that it's not completely equivalent to .append() - it actually does an addition and then an assignment (and the assignment fails), rather than merely appending in-place.
You cannot replace an element of a tuple, but you can replace the entire contents of the element. This will work:
thing[0][:] = ['b']

Creating a list in Python- something sneaky going on?

Apologies if this doesn't make any sense, I'm very new to Python!
From testing in an interpreter, I can see that list() and [] both produce an empty list:
>>> list()
[]
>>> []
[]
From what I've learned so far, the only way to create an object is to call its constructor (__init__), but I don't see this happening when I just type []. So by executing [], is Python then mapping that to a call to list()?
Those two constructs are handled quite differently:
>>> import dis
>>> def f(): return []
...
>>> dis.dis(f)
1 0 BUILD_LIST 0
3 RETURN_VALUE
>>> def f(): return list()
...
>>> dis.dis(f)
1 0 LOAD_GLOBAL 0 (list)
3 CALL_FUNCTION 0
6 RETURN_VALUE
>>>
The [] form constructs a list using the opcode BUILD_LIST, whereas the list() form calls the list object's constructor.
No, Python does not call list(), or you could affect what type [] creates by assigning to list, which you cant:
>>> import __builtin__
>>> __builtin__.list = set
>>> list()
set([])
>>> []
[]
[] is syntax for creating a list. It's a builtin type and it has special syntax, just like dicts and strings and ints and floats and lots of other types.
Creating instances of types can also be done by calling the type, like list() -- which will in turn call the type's constructor and initializer for you. Calling the initializer (__init__) directly does not create a new instance of the type. Calling the constructor (__new__) does, but you should not be calling it directly.
I started learning python yesterday....
I guess you would have to say its internally mapped
>>> a = []
>>> type(a)
<type 'list'>
>>> a = list()
>>> type(a)
<type 'list'>
What are the key differences between using list() and []?
The most obvious and visible key difference between list() and [] is the syntax. Putting the syntax aside for a minute here, someone whose new or intermediately exposed to python might argue that they’re both lists or derive from the same class; that is true. Which furthermore increases the importance of understanding the key differences of both, most of which are outlined below.
list() is a function and [] is literal syntax.
Let’s take a look at what happens when we call list() and [] respectively through the disassembler.
>>> import dis
>>> print(dis.dis(lambda: list()))
1 0 LOAD_GLOBAL 0 (list)
3 CALL_FUNCTION 0 (0 positional, 0 keyword pair)
6 RETURN_VALUE
None
>>> print(dis.dis(lambda: []))
1 0 BUILD_LIST 0
3 RETURN_VALUE
None
The output from the disassembler above shows that the literal syntax version doesn’t require a global lookup, denoted by the op code LOAD_GLOBAL or a function call, denoted by the op code CALL_FUNCTION.
As a result, literal syntax is faster than it’s counterpart. – Let’s take a second and look at the timings below.
import timeit
>>> timeit.timeit('[]', number=10**4)
0.0014592369552701712
>>> timeit.timeit('list()', number=10**4)
0.0033833282068371773
On another note it’s equally important and worth pointing out that literal syntax, [] does not unpack values. An example of unpacking is shown below.
>>> list('abc') # unpacks value
['a', 'b', 'c']
>>> ['abc'] # value remains packed
['abc']
What’s a literal in python?
Literals are notations or a way of writing constant or raw variable values which python recognises as built-in types.
Sourced from my post on PythonRight - what's the difference between list and [].
In addition to the other answers, from the Language Reference:
A list display is a possibly empty series of expressions enclosed in square brackets:
list_display ::= "[" [expression_list | list_comprehension] "]"
...
A list display yields a new list object.
It does not explicitly mention how "yielding a new list object" is implemented. It could well be a call to the list() constructor, like you mentioned. Or maybe lists, being so elementary, get special treatment, and list() is actually mapped to something different entirely.
Either way, [] is certainly not mapped to a call to the constructor of the type named __builtins__.list, because redefining that type still causes [] to return an actual list, as other answerers have shown.
Python, like most programming langauges, has something called literals, meaning that special syntax can be used to write out some of the most important types of values. Very little of this is necessary, but it makes it easier to use Python that we can write literals.
>>> 0
0
>>> int()
0
>>> 5
5
>>> int('5') # I'm using a string literal here though!
5
>>> 0.0
0.0
>>> float()
0.0
>>> ""
''
>>> str()
''
>>> u""
u''
>>> unicode()
u''
>>> ()
()
>>> tuple()
()
>>> {}
{}
>>> dict()
{}
When we make our own types (classes), we create instances of them using their constructors, like list for lists. When we use literals, it's sort of like syntactic sugar for calling list, but in reality it calls that same basic things behind the scene.
Since :
class list(object):
"""
list() -> new empty list
list(iterable) -> new list initialized from iterable's items
"""
If the element in your lists is an iterable (i.e. a str), the list() and [] don't work the same way.
So
>>> a = ['ab']
>>> b = list('ab')
>>> a[0]
'ab'
>>> b[0]
'a'

What's the difference between dict() and {}?

So let's say I wanna make a dictionary. We'll call it d. But there are multiple ways to initialize a dictionary in Python! For example, I could do this:
d = {'hash': 'bang', 'slash': 'dot'}
Or I could do this:
d = dict(hash='bang', slash='dot')
Or this, curiously:
d = dict({'hash': 'bang', 'slash': 'dot'})
Or this:
d = dict([['hash', 'bang'], ['slash', 'dot']])
And a whole other multitude of ways with the dict() function. So obviously one of the things dict() provides is flexibility in syntax and initialization. But that's not what I'm asking about.
Say I were to make d just an empty dictionary. What goes on behind the scenes of the Python interpreter when I do d = {} versus d = dict()? Is it simply two ways to do the same thing? Does using {} have the additional call of dict()? Does one have (even negligible) more overhead than the other? While the question is really completely unimportant, it's a curiosity I would love to have answered.
>>> def f():
... return {'a' : 1, 'b' : 2}
...
>>> def g():
... return dict(a=1, b=2)
...
>>> g()
{'a': 1, 'b': 2}
>>> f()
{'a': 1, 'b': 2}
>>> import dis
>>> dis.dis(f)
2 0 BUILD_MAP 0
3 DUP_TOP
4 LOAD_CONST 1 ('a')
7 LOAD_CONST 2 (1)
10 ROT_THREE
11 STORE_SUBSCR
12 DUP_TOP
13 LOAD_CONST 3 ('b')
16 LOAD_CONST 4 (2)
19 ROT_THREE
20 STORE_SUBSCR
21 RETURN_VALUE
>>> dis.dis(g)
2 0 LOAD_GLOBAL 0 (dict)
3 LOAD_CONST 1 ('a')
6 LOAD_CONST 2 (1)
9 LOAD_CONST 3 ('b')
12 LOAD_CONST 4 (2)
15 CALL_FUNCTION 512
18 RETURN_VALUE
dict() is apparently some C built-in. A really smart or dedicated person (not me) could look at the interpreter source and tell you more. I just wanted to show off dis.dis. :)
As far as performance goes:
>>> from timeit import timeit
>>> timeit("a = {'a': 1, 'b': 2}")
0.424...
>>> timeit("a = dict(a = 1, b = 2)")
0.889...
#Jacob: There is a difference in how the objects are allocated, but they are not copy-on-write. Python allocates a fixed-size "free list" where it can quickly allocate dictionary objects (until it fills). Dictionaries allocated via the {} syntax (or a C call to PyDict_New) can come from this free list. When the dictionary is no longer referenced it gets returned to the free list and that memory block can be reused (though the fields are reset first).
This first dictionary gets immediately returned to the free list, and the next will reuse its memory space:
>>> id({})
340160
>>> id({1: 2})
340160
If you keep a reference, the next dictionary will come from the next free slot:
>>> x = {}
>>> id(x)
340160
>>> id({})
340016
But we can delete the reference to that dictionary and free its slot again:
>>> del x
>>> id({})
340160
Since the {} syntax is handled in byte-code it can use this optimization mentioned above. On the other hand dict() is handled like a regular class constructor and Python uses the generic memory allocator, which does not follow an easily predictable pattern like the free list above.
Also, looking at compile.c from Python 2.6, with the {} syntax it seems to pre-size the hashtable based on the number of items it's storing which is known at parse time.
Basically, {} is syntax and is handled on a language and bytecode level. dict() is just another builtin with a more flexible initialization syntax. Note that dict() was only added in the middle of 2.x series.
Update: thanks for the responses. Removed speculation about copy-on-write.
One other difference between {} and dict is that dict always allocates a new dictionary (even if the contents are static) whereas {} doesn't always do so (see mgood's answer for when and why):
def dict1():
return {'a':'b'}
def dict2():
return dict(a='b')
print id(dict1()), id(dict1())
print id(dict2()), id(dict2())
produces:
$ ./mumble.py
11642752 11642752
11867168 11867456
I'm not suggesting you try to take advantage of this or not, it depends on the particular situation, just pointing it out. (It's also probably evident from the disassembly if you understand the opcodes).
dict() is used when you want to create a dictionary from an iterable, like :
dict( generator which yields (key,value) pairs )
dict( list of (key,value) pairs )
Funny usage:
def func(**kwargs):
for e in kwargs:
print(e)
a = 'I want to be printed'
kwargs={a:True}
func(**kwargs)
a = 'I dont want to be printed'
kwargs=dict(a=True)
func(**kwargs)
output:
I want to be printed
a
In order to create an empty set we should use the keyword set before it
i.e set() this creates an empty set where as in dicts only the flower brackets can create an empty dict
Lets go with an example
print isinstance({},dict)
True
print isinstance({},set)
False
print isinstance(set(),set)
True

Categories

Resources