I am trying these in the python shell and am getting quite confusing results.
>>> p = [1, 2, 3, 4, 5, 6, 7, 8]
>>> p
[1, 2, 3, 4, 5, 6, 7, 8]
>>> p[2:8:2]
[3, 5, 7]
>>> id(p[2:8:2])
37798416
>>> id(p[2:8:2])
37798416
>>> id(p[2:8:2])
50868392
Note how the id changed the 3rd time !
>>> id(p[2:8:2])
37798336
And changed again !
Question#1: How and why did that happen ?
Question#2:
>>> p[2:8:2] = [33,55,77]
>>> p
[1, 2, 33, 4, 55, 6, 77, 8]
How does python exactly "store" p[2:8:2] ? (may be "store" is not the right word, but I hope you get the idea). It does not look like it is a distinct list from the original list (though it is made up of non-sequential immutable items from the original list), as changes to this list are reflected in the original list !
Slicing, with rare exception, makes brand new copies of whatever you're slicing. So all the id checks are telling you is that sometimes the new list reuses the memory from last time, and sometimes it uses a different bit of memory. The exact behavior is pure implementation detail. In CPython (the reference interpreter) id happens to correspond to memory addresses, so all you're seeing is a behavioral artifact of the allocator, not some deep meaning to slicing.
On your question #2: When use in an assignment context, slicing modifies the original sequence, it doesn't create a new list at all. Don't try to draw meaningful parallels between slicing (read oriented, makes new sequences) and slice assignment (write oriented, modifies existing sequences); the behaviors under the hood are different in almost every way.
For question 1:
The id of an object is guaranteed to both be unique and stay constant during the lifetime of that object. See here in the Python library docs:
id(object) - return the identity of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
Since you're creating and destroying objects with your slicing, the id is actually following the rules.
If you're using the reference (and I suspect most common) implementation, CPython, it simply gives you the memory address of the object. The source code can be found in Python/bltinmodule.c, simplified and annotated below:
static PyObject *builtin_id(PyModuleDef *self, PyObject *v) {
PyObject *id = PyLong_FromVoidPtr(v); // Turn object address into
return id; // long and return it.
}
That ensures that it's unique and the vagaries and order of memory allocation calls also explain why it can repeat and/or be different.
For question 2:
Assigning to the "slice" does not actually involve creating the sliced object and assigning to it. It simply sets certain values in the already existing object as specified by the slice notation to those given on the right hand side of the assignment.
More detail can be found in the sliceobject files in the CPython source code, specifically Objects/sliceobject.c and Include/sliceobject.h. These involve the creation of a PySliceObject which consists of a {start, stop, step} tuple.
When you apply this tuple to an object on the right hand side of an assignment, such as x = y[2:8:2], it uses the PySliceObject to create a new list x based on y, getting only the relevant elements.
When used on the left hand side, such as x[2:8:2] = [33,55,77], it uses the PySliceObject to decide which elements of x are set to the values on the right.
Related
So if I have a list a and append a to it, I will get a list that contains it own reference.
>>> a = [1,2]
>>> a.append(a)
>>> a
[1, 2, [...]]
>>> a[-1][-1][-1]
[1, 2, [...]]
And this basically results in seemingly infinite recursions.
And not only in lists, dictionaries as well:
>>> b = {'a':1,'b':2}
>>> b['c'] = b
>>> b
{'a': 1, 'b': 2, 'c': {...}}
It could have been a good way to store the list in last element and modify other elements, but that wouldn't work as the change will be seen in every recursive reference.
I get why this happens, i.e. due to their mutability. However, I am interested in actual use-cases of this behavior. Can somebody enlighten me?
The use case is that Python is a dynamically typed language, where anything can reference anything, including itself.
List elements are references to other objects, just like variable names and attributes and the keys and values in dictionaries. The references are not typed, variables or lists are not restricted to only referencing, say, integers or floating point values. Every reference can reference any valid Python object. (Python is also strongly typed, in that the objects have a specific type that won't just change; strings remain strings, lists stay lists).
So, because Python is dynamically typed, the following:
foo = []
# ...
foo = False
is valid, because foo isn't restricted to a specific type of object, and the same goes for Python list objects.
The moment your language allows this, you have to account for recursive structures, because containers are allowed to reference themselves, directly or indirectly. The list representation takes this into account by not blowing up when you do this and ask for a string representation. It is instead showing you a [...] entry when there is a circular reference. This happens not just for direct references either, you can create an indirect reference too:
>>> foo = []
>>> bar = []
>>> foo.append(bar)
>>> bar.append(foo)
>>> foo
[[[...]]]
foo is the outermost [/]] pair and the [...] entry. bar is the [/] pair in the middle.
There are plenty of practical situations where you'd want a self-referencing (circular) structure. The built-in OrderedDict object uses a circular linked list to track item order, for example. This is not normally easily visible as there is a C-optimised version of the type, but we can force the Python interpreter to use the pure-Python version (you want to use a fresh interpreter, this is kind-of hackish):
>>> import sys
>>> class ImportFailedModule:
... def __getattr__(self, name):
... raise ImportError
...
>>> sys.modules["_collections"] = ImportFailedModule() # block the extension module from being loaded
>>> del sys.modules["collections"] # force a re-import
>>> from collections import OrderedDict
now we have a pure-python version we can introspect:
>>> od = OrderedDict()
>>> vars(od)
{'_OrderedDict__hardroot': <collections._Link object at 0x10a854e00>, '_OrderedDict__root': <weakproxy at 0x10a861130 to _Link at 0x10a854e00>, '_OrderedDict__map': {}}
Because this ordered dict is empty, the root references itself:
>>> od._OrderedDict__root.next is od._OrderedDict__root
True
just like a list can reference itself. Add a key or two and the linked list grows, but remains linked to itself, eventually:
>>> od["foo"] = "bar"
>>> od._OrderedDict__root.next is od._OrderedDict__root
False
>>> od._OrderedDict__root.next.next is od._OrderedDict__root
True
>>> od["spam"] = 42
>>> od._OrderedDict__root.next.next is od._OrderedDict__root
False
>>> od._OrderedDict__root.next.next.next is od._OrderedDict__root
True
The circular linked list makes it easy to alter the key ordering without having to rebuild the whole underlying hash table.
However, I am interested in actual use-cases of this behavior. Can somebody enlighten me?
I don't think there are many useful use-cases for this. The reason this is allowed is because there could be some actual use-cases for it and forbidding it would make the performance of these containers worse or increase their memory usage.
Python is dynamically typed and you can add any Python object to a list. That means one would need to make special precautions to forbid adding a list to itself. This is different from (most) typed-languages where this cannot happen because of the typing-system.
So in order to forbid such recursive data-structures one would either need to check on every addition/insertion/mutation if the newly added object already participates in a higher layer of the data-structure. That means in the worst case it has to check if the newly added element is anywhere where it could participate in a recursive data-structure. The problem here is that the same list can be referenced in multiple places and can be part of multiple data-structures already and data-structures such as list/dict can be (almost) arbitrarily deep. That detection would be either slow (e.g. linear search) or would take quite a bit of memory (lookup). So it's cheaper to simply allow it.
The reason why Python detects this when printing is that you don't want the interpreter entering an infinite loop, or get a RecursionError, or StackOverflow. That's why for some operations like printing (but also deepcopy) Python temporarily creates a lookup to detect these recursive data-structures and handles them appropriately.
Consider building a state machine that parse string of digits an check if you can divide by 25 you could model each node as list with 10 outgoing directions consider some connections going to them self
def canDiv25(s):
n0,n1,n1g,n2=[],[],[],[]
n0.extend((n1,n0,n2,n0,n0,n1,n0,n2,n0,n0))
n1.extend((n1g,n0,n2,n0,n0,n1,n0,n2,n0,n0))
n1g.extend(n1)
n2.extend((n1,n0,n2,n0,n0,n1g,n0,n2,n0,n0))
cn=n0
for c in s:
cn=cn[int(c)]
return cn is n1g
for i in range(144):
print("%d %d"%(i,canDiv25(str(i))),end='\t')
While this state machine by itself has little practical it show what could happen. Alternative you could have an simple Adventure game where each room is represented as a dictionary you can go for example NORTH but in that room there is of course a back link to SOUTH. Also sometimes game developers make it so that for example to simulate a tricky path in some dungeon the way in NORTH direction will point to the room itself.
A very simple application of this would be a circular linked list where the last node in a list references the first node. These are useful for creating infinite resources, state machines or graphs in general.
def to_circular_list(items):
head, *tail = items
first = { "elem": head }
current = first
for item in tail:
current['next'] = { "elem": item }
current = current['next']
current['next'] = first
return first
to_circular_list([1, 2, 3, 4])
If it's not obvious how that relates to having a self-referencing object, think about what would happen if you only called to_circular_list([1]), you would end up with a data structure that looks like
item = {
"elem": 1,
"next": item
}
If the language didn't support this kind of direct self referencing, it would be impossible to use circular linked lists and many other concepts that rely on self references as a tool in Python.
The reason this is possible is simply because the syntax of Python doesn't prohibit it, much in the way any C or C++ object can contain a reference to itself. An example might be: https://www.geeksforgeeks.org/self-referential-structures/
As #MSeifert said, you will generally get a RecursionError at some point if you're trying to access the list repeatedly from itself. Code that uses this pattern like this:
a = [1, 2]
a.append(a)
def loop(l):
for item in l:
if isinstance(item, list):
loop(l)
else: print(item)
will eventually crash without some sort of condition. I believe that even print(a) will also crash. However:
a = [1, 2]
while True:
for item in a:
print(item)
will run infinitely with the same expected output as the above. Very few recursive problems don't unravel into a simple while loop. For an example of recursive problems that do require a self-referential structure, look up Ackermann's function: http://mathworld.wolfram.com/AckermannFunction.html. This function could be modified to use a self-referential list.
There is certainly precedent for self-referential containers or tree structures, particularly in math, but on a computer they are all limited by the size of the call stack and CPU time, making it impractical to investigate them without some sort of constraint.
I am new to Python from R. I have recently spent a lot of time reading up on how everything in Python is an object, objects can call methods on themselves, methods are functions within a class, yada yada yada.
Here's what I don't understand. Take the following simple code:
mylist = [3, 1, 7]
If I want to know how many times the number 7 occurs, I can do:
mylist.count(7)
That, of course, returns 1. And if I want to save the count number to another variable:
seven_counts = mylist.count(7)
So far, so good. Other than the syntax, the behavior is similar to R. However, let's say I am thinking about adding a number to my list:
mylist.append(9)
Wait a minute, that method actually changed the variable itself! (i.e., "mylist" has been altered and now includes the number 9 as the fourth digit in the list.) Assigning the code to a new variable (like I did with seven_counts) produces garbage:
newlist = mylist.append(9)
I find the inconsistency in this behavior a bit odd, and frankly undesirable. (Let's say I wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable.)
My question is simple:
Is there a way to know in advance if calling a particular method will actually alter your variable (object)?
Aside from reading the documentation (which for some methods will include type annotations specifying the return value) or playing with the method in the interactive interpreter (including using help() to check the docstring for a type annotation), no, you can't know up front just by looking at the method.
That said, the behavior you're seeing is intentional. Python methods either return a new modified copy of the object or modify the object in place; at least among built-ins, they never do both (some methods mutate the object and return a non-None value, but it's never the object just mutated; the pop method of dict and list is an example of this case).
This either/or behavior is intentional; if they didn't obey this rule, you'd have had an even more confusing and hard to identify problem, namely, determining whether append mutated the value it was called on, or returned a new object. You definitely got back a list, but is it a new list or the same list? If it mutated the value it was called on, then
newlist = mylist.append(9)
is a little strange; newlist and mylist would be aliases to the same list (so why have both names?). You might not even notice for a while; you'd continue using newlist, thinking it was independent of mylist, only to look at mylist and discover it was all messed up. By having all such "modify in place" methods return None (or at least, not the original object), the error is discovered more quickly/easily; if you try and use newlist, mistakenly believing it to be a list, you'll immediately get TypeErrors or AttributeErrors.
Basically, the only way to know in advance is to read the documentation. For methods whose name indicates a modifying operation, you can check the return value and often get an idea as to whether they're mutating. It helps to know what types are mutable in the first place; list, dict, set and bytearray are all mutable, and the methods they have that their immutable counterparts (aside from dict, which has no immutable counterpart) lack tend to mutate the object in place.
The default tends to be to mutate the object in place simply because that's more efficient; if you have a 100,000 element list, a default behavior for append that made a new 100,001 element list and returned it would be extremely inefficient (and there would be no obvious way to avoid it). For immutable types (e.g. str, tuple, frozenset) this is unavoidable, and you can use those types if you want a guarantee that the object is never mutate in place, but it comes at a cost of unnecessary creation and destruction of objects that will slow down your code in most cases.
Just checkout the doc:
>>> list.count.__doc__
'L.count(value) -> integer -- return number of occurrences of value'
>>> list.append.__doc__
'L.append(object) -> None -- append object to end'
There isn't really an easy way to tell, but:
immutable object --> no way of changing through method calls
So, for example, tuple has no methods which affect the tuple as it is unchangeable so methods can only return new instances.
And if you "wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable" then you can concatenate the list with a new list with one element.
i.e.
>>> l = [1,2,3]
>>> k = l + [4]
>>> l
[1, 2, 3]
>>> k
[1, 2, 3, 4]
Not from merely your invocation (your method call). You can guarantee that the method won't change the object if you pass in only immutable objects, but some methods are defined to change the object -- and will either not be defined for the one you use, or will fault in execution.
I Real Life, you look at the method's documentation: that will tell you exactly what happens.
[I was about to include what Joe Iddon's answer covers ...]
A introductory Python textbook defined 'object reference' as follows, but I didn't understand:
An object reference is nothing more than a concrete representation of the object’s identity (the memory address where the object is stored).
The textbook tried illustrating this by using an arrow to show an object reference as some sort of relation going from a variable a to an object 1234 in the assignment statement a = 1234.
From what I gathered off of Wikipedia, the (object) reference of a = 1234 would be an association between a and 1234 were a was "pointing" to 1234 (feel free to clarify "reference vs. pointer"), but it has been a bit difficult to verify as (1) I'm teaching myself Python, (2) many search results talk about references for Java, and (3) not many search results are about object references.
So, what is an object reference in Python? Thanks for the help!
Whatever is associated with a variable name has to be stored in the program's memory somewhere. An easy way to think of this, is that every byte of memory has an index-number. For simplicity's sake, lets imagine a simple computer, these index-numbers go from 0 (the first byte), upwards to however many bytes there are.
Say we have a sequence of 37 bytes, that a human might interpret as some words:
"The Owl and the Pussy-cat went to sea"
The computer is storing them in a contiguous block, starting at some index-position in memory. This index-position is most often called an "address". Obviously this address is absolutely just a number, the byte-number of the memory these letters are residing in.
#12000 The Owl and the Pussy-cat went to sea
So at address 12000 is a T, at 12001 an h, 12002 an e ... up to the last a at 12037.
I am labouring the point here because it's fundamental to every programming language. That 12000 is the "address" of this string. It's also a "reference" to it's location. For most intents and purposes an address is a pointer is a reference. Different languages have differing syntactic handling of these, but essentially they're the same thing - dealing with a block of data at a given number.
Python and Java try to hide this addressing as much as possible, where languages like C are quite happy to expose pointers for exactly what they are.
The take-away from this, is that an object reference is the number of where the data is stored in memory. (As is a pointer.)
Now, most programming languages distinguish between simple types: characters and numbers, and complex types: strings, lists and other compound-types. This is where the reference to an object makes a difference.
So when performing operations on simple types, they are independent, they each have their own memory for storage. Imagine the following sequence in python:
>>> a = 3
>>> b = a
>>> b
3
>>> b = 4
>>> b
4
>>> a
3 # <-- original has not changed
The variables a and b do not share the memory where their values are stored. But with a complex type:
>>> s = [ 1, 2, 3 ]
>>> t = s
>>> t
[1, 2, 3]
>>> t[1] = 8
>>> t
[1, 8, 3]
>>> s
[1, 8, 3] # <-- original HAS changed
We assigned t to be s, but obviously in this case t is s - they share the same memory. Wait, what! Here we have found out that both s and t are a reference to the same object - they simply share (point to) the same address in memory.
One place Python differs from other languages is that it considers strings as a simple type, and these are independent, so they behave like numbers:
>>> j = 'Pussycat'
>>> k = j
>>> k
'Pussycat'
>>> k = 'Owl'
>>> j
'Pussycat' # <-- Original has not changed
Whereas in C strings are definitely handled as complex types, and would behave like the Python list example.
The upshot of all this, is that when objects that are handled by reference are modified, all references-to this object "see" the change. So if the object is passed to a function that modifies it (i.e.: the content of memory holding the data is changed), the change is reflected outside that function too.
But if a simple type is changed, or passed to a function, it is copied to the function, so the changes are not seen in the original.
For example:
def fnA( my_list ):
my_list.append( 'A' )
a_list = [ 'B' ]
fnA( a_list )
print( str( a_list ) )
['B', 'A'] # <-- a_list was changed inside the function
But:
def fnB( number ):
number += 1
x = 3
fnB( x )
print( x )
3 # <-- x was NOT changed inside the function
So keeping in mind that the memory of "objects" that are used by reference is shared by all copies, and memory of simple types is not, it's fairly obvious that the two types operate differently.
Objects are things. Generally, they're what you see on the right hand side of an equation.
Variable names (often just called "names") are references to the actual object. When a name is on the right hand side of an equation1, the object that it references is automatically looked up and used in the equation. The result of the expression on the right hand side is an object. The name on the left hand side of the equation becomes a reference to this (possibly new) object.
Note, you can have object references that aren't explicit names if you are working with container objects (like lists or dictionaries):
a = [] # the name a is a reference to a list.
a.append(12345) # the container list holds a reference to an integer object
In a similar way, multiple names can refer to the same object:
a = []
b = a
We can demonstrate that they are the same object by looking at the id of a and b and noting that they are the same. Or, we can look at the "side-effects" of mutating the object referenced by a or b (if we mutate one, we mutate both because they reference the same object).
a.append(1)
print a, b # look mom, both are [1]!
1More accurately, when a name is used in an expression
In python, strictly speaking, the language has only naming references to the objects, that behave as labels. The assignment operator only binds to the name. The objects will stay in the memory until they are garbage collected
Ok, first things first.
Remember, there are two types of objects in python.
Mutable : Whose values can be changed. Eg: dictionaries, lists and user defined objects(unless defined immutable)
Immutable : Whose values can't be changed. Eg: tuples, numbers, booleans and strings.
Now, when python says PASS BY OBJECT REFERENECE, just remember that
If the underlying object is mutable, then any modifications done will persist.
and,
If the underlying object is immutable, then any modifications done will not persist.
If you still want examples for clarity, scroll down or click here .
>>> d
{1: 1, 2: 2, 3: 3}
>>> lst = [d, d]
>>> c=lst[0]
>>> c[1]=5
>>> lst
[{1: 5, 2: 2, 3: 3}, {1: 5, 2: 2, 3: 3}]
When lst = [d, d], are lst[0] and lsg[1] both references to the memory block of d, instead of creating two memory blocks and copy the content of d to them respectively?
When c=lst[0], is c just a reference to the memory occupied by lst[0], instead of creating a new memory block and copy the content from lst[0]?
In Python, when is a reference created to point to an existing memory block, and when is a new memory block allocated and then copy?
This language feature of Python is different from C. What is the name of this language feature?
Thanks.
All variables (and other containers, such as dictionaries, lists, and object attributes) hold references to objects. Memory allocation occurs when the object is instantiated. Simple assignment always creates another reference to the existing object. For example, if you have:
a = [1, 2, 3]
b = a
Then b and a point to the same object, a list. You can verify this using the is operator:
print(b is a) # True
If you change a, then b changes too, because they are two names for the same object.
a.append(4)
print(b[3] == 4) # True
print(b[3] is a[3]) # also True
If you want to create a copy, you must do so explicitly. Here are some ways of doing this:
For lists, use a slice: b = a[:].
For many types, you can use the type name to copy an existing object of that type: b = list(a). When creating your own classes, this is a good approach to take if you need copy functionality.
The copy module has methods that can be used to copy objects (either shallowly or deeply).
For immutable types, such as strings, numbers, and tuples, there is never any need to make a copy. You can only "change" these kinds of values by referencing different ones.
The best way of describing this is probably "everything's an object." In C, "primitive" types like integers are treated differently from arrays. In Python, they are not: all values are stored as references to objects—even integers.
This paragraph from the Python tutorial should help clear things up for you:
Objects have individuality, and multiple names (in multiple scopes)
can be bound to the same object. This is known as aliasing in other
languages. This is usually not appreciated on a first glance at
Python, and can be safely ignored when dealing with immutable basic
types (numbers, strings, tuples). However, aliasing has a possibly
surprising effect on the semantics of Python code involving mutable
objects such as lists, dictionaries, and most other types. This is
usually used to the benefit of the program, since aliases behave like
pointers in some respects. For example, passing an object is cheap
since only a pointer is passed by the implementation; and if a
function modifies an object passed as an argument, the caller will see
the change — this eliminates the need for two different argument
passing mechanisms as in Pascal.
To answer your individual questions in more detail:
When lst = [d, d], are lst[0] and lst[1] both references to the memory block of d, instead of creating two memory blocks and copy the content of d to them respectively?
No. They don't refer to the memory block of d. lst[0] and lst[1] are aliasing the same object as d, at that point in time. Proof: If you assign d to a new object after initializing the list, lst[0] and lst[1] will be unchanged. If you mutate the object aliased by d, then the mutation is visible lst[0] and lst[1], because they alias the same object.
When c=lst[0], is c just a reference to the memory occupied by lst[0], instead of creating a new memory block and copy the content from lst[0]?
Again no. It's not a reference to the memory occupied by lst[0]. Proof: if you assign lst[0] to a new object, c will be unchanged. If you modify a mutable object (like the dictionary that lst[0] points to) you will see the change in c, because c is referring to the same object, the original dictionary.
In Python, when is a reference created to point to an existing memory block, and when is a new memory block allocated and then copy?
Python doesn't really work with "memory blocks" in the same way that C does. It is an abstraction away from that. Whenever you create a new object, and assign it to a variable, you've obviously got memory allocated for that object. But you will never work with that memory directly, you work with references to the objects in that memory.
Those references are the values that get assigned to symbolic names, AKA variables, AKA aliases. "pass-by-reference" is a concept from pointer-based languages like C and C++, and does not apply to Python. There is a blog post which I believe covers this topic the best.
It is often argued whether Python is pass-by-value, pass-by-reference, or pass-by-object-reference. The truth is that it doesn't matter how you think of it, as long as you understand that the entire language specification is just an abstraction for working with names and objects. Java and Ruby have similar execution models, but the Java docs call it pass-by-value while the Ruby docs call it pass-by-reference. The Python docs remain neutral on the subject, so it's best not to speculate and just see things for what they are.
This language feature of Python is different from C. What is the name of this language feature?
Associating names with objects is known as name binding. Allowing multiple names (in potentially multiple scopes) to be bound to the same object is known as aliasing. You can read more about aliasing in the Python tutorial and on Wikipedia.
It might also be helpful for you to read would be the execution model documentation where it talks about name binding and scopes in more detail.
In short; Python is pass-by-reference. Objects are created and memory allocated upon their construction. Referencing objects does not allocate more memory unless you are either creating new objects or expanding existing objects (list.append())
This post Is Python pass-by-reference or pass-by-value covers it very well.
As a side note; if you are worried about how memory is allocated in a manage programming language like Python then you're probably using the wrong language and/or prematurely optimizing. Also how memory is managed in Python is implemtnation specific as there are many implementations of Python; CPython (what you are probably using); Jython, IronPython, PyPy, MicroPython, etc.
There is a lot of confusion with python names in the web and documentation doesn't seem to be that clear about names. Below are several things I read about python names.
names are references to objects (where are they? heap?) and what name holds is an address. (like Java).
names in python are like C++ references ( int& b) which means that it is another alias for a memory location; i.e. for int a , a is a memory location. if int& b = a means that b is another name the for same memory location
names are very similar to automatically dereferenced pointers variables in C.
Which of the above statements is/are correct?
Does Python names contain some kind of address in them or is it just a name to a memory location (like C++ & references)?
Where are python names stored, Stack or heap?
EDIT:
Check out the below lines from http://etutorials.org/Programming/Python.+Text+processing/Appendix+A.+A+Selective+and+Impressionistic+Short+Review+of+Python/A.2+Namespaces+and+Bindings/#
Whenever a (possibly qualified) name occurs on the right side of an assignment, or on a line by itself, the name is dereferenced to the object itself. If a name has not been bound inside some accessible scope, it cannot be dereferenced; attempting to do so raises a NameError exception. If the name is followed by left and right parentheses (possibly with comma-separated expressions between them), the object is invoked/called after it is dereferenced. Exactly what happens upon invocation can be controlled and overridden for Python objects; but in general, invoking a function or method runs some code, and invoking a class creates an instance. For example:
pkg.subpkg.func() # invoke a function from a namespace
x = y # deref 'y' and bind same object to 'x'
This makes sense.Just want to cross check how true it is.Comments and answers please
names are references to objects
Yes. You shouldn't care where the objects live if you just want to understand Python variables' semantics; they're somewhere in memory and Python implementations manage memory for you. How they do that depends on the implementation (CPython, Jython, PyPy...).
names in python are like C++ references
Not exactly. Reassigning a reference in C++ actually reassigns the memory location referenced, e.g. after
int i = 0;
int &r = i;
r = 1;
it is true that i == 1. You can't do this in Python except by using a mutable container object. The closest you can get to the C++ reference behavior is
i = [0] # single-element list
r = i # r is now another reference to the object referenced by i
r[0] = 1 # sets i[0]
are very similar to automatically dereferenced pointers variables in C
No, because then they'd be similar to C++ references in the above regard.
Does Python names contain some kind of address in them or is it just a name to a memory location?
The former is closer to the truth, assuming a straightforward implementation (again, PyPy might do things differently than CPython). In any case, Python variables are not storage locations, but rather labels/names for objects that may live anywhere in memory.
Every object in a Python process has an identity that can be obtained using the id function, which in CPython returns its memory address. You can check whether two variables (or expressions more generally) reference the same object by checking their id, or more directly by using is:
>>> i = [1, 2]
>>> j = i # a new reference
>>> i is j # same identity?
True
>>> j = [1, 2] # a new list
>>> i == j # same value?
True
>>> i is j # same identity?
False
Python names are, well, names. You have objects and names, that's it.
Creating an object, say [3, 4, 5] creates an object somewhere on the heap. You don't need to know how. Now you can put names to target this object, by assigning it to names:
x = [3, 4, 5]
That is, the assignment operator assigns names rather than values. x isn't [3, 4, 5], no, it's simply a name pointing to the [3, 4, 5] object. So doing this:
x = 1
Doesn't change the original [3, 4, 5] object, instead it assigns the object 1 to the name x. Also note that most expressions like [3, 4, 5], but also 8 + 3 create temporaries. Unless you assign a name to that temporary it will immediately die. There is no (except, for example in CPython for small numbers, but that aside) mechanism to keep objects alive that aren't referenced, and cache them. For example, this fails:
>>> x = [3, 4, 5]
>>> x is [3, 4, 5] # must be some object, right? no!
False
However, that's merely assignment (which is not overloadable in Python). In fact, objects and names in Python very well behave like automatically dereferencing pointers, except that they are automatically reference counted and die after they're not referenced anymore (in CPython, at least) and that they do not automatically dereference on assignment.
Thanks to this memory model, for example the C++ way of overloading index operations doesn't work. Instead, Python uses __setitem__ and __getitem__ because you can't return anything that's "assignable". Furthermore, the operators +=, *=, etc... work by creating temporaries and assigning that temporary back to the name.
Python objects are stored on a heap and are garbage collected via reference counting.
Variables are references to objects like in Java, and thus point 1 applies. I am not familiar with either C++ or automatically dereferenced pointer variables in C, to make a call on those.
Ultimately, it's the python interpreter that does the looking up of items in the interpreter structures, which usually are python lists and dictionaries and other such abstract containers; namespaces use dict (a hash table) for example, where the names and values are pointers to other python objects. These are managed explicitly by the mapping protocol.
To the python programmer, this is all hidden; you don't need to know where your objects live, just that they are still alive as long as you have something referencing them. You pass around these references when coding in python.