As we know integer (int) is immutable in Python. We cannot change the value of an int object in-place(without changing its reference).
I want to understand the working of garbage collector in the following scenario.
from ctypes import c_long
a = 1000
while a < 1010:
print(id(a), c_long.from_address(id(a)))
a = a + 1
When I run the above code, I get the following output :
140548768404176 c_long(1)
140548768406160 c_long(1)
140548768404176 c_long(1)
140548768406160 c_long(1)
140548768404176 c_long(1)
140548768406160 c_long(1)
140548768404176 c_long(1)
140548768406160 c_long(1)
140548768404176 c_long(1)
140548768406160 c_long(1)
Note: c_long.from_address(id()) is used to get the reference count of a value or variable.
I am aware of the fact that when we assign a variable with some value, and when the reference count becomes 0, it removes the reference to that value. And now that freed space can be utilized again.
I want to understand how the memory block gets reassigned immediately and only these 2 blocks are used.
Related
My code is running perfeclty with no errors from python shell but in VS Code IDE it is highlighting geeks[i] = None as error and giving the above(title) as problem/error.
Python:
geeks = [6, 0, 4, 1]
i = 0
while i < len(geeks):
geeks[i] = None
i += 1
geeks = [x for x in geeks if x is not None]
print(geeks)
Aim of code is to remove all the element from the list .
I know various way by which it could be done but was curious why this happend in VS Code & how to solve this ?
Is there any problem that could occur later ?
the code runs fine but VS Code IDE shows it as error
the variable geeks has type list[int], deduced by PyLance static analysis.
The other error message you get is more descriptive of the reason
Argument of type "None" cannot be assigned to parameter "__o"
of type "Iterable[int]" in function "__setitem__"
"__iter__" is not present
Adding a None to this list is not allowed.
If you give geeks the correct type the error is gone
from typing import List, Union
geeks: List[Union[int, None]] = [6, 0, 4, 1]
Why does CPython (no clue about other Python implementations) have the following behavior?
tuple1 = ()
tuple2 = ()
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))
I have a few ideas why it may, but can't find a concrete reason why.
EDIT
To further prove Glenn's and Thomas' point:
[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.
Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.
In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.
CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.
This shows what's happening very clearly (the output is likely to be different in other implementations of Python):
class A:
def __init__(self): print("a")
def __del__(self): print("b")
# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))
Consider the following Python code snippet (produces the same result on both Python 2 and 3):
import sys
REF = sys.getrefcount
class Foo(object):
pass
def function(foo):
print( REF(foo) )
class Class(object):
def __init__(self, foo):
print( REF(foo) )
print( REF(Foo()) ) # output: 1, ok
function(Foo()) # output: 3, ok
Class(Foo()) # output: 4, what ???
Having been searching Google so far, every site I've reached only gives the following cases in which reference count of an object increments by one:
assignment operator
argument passing
appending an object to a list (object's reference count will be increased).
(https://rushter.com/blog/python-garbage-collector/ as a concrete source)
If I understand right, in the call function(Foo()), each of the following step does increment the reference to Foo() (an object initialized on the fly) by 1:
Foo() is passed to function as an argument => +1 ref
It is assigned to variable foo inside function => +1 ref
It is passed to sys.getrefcount as an argument => +1 ref
Thus 3 references, as expected.
If so, how can the last line claims that reference count of the object is 4 ? The code calls the constructor of a class, which is also a function, so the reference count to Foo() should be 3 as well !
So is there any mystery of Python refcount here ? Are there any extra cases in which refcount also increments ?
BACKGROUND: I want to detect whether the object passed to the function/constructor is initialized on the fly, i.e. out of the caller's control, so I can do some sort of memory saving: reuse an existing object without making use of a new one.
I know they practically do the same thing, but if you were to lets say do something like...
curpop = this_other_ndarray
i = 0;
while i<20:
curpop[:] = select(curpop, parameter, parameter1)
stuff
more stuff
curpop[:] = some_stuff_I_did
i += 1;
So the above code is just saying, before I enter a generational loop I am going to take an initial generation of populations from 'this other ndarray'.
Then I am planning on changing that array over and over and everytime I restart the loop I will only select some from itself but I will declare that as what it is equal to now. Is this okay to do in Python3?
Is the declaration of
'array[:] = some of it self'
versus
'array = some of itself'
different at all?
These are two totally different things.
The first is simple assignment.
foo = bar
This assignment statement merely says that the name on the left-hand side now refers to the same object as the name on the right-hand side. These statements do not modify either object.
Objects are neither created nor necessarily destroyed. If you lose the last name of an object, however, you will have lost the object. The CPython runtime uses reference counting as a memory management strategy, and will automatically reclaim objects that have a zero reference count.
In Python, variables act simply like object names that you can create, destroy, and change what they reference. Think of them like name-tags.
Now, a statement like:
foo[:] = bar
Is actually a method call. It can be translated to:
foo.__setitem__(slice(None, None, None), bar)
Observe:
>>> class Foo:
... def __setitem__(self, key, value):
... print("Key:", key, "Value:", value)
...
>>> class Bar: pass
...
>>> foo = Foo()
>>> bar = Bar()
>>> foo[:] = bar
Key: slice(None, None, None) Value: <__main__.Bar object at 0x104aa5c50>
So, really, the type of the objects control the ultimate effects of this statement. In the case of numpy.ndarray objects, slice-based assignment works similarly to list slice based assignment in that it mutates the array object in-place, with a few more caveats like broadcasting to throw into the mix. See the relevant docs:
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html#assigning-values-to-indexed-arrays
In many cases
curpop[:]= iterable_value_as_tuple_string_dictionary_and_list_etc
do the same thing as
curpop=iterable_value_as_tuple_string_dictionary_and_list_etc
of course assign a string first or at any step will remove the ability in the next steps to use [:] to assign something again
note that
curpop[:]= notiterable_value != curpop=notiterable_value
as the first assign notiterable_value to each element of curpop and the second assign the value notiterable_value to curpop itself
Why does CPython (no clue about other Python implementations) have the following behavior?
tuple1 = ()
tuple2 = ()
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))
I have a few ideas why it may, but can't find a concrete reason why.
EDIT
To further prove Glenn's and Thomas' point:
[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.
Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.
In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.
CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.
This shows what's happening very clearly (the output is likely to be different in other implementations of Python):
class A:
def __init__(self): print("a")
def __del__(self): print("b")
# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))