Piggybacking on this question, say i have a container for a weakreference:
import weakref
class Foo(object):
a = lambda *_: None
def __init__(self, a):
self.a = weakref.ref(a, self._remove)
def _remove(self, *args):
self.__del__(self)
class Bar(object):
pass
>>> bar = Bar()
>>> foo = Foo(bar)
>>> del bar
>>> foo
<__main__.Foo object at 0x...>
I thought of storing the Foo instance in a static WeakKeyDictionary container, with the a attribute as a key, and using weakref.proxy of the instance everywhere--but that seems...inefficient. What's the best way to make it so that the Foo instance deletes itself when its reference to a dies?
You can't. I just spent some time digging through the Python source and ctypes documentation to ironically show how one might really delete (aka Py_DECREF until deallocated) an object until I gave up. The point is, you don't really want to do this. Python manages its own memory for a reason. Sure, it gives you access to things like weak references, but in no case will Python break a strong reference.
What you are proposing is to have an object reach into the environments of every bit of code loaded into the Python interpreter to rip out any references to itself. weakref has to rip out references too, but it only has to remove the references from the weakref object; it doesn't have to touch the object holding a reference to the weakref. To remove a reference in the way you propose would be at least invasive and most likely impossible.
To see why it would be impossible, consider how one might write a Python module in C that defines a type. Each instance of the object is going to hold some PyObject pointers to things it cares about. Some of these might be exposed to Python through properties, while others might remain internal. Suppose one of these internal references referenced one of your Foo objects. For it to 'delete' itself, it would have to reach into our C type and NULL out the reference. But to Python code, the C struct defining the object is opaque. If you dug into it with ctypes, you could inspect the bytes, but who's to know whether some sequence of bytes is a pointer to your object or an int that just happens to have the same value as the address of your object? You can't, at least without knowing implementation details of that type. And you can't handle every case, because someone can add another case just by importing another module written in C. You can't anticipate everything.
So what can you do? If you're deadset on doing something like this, you can mimic weakref's interface. Basically, make a new class that holds a reference to your class; to avoid ambiguity, I'll call this a fakeref. When it's called, it returns the instance of your class. Your class holds weak references1 to all of its fakerefs. Whenever your Foo class wants to delete itself, loop over the fakerefs, Noneing out the references to the Foo. VoilĂ ; your class can 'delete' itself as desired and all of the fakerefs will now return None. But just as with weakrefs, storing the result of a call will make it a strong reference again, and your class will not be able to delete itself in the manner you desire.
All this said, I don't think you've presented a good enough case for why this is necessary. All you've said is that "there's no reason for it to stay in memory". Well, there is: it needs to be there for the objects that reference it. If, at some point in time, it becomes useless, then your objects shouldn't be holding a reference to it. When the objects referencing it don't care about it any more, they should remove those references. Then Python will clean it up with no further intervention on your part.
1 If you don't want to rely on weak references, your fakeref can implement __del__ and remove itself from your Foo instance it holds a reference to (if not None).
Related
While trying to use introspection to navigate from strings to classes via some of the suggestions in Convert string to Python class object? I noticed that the given approaches won't work to get at a class in scope local to a function. Consider the following code:
import sys
def f():
class LocalClass:
pass
print LocalClass
print 'LocalClass' in dir(sys.modules[__name__])
which gives output
__main__.LocalClass
False
I'm a bit confused as to why LocalClass seems to belong to the main module according to the class object itself, and yet not accessible through sys.modules. Can someone give an explanation?
And is there a way to generate a class from a string, even if that class is only in non-global scope?
In the function f, LocalClass is indeed local. You can see this by trying __main__.LocalClass and seeing that AttributeError: 'module' object has no attribute 'LocalClass' is raised.
As to why the class returns __main__.LocalClass is because by default, the __repr__ function returns <cls.__module__>.<cls.__name__>.
The reason why dir isn't finding it is because it only looks at the variables defined in its scope. LocalClass is local so it won't show up if you are looking in the main module.
A way to create a class from a string can be done in many ways.
The first and easiest to understand is by using exec. Now you shouldn't just go around using exec for random things so I wouldn't reccomend using this method.
The second method is by using the type function. The help page for it returns type(name, bases, dict). This means you can create a class called LocalClass subclassed by object with the attribute foo set to "bar" by doing type("LocalClass", (object,), {"foo": "bar"}) and catching the returned class in a variable. You can make the class global by doing globals()["LocalClass"] = ...
PS: An easier (not sure if prettier) way to get the main module is by doing import __main__. This can be used in any module but I would generally advise against using this unless you know what you are doing because in general, python people don't like you doing this sort of thing.
EDIT: after looking at the linked question, you dont want to dynamically create a new class but to retrieve a variable given it's name. All the answers in the linked question will do that. I'll leave you up to deciding which one you prefer the most
EDIT2: LocalClass.__module__ is the same as __main__ because that was the module you defined the class. If you had defined it in module Foo that was imported by __main__ (and not actually ran standalone), you would find that __module__ would be "B". Even though LocalClass was defined in __main__, it won't automatically go into the global table just because it is a class - in python, as you might have already known, (almost) EVERYTHING is an object. The dir function searches for all variables defined in a scope. As you are looking in the main scope, it is nearly equivalent to be doing __dict__ or globals() but with some slight differences. Because LocalClass is local, it isn't defined in the global context. If however you did locals() whilst inside the function f, you would find that LocalClass would appear in that list
I am quite amateur in OOP concepts of python so I wanted to know are the functionalities of self of Python in any way similar to those of this keyword of CPP/C#.
self & this have the same purpose except that self must be received explicitly.
Python is a dynamic language. So you can add members to your class. Using self explicitly let you define if you work in the local scope, instance scope or class scope.
As in C++, you can pass the instance explicitly. In the following code, #1 and #2 are actually the same. So you can use methods as normal functions with no ambiguity.
class Foo :
def call(self) :
pass
foo = Foo()
foo.call() #1
Foo.call(foo) #2
From PEP20 : Explicit is better than implicit.
Note that self is not a keyword, you can call it as you wish, it is just a convention.
Yes they implement the same concept. They serve the purpose of providing a handle to the instance of class, on which the method was executed. Or, in other wording, instance through which the method was called.
Probably someone smarter will come to point out the real differences but for a quite normal user, pythonic self is basically equivalent to c++ *this.
However the reference to self in python is used way more explicitly. E.g. it is explicitly present in method declarations. And method calls executed on the instance of object being called must be executed explicitly using self.
I.e:
def do_more_fun(self):
#haha
pass
def method1(self, other_arg):
self.do_more_fun()
This in c++ would look more like:
void do_more_fun(){
//haha
};
void method1(other_arg){
do_more_fun();
// this->do_more_fun(); // also can be called explicitly through `this`
}
Also as juanchopanza pointed out, this is a keyword in c++ so you cannot really use other name for it. This goes in pair with the other difference, you cannot omit passing this in c++ method. Only way to do it is make it static. This also holds for python but under different convention. In python 1st argument is always implicitly assigned the reference to self. So you can choose any name you like. To prevent it, and be able to make a static method in python, you need to use #staticmethod decorator (reference).
This question already has answers here:
Is it possible to dereference variable id's?
(4 answers)
Closed 3 years ago.
Let's say I have an id of a Python object, which I retrieved by doing id(thing). How do I find thing again by the id number I was given?
If the object is still there, this can be done by ctypes:
import ctypes
a = "hello world"
print ctypes.cast(id(a), ctypes.py_object).value
output:
hello world
If you don't know whether the object is still there, this is a recipe for undefined behavior and weird crashes or worse, so be careful.
You'll probably want to consider implementing it another way. Are you aware of the weakref module?
(Edited) The Python weakref module lets you keep references, dictionary references, and proxies to objects without having those references count in the reference counter. They're like symbolic links.
You can use the gc module to get all the objects currently tracked by the Python garbage collector.
import gc
def objects_by_id(id_):
for obj in gc.get_objects():
if id(obj) == id_:
return obj
raise Exception("No found")
Short answer, you can't.
Long answer, you can maintain a dict for mapping IDs to objects, or look the ID up by exhaustive search of gc.get_objects(), but this will create one of two problems: either the dict's reference will keep the object alive and prevent GC, or (if it's a WeakValue dict or you use gc.get_objects()) the ID may be deallocated and reused for a completely different object.
Basically, if you're trying to do this, you probably need to do something differently.
Just mentioning this module for completeness. This code by Bill Bumgarner includes a C extension to do what you want without looping throughout every object in existence.
The code for the function is quite straightforward. Every Python object is represented in C by a pointer to a PyObject struct. Because id(x) is just the memory address of this struct, we can retrieve the Python object just by treating x as a pointer to a PyObject, then calling Py_INCREF to tell the garbage collector that we're creating a new reference to the object.
static PyObject *
di_di(PyObject *self, PyObject *args)
{
PyObject *obj;
if (!PyArg_ParseTuple(args, "l:di", &obj))
return NULL;
Py_INCREF(obj);
return obj;
}
If the original object no longer exists then the result is undefined. It may crash, but it could also return a reference to a new object that's taken the location of the old one in memory.
eGenix mxTools library does provide such a function, although marked as "expert-only": mx.Tools.makeref(id)
This will do:
a = 0
id_a = id(a)
variables = {**locals(), **globals()}
for var in variables:
exec('var_id=id(%s)'%var)
if var_id == id_a:
exec('the_variable=%s'%var)
print(the_variable)
print(id(the_variable))
But I suggest implementing a more decent way.
As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods
This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.
You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.
I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.
You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...
It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.
At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.