Hierarchy / Flyweight / Instancing Problem in Python - python

Here is the problem I am trying to solve, (I have simplified the actual problem, but this should give you all the relevant information). I have a hierarchy like so:
1.A
1.B
1.C
2.A
3.D
4.B
5.F
(This is hard to illustrate - each number is the parent, each letter is the child).
Creating an instance of the 'letter' objects is expensive (IO, database costs, etc), so should only be done once.
The hierarchy needs to be easy to navigate.
Children in the hierarchy need to have just one parent.
Modifying the contents of the letter objects should be possible directly from the objects in the hierarchy.
There needs to be a central store containing all of the 'letter' objects (and only those in the hierarchy).
'letter' and 'number' objects need to be possible to create from a constructor (such as Letter(**kwargs) ).
It is perfectably acceptable to expect that when a letter changes from the hierarchy, all other letters will respect the same change.
Hope this isn't too abstract to illustrate the problem.
What would be the best way of solving this? (Then I'll post my solution)
Here's an example script:
one = Number('one')
a = Letter('a')
one.addChild(a)
two = Number('two')
a = Letter('a')
two.addChild(a)
for child in one:
child.method1()
for child in two:
print '%s' % child.method2()

A basic approach will use builtin data types. If I get your drift, the Letter object should be created by a factory with a dict cache to keep previously generated Letter objects. The factory will create only one Letter object for each key.
A Number object can be a sub-class of list that will hold the Letter objects, so that append() can be used to add a child. A list is easy to navigate.
A crude outline of a caching factory:
>>> class Letters(object):
... def __init__(self):
... self.cache = {}
... def create(self, v):
... l = self.cache.get(v, None)
... if l:
... return l
... l = self.cache[v] = Letter(v)
... return l
>>> factory=Letters()
>>> factory.cache
{}
>>> factory.create('a')
<__main__.Letter object at 0x00EF2950>
>>> factory.create('a')
<__main__.Letter object at 0x00EF2950>
>>>
To fulfill requirement 6 (constructor), here is
a more contrived example, using __new__, of a caching constructor. This is similar to Recipe 413717: Caching object creation .
class Letter(object):
cache = {}
def __new__(cls, v):
o = cls.cache.get(v, None)
if o:
return o
else:
o = cls.cache[v] = object.__new__(cls)
return o
def __init__(self, v):
self.v = v
self.refcount = 0
def addAsChild(self, chain):
if self.refcount > 0:
return False
self.refcount += 1
chain.append(self)
return True
Testing the cache functionality
>>> l1 = Letter('a')
>>> l2 = Letter('a')
>>> l1 is l2
True
>>>
For enforcing a single parent, you'll need a method on Letter objects (not Number) - with a reference counter. When called to perform the addition it will refuse addition if the counter is greater than zero.
l1.addAsChild(num4)

Related

The zen of python applied to methods in classes

The Zen of python tells us:
There should be one and only one obvious way to do it.
This is difficult to put in practice when it comes to the following situation.
A class receives a list of documents.
The output is a dictionary per document with a variety of key/value pairs.
Every pair depends on a previous calculated one or even from other value/pairs of other dictionary of the list.
This is a very simplified example of such a class.
What is the “obvious” way to go? Every method adds a value/pair to every of the dictionaries.
class T():
def __init__(self,mylist):
#build list of dicts
self.J = [{str(i):mylist[i]} for i in range(len(mylist))]
# enhancement 1: upper
self.method1()
# enhancement 2: lower
self.J = self.static2(self.J)
def method1(self):
newdict = []
for i,mydict in enumerate(self.J):
mydict['up'] = mydict[str(i)].upper()
newdict.append(mydict)
self.J = newdict
#staticmethod
def static2(alist):
J = []
for i,mydict in enumerate(alist):
mydict['down'] = mydict[str(i)].lower()
J.append(mydict)
return J
#property
def propmethod(self):
J = []
for i,mydict in enumerate(self.J):
mydict['prop'] = mydict[str(i)].title()
J.append(mydict)
return J
# more methods extrating info out of every doc in the list
# ...
self.method1() is simple run and a new key/value pair is added to every dict.
The static method 2 can also be used.
and also the property.
Out of the three ways I discharge #property because I am not adding another attribute.
From the other two which one would you choose?
Remember the class will be composed by tens of this Methode that so not add attributes. Only Update (add keine pair values) dictionaries in a list.
I can not see the difference between method1 and static2.
thx.

Initiate subclasses from parent class

Suppose I have a list of inputs that will generate O objects, of the following form:
inps = [['A', 5], ['B', 2]]
and O has subclasses A and B. A and B each are initiated with a single integer --
5 or 2 in the example above -- and have a method update(self, t), so I believe it makes sense to group them under an O superclass. I could complete the program with a loop:
Os = []
for inp in inps:
if inp[0] == 'A':
Os.append(A(inp[1]))
elif inp[0] == 'B':
Os.append(B(inp[1]))
and then at runtime,
for O in Os: O.update(t)
I'm wondering, however, if there is a more object oriented way to accomplish this. One way, I suppose, might be to make a fake "O constructor" outside of the O class:
def initO(inp):
if inp[0] == 'A':
return A(inp[1])
elif inp[0] == 'B':
return B(inp[1])
Os = [initO(inp) for inp in inps]
This is more elegant, in my opinion, and for all intensive purposes gives me the result I want; but it feels like a complete abuse of the class system in python. Is there a better way to do this, perhaps by initiating A and B from the O constructor?
EDIT: The ideal would be to be able to use
Os = [O(inp) for inp in inps]
while maintaining O as a superclass of A and B.
You could use a dict to map the names to the actual classes:
dct = {'A': A, 'B': B}
[dct[name](argument) for name, argument in inps]
Or if you don't want the list-comprehension:
dct = {'A': A, 'B': B}
Os = []
for inp in inps:
cls = dct[inp[0]]
Os.append(cls(inp[1]))
Although it is technically possible to perform call by name in Python, I strongly advice not to do that. The cleanest way is probably using a dictionary:
trans = { 'A' : A, 'B' : B }
def initO(inp):
cons = trans.get(inp[0])
if cons is not None:
return cons(*inp[1:])
So here trans is a dictionary that maps names on classes (and thus corresponding constructors).
In the initO we perform a lookup, if the lookup succeeds, we call the constructor cons with the remaining arguments of inp.
In case you really want to create a (direct) subclass from within a parent class you could use the special __subclasses__ method:
class O(object):
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
# probably not a really good name for that method - I'm out of creativity...
subcls = next(sub for sub in cls.__subclasses__() if sub.__name__ == subclassname)
return subcls(value)
def __repr__(self):
return '{self.__class__.__name__}({self.value})'.format(self=self)
class A(O):
pass
class B(O):
pass
This acts like a factory:
>>> O.get_subclass('A', 1)
A(1)
Or as list-comprehension:
>>> [O.get_subclass(*inp) for inp in inps]
In case you want to optimize it and you know that you won't add subclasses during the programs progress you could put the subclasses in a dictionary that maps from __name__ to the subclass:
class O(object):
__subs = {}
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
if not cls.__subs:
cls.__subs = {sub.__name__: sub for sub in cls.__subclasses__()}
return cls.__subs[subclassname](value)
You could probably also use __new__ to implement that behavior or a metaclass but I think a classmethod may be more appropriate here because it's easy to understand and allows for more flexibility.
In case you not only want direct subclasses you might want to check this recipe to find even subclasses of your subclasses (I also implemented it in a 3rd party extension package of mine: iteration_utilities.itersubclasses).
Without knowing more about your A and B, it's hard to say. But this looks like a classic case for a switch in a language like C. Python doesn't have a switch statement, so the use of a dict or dict-like construct is used instead.
If you're sure your inputs are clean, you can directly get your classes using the globals() function:
Os = [globals()[f](x) for (f, x) in inps]
If you want to sanitize, you can do something like this:
allowed = {'A', 'B'}
Os = [globals()[f](x) for (f, x) in inps if f in allowed]
This solution can also be changed if you prefer to have a fixed dictionary and sanitized inputs:
allowed = {'A', 'B'}
classname_to_class = {k: v for (k, v) in globals().iteritems() if k in allowed}
# Now, you can have a dict mapping class names to classes without writing 'A': A, 'B': B ...
Alternately, if you can prefix all your class definitions, you could even do something like this:
classname_to_class = {k[13:]: v for (k, v) in globals().iteritems() if k.startswith('SpecialPrefix'} # 13: is the length of 'SpecialPrefix'
This solution allows you to just name your classes with a prefix and have the dictionary automatically populate (after stripping out the special prefix if you so choose). These dictionaries are equivalent to trans and dct in the other solutions posted here, except without having to manually generate the dictionary.
Unlike the other solutions posted so far, these reduce the likelihood of a transcription error (and the amount of boilerplate code required) in cases where you have a lot more classes than A and B.
At the risk of drawing more negative fire... we can use metaclasses. This may or may not be suitable for your particular application. Every time you define a subclass of class O, you always have an up-to-date list (well, dict) of O's subclasses. Oh, and this is written for Python 2 (but can be ported to Python 3).
class OMetaclass(type):
'''This metaclass adds a 'subclasses' attribute to its classes that
maps subclass name to the class object.'''
def __init__(cls, name, bases, dct):
if not hasattr(cls, 'subclasses'):
cls.subclasses = {}
else:
cls.subclasses[name] = cls
super(OMetaclass, cls).__init__(name, bases, dct)
class O(object):
__metaclass__ = OMetaclass
### Now, define the rest of your subclasses of O as usual.
class A(O):
def __init__(self, x): pass
class B(O):
def __init__(self, x): pass
Now, you have a dictionary, O.subclasses, that contains all the subclasses of O. You can now just do this:
Os = [O.subclasses[cls](arg) for (cls, arg) in inps]
Now, you don't have to worry about weird prefixes for your classes and you won't need to change your code if you're subclassing O already, but you've introduced magic (metaclasses) that may make your program harder to grok.

Removing list item if not in another list - python

Here is my situation.
I have a list of Person objects.
class Person():
def __init__(self, name="", age=):
self.name = name
self.uid = str( uuid.uuid4( ) )
self.age = age
My UI contains a treeview displaying these items. In some cases users can have a instance of the same person if they want. I highlight those bold to let the user know it's the same person.
THE PROBLEM
When a user deletes a tree node I then need to know if I should remove the actual object from the list. However if another instance of that object is being used then I shouldn't delete it.
My thoughts for a solution.
Before the delete operation takes place, which removes just the treenode items, I would collect all persons being used in the ui.
Next I would proceed with deleting the treeview items.
Next take another collection of objevst being used in the ui.
Laslty compare the two lists and delete persons not appearing in second list.
If I go this solution would I be best to do a test like
for p in reversed(original_list):
if p not in new_list:
original_list.remove(p)
Or should I collect the uid numbers instead to do the comparisons rather then the entire object?
The lists could be rather large.
Herr is the code with my first attempt at handling the remove operation. It saves out a json file when you close the app.
https://gist.github.com/JokerMartini/4a78b3c5db1dff8b7ed8
This is my function doing the deleting.
def delete_treewidet_items(self, ctrl):
global NODES
root = self.treeWidget.invisibleRootItem()
# delete treewidget items from gui
for item in self.treeWidget.selectedItems():
(item.parent() or root).removeChild(item)
# collect all uids used in GUI
uids_used = self.get_used_uids( root=self.treeWidget.invisibleRootItem() )
for n in reversed(NODES):
if n.uid not in uids_used:
NODES.remove(n)
You have not really posted enough code but from what I can gather:
import collections
import uuid
class Person():
def __init__(self, name="", age=69):
self.name = name
self.uid = str( uuid.uuid4( ) )
self.age = age
def __eq__(self, other):
return isinstance(other, Person) and self.uid == other.uid
def __ne__(self, other): return self != other # you need this
def __hash__(self):
return hash(self.uid)
# UI --------------------------------------------------------------------------
persons_count = collections.defaultdict(int) # belongs to your UI class
your_list_of_persons = [] # should be a set
def add_to_ui(person):
persons_count[person] += 1
# add it to the UI
def remove_from_ui(person):
persons_count[person] -= 1
if not persons_count[person]: your_list_of_persons.remove(person)
# remove from UI
So basically:
before the delete operation takes place, which removes just the treenode items, I would collect all persons being used in the ui.
No - you have this info always available as a module variable in your ui - the persons_count above. This way you don't have to copy lists around.
Remains the code that creates the persons - then your list (which contains distinct persons so should be a set) should be updated. If this is done in add_to_ui (makes sense) you should modify as:
def add_to_ui(name, age):
p = Person(name, age)
set_of_persons.add(p) # if already there won't be re-added and it's O(1)
persons_count[person] += 1
# add it to the UI
To take this a step further - you don't really need your original list - that is just persons_count.keys(), you just have to modify:
def add_to_ui(name, age):
p = Person(name, age)
persons_count[person] += 1
# add it to the UI
def remove_from_ui(person):
persons_count[person] -= 1
if not persons_count[person]: del persons_count[person]
# remove from UI
So you get the picture
EDIT: here is delete from my latest iteration:
def delete_tree_nodes_clicked(self):
root = self.treeWidget.invisibleRootItem()
# delete treewidget items from gui
for item in self.treeWidget.selectedItems():
(item.parent() or root).removeChild(item)
self.highlighted.discard(item)
persons_count[item.person] -= 1
if not persons_count[item.person]: del persons_count[item.person]
I have posted my solution (a rewrite of the code linked to the first question) in: https://github.com/Utumno/so_34104763/commits/master. It's a nice exercise in refactoring - have a look at the commit messages. In particular I introduce the dict here: https://github.com/Utumno/so_34104763/commit/074b7e659282a9896ea11bbef770464d07e865b7
Could use more work but it's a step towards the right direction I think - should be faster too in most operations and conserve memory
Not worrying too much about runtime or size of lists, you could use set-operations:
for p in set(original_list) - set(new_list):
original_list.remove(p)
Or filter the list:
new_original_list = [p for p in original_list if p in new_list]
But then again, why look at the whole list - when one item (or even a non-leaf node in a tree) is deleted, you know which item was deleted, so you could restrict your search to just that one.
You can compare objects using:
object identity
object equality
To compare objects identity you should use build in function id() or keyword is (that uses id()). From docs:
id function
Return the “identity” of an object. This is an integer (or long
integer) which is guaranteed to be unique and constant for this object
during its lifetime. Two objects with non-overlapping lifetimes may
have the same id() value.
is operator
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. x is not y yields the
inverse truth value.
Example:
>>> p1 = Person('John')
>>> p2 = Person('Billy')
>>> id(p1) == id(p2)
False
>>> p1 is p2
False
To compare object equality you use == operator. == operator uses eq method to test for equality. If class does not define such method it falls back to comparing identity of objects.
So for:
Or should I collect the uid numbers instead to do the comparisons
rather then the entire object?
you would be doing the same thing since you have not defined eq in your class.
To filter lists do not modify list while you are iterating over it is bad. Guess what will be printed:
>>> a = [1, 2, 3]
>>> b = [1, 2]
>>> for item in a:
... if item in b:
... a.remove(item)
>>> a
[2, 3]
If you want to do this safely iterate over list from the back like:
>>> a = [1, 2, 3]
>>> b = [1, 2]
>>> for i in xrange(len(a) - 1, -1, -1):
... if a[i] in b:
... a.pop(i)
2
1
>>> a
[3]

python: need a deepcopy equivalent breaking all shared identity

Due to some constrains I need to create a fresh copy of an object alongwith fresh copies of all its attributes and for attributes of its attributes and so on recursively.
Existing deepcopy() is recursive, but when multiple objects within the tree being copied have the same starting identity, they also have the same ending identity (even though their ending identities don't match their starting identities).
For the following case:
class A:
def __init__(self, x):
self.x = x
v = A(1)
o = [v, v]
copy.deepcopy does following:
dc_o = copy.deepcopy(o)
assert dc_o[0] is not o[0] # new identity from the original
assert dc_o[0] is dc_o[1] # but maintains identity within the copied tree
assert dc_o[0] == dc_o[1] # ...as well as value
But, what I need is:
r_dc_o = recursive_deepcopy(o)
assert r_dc_o[0] is not o[0] # new identity from the original
assert r_dc_o[0] is not r_dc_o[1] # also new identity from elsewhere inside copy
assert r_dc_o[0] == r_dc_o[1] # while maintaining the same value
How can I do this?
Fully automating a recursive deepcopy in a way that didn't memoize objects would be extremely dangerous -- it would mean you couldn't have any kind of objects with internal references preserved in a way that would make those references useful after the copy operation (think about objects with a "parent" link, or objects that link to a shared registry or similar resource). That said, if you really wanted to do this (and you shouldn't -- it will break a great many objects passed through the operation), you can accomplish it by constructing a memo dictionary that ignored attempts at adding keys, and passing that as a second argument to deepcopy().
So, here we are:
import copy
class baddict(dict):
def __setitem__(self, k, v):
pass
class A:
def __init__(self, x):
self.x = x
def __eq__(self, other):
self.x == other.x
v = A(1)
o = [v, v]
r_dc_o = copy.deepcopy(o, baddict())
assert r_dc_o[0] is not r_dc_o[1]
assert r_dc_o[0] == r_dc_o[1]
I'd suggest thinking about why you need this behavior, and trying to come up with a better way to accomplish it. Even a baddict implementation that looked at the value and skipped memoizing only if values were instances of a specific class would be safer than what we're doing here.

Inspect python class attributes

I need a way to inspect a class so I can safely identify which attributes are user-defined class attributes. The problem is that functions like dir(), inspect.getmembers() and friends return all class attributes including the pre-defined ones like: __class__, __doc__, __dict__, __hash__. This is of course understandable, and one could argue that I could just make a list of named members to ignore, but unfortunately these pre-defined attributes are bound to change with different versions of Python therefore making my project volnerable to changed in the python project - and I don't like that.
example:
>>> class A:
... a=10
... b=20
... def __init__(self):
... self.c=30
>>> dir(A)
['__doc__', '__init__', '__module__', 'a', 'b']
>>> get_user_attributes(A)
['a','b']
In the example above I want a safe way to retrieve only the user-defined class attributes ['a','b'] not 'c' as it is an instance attribute. So my question is... Can anyone help me with the above fictive function get_user_attributes(cls)?
I have spent some time trying to solve the problem by parsing the class in AST level which would be very easy. But I can't find a way to convert already parsed objects to an AST node tree. I guess all AST info is discarded once a class has been compiled into bytecode.
Below is the hard way. Here's the easy way. Don't know why it didn't occur to me sooner.
import inspect
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
return [item
for item in inspect.getmembers(cls)
if item[0] not in boring]
Here's a start
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
attrs = {}
bases = reversed(inspect.getmro(cls))
for base in bases:
if hasattr(base, '__dict__'):
attrs.update(base.__dict__)
elif hasattr(base, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
This should be fairly robust. Essentially, it works by getting the attributes that are on a default subclass of object to ignore. It then gets the mro of the class that's passed to it and traverses it in reverse order so that subclass keys can overwrite superclass keys. It returns a dictionary of key-value pairs. If you want a list of key, value tuples like in inspect.getmembers then just return either attrs.items() or list(attrs.items()) in Python 3.
If you don't actually want to traverse the mro and just want attributes defined directly on the subclass then it's easier:
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
if hasattr(cls, '__dict__'):
attrs = cls.__dict__.copy()
elif hasattr(cls, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
Double underscores on both ends of 'special attributes' have been a part of python before 2.0. It would be very unlikely that they would change that any time in the near future.
class Foo(object):
a = 1
b = 2
def get_attrs(klass):
return [k for k in klass.__dict__.keys()
if not k.startswith('__')
and not k.endswith('__')]
print get_attrs(Foo)
['a', 'b']
Thanks aaronasterling, you gave me the expression i needed :-)
My final class attribute inspector function looks like this:
def get_user_attributes(cls,exclude_methods=True):
base_attrs = dir(type('dummy', (object,), {}))
this_cls_attrs = dir(cls)
res = []
for attr in this_cls_attrs:
if base_attrs.count(attr) or (callable(getattr(cls,attr)) and exclude_methods):
continue
res += [attr]
return res
Either return class attribute variabels only (exclude_methods=True) or also retrieve the methods.
My initial tests og the above function supports both old and new-style python classes.
/ Jakob
If you use new style classes, could you simply subtract the attributes of the parent class?
class A(object):
a = 10
b = 20
#...
def get_attrs(Foo):
return [k for k in dir(Foo) if k not in dir(super(Foo))]
Edit: Not quite. __dict__,__module__ and __weakref__ appear when inheriting from object, but aren't there in object itself. You could special case these--I doubt they'd change very often.
Sorry for necro-bumping the thread. I'm surprised that there's still no simple function (or a library) to handle such common usage as of 2019.
I'd like to thank aaronasterling for the idea. Actually, set container provides a more straightforward way to express it:
class dummy: pass
def abridged_set_of_user_attributes(obj):
return set(dir(obj))-set(dir(dummy))
def abridged_list_of_user_attributes(obj):
return list(abridged_set_of_user_attributes(obj))
The original solution using list comprehension is actually two level of loops because there are two in keyword compounded, despite having only one for keyword made it look like less work than it is.
This worked for me to include user defined attributes with __ that might be be found in cls.__dict__
import inspect
class A:
__a = True
def __init__(self, _a, b, c):
self._a = _a
self.b = b
self.c = c
def test(self):
return False
cls = A(1, 2, 3)
members = inspect.getmembers(cls, predicate=lambda x: not inspect.ismethod(x))
attrs = set(dict(members).keys()).intersection(set(cls.__dict__.keys()))
__attrs = {m[0] for m in members if m[0].startswith(f'_{cls.__class__.__name__}')}
attrs.update(__attrs)
This will correctly yield: {'_A__a', '_a', 'b', 'c'}
You can update to clean the cls.__class__.__name__ if you wish

Categories

Resources