I was wondering what would be the best way, performance wise, to pass shared arguments to threads (e.g. an input Queue).
I used to pass them as arguments to the __init__ function, because that's what I saw in most of the examples out there in the internet.
But I was wondering whether it would be faster to set them as class variables, is there a reason not to do that?
Here is what I mean:
class Worker(threading.Thread):
def __init__(self, in_q):
self.in_q = in_q
or:
class Worker(threading.Thread):
in_q = None
def __init__(self):
...
...
def main():
Worker.in_q = Queue.Queue()
Class attributes are sometimes called "static" for a reason. They are part of the static model structure and tell something about the classes. Attributes tell something about the object in the runtime. This does not apply to your case.
For example, at some point you may want to have, e.g. two separate groups of workers running in parallel, but sharing different queues. The design with the static attributes will prevent you from doing that. Basically, that's a slightly disguised global variable with the same drawbacks (implicit coupling, encapsulation leakage etc).
Related
Overview
I need to duplicate a whole inheritance tree of classes. Simply deep-copying the class objects does not work; a proper factory pattern involves a huge amount of code changes; I'm not sure how to use metaclasses to accomplish this.
Background
The software I work on implements support for specialized external hardware, connected to the host computer via USB. Many years ago, it was assumed that there would only ever be one type of hardware in use at a time. Consequently, the hardware object is used as a singleton. Along the years, secondary classes were configured based on the currently active hardware class.
At the moment, it is impossible to use this library with two types of hardware at the same time, since the classobjects cannot be configured for both hardware at the same time.
In recent years, we have avoided this issue by creating one python process for each hardware, but this is becoming untenable.
Here is an extremely simplified example of the architecture:
# ----------
# Hardware classes
class HwBase():
def customizeComponent(self, compDict):
compDict['ComponentBase'].hardware = self
class HwA(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(1,2,3)
class HwB(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(4,5,6)
# ----------
# Property classes
class SpecialProperty(property):
def __init__(self, fvalidate):
self.fvalidate = fvalidate
# handle fset, fget, etc. here.
# super().__init__()
# ----------
# Component classes
class ComponentBase():
hardware = None
def validateProp(self, val):
return val < self.maxVal
prop = SpecialProperty(fvalidate=validateProp)
class SomeComponent():
"""Users directly instantiate and use this compoent via an interactive shell.
This component does complex operations with the hardware attribute"""
def validateThing(self, val):
return isinstance(val, ComponentBase)
thing = SpecialProperty(fvalidate=validateThing)
class AnotherComponent():
"""Users directly instantiate and use this compoent via an interactive shell
This component does complex operations with the hardware attribute"""
maxVal = 15
# ----------
# Initialization
def initialize():
""" This is only called once perppython instance."""
#activeCls = HwA
activeCls = HwB
allComponents = {
'ComponentBase': ComponentBase,
'SomeComponent': SomeComponent,
'AnotherComponent': AnotherComponent
}
hwInstance = activeCls()
hwInstance.customizeComponent(allComponents)
return allComponents
components = initialize()
# ----------
# User code goes here
someInstance1 = components['SomeComponent']()
someInstance2 = components['SomeComponent']()
someInstance1.prop = 10
someInstance2.prop = 10
The overarching goal would be to interact with both HwA and HwB at the same time. Since most interactions are done via components instead of the Hw objects themselves, I believe the solution involves having multiple versions of the components, e.g.: two separate inheritance trees, for a total of 6 final components, one tree/set configured for each hardware. This is what I need help with.
Potential solutions
Consider that I have around tens different hardware do configure for. Furthermore, there are hundreds of different leaf components classes, with many extra bases and mixin classes.
Move all configuration steps in the component's init method
Not possible due to the use of properties; these need to be set on the class.
Deepcopy the classobjects
Copy all classobjects, swap in the appropriate __bases__. Mutable class variables need to be carefully handled. However, I'm not sure how to deal with properties for this, since classbody references within the property objects (such as fvalidate) need to be updated to that of the copied class.
This requires a significant amount of manual intervention to work. Not impossible, but prone to breaking in the long term.
Factory pattern
Wrap all component definition in a factory function:
def ComponentBaseFactory(hw):
class SomeComponent(cache[hw].ComponentBase):
pass
and have some sort of component cache which would handle creating all classobjects during initialize()
This is what I consider the most architecturally-correct option available. Since the class body is re-executed
on every factory call, the attributes of the properties will reference the appropriate class object.
Downside: huge code footprint. I am familiar with doing codebase-wide changes via sed or python scripts, but this would be quite a lot.
Add metaclasses on components
I am not sure how to proceed for this. Based on the python data model (py3.7), the following happens at class creation (which happens right after the class definition indentation ends):
MRO entries are resolved;
the appropriate metaclass is determined;
the class namespace is prepared;
the class body is executed;
the class object is created.
I would need to redo these steps after the class has been defined (like a factory function!), but i'm not sure how to redo step 4. Specifically, the python documentation states in section 3.3.3.5 that the class body is executed as with a "special?" form of the exec() builtin. How can I re-exec the class body with a different set of locals/globals? Even if I access the class body's code with inspect shenanigans, i'm not sure i'll be able to reproduce the module environment properly.
Even if I mess with __prepare__ and __new__, I don't see how I can fix the cross-references introduced in the class code block regarding the property instantiation.
Components as metaclasses
A metaclass is a class factory, just like a class is an object factory. SomeComponent and AnotherComponent could be declared as metaclasses, then get instantiated with the Hw object during initialize():
SomeComponent = SomeComponentMeta(hw)
This is similar to the factory pattern, but would also require quite a few code changes: a lot of class code would have to be moved to the metaclass __init__.
I'd have to spend a lot more of time here to proper understand what you need, but if your "TL;DR" of executing the class body with different globals/nonlocal variables is the bottom line, the factory approach is a very clean and readable way, as you had considered.
At first, I don't think a metaclass could be a good approach here - although it could be used to customize your special properties (in my first read, I could not figure out what they actually do, and how they should differ between your final classes). If the function as a class factory can specialize your properties, it would work nonetheless.
If what you need is that the properties are independent for Hwa and HwB like in accessing a different list object in HwA than is accessed in HwB, yes, a metaclass could take care of that, by automatically recreating any properties when creating a subclass (so that the property objects themselves are not shared with the supper-classes and across the hierarchy).
If that i what you need, leave a comment, I can write some proof of concept code.
Anyway, it is possible to create a metaclass that, upon instantiating a subclass, will look upon the hierarchy for all SpecialProperty and create new-instances of those for the subclass - so that a base value set on a superclass remains valid for the subclasses, but when configuration runs, each class will have an independent configuration. (as it turns out, no metaclass is needed: we are covered by __init_subclass__ )
Another thing to take care of is that subclassses of property cannot be simply copies with Python's copy.copy (tested empirically), so we need a way to create reliable copies of those. I include one function bellow, but it might need to be improved to work with the actual SpecialProperty class.
from copy import copy
def copy_property(prop):
cls = prop.__class__
new_prop = cls.__new__(cls)
# Initialize the attributes that can't be set from Python code, inplace:
property.__init__(new_prop, prop.fget, prop.fset, prop.fdel)
if hasattr(prop, "__dict__"): # only exists for subclasses of property
# Possible adaptation needed: it may be that for some attributes of
# SpecialProperty, a deepcopy would be needed.
# But for the given example attribute of "fvalidate" a simple copy is better:
new_prop.__dict__ = copy(prop.__dict__)
return new_prop
# Python 3.6 introduced `__init_subclass__` which is called at subclass _creation_
# time. With it, the logic can be inserted in ComponentBase and there is no need for
# a metaclass.
class ComponentBase():
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
for attrname in dir(cls):
attr = getattr(cls, attrname)
if not isinstance(attr, SpecialProperty):
continue
new_prop = copy_property(attr)
setattr(cls, attrname, new_prop)
hardware = None
...
As you see- theres some workarounds that had to be done because your project opted for subclassing property. I am leaving this remark here as a remainder that unless property fits one exact needs, it is more clean to write a new class implementing the Descriptor Protocol - just by implementing __set__, __get__ and __delete__ directly.
I am writing some code that is an upside down triangle of inheritance. I have a base Linux class that has a CLIENT attr which holds a connection. I have several APIs that are logically separated (kvm, yum, gdb, dhcp, etc..) that use CLIENT but I only want the user to need to create a single instance of Linux class but be able to call all the methods from the Parent classes. While maintaining the nice logical code separation among the parents:
class Linux(
SSHClient,
yum.Yum,
kvm.Kvm,
ldap.Ldap,
lshw.Lshw,
packet_capture.Tcpdump,
tc.TrafficControl,
networking.Networking,
gdb.Gdb,
dhcp.Dhcp,
httputil.Http,
scp.Scp,
fileutils.FileUtils):
I made a little example:
class Dad(object):
def __init__(self):
raise NotImplementedError("Create Baby instead")
def dadCallBaby(self):
print('sup {}'.format(self.babyName))
class Mom(object):
def __init__(self):
raise NotImplementedError("Create Baby instead")
def momCallBaby(self):
print('goochi goo {}'.format(self.babyName))
class Baby(Mom, Dad):
def __init__(self, name):
self.babyName = name
def greeting(self):
self.momCallBaby()
self.dadCallBaby()
x=Baby('Joe')
x.greeting()
What is doing this called? Is this Duck Typing? And is there a better option?
There's really no such thing as "child-only attributes".
The attribute babyName is just stored in each object's namespace, and looked up there. Python doesn't care that it happened to be stored by Baby.__init__. And in fact, you can write store the same attribute on a Mom that isn't a Baby and it will work the same way:
class NotABaby(Mom):
def __init__(self): pass
mom = NotABaby()
mom.babyName = 'Me?'
mom.momCallBaby()
Also, it's hard to suggest a better way to do what you're doing, because what you're doing is inherently confusing and probably shouldn't be done.
Inheritance normally means subtyping—that is, Baby should only be a subclass of Mom if every Baby instance is usable as a Mom.1
But a baby is not a mom and a dad.2 A baby has a mom and a dad. And the way to represent that is by giving Baby attributes for its mom and dad:
class Baby(object):
def __init__(self, mom, dad, name):
self.mom, self.dad, self.name = mom, dad, name
def greeting(self):
self.mom.momCallBaby(self.name)
self.dad.dadCallBaby(self.name)
Notice that, e.g., this means that the same woman can be the mom of two babies. Since that's also true of the real-life thing you're modeling here, that's a sign that you're modeling things correctly.
Your "real" example is a little less clear, but I suspect the same thing is going on there.
The only reason you want to use inheritance, as far as I can tell, is:
I only want the user to need to create a single instance of Linux class
You don't need, or want, inheritance for that:
class Linux(object):
def __init__(self):
self.ssh_client = SSHClient()
self.yum = yum.Yum()
# etc.
… but be able to call all the methods from the Parent classes
If yum.Yum, ldap.Ldap and dhcp.Dhcp both have methods named lookup, which one would be called by Linux.lookup?
What you probably want is to just leave the attributes as public attributes, and use them explicitly:
system = Linux()
print(system.yum.lookup(package))
print(system.ldap.lookup(name))
print(system.dhcp.lookup(reservation))
Or you'll want to provide a "Linux API" that wraps all the underlying APIs:
def lookup_package(self, package):
return self.yum.lookup(package)
def lookup_ldap_name(self, name):
return self.ldap.lookup(name)
def lookup_reservation(self, reservation):
return self.dhcp.lookup(reservation)
If you really do want to just forward every method of all of your different components, and you're sure that none of them conflict with each other, and there are way too many to write out manually, you can always do it programmatically, by iterating all of the classes, iterating inspect.getmembers of each one, filtering out the ones that start with _ or aren't unbound methods, creating a proxy function, and setattr-ing it onto Linux.
Or, alternatively (probably not as good an idea in this case, but very commonly useful in cases that aren't that different), you can proxy dynamically, at method lookup time, by implementing a __getattr__ method (and, often, a __dir__ method).
I think one of these two kinds of proxying may be what you're really after here.
1. There are some cases where you want to inherit for reasons other than subtyping. For example, you inherit a mixin class to get implementations for a bunch of methods. The question of whether your class is usable wherever that mixin's instances are usable doesn't really make sense, because the mixin isn't usable anywhere (except as a base class). But the subtyping is still the standard that you're bending there.
2. If it is, call Child Protective Services. And also call Professor X, because that shouldn't be physically possible.
I have a library with one parent and a dozen of children:
# mylib1.py:
#
class Foo(object):
def __init__(self, a):
self.a = a
class FooChild(Foo):
def __init__(self, a, b):
super(FooChild, self).__init__(a)
self.b = b
# more children here...
Now I want to extend that library with a simple (but a bit spesific, for use in another approach) method. So I would like to change parent class and use it's children.
# mylib2.py:
#
import mylib1
def fooMethod(self):
print 'a={}, b={}'.format(self.a, self.b)
setattr(mylib1.Foo, 'fooMethod', fooMethod)
And now I can use it like this:
# way ONE:
import mylib2
fc = mylib2.mylib1.FooChild(3, 4)
fc.fooMethod()
or like this:
# way TWO:
# order doesn't matter here:
import mylib1
import mylib2
fc = mylib1.FooChild(3, 4)
fc.fooMethod()
So, my questions are:
Is this good thing?
How this should be done in a better way?
A common approach is to use mixin
If you want, you could add dynamically How do I dynamically add mixins as base classes without getting MRO errors?.
There is a general rule in programming, that you should avoid dependence on global state. Which in other words means that your globals should be if possible constant. Classes are (mostly) globals.
Your approach is called monkey patching. And if you don't have a really really good reason to explain it, you should avoid it. This is because monkey patching violates the above rule.
Imagine you have two separate modules and both of them use this approach. One of them sets Foo.fooMethod to some method. The other - to another. Then you somehow switch control between these modules. The result would be, that it would be hard to determine what fooMethod is used where. This means hard to debug problems.
There are people (e.g. Brandon Craig-Rhodes), who believe that patching is bad even in tests.
What I would suggest is to use some attribute that you would set when instantiating instances of your Foo() class (and its children), that would control the behaviour of your fooMethod. Then the behaviour of this method would depend on how you instantiated the object, not on global state.
I am trying to implement thread safe code but encounter some simple problem. I searched and not found solution.
Let me show abstract code to describe problem:
import threading
class A(object):
sharedLock = threading.Lock()
shared = 0
#classmethod
def getIncremented(cls):
with cls.sharedLock:
cls.shared += 1
return cls.shared
class B(A):
pass
class C(A):
#classmethod
def getIncremented(cls):
with cls.sharedLock:
cls.shared += B.getIncremented()
return cls.shared
I want to define class A to inherit many child classes for example for enumerations or lazy variables - specific use no matter. I am already done single thread version now want update multi thread.
This code will give such results as should do:
id(A.sharedLock) 11694384
id(B.sharedLock) 11694384
id(C.sharedLock) 11694384
I means that lock in class A is lock in class B so it is bad since first entry into class B will lock also class A and class C. If C will use B it will lead to dedlock.
I can use RLock but it is invalid programming pattern and not sure if it not produce more serious deadlock.
How can I change sharedLock value during initialization of class to new lock to make id(A.sharedLock) != id(B.sharedLock) and same for A and C and B and C?
How can I hook class initialization in python in generic to change some class variables?
That question is not too complex but I do not know what to do with it.
I want inherit parent share variables except shared parent locks
You must not do this. It makes access to "share variables" not thread-safe.
sharedLock protects shared variable. If the same shared variable can be modified in a recursive call then you need RLock(). Here shared means shared among all subclasses.
It looks like you want a standalone function (or a static method) instead of the classmethod:
def getIncremented(_lock=Lock(), _shared=[0]):
with _lock:
_shared[0] += 1
return _shared[0]
Thus all classes use the same shared variable (and the corresponding lock).
If you want each class to have its own shared variable (here shared means shared among instances of this particular class) then don't use cls.shared that may traverse ancestors to get it.
To hint that subclasses shouldn't use a variable directly, you could use the syntax for a private variable:
class A:
__shared = 0
__lock = Lock()
If a subclass overrides a method that uses __shared then it won't use A.__shared by accident in the code directly.
As you noticed, if you expose shared locks as class attributes, the locks are shared by subclasses.
You could hack around this by redefining the lock on each subclass:
class B(A):
sharedLock = threading.Lock()
You could even use metaclasses to achieve this (please don't). It seems to me that you're approaching the program from the wrong angle.
This task is easier if you assign locks explicitly to instances (not classes).
class A(object):
def __init__(self, lock):
this.sharedLock= lock
my_lock= threading.Lock()
a= A(my_lock)
Of course, you run into the "problem" of having to explicitly pass the lock for each instance. This is traditionally solved using a factory pattern, but in python you can simply use functions properly:
from functools import partial
A_with_mylock= partial(A, my_lock)
a2= A_with_mylock()
Here is solution - this allow separate lock per each class since it done on class constructor level (metaclass). Thank you for all hints and help to achieve this code it looks very nice.
I can be also mangled variable but need to use hardcode '_A__lock' what can be problematic and not tested by me.
import threading
class MetaA(type):
def __new__(self, name, bases, clsDict):
# change <type> behavior
clsDict['_lock'] = threading.Lock()
return super(MetaA, self).__new__(self, name, bases, clsDict)
class A(object):
__metaclass__ = MetaA
#classmethod
def getLock(cls):
return cls._lock
class B(A):
pass
print 'id(A.getLock())', id(A.getLock())
print 'id(B.getLock())', id(B.getLock())
print A.getLock() == B.getLock()
I have the sense that this must be kind of a dumb question—nub here. So I'm open to an answer of the sort "This is ass-backwards, don't do it, please try this: [proper way]".
I'm using Python 2.7.5.
General Form of the Problem
This causes an infinite loop unless Thesaurus (an app-wide singleton) does not call Baseclass.__init__()
class Baseclass():
def __init__(self):
thes = Thesaurus()
#do stuff
class Thesaurus(Baseclass):
def __init__(self):
Baseclass.__init__(self)
#do stuff
My Specific Case
I have a base class that virtually every other class in my app extends (just some basic conventions for functionality within the app; perhaps should just be an interface). This base class is meant to house a singleton of a Thesaurus class that grants some flexibility with user input by inferring some synonyms (ie. {'yes':'yep', 'ok'}).
But since the subclass calls the superclass's __init__(), which in turn creates another subclass, loops ensue. Not calling the superclass's __init__() works just fine, but I'm concerned that's merely a lucky coincidence, and that my Thesaurus class may eventually be modified to require it's parent __init__().
Advice?
Well, I'm stopping to look at your code, and I'll just base my answer on what you say:
I have a base class that virtually every other class in my app extends (just some basic conventions for functionality within the app; perhaps should just be an interface).
this would be ThesaurusBase in the code below
This base class is meant to house a singleton of a Thesaurus class that grants some flexibility with user input by inferring some synonyms (ie. {'yes':'yep', 'ok'}).
That would be ThesaurusSingleton, that you can call with a better name and make it actually useful.
class ThesaurusBase():
def __init__(self, singleton=None):
self.singleton = singleton
def mymethod1(self):
raise NotImplementedError
def mymethod2(self):
raise NotImplementedError
class ThesaurusSingleton(ThesaurusBase):
def mymethod1(self):
return "meaw!"
class Thesaurus(TheraususBase):
def __init__(self, singleton=None):
TheraususBase.__init__(self, singleton)
def mymethod1(self):
return "quack!"
def mymethod2(self):
return "\\_o<"
now you can create your objects as follows:
singleton = ThesaurusSingleton()
thesaurus = Thesaurus(singleton)
edit:
Basically, what I've done here is build a "Base" class that is just an interface defining an expected behavior for all its children classes. The class ThesaurusSingleton (I know that's a terrible name) is also implementing that interface, because you said it had too and I did not want to discuss your design, you may always have good reasons for weird constraints.
And finally, do you really need to instantiate your singleton inside the class that is defining the singleton object? Though there may be some hackish way to do so, there's often a better design that avoids the "hackish" part.
What I think is that however you create your singleton, you should better do it explicitly. That's in the "Zen of python": explicit is better than implicit. Why? because then people reading your code (and that might be you in six months) will be able to understand what's happening and what you were thinking when you wrote that code. If you try to make things more implicit (like using sophisticated meta classes and weird self-inheritance) you may wonder what this code does in less than three weeks!
I'm not telling to avoid that kind of options, but to only use sophisticated stuff when you're out of simple ones!
Based on what you said I think the solution I gave can be a starting point. But as you focus on some obscure, yet not very useful hackish stuff instead of talking about your design, I can't be sure if my example is that appropriate, and hint you on the design.
edit2:
There's an another way to achieve what you say you want (but be sure that's really the design you want). You may want to use a class method that will act on the class itself (instead of the instances) and thus enable you to store a class-wide instance of itself:
>>> class ThesaurusBase:
... #classmethod
... def initClassWide(cls):
... cls._shared = cls()
...
>>> class T(ThesaurusBase):
... def foo(self):
... print self._shared
...
>>> ThesaurusBase.initClassWide()
>>> t = T()
>>> t.foo()
<__main__.ThesaurusBase instance at 0x7ff299a7def0>
and you can call the initClassWide method at the module level of where you declare ThesaurusBase, so whenever you import that module, it will have the singleton loaded (the import mechanism ensuring that python modules are run only once).
the short answer is:
do not instantiate an instance of a sub class from the super class constructor
longer answer:
if the motive you have to try to do this is the fact the Thesaurus is a singleton then you'll be better off exposing the singleton using a static method in the class (Thesaurus) and calling this method when you need the singleton