I have a series of functions that serve to classify data. Each function is passed the same input. The goal of this system is to be able to drop in new classification functions at will without having to adjust anything.
To do this, I make use of a classes_in_module function lifted from here. Then, every classifier in one python file will be ran on each input.
However, I am finding that implementing the classifier as either a class or a function is kludgy. Classes mean instantiating and executing, while functions lack clean introspection to allow me to query the name or use inheritance to define common values.
Here is an example. First, the class implementation:
class AbstractClassifier(object):
#property
def name(self):
return self.__class__.__name__
class ClassifierA(AbstractClassifier):
def __init__(self, data):
self.data = data
def run(self):
return 1
This can then be used in this fashion, assuming that classifier_list is the output of classes_in_module on a file containing ClassifierA among others:
result = []
for classifier in classifier_list:
c = classifier(data)
result.append(c.run())
However, this seems a bit silly. This class is obviously static, and doesn't really need to maintain its own state, as it is used once and discarded. The classifier is really a function, but then I lose the ability to have a shared name property -- I would have to use the ugly introspection technique sys._getframe().f_code.co_name and replicate that code for each classifier function. And any other shared properties between classifiers would also be lost.
What do you think? Should I just accept this mis-use of classes? Or is there a better way?
Functions can have member data. You can also find the name of a function using the func_name attribute.
def classifier(data):
return 1
classifier.name = classifier.func_name
print(classifier.name) #classifier
If you want multiple functions to behave the same way, you can use a decorator.
function_tracker = []
def add_attributes(function):
function.name = function.func_name
function.id = len(function_tracker)
function_tracker.append(function)
return function
#add_attributes
def classifier(data):
return 1
print(classifier.name, classifier.id) # 'classifier', 0
Would this work to avoid classes in your specific case?
If you don't need several instances of the class (and it seems you don't) make one instance of the class and change the run to __call__:
class AbstractClassifier(object):
#property
def name(self):
return self.__class__.__name__
class ClassifierA(AbstractClassifier):
def __call__(self, data):
return 1
ClassifierA = ClassifierA() # see below for alternatives
and then in your other code:
result = []
for classifier in classifier_list:
result.append(classifier(data))
Instead of having ClassifierA = ClassifierA() (which isn't very elegant), one could do:
classifier_list = [c() for c in (ClassifierA, ClassifierB, ...)]
This method allows you to keep your classes handy should you need to create more instances of them; if you don't ever need to have more than one instance you could use a decorator to IAYG (instantiate as you go ;) :
def instantiate(cls):
return cls()
#instantiate
class ClassifierZ(object):
def __call__(self, data):
return some_classification
To use a class instance as a function:
class ClassifierA(AbstractClassifier):
def __init__(self, data):
self.data = data
def __call__(self):
return 1
result = []
for classifier in classifier_list:
c = classifier(data)
result.append(c())
Or to just use functions:
classifier_list = []
def my_decorator(func):
classifier_list.append(func)
return func
#my_decorator
def classifier_a(data):
return 1
result = []
for classifier in classifier_list:
c = classifier(data)
result.append(c)
Related
I have a class that contains a list of "features" wherein each feature is the name of a function found within that class. It's a way of controlling which features are added to a timeseries dataframe dynamically. I have a function in another class which is meant to take an existing feature and retrieve its future state. So based on the way the current structure is set up, I need to dynamically add a function by appending "_after" to the end of its name.
I've got my decorator set up so far that the features list is updated with the new function name, but I don't know how to declare an additional function from within the decorator. Ultimately I don't even need to wrap the original function, I just need to create a new one using the naming convention of the old one.
In this contrived example, Dog should in the end have two functions: bark() and bark_after(). The Features list should contain ['bark', 'bark_again']. I'd also like to not have to pass Features in explicitly to the decorator.
class Dog:
Features = ['bark']
def __init__(self, name):
self.name = name
def gets_future_value(*args):
def decorator(function):
new_function = f'{function.__name__}_after'
args[0].append(new_function)
return function
return decorator
#gets_future_value(Features)
def bark(self):
print('Bark!')
d = Dog('Mr. Barkington')
print(d.Features)
d.bark()
d.bark_after()
I think what you need is a class decorator:
import inspect
def add_after_methods(cls):
features = set(cls.Features) # For fast membership testing.
isfunction = inspect.isfunction
def isfeature(member):
"""Return True if the member is a Python function and a class feature."""
return isfunction(member) and member.__name__ in features
# Create any needed _after functions.
for name, member in inspect.getmembers(cls, isfeature):
after_func_name = name + '_after'
def after_func(*args, **kwargs):
print(f'in {after_func_name}()')
setattr(cls, after_func_name, after_func)
cls.Features.append(after_func_name)
return cls
#add_after_methods
class Dog:
Features = ['bark']
def __init__(self, name):
self.name = name
def bark(self):
print('Bark!')
d = Dog('Mr. Barkington')
print(d.Features) # -> ['bark', 'bark_after']
d.bark() # -> Bark!
d.bark_after() # -> in bark_after()
I understand that in Python variables point to objects so when you assign one to another they then both point to the same object. What I'd like to do is to make one variable change when the other one does. In the case I am working on a GUI. So I have a label with an attribute for its text. I'd like that attribute to be equal to an attribute in another class. At the moment I am doing it by using an intermediate function but it feels like there should be a more elegant way of doing it. So my way is effectively similar to the below:
class Label():
def init():
self.text = None
self.gettext = None
def display():
if callable(self.gettext):
self.text = self.gettext()
else:
self.text = self.gettext
print(str(self.text))
class Anotherclass():
def init():
self.anattribute = "avaluethatchanges"
mylabel = Label()
myclass = Anotherclass()
def gettheattribute():
return myclass.anattribute
mylabel.gettext = gettheattribute
There will be lots of labels linked to lots of different classes. So what I would like to be able to do is just:
mylabel.gettext = myclass.anattribute
However, when myclass.anattribute gets changed - myclass.gettext doesn't. I understand why but is there another way of writing it so that it does - without creating the function?
Many thanks
EDIT: - Both classes will be used in other applications where one or the other might not exist so I can't hard code the relationship between them within the classes themselves.
The first thing I would say is that it's somewhat of an antipattern to duplicate the storage of data in two places, since it violates the DRY principle of software development.
Generally, with GUI designs like this, there's the concept of MVC, or Model, View, Controller.
It's a pretty large topic, but the general idea is that you create a model object to store all your data, and then all the other parts of the GUI -- the many Views that display the data, and the Controllers that change the data -- all look at the model, so that the data is only stored and updated in one place.
GUI elements are either designed to accept a model and refreshes are either manually triggered or there is some type of Listen/Callback/Event system to automatically trigger refreshes on the Views when the model changes. The specific way to handle that depends on the GUI framework you are using.
One simple way to implement this would be to create a model class that both classes share and use python properties and a callback registry to trigger updates.
class Model():
def __init__(self, text):
self._text = text
self._callbacks = []
def on_text_changed(callback):
self._callbacks.append(callback)
#property
def text(self):
return self._text
#text.setter
def text(self, value):
self._text = text
for callback in self._callbacks:
callback()
Then both other classes would need something like this
class Label():
def __init__(self, model):
self.model = model
self.model.on_text_changed(self.refresh)
def refresh(self):
print(self.text)
#property
def text(self):
return self.model.text
#text.setter
def text(self, value):
self.model.text = value
Then you would create them like this
model = Model('The text')
label = Label(model)
another_class = AnotherClass(model)
label.text = 'This will update text on all classes'
another_class.text = 'So will this'
model.text = "And so will this.
Sounds like this might be a good use case for a property. Properties let you have getter/setters that work seamlessly like attributes. From the docs
[a property is] a succinct way of building a data descriptor that triggers function calls upon access to an attribute
...
The property() builtin helps whenever a user interface has granted attribute access and then subsequent changes require the intervention of a method.
mylabel = Label()
class MyClass(object):
def __init__(self, some_label):
self._anattribute = None
self.label = some_label
#property
def anattribute(self):
return self._anattribute
#anattribute.setter
def anattribute(self, value):
self._anattribute = value # set the underlying value
# do something else, too
self.label.text = self._anattribute
So...
mylabel = Label()
myinstance = MyClass(mylabel)
myinstance.anattribute = 'foo'
mylabel.text == 'foo' # True
Storing self._anattribute is not strictly necessary, either. You could have the getter/setter access/modify self.label.text directly, if applicable.
class MyClass:
def __init__(self,shared_dict): # use some mutable datatype (a dict works well)
self.shared = shared_dict
def __getattr__(self,item):
return self.shared.get(item)
data = {'a':'hello','b':[1,2,3]}
c = MyClass(data)
print(c.a)
data['a'] = 'world!'
print(c.a)
I guess ... this doesnt make much sense from a use case standpoint really ... there is almost guaranteed to be a better way to do whatever it is you are actually trying to do (probably involves notifying subscribers and updating variables)
see it in action here https://repl.it/repls/TestyGruesomeNumbers
Sounds like you have some GUI element that you want to tie to an underlying model. Riffing on 0x5453's very good advice, what about the following:
class Label():
def __init__(self, text_source, source_attribute):
self.text_source = text_source
self.source_attribute = source_attribute
#property
def text_from_source(self):
return getattr(self.text_source, self.source_attribute)
def display(self):
print(str(self.text_from_source))
class Anotherclass():
def __init__():
self.anattribute = "avaluethatchanges"
>>> A = Anotherclass()
>>> L = Label(A, "anattribute")
>>> L.display()
avaluethatchanges
>>> A.anattribute = 3.1415
>>> L.display()
3.1415
This does not let you change the attribute from within the label, but I'd prefer it that way.
I currently have the following two ways:
class Venue:
store = Database.store()
ids = [vid for vid in store.find(Venue.id, Venue.type == "Y")]
def __init__(self):
self.a = 1
self.b = 2
OR
class Venue:
#classmethod
def set_venue_ids(cls):
store = Database.store()
cls.ids = [vid for vid in store.find(Venue.id, Venue.type == "Y")]
def __init__(self):
self.a = 1
self.b = 2
And before using/instantiating the class I would call:
Venue.set_venue_ids()
What would be the correct way of achieving this?
If it's the first way, what would I do if the instantiation of the attribute required more complex logic that could be done more simply through the use of a function?
Or is there an entirely different way to structure my code to accomplish what I'm trying to do?
From a purely technical POV, a class is an instance of its metaclass so the metaclass initializer is an obvious candidate for class attributes initialization (at least when you have anything a bit complex).
Now given the canonical lifetime of a class object (usually the whole process), I would definitly not use an attribute here - if anyone adds or removes venues from your database while your process is running, your ids attributes will get out of sync. Why don't you use a classmethod instead to make sure your data are always have up to date ?
Oh and yes, another way to construct your Venue.ids (or any other class attribute requiring non-trivial code) without having complex code at the class top-level polluthing the class namespace (did you noticed that in your first example store becomes a class attributes too, as well as vid if using Python 2.x ?) is to put the code in a plain function and call that function from within your class statement's body, ie:
def list_venue_ids():
store = Database.store()
# I assume `store.find()` returns some iterator (not a `list`)
# if it does return a list, you could just
# `return store.find(...)`.
return list(store.find(Venue.id, Venue.type == "Y"))
class Venue(object):
ids = list_venue_ids()
def __init__(self):
self.a = 1
self.b = 2
If I'm creating a class that needs to store properties, when is it appropriate to use an #property decorator and when should I simply define them in __init__?
The reasons I can think of:
Say I have a class like
class Apple:
def __init__(self):
self.foodType = "fruit"
self.edible = True
self.color = "red"
This works fine. In this case, it's pretty clear to me that I shouldn't write the class as:
class Apple:
#property
def foodType(self):
return "fruit"
#property
def edible(self):
return True
#property
def color(self):
return "red"
But say I have a more complicated class, which has slower methods (say, fetching data over the internet).
I could implement this assigning attributes in __init__:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
self.wikipedia_article_content = requests.get(self.wikipedia_url).text
or I could implement this with #property:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
#property
def wikipedia_article_content(self):
return requests.get(self.wikipedia_url).text
In this case, the latter is about 50,000 times faster to instantiate. However, I could argue that if I were fetching wikipedia_article_content multiple times, the former is faster:
a = Apple()
a.wikipedia_article_content
a.wikipedia_article_content
a.wikipedia_article_content
In which case, the former is ~3 times faster because it has one third the number of requests.
My question
Is the only difference between assigning properties in these two ways the ones I've thought of? What else does #property allow me to do other than save time (in some cases)? For properties that take some time to assign, is there a "right way" to assign them?
Using a property allows for more complex behavior. Such as fetching the article content only when it has changed and only after a certain time period has passed.
Yes, I would suggest using property for those arguments. If you want to make it lazy or cached you can subclass property.
This is just an implementation of a lazy property. It does some operations inside the property and returns the result. This result is saved in the class with another name and each subsequent call on the property just returns the saved result.
class LazyProperty(property):
def __init__(self, *args, **kwargs):
# Let property set everything up
super(LazyProperty, self).__init__(*args, **kwargs)
# We need a name to save the cached result. If the property is called
# "test" save the result as "_test".
self._key = '_{0}'.format(self.fget.__name__)
def __get__(self, obj, owner=None):
# Called on the class not the instance
if obj is None:
return self
# Value is already fetched so just return the stored value
elif self._key in obj.__dict__:
return obj.__dict__[self._key]
# Value is not fetched, so fetch, save and return it
else:
val = self.fget(obj)
obj.__dict__[self._key] = val
return val
This allows you to calculate the value once and then always return it:
class Test:
def __init__(self):
pass
#LazyProperty
def test(self):
print('Doing some very slow stuff.')
return 100
This is how it would work (obviously you need to adapt it for your case):
>>> a = Test()
>>> a._test # The property hasn't been called so there is no result saved yet.
AttributeError: 'Test' object has no attribute '_test'
>>> a.test # First property access will evaluate the code you have in your property
Doing some very slow stuff.
100
>>> a.test # Accessing the property again will give you the saved result
100
>>> a._test # Or access the saved result directly
100
I want to share a variable value between a function defined within a python class and an externally defined function. So in the code below, when the internalCompute() function is called, self.data is updated. How can I access this updated data value inside a function that is defined outside the class, i.e inside the report function?
Note:
I would like to avoid using of global variable as much as possible.
class Compute(object):
def __init__(self):
self.data = 0
def internalCompute(self):
self.data = 5
def externalCompute():
q = Compute()
q.internalCompute()
def report():
# access the updated variable self.data from Compute class
print "You entered report"
externalCompute()
report()
It's a good idea to avoid magic globals, but your report() function must have some way to know where to look. If it's ok to pass it the object doing the computation, i.e. q, it can simply print out q.data. If not, then you can arrange for q to save its data in the class itself-- obviously this means that the class can only be instantiated once, so I would go with the first option.
You can't do that unless you instantiate the class somewhere.
Currently in your implementation you instantiate the Compute class here:
def externalCompute():
q = Compute()
q.internalCompute()
But as soon as the function finishes q goes out of scope and is destroyed so you will lose all information that the class contains. In order to do what you want to do the Compute class has to be instantiated to not be local to a function (or your function needs to return the instance of the Compute class as to preserve its state.)
Usually you would do that by having one "main" if your python file in this way:
class Compute(object):
def __init__(self):
self.data = 0
def internalCompute(self):
self.data = 5
def externalCompute(q):
q.internalCompute()
def report(q):
# access the updated variable self.data from Compute class
data = q.data
print "You entered report"
if __name__ == '__main__':
q = Compute()
externalCompute(q)
report(q)
You actually have access to the data through q.data, you just have to return it.
Change your code to reflect that fact:
class Compute(object):
def __init__(self):
self.data = 0
def internalCompute(self):
self.data = 5
def externalCompute():
q = Compute()
q.internalCompute()
return q.data
def report():
print externalCompute()
report() # 5
If you don't like this approach, you have only a few other options:
Global variable.
Instantiating another class.
Updating the same class you instantiated.
Database.
Pickle.