Python file organization for chainable functions - python

I'm writing a Python3.x framework that has generators and filters. I have a compact syntax for chaining the output of generators and filters into filters, but file organization feels inelegant. Here's what I mean.
Assume Renderer is the super class for both generators and filters:
# file: renderer.py -- defines the common superclass used by generators and filters
class Renderer(object):
def render():
# Every subclass provides a `render` function that emits some data...
pass
# file: gen1.py -- defines a generator
class Gen1(Renderer):
def __init__(self):
super(Gen1, self).__init__()
def render(self):
... emit some data
# file: filt1.py -- defines a filter that takes any Renderer object as an input
class Filt1(Renderer):
def __init__(self, input, param):
super(Filt1, self).__init__()
self._input = input
self._param = param
def render():
... call self._input.render() to fetch and act on data before emitting it
# file: filt2.py -- defines a filter that takes any Renderer object as an input
class Filt2(Renderer):
def __init__(self, input):
super(Filt2, self).__init__()
self._input = input
def render():
... call self._input.render() to fetch and act on data before emitting it
# file: render_module.py -- a module file to bring in all the components
from renderer.py import Renderer
from gen1.py import Gen1
from filt1.py import Filt1
from filt2.py import Filt2
What I'd like
What I'd like is for a user of the platform to be able to write code like this, which chains the output of gen1 into filt1, and the output of filt1 into filt2:
import render_module as rm
chain = rm.Gen1().filt1(123).filt2()
chain.render()
What I've done
What I've done is add the following to renderer.py. This works, but see "The Problem to Solve" below.
class Renderer(object):
def render():
# emit some data...
pass
def filt1(self, param):
return rm.Filt1(self, parm)
def filt2(self):
return rm.Filt1(self)
import render_module as rm # at end of file to avoid a circular dependency
The Problem to Solve
It feels wrong to pollute the common superclass with specific mentions of each subclass. The clear indication of code smell is the import statement at the end of renerer.py.
But I haven't figured out a better way to refactor and organize the files. What's the pythonic approach out of this conundrum?

Related

Register classes in different files to a Class factory

I am trying to register classes that are in different files to the factory class. The factory class has a dictionary called "registry" which hold/maps the a user defined name to the registering class. My issue is that if my factory class and registering classes are in the same .py file everything works as expected but the moment I move the registering classes into their own .py files and import the factory class to apply the register decorator (as described in the question & article below) the "registry" dictionary stays empty, which means that the classes are not getting registered.
They way I am registering these classes is via a decorator. My code looks very much like what we see here:
Registering classes to factory with classes in different files (my question is a duplicate of this, but bumping this question to the top)
https://medium.com/#geoffreykoh/implementing-the-factory-pattern-via-dynamic-registry-and-python-decorators-479fc1537bbe
I would like to know:
What why keeping them in the same file work while splitting them out doest
How can I make the separate file approach work ?
Hopefully the code samples in the articles clarify what I am trying to do and struggling with.
I'm currently exploring a similar problem, and I think I may have found a solution. It is a bit of a 'hack' though, so take it with a grain of salt.
What why keeping them in the same file work while splitting them out doest
In order to make your classes self-register in the factory while keeping their definition in single .py files, we have to somehow force the loading of the classes in the .py files.
How can I make the separate file approach work?
In my case, I've came across this problem when trying to implement a 'Simple Factory', with self-registering subclasses to avoid having to modify the typical 'if/else' idiom in the factory's get() method.
I'll use a simple example, starting with the decorator method you've mentioned.
Example with decorators
Let's say we have a ShoeFactory as shown below, in which we register different 'classes' of shoes:
# file shoe.py
class ShoeFactory:
_shoe_classes = {}
#classmethod
def get(cls, shoe_type:str):
try:
return cls._shoe_classes[shoe_type]()
except KeyError:
raise ValueError(f"unknown product type : {shoe_type}")
#classmethod
def register(cls, shoe_type:str):
def inner_wrapper(wrapped_class):
cls._shoe_classes[shoe_type] = wrapped_class
return wrapped_class
return inner_wrapper
Examples of shoe classes:
# file sandal.py
from shoe import ShoeFactory
#ShoeFactory.register('Sandal')
class Sandal:
def __init__(self):
print("i'm a sandal")
# file croc.py
from shoe import ShoeFactory
#ShoeFactory.register('Croc')
class Croc:
def __init__(self):
print("i'm a croc")
In order to make Sandal self-register in the ShoeFactory while keeping its definition in a single .py file, we have to somehow force the loading of the Sandal class in .py file.
I've done this in 3 steps:
Keeping all class implementations in a specific folder, e.g., structuring the files as follows:
.
└- shoe.py # file with the ShoeFactory class
└─ shoes/
└- __init__.py
└- croc.py
└- sandal.py
Adding the following statement to the end of the shoe.py file, which will take care of loading and registering each individual class:
from shoes import *
Add a piece of code like the snippet below to your __init__.py within the shoes/ foder, so that to dynamically load all classes [1]:
from inspect import isclass
from pkgutil import iter_modules
from pathlib import Path
from importlib import import_module
# iterate through the modules in the current package
package_dir = Path(__file__).resolve().parent
for (_, module_name, _) in iter_modules([package_dir]):
# import the module and iterate through its attributes
module = import_module(f"{__name__}.{module_name}")
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if isclass(attribute):
# Add the class to this package's variables
globals()[attribute_name] = attribute
If we follow this approach, I get the following results when running some test code as follows:
# file shoe_test.py
from shoe import ShoeFactory
if __name__ == "__main__":
croc = ShoeFactory.get('Croc')
sandal = ShoeFactory.get('Sandal')
$ python shoe_test.py
i'm a croc
i'm a sandal
Example with __init_subclass__()
I've personally followed a slighly different approach for my simple factory design, which does not use decorators.
I've defined a RegistrableShoe base class, and then used a __init_subclass__() approach to do the self-registering ([1] item 49, [2]).
I think the idea is that when Python finds the definition of a subclass of RegistrableShoe, the __init_subclass__() method is ran, which in turn registers the subclass in the factory.
This approach requires the following changes when compared to the example above:
Added a RegistrableShoe base class to the shoe.py file, and re-factored ShoeFactory a bit:
# file shoe.py
class RegistrableShoe():
def __init_subclass__(cls, shoe_type:str):
ShoeFactory.register(shoe_type, shoe_class=cls)
class ShoeFactory:
_shoe_classes = {}
#classmethod
def get(cls, shoe_type:str):
try:
return cls._shoe_classes[shoe_type]()
except KeyError:
raise ValueError(f"unknown product type : {shoe_type}")
#classmethod
def register(cls, shoe_type:str, shoe_class:RegistrableShoe):
cls._shoe_classes[shoe_type] = shoe_class
from shoes import *
Changed the concrete shoe classes to derive from the RegistrableShoe base class and pass a shoe_type parameter:
# file croc.py
from shoe import RegistrableShoe
class Croc(RegistrableShoe, shoe_type='Croc'):
def __init__(self):
print("i'm a croc")
# file sandal.py
from shoe import RegistrableShoe
class Sandal(RegistrableShoe, shoe_type='Sandal'):
def __init__(self):
print("i'm a sandal")

How to verify when an unknown object created by the code under test was called as expected (pytest) (unittest)

I have some code that creates instances from a list of classes that is passed to it. This cannot change as the list of classes passed to it has been designed to be dynamic and chosen at runtime through configuration files). Initialising those classes must be done by the code under test as it depends on factors only the code under test knows how to control (i.e. it will set specific initialisation args). I've tested the code quite extensively through running it and manually trawling through reams of output. Obviously I'm at the point where I need to add some proper unittests as I've proven my concept to myself. The following example demonstrates what I am trying to test:
I would like to test the run method of the Foo class defined below:
# foo.py
class Foo:
def __init__(self, stuff):
self._stuff = stuff
def run():
for thing in self._stuff:
stuff = stuff()
stuff.run()
Where one (or more) files would contain the class definitions for stuff to run, for example:
# classes.py
class Abc:
def run(self):
print("Abc.run()", self)
class Ced:
def run(self):
print("Ced.run()", self)
class Def:
def run(self):
print("Def.run()", self)
And finally, an example of how it would tie together:
>>> from foo import Foo
>>> from classes import Abc, Ced, Def
>>> f = Foo([Abc, Ced, Def])
>>> f.run()
Abc.run() <__main__.Abc object at 0x7f7469f9f9a0>
Ced.run() <__main__.Abc object at 0x7f7469f9f9a1>
Def.run() <__main__.Abc object at 0x7f7469f9f9a2>
Where the list of stuff to run defines the object classes (NOT instances), as the instances only have a short lifespan; they're created by Foo.run() and die when (or rather, sometime soon after) the function completes. However, I'm finding it very tricky to come up with a clear method to test this code.
I want to prove that the run method of each of the classes in the list of stuff to run was called. However, from the test, I do not have visibility on the Abc instance which the run method creates, therefore, how can it be verified? I can't patch the import as the code under test does not explicitly import the class (after all, it doesn't care what class it is). For example:
# test.py
from foo import Foo
class FakeStuff:
def run(self):
self.run_called = True
def test_foo_runs_all_stuff():
under_test = Foo([FakeStuff])
under_test.run()
# How to verify that FakeStuff.run() was called?
assert <SOMETHING>.run_called, "FakeStuff.run() was not called"
It seems that you correctly realise that you can pass anything into Foo(), so you should be able to log something in FakeStuff.run():
class Foo:
def __init__(self, stuff):
self._stuff = stuff
def run(self):
for thing in self._stuff:
stuff = thing()
stuff.run()
class FakeStuff:
run_called = 0
def run(self):
FakeStuff.run_called += 1
def test_foo_runs_all_stuff():
under_test = Foo([FakeStuff, FakeStuff])
under_test.run()
# How to verify that FakeStuff.run() was called?
assert FakeStuff.run_called == 2, "FakeStuff.run() was not called"
Note that I have modified your original Foo to what I think you meant. Please correct me if I'm wrong.

Can I add methods into def __init__?

Using python 2.7.6, I have been trying to write a class that can extract pieces of xml data from a couple of xml files within a given zip file. I want to be able to use any of the methods in any order once I am working with the class, so wanted the unzip stage to be behind the scenes, in the class.
It is the first time I have really tried to make real use of a class as I am quite new to python, so I am learning as I go.
I defined methods to unzip the data to memory and was using those methods in other methods - then realised it would be horribly inefficient when using multiple methods. Since the unzipping step is necessary for any method in the class, is there a way to build it into the init definition so it is only done once when the class is first created?
Example of what I currently have:
class XMLzip(object):
def __init__(self, xzipfile):
self.xzipfile = xzipfile
def extract_xml1(self):
#extract the xmlfile to a variable
def extract_xml2(self):
#extract xmlfile2 to a variable
def do_stuff(self):
self.extract_xml1()
....
def do_domethingelse(self):
self.extract_xml1()
Is there a way to do something like I have shown below? And if so, what is it called - my searches haven't been very effective.
class XMLzip(object):
def __init__(self, xzipfile):
self.xzipfile = xzipfile
def extract_xml1()
# extract it here
def extract_xml2()
# extract it here
# Now carry on with normal methods
def do_stuff(self):
...
in the __init__ you can do whatever you want in order to initialize your class, in this case look like what you need is something like this
class XMLzip(object):
def __init__(self, xzipfile):
self.xzipfile = xzipfile
self.xml1 = #extract xml1 here
self.xml2 = #extract xml2 here
def do_stuff(self):
...
if you want to do the extract part only once, then do it and save result in a additional attribute in the instance of your class.
I suspect that the extract procedure is very similar, so you can make it a function be inside your class or outside, that is up to your preference, and give additional arguments to handle the specificity, for example something like this
the outside version
def extract_xml_from_zip(zip_file,this_xml):
# extract the request xml file from the given zip_file
return result
class XMLzip(object):
def __init__(self, xzipfile):
self.xzipfile = xzipfile
self.xml1 = extract_xml_from_zip(xzipfile,"xml1")
self.xml2 = extract_xml_from_zip(xzipfile,"xml2")
def do_stuff(self):
...
the inside version
class XMLzip(object):
def __init__(self, xzipfile):
self.xzipfile = xzipfile
self.xml1 = self.extract_xml_from_zip("xml1")
self.xml2 = self.extract_xml_from_zip("xml2")
def extract_xml_from_zip(self,this_xml):
# extract the request xml file from the zip_file in self.xzipfile
return result
def do_stuff(self):
...
You can call any method you have defined in your class in your initializer.
Demo:
>>> class Foo(object):
... def __init__(self):
... self.some_method()
... def some_method(self):
... print('hi')
...
>>> f = Foo()
hi
I take from your question that you need to extract the files only once. Leave your class as is and use your extract methods in __init__ and set the required attributes/variables for the extracted content.
For example
def __init__(self, xzipfile):
self.xzipfile = xzipfile
self.extract1 = self.extract_xml1()
self.extract2 = self.extract_xml2()
This of course requires your extract methods to have a return value, don't forget that.

What kind of design pattern am I looking for and how do I implement this in python

I am trying to give a slight amount of genericness to my code . Basically what I am looking for is this .
I wish to write an API interface MyAPI :
class MyAPI(object):
def __init__(self):
pass
def upload(self):
pass
def download(self):
pass
class MyAPIEx(object):
def upload(self):
#specific implementation
class MyAPIEx2(object):
def upload(self)
#specific implementation
#Actual usage ...
def use_api():
obj = MyAPI()
obj.upload()
SO what I want is that based on a configuration I should be able to call the upload function
of either MyAPIEx or MyAPIEx2 . What is the exact design pattern I am looking for and how do I implement it in python.
You are looking for Factory method (or any other implementation of a factory).
Its really hard to say what pattern you are using, without more info. The way to instantiate MyAPI is indeed a Factory like #Darhazer mentioned, but it sounds more like you're interested in knowing about the pattern used for the MyAPI class hierarchy, and without more info we cant say.
I made some code improvements below, look for the comments with the word IMPROVEMENT.
class MyAPI(object):
def __init__(self):
pass
def upload(self):
# IMPROVEMENT making this function abstract
# This is how I do it, but you can find other ways searching on google
raise NotImplementedError, "upload function not implemented"
def download(self):
# IMPROVEMENT making this function abstract
# This is how I do it, but you can find other ways searching on google
raise NotImplementedError, "download function not implemented"
# IMPROVEMENT Notice that I changed object to MyAPI to inherit from it
class MyAPIEx(MyAPI):
def upload(self):
#specific implementation
# IMPROVEMENT Notice that I changed object to MyAPI to inherit from it
class MyAPIEx2(MyAPI):
def upload(self)
#specific implementation
# IMPROVEMENT changed use_api() to get_api(), which is a factory,
# call it to get the MyAPI implementation
def get_api(configDict):
if 'MyAPIEx' in configDict:
return MyAPIEx()
elif 'MyAPIEx2' in configDict:
return MyAPIEx2()
else
# some sort of an error
# Actual usage ...
# IMPROVEMENT, create a config dictionary to be used in the factory
configDict = dict()
# fill in the config accordingly
obj = get_api(configDict)
obj.upload()

pygtk gtk.Builder.connect_signals onto multiple objects?

I am updating some code from using libglade to GtkBuilder, which is supposed to be the way of the future.
With gtk.glade, you could call glade_xml.signal_autoconnect(...) repeatedly to connect signals onto objects of different classes corresponding to different windows in the program. However Builder.connect_signals seems to work only once, and (therefore) to give warnings about any handlers that aren't defined in the first class that's passed in.
I realize I can connect them manually but this seems a bit laborious. (Or for that matter I could use some getattr hackery to let it connect them through a proxy to all the objects...)
Is it a bug there's no function to hook up handlers across multiple objects? Or am I missing something?
Someone else has a similar problem http://www.gtkforums.com/about1514.html which I assume means this can't be done.
Here's what I currently have. Feel free to use it, or to suggest something better:
class HandlerFinder(object):
"""Searches for handler implementations across multiple objects.
"""
# See <http://stackoverflow.com/questions/4637792> for why this is
# necessary.
def __init__(self, backing_objects):
self.backing_objects = backing_objects
def __getattr__(self, name):
for o in self.backing_objects:
if hasattr(o, name):
return getattr(o, name)
else:
raise AttributeError("%r not found on any of %r"
% (name, self.backing_objects))
I have been looking for a solution to this for some time and found that it can be done by passing a dict of all the handlers to connect_signals.
The inspect module can extract methods using
inspect.getmembers(instance, predicate=inspect.ismethod
These can then be concatenated into a dictionary using d.update(d3), watching out for duplicate functions such as on_delete.
Example code:
import inspect
...
handlers = {}
for c in [win2, win3, win4, self]: # self is the main window
methods = inspect.getmembers(c, predicate=inspect.ismethod)
handlers.update(methods)
builder.connect_signals(handlers)
This will not pick up alias method names declared using #alias. For an example of how to do that, see the code for Builder.py, at def dict_from_callback_obj.
I'm only a novice but this is what I do, maybe it can inspire;-)
I instantiate the major components from a 'control' and pass the builder object so that the instantiated object can make use of any of the builder objects (mainwindow in example) or add to the builder (aboutDialog example). I also pass a dictionary (dic) where each component adds "signals" to it.
Then the 'connect_signals(dic)' is executed.
Of course I need to do some manual signal connecting when I need to pass user arguments to the callback method, but those are few.
#modules.control.py
class Control:
def __init__(self):
# Load the builder obj
guibuilder = gtk.Builder()
guibuilder.add_from_file("gui/mainwindow.ui")
# Create a dictionnary to store signal from loaded components
dic = {}
# Instanciate the components...
aboutdialog = modules.aboutdialog.AboutDialog(guibuilder, dic)
mainwin = modules.mainwindow.MainWindow(guibuilder, dic, self)
...
guibuilder.connect_signals(dic)
del dic
#modules/aboutdialog.py
class AboutDialog:
def __init__(self, builder, dic):
dic["on_OpenAboutWindow_activate"] = self.on_OpenAboutWindow_activate
self.builder = builder
def on_OpenAboutWindow_activate(self, menu_item):
self.builder.add_from_file("gui/aboutdialog.ui")
self.aboutdialog = self.builder.get_object("aboutdialog")
self.aboutdialog.run()
self.aboutdialog.destroy()
#modules/mainwindow.py
class MainWindow:
def __init__(self, builder, dic, controller):
self.control = controller
# get gui xml and/or signals
dic["on_file_new_activate"] = self.control.newFile
dic["on_file_open_activate"] = self.control.openFile
dic["on_file_save_activate"] = self.control.saveFile
dic["on_file_close_activate"] = self.control.closeFile
...
# get needed gui objects
self.mainWindow = builder.get_object("mainWindow")
...
Edit: alternative to auto attach signals to callbacks:
Untested code
def start_element(name, attrs):
if name == "signal":
if attrs["handler"]:
handler = attrs["handler"]
#Insert code to verify if handler is part of the collection
#we want.
self.handlerList.append(handler)
def extractSignals(uiFile)
import xml.parsers.expat
p = xml.parsers.expat.ParserCreate()
p.StartElementHandler = self.start_element
p.ParseFile(uiFile)
self.handlerList = []
extractSignals(uiFile)
for handler in handlerList:
dic[handler] = eval(''. join(["self.", handler, "_cb"]))
builder.connect_signals
({
"on_window_destroy" : gtk.main_quit,
"on_buttonQuit_clicked" : gtk.main_quit
})

Categories

Resources