Is there a reasonably natural way of converting python function to standalone scripts? Something like:
def f():
# some long and involved computation
script = function_to_script(f) # now script is some sort of closure,
# which can be run in a separate process
# or even shipped over the network to a
# different host
and NOT like:
script = open("script.py", "wt")
script.write("#!/usr/bin/env python")
...
You can turn any "object" into a function by defining the __call__ method on it (see here.) Hence, if you want to compartmentalize some state with the computations, as long as what you've provided from the very top to the bottom of a class can be pickled, then that object can be pickled.
class MyPickledFunction(object):
def __init__(self, *state):
self.__state = state
def __call__(self, *args, **kwargs):
#stuff in here
That's the easy cheater way. Why pickling? Anything that can be pickled can be sent to another process without fear. You're forming a "poor man's closure" by using an object like this.
(There's a nice post about the "marshal" library here on SO if you want to truly pickle a function.)
Related
I have a Python class that requires some data in order to be initialized. This data is usually obtained using a function from another module, which makes calls to an API. One of the parameters my class' initializer takes is the same ID that can be used to obtain the resource with the API.
Calling the API from inside the initializer, and obtaining the data it needs would make for shorter (and cleaner?) initialization. But I am concerned this could make the class harder to test, and introduce a dependency deep inside the code.
I'm trying to devise the best way to implement this in a maintainable and testable way.
Would it be bad to call the API module directly from within the initializer, and obtain the data it needs to complete initialization? Or is it better to just call the API from outside and pass the data to the initializer?
The "normal" way(1) is the pass the dependent function, module, or class, into the constructor itself.
Then, in your production code, pass in the real thing. In your test code, pass in a dummy one that will behave exactly as you desire for the specific test case.
That's actually a half-way measure between the two things you posit.
In other words, something like:
def do_something_with(something_generator):
something = something_generator.get()
print(something)
# Real code.
do_something_with(ProductionGenerator())
# Test code.
class TestGenerator:
def get(self):
return 42
do_something_with(TestGenerator())
If you're reticent to always pass in a dependency, you can get around that with something like a default value and creating it inside the function if not given:
def do_something(something_generator=None):
if something_generator is None:
local_gen = ProductionGenerator()
else:
local_gen = something_generator
something = something_generator.get()
print(something)
# Real code.
do_something()
# Test code.
class TestGenerator:
def get(self):
return 42
do_something(TestGenerator())
(1) Defined, of course, as the way I do it :-)
I have a function foo that takes a parameter stuff
Stuff can be something in a database and I'd like to create a function that takes a stuff_id, get the stuff from the db, execute foo.
Here's my attempt to solve it:
1/ Create a second function with suffix from_stuff_id
def foo(stuff):
do something
def foo_from_stuff_id(stuff_id):
stuff = get_stuff(stuff_id)
foo(stuff)
2/ Modify the first function
def foo(stuff=None, stuff_id=None):
if stuff_id:
stuff = get_stuff(stuff_id)
do something
I don't like both ways.
What's the most pythonic way to do it ?
Assuming foo is the main component of your application, your first way. Each function should have a different purpose. The moment you combine multiple purposes into a single function, you can easily get lost in long streams of code.
If, however, some other function can also provide stuff, then go with the second.
The only thing I would add is make sure you add docstrings (PEP-257) to each function to explain in words the role of the function. If necessary, you can also add comments to your code.
I'm not a big fan of type overloading in Python, but this is one of the cases where I might go for it if there's really a need:
def foo(stuff):
if isinstance(stuff, int):
stuff = get_stuff(stuff)
...
With type annotations it would look like this:
def foo(stuff: Union[int, Stuff]):
if isinstance(stuff, int):
stuff = get_stuff(stuff)
...
It basically depends on how you've defined all these functions. If you're importing get_stuff from another module the second approach is more Pythonic, because from an OOP perspective you create functions for doing one particular purpose and in this case when you've already defined the get_stuff you don't need to call it within another function.
If get_stuff it's not defined in another module, then it depends on whether you are using classes or not. If you're using a class and you want to use all these modules together you can use a method for either accessing or connecting to the data base and use that method within other methods like foo.
Example:
from some module import get_stuff
MyClass:
def __init__(self, *args, **kwargs):
# ...
self.stuff_id = kwargs['stuff_id']
def foo(self):
stuff = get_stuff(self.stuff_id)
# do stuff
Or if the functionality of foo depends on the existence of stuff you can have a global stuff and simply check for its validation :
MyClass:
def __init__(self, *args, **kwargs):
# ...
_stuff_id = kwargs['stuff_id']
self.stuff = get_stuff(_stuff_id) # can return None
def foo(self):
if self.stuff:
# do stuff
else:
# do other stuff
Or another neat design pattern for such situations might be using a dispatcher function (or method in class) that delegates the execution to different functions based on the state of stuff.
def delegator(stff, stuff_id):
if stuff: # or other condition
foo(stuff)
else:
get_stuff(stuff_id)
Suppose I have a python class with a large overhead
class some_class:
def __init__(self):
self.overhead = large_overhead
# Get new data
def read_new_data(self, data):
self.new_data = data
def do_something(self):
# DO SOMETHING.
Suppose I want to have it listen to output of another program, or multiple programs, and I have a way to maintain this steady stream of inputs. How do I not initiate a new instance every time given the overhead? Do I create a new script and package the class to maintain its 'live'? And if so, how do I capture the output of the programs if they cannot be in direct communication with the script I'm running without going through a middle storage like SQL or file?
You can use a class variable:
class some_class:
overhead = large_overhead
# Get new data
def read_new_data(self, data):
self.new_data = data
def do_something(self):
# DO SOMETHING.
now overhead is only evaluated once when the class is defined, and you can use self.overhead within any class instances.
Lacking specifics... Use asyncio to setup listeners/watchers and register your object's methods as callbacks for when the data comes in - run the whole thing in an event loop.
While that was easy to say and pretty abstract, I'm sure I would have a pretty steep learning curve to implement that, especially considering I'd want to implement some testing infrastructure. But it seems pretty straightforward.
I am writing a Python app which will use a config file, so I am delegating the control of the config file to a dedicated module, configmanager, and within it a class, ConfigManager.
Whenever a method within ConfigManager is run, which will change my config file in some way, I will need to get the latest version of the file from the disk. Of course, in the spirit of DRY, I should delegate the opening of the config file to it's own function.
However, I feel as though explicitly calling a method to get and return the config file in each function that edits it is not very "clean".
Is there a recommended way in Python to run a method, and make a value available to other methods in a class, whenever and before a method is run in that class?
In other words:
I create ConfigManager.edit_config().
Whenever ConfigManager.edit_config() is called, another function ConfigManager.get_config_file() is run.
ConfigManager.get_config_file() makes a value available to the method ConfigManager.edit_config().
And ConfigManager.edit_config() now runs, having access to the value given by ConfigManager.get_config_file().
I expect to have many versions of edit_config() methods in ConfigManager, hence the desire to DRY my code.
Is there a recommended way of accomplishing something like this? Or should I just create a function to get the config fine, and manually call it each time?
The natural way to have:
ConfigManager.get_config_file() makes a value available to the method
ConfigManager.edit_config().
is to have get_config_file() return that value.
Just call get_config_file() within edit_config().
If there are going to be many versions of edit_config(), then a decorator might be the way to go:
def config_editor(func):
def wrapped(self, *args, **kwargs):
config_file = self.get_config_file()
func(self, config_file, *args, **kwargs)
return func
class ConfigManager
.
.
.
#config_editor
def edit_config1(self, config_file, arg1):
...
#config_editor
def edit_config2(self, config_file, arg1, arg2):
...
ConfigManager mgr
mgr.edit_config1(arg1)
I don't actually like this:
Firstly, the declaration of edit_config1 takes one more argument than the actual usage needs (because the decorator supplies the additional argument).
Secondly, it doesn't actually save all that much boiler plate over:
def edit_config3(self, arg1):
config_file = self.get_config_file()
In conclusion, I don't think the decorators save enough repetition to be worth it.
Since you get something from disk, you open a file. So, you could use the class with the with "function" of python.
You should check the context managers. With that, you will be able to implement the functionality that you want each time that someone access the config file through the __enter__ method and (if it is needed) implement the functionality for stop using the resource with the __exit__ method.
I am working with the Python canmatrix library (well, presently my Python3 fork) which provides a set of classes for an in-memory description of CAN network messages as well as scripts for importing and exporting to and from on-disk representations (various standard CAN description file formats).
I am writing a PyQt application using the canmatrix library and would like to add some minor additional functionality to the bottom level Signal class. Note that a CanMatrix organizes it's member Frames which in turn organize it's member Signals. The whole structure is created by an import script which reads a file. I would like to retain the import script and sub-member finder functions of each layer but add an extra 'value' member to the Signal class as well as getters/setters that can trigger Qt signals (not related to the canmatrix Signal objects).
It seems that standard inheritance approaches would require me to subclass every class in the library and override every function which creates the library Signal to use mine instead. Ditto for the import functions. This just seems horribly excessive to add non-intrusive functionality to a library.
I have tried inheriting and replacing the library class with my inherited one (with and without the pass-through constructor) but the import still creates library classes, not mine. I forget if I copied this from this other answer or not, but it's the same structure as referenced there.
class Signal(QObject, canmatrix.Signal):
_my_signal = pyqtSignal(int)
def __init__(self, *args, **kwargs):
canmatrix.Signal.__init__(self, *args, **kwargs)
# TODO: what about QObject
print('boo')
def connect(self, target):
self._my_signal.connect(target)
def set_value(self, value):
self._my_value = value
self._my_signal.emit(value)
canmatrix.Signal = Signal
print('overwritten')
Is there a direct error in my attempt here?
Am I doing this all wrong and need to go find some (other) design pattern?
My next attempt involved shadowing each instance of the library class. For any instance of the library class that I want to add the functionality to I must construct one of my objects which will associate itself with the library-class object. Then, with an extra layer, I can get from either object to the other.
class Signal(QObject):
_my_signal = pyqtSignal(int)
def __init__(self, signal):
signal.signal = self
self.signal = signal
# TODO: what about QObject parameters
QObject.__init__(self)
self.value = None
def connect(self, target):
self._my_signal.connect(target)
def set_value(self, value):
self.value = value
self._my_signal.emit(value)
The extra layer is annoying (library_signal.signal.set_value() rather than library_signal.set_value()) and the mutual references seem like they may keep both objects from ever getting cleaned up.
This does run and function, but I suspect there's still a better way.