This is my first post here so please let me know if I'm doing it wrong. I tried looking for an existing answer, but wasn't sure what to search for.
Consider the following simple example, a python module called mymath.py which uses only built-in python operations and modules. This custom module is portable, so anyone can execute the code without installing anything other than stock python.
# mymath.py
import sys
def minus(a, b):
return a-b
def mult(a, b):
return a*b
def div(a, b):
return a/b
def plus(a, b):
return a+b
def sum_series(int_list):
sum = 0
for i in int_list:
sum = plus(sum, i)
return sum
def main():
my_list = [2, 4, 6]
value = sum_series(my_list)
sys.stdout.write("your list total = {}".format(value))
Notice, main() only calls sum_series() which in turn calls plus(). The other functions may be required elsewhere in this fictional code base, but we're only concerned with main().
Now, I would like to copy only the relevant source code to another object as a text string. In other words, gather main() and all it's dependencies (recursively), resulting in a string of executable code.
My current solution:
import inspect
import mymath
# copy the source code in to the callback
source = inspect.getsource(mymath)
# then append a call to the main function
source += "\nmain()"
This works, producing a local copy of the module as a string that can run main() without requiring an import of mymath. The problem is that knob is now bloated with all the extra unused functions, although it is able to pick up any changes I make to mymath.py by rerunning my current solution.
So, the question is - is there a way to do the equivalent of:
source = getFunctionSourcerRecursively(mymath.main)
source += "\nmain()"
resulting in source =
# mymath.py
import sys
def plus(a, b):
return a+b
def sum_series(int_list):
sum = 0
for i in int_list:
sum = plus(sum, i)
return sum
def main():
my_list = [2, 4, 6]
sys.stdout.write("your list total = {}".format(sum_series(my_list)))
main()
So, basically "source" now contains only the relevant code and is portable, no longer requiring people offsite to have mymath installed.
If you're curious, my real-world case involves using The Foundry Nuke (compositing application) which has an internal callback system that can run code when callback events are triggered on a knob (property). I want to be able to share these saved Nuke files (.nk or .nknc) with offsite clients, without requiring them to modify their system.
You might try informal interfaces (a.k.a. protocols). While protocols work fine in many cases, there are situations where informal interfaces or duck typing in general can cause confusion. For example, an Addition and Multiplication both are mathFunc(). But they ain't the same thing even if they implement the same interfaces/protocols. Abstract Base Classes or ABCs can help solve this issue.
The concept behind ABCs is simple - user defines base classes which are abstract in nature. We define certain methods on the base classes as abstract methods. So any objects deriving from these bases classes are forced to implement those methods. And since we’re using base classes, if we see an object has our class as a base class, we can say that this object implements the interface. That is now we can use types to tell if an object implements a certain interface.
import mymath
class MathClass(mymath.ABC):
#mymath.abstractmethod
def mathFunc(self):
pass
Related
I’ve tried to develop a « module expander » tool for Python 3 but I've some issues.
The idea is the following : for a given Python script main.py, the tool generates a functionally equivalent Python script expanded_main.py, by replacing each import statement by the actual code of the imported module; this assumes that the Python source code of the imported is accessible. To do the job the right way, I’m using the builtin module ast of Python as well as astor, a third-party tool allowing to dump the AST back into Python source. The motivation of this import expander is to be able to compile a script into one single bytecode chunk, so the Python VM should not take care of importing modules (this could be useful for MicroPython, for instance).
The simplest case is the statement:
from import my_module1 import *
To transform this, my tool looks for a file my_module1.py and it replaces the import statement by the content of this file. Then, the expanded_main.py can access any name defined in my_module, as if the module was imported the normal way. I don’t care about subtle side effects that may reveal the trick. Also, to simplify, I treat from import my_module1 import a, b, c as the previous import (with asterisk), without caring about possible side effect. So far so good.
Now here is my point. How could you handle this flavor of import:
import my_module2
My first idea was to mimic this by creating a class having the same name as the module and copying the content of the Python file indented:
class my_module2:
# content of my_module2.py
…
This actually works for many cases but, sadly, I discovered that this has several glitches: one of these is that it fails with functions having a body referring to a global variable defined in the module. For example, consider the following two Python files:
# my_module2.py
g = "Hello"
def greetings():
print (g + " World!")
and
# main.py
import my_module2
print(my_module2.g)
my_module2.greetings()
At execution, main.py prints "Hello" and "Hello World!". Now, my expander tool shall generate this:
# expanded_main.py
class my_module2:
g = "Hello"
def greetings():
print (g + " World!")
print(my_module2.g)
my_module2.greetings()
At execution of expanded_main.py, the first print statement is OK ("Hello") but the greetings function raises an exception: NameError: name 'g' is not defined.
What happens actually is that
in the module my_module2, g is a global variable,
in the class my_module2, g is a class variable, which should be referred as my_module2.g.
Other similar side effects happens when you define functions, classes, … in my_module2.py and you want to refer to them in other functions, classes, … of the same my_module2.py.
Any idea how these problems could be solved?
Apart classes, are there other Python constructs that allow to mimic a module?
Final note: I’m aware that the tool should take care 1° of nested imports (recursion), 2° of possible multiple import of the same module. I don't expect to discuss these topics here.
You can execute the source code of a module in the scope of a function, specifically an instance method. The attributes can then be made available by defining __getattr__ on the corresponding class and keeping a copy of the initial function's locals(). Here is some sample code:
class Importer:
def __init__(self):
g = "Hello"
def greetings():
print (g + " World!")
self._attributes = locals()
def __getattr__(self, item):
return self._attributes[item]
module1 = Importer()
print(module1.g)
module1.greetings()
Nested imports are handled naturally by replacing them the same way with an instance of Importer. Duplicate imports shouldn't be a problem either.
A proper Python module will list all its public symbols in a list called __all__. Managing that list can be tedious, since you'll have to list each symbol twice. Surely there are better ways, probably using decorators so one would merely annotate the exported symbols as #export.
How would you write such a decorator? I'm certain there are different ways, so I'd like to see several answers with enough information that users can compare the approaches against one another.
In Is it a good practice to add names to __all__ using a decorator?, Ed L suggests the following, to be included in some utility library:
import sys
def export(fn):
"""Use a decorator to avoid retyping function/class names.
* Based on an idea by Duncan Booth:
http://groups.google.com/group/comp.lang.python/msg/11cbb03e09611b8a
* Improved via a suggestion by Dave Angel:
http://groups.google.com/group/comp.lang.python/msg/3d400fb22d8a42e1
"""
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
name = fn.__name__
all_ = mod.__all__
if name not in all_:
all_.append(name)
else:
mod.__all__ = [fn.__name__]
return fn
We've adapted the name to match the other examples. With this in a local utility library, you'd simply write
from .utility import export
and then start using #export. Just one line of idiomatic Python, you can't get much simpler than this. On the downside, the module does require access to the module by using the __module__ property and the sys.modules cache, both of which may be problematic in some of the more esoteric setups (like custom import machinery, or wrapping functions from another module to create functions in this module).
The python part of the atpublic package by Barry Warsaw does something similar to this. It offers some keyword-based syntax, too, but the decorator variant relies on the same patterns used above.
This great answer by Aaron Hall suggests something very similar, with two more lines of code as it doesn't use __dict__.setdefault. It might be preferable if manipulating the module __dict__ is problematic for some reason.
You could simply declare the decorator at the module level like this:
__all__ = []
def export(obj):
__all__.append(obj.__name__)
return obj
This is perfect if you only use this in a single module. At 4 lines of code (plus probably some empty lines for typical formatting practices) it's not overly expensive to repeat this in different modules, but it does feel like code duplication in those cases.
You could define the following in some utility library:
def exporter():
all = []
def decorator(obj):
all.append(obj.__name__)
return obj
return decorator, all
export, __all__ = exporter()
export(exporter)
# possibly some other utilities, decorated with #export as well
Then inside your public library you'd do something like this:
from . import utility
export, __all__ = utility.exporter()
# start using #export
Using the library takes two lines of code here. It combines the definition of __all__ and the decorator. So people searching for one of them will find the other, thus helping readers to quickly understand your code. The above will also work in exotic environments, where the module may not be available from the sys.modules cache or where the __module__ property has been tampered with or some such.
https://github.com/russianidiot/public.py has yet another implementation of such a decorator. Its core file is currently 160 lines long! The crucial points appear to be the fact that it uses the inspect module to obtain the appropriate module based on the current call stack.
This is not a decorator approach, but provides the level of efficiency I think you're after.
https://pypi.org/project/auto-all/
You can use the two functions provided with the package to "start" and "end" capturing the module objects that you want included in the __all__ variable.
from auto_all import start_all, end_all
# Imports outside the start and end functions won't be externally availab;e.
from pathlib import Path
def a_private_function():
print("This is a private function.")
# Start defining externally accessible objects
start_all(globals())
def a_public_function():
print("This is a public function.")
# Stop defining externally accessible objects
end_all(globals())
The functions in the package are trivial (a few lines), so could be copied into your code if you want to avoid external dependencies.
While other variants are technically correct to a certain extent, one might also be sure that:
if the target module already has __all__ declared, it is handled correctly;
target appears in __all__ only once:
# utils.py
import sys
from typing import Any
def export(target: Any) -> Any:
"""
Mark a module-level object as exported.
Simplifies tracking of objects available via wildcard imports.
"""
mod = sys.modules[target.__module__]
__all__ = getattr(mod, '__all__', None)
if __all__ is None:
__all__ = []
setattr(mod, '__all__', __all__)
elif not isinstance(__all__, list):
__all__ = list(__all__)
setattr(mod, '__all__', __all__)
target_name = target.__name__
if target_name not in __all__:
__all__.append(target_name)
return target
I have a module I am using which uses RealClass, so it is an internal dependency I don't have access to.
I want to be able to create a FakeClass which replaces the functionality of the RealClass for testing. I don't want to replace individual methods but the entire class.
I looked at stubble which seems to be what I want but I was wondering if mox or any of the other mocking frameworks have this functionality? Or what would you suggest to use? Maybe fudge, monkey-patching? Just looking for best practices with this stuff. Also any useful examples would be awesome.
Pseudo code:
from module import RealClass
class FakeClass
methodsFromRealClassOverridden
class Test(unittest.TestCase):
setup()
teardown()
test1()
stub(RealClass, FakeClass) // something like this, but really just want the functionality
classThatUsesRealClass // now will use FakeClass
UPDATE:
Here's one way I found to do it. It isn't perfect but it works.
Example:
fake = FakeClass()
stub = stubout.StubOutForTesting()
stub.Set(RealClass, 'method_1', fake.method_1)
stub.Set(RealClass, 'method_2', fake.method_2)
I think you want opinions/experiences so I'm just giving my 2 cents.
As you noticed there are a few Python testing tools/classes/frameworks, but most of the time given the simplicity/dynamism/openness of Python you will limit yourself to using ad-hoc relevant test cases which involve stubbing at the interface level, and a bit of unittest... until you start using the frameworks.
There is nothing pejorative about monkey-patching, especially when it comes to performing testing/stubbing:
#!/usr/bin/env python
# minimal example of library code
class Class:
""" a class """
def method(self, arg):
""" a method that does real work """
print("pouet %s" % arg)
#!/usr/bin/env python
# minimal example for stub and tests, overriding/wrapping one method
from Class import Class
Class._real_method = Class.method
def mymethod(self, arg):
# do what you want
print("called stub")
# in case you want to call the real function...
self._real_method(arg)
Class.method = mymethod
# ...
e = Class()
e.method("pouet")
Namespaces will allow you to patch stuff inside of imported modules inside of imported modules...
Note that the above method does not work with classes in C modules.
For them you can use a wrapper class that filters on class member names using getattr/setattr, and returns the redefined members from the wrapper class.
#!/usr/bin/env python
# Stupid minimal example replacing the sys module
# (not very useful / optimal, it's just an example of patching)
import sys
class SysWrap():
real = sys
def __getattr__(self, attr):
if attr == 'stderr':
class StdErr():
def write(self, txt):
print("[err: %s]" % txt)
return StdErr()
print("Getattr %s" % attr)
return getattr(SysWrap.real, attr)
sys = SysWrap()
# use the real stdout
sys.stdout.write("pouet")
# use fake stderr
sys.stderr.write("pouet")
Once you are becoming tired of performing ad-hoc testing, you'll find higher level stuff such as the ones you mentioned (stubble, fudge) useful, but to enjoy them and use them efficiently you have to first see the problems they solve and accept all the automatic stuff they do under the hood.
It is probable that a part of ad-hoc monkey patching will remain, it's just easier to understand, and all the tools have some limitations.
Tools empower you but you have to deeply understand them to use them efficiently.
An important aspect when deciding whether to use a tool or not is that when you transmit a chunk of code, you transmit the whole environment (including testing tools).
The next guy might not be as smart as you and skip the testing because your testing tool is too complex for him.
Generally you want to avoid using a lot of dependencies in your software.
In the end, I think that nobody will bother you if you just use unittest and ad-hoc tests/monkey-patching, provided your stuff works.
Your code might not be that complex anyway.
I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.
You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...
It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.
At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.
I am doing some heavy commandline stuff (not really web based) and am new to Python, so I was wondering how to set up my files/folders/etc. Are there "header" files where I can keep all the DB connection stuff?
How/where do I define classes and objects?
Just to give you an example of a typical Python module's source, here's something with some explanation. This is a file named "Dims.py". This is not the whole file, just some parts to give an idea what's going on.
#!/usr/bin/env python
This is the standard first line telling the shell how to execute this file. Saying /usr/bin/env python instead of /usr/bin/python tells the shell to find Python via the user's PATH; the desired Python may well be in ~/bin or /usr/local/bin.
"""Library for dealing with lengths and locations."""
If the first thing in the file is a string, it is the docstring for the module. A docstring is a string that appears immediately after the start of an item, which can be accessed via its __doc__ property. In this case, since it is the module's docstring, if a user imports this file with import Dims, then Dims.__doc__ will return this string.
# Units
MM_BASIC = 1500000
MILS_BASIC = 38100
IN_BASIC = MILS_BASIC * 1000
There are a lot of good guidelines for formatting and naming conventions in a document known as PEP (Python Enhancement Proposal) 8. These are module-level variables (constants, really) so they are written in all caps with underscores. No, I don't follow all the rules; old habits die hard. Since you're starting fresh, follow PEP 8 unless you can't.
_SCALING = 1
_SCALES = {
mm_basic: MM_BASIC,
"mm": MM_BASIC,
mils_basic: MILS_BASIC,
"mil": MILS_BASIC,
"mils": MILS_BASIC,
"basic": 1,
1: 1
}
These module-level variables have leading underscores in their names. This gives them a limited amount of "privacy", in that import Dims will not let you access Dims._SCALING. However, if you need to mess with it, you can explicitly say something like import Dims._SCALING as scaling.
def UnitsToScale(units=None):
"""Scales the given units to the current scaling."""
if units is None:
return _SCALING
elif units not in _SCALES:
raise ValueError("unrecognized units: '%s'." % units)
return _SCALES[units]
UnitsToScale is a module-level function. Note the docstring and the use of default values and exceptions. No spaces around the = in default value declarations.
class Length(object):
"""A length. Makes unit conversions easier.
The basic, mm, and mils properties can be used to get or set the length
in the desired units.
>>> x = Length(mils=1000)
>>> x.mils
1000.0
>>> x.mm
25.399999999999999
>>> x.basic
38100000L
>>> x.mils = 100
>>> x.mm
2.54
"""
The class declaration. Note the docstring has things in it that look like Python command line commands. These care called doctests, in that they are test code in the docstring. More on this later.
def __init__(self, unscaled=0, basic=None, mm=None, mils=None, units=None):
"""Constructs a Length.
Default contructor creates a length of 0.
>>> Length()
Length(basic=0)
Length(<float>) or Length(<string>) creates a length with the given
value at the current scale factor.
>>> Length(1500)
Length(basic=1500)
>>> Length("1500")
Length(basic=1500)
"""
# Straight copy
if isinstance(unscaled, Length):
self._x = unscaled._x
return
# rest omitted
This is the initializer. Unlike C++, you only get one, but you can use default arguments to make it look like several different constructors are available.
def _GetBasic(self): return self._x
def _SetBasic(self, x): self._x = x
basic = property(_GetBasic, _SetBasic, doc="""
This returns the length in basic units.""")
This is a property. It allows you to have getter/setter functions while using the same syntax as you would for accessing any other data member, in this case, myLength.basic = 10 does the same thing as myLength._SetBasic(10). Because you can do this, you should not write getter/setter functions for your data members by default. Just operate directly on the data members. If you need to have getter/setter functions later, you can convert the data member to a property and your module's users won't need to change their code. Note that the docstring is on the property, not the getter/setter functions.
If you have a property that is read-only, you can use property as a decorator to declare it. For example, if the above property was to be read-only, I would write:
#property
def basic(self):
"""This returns the length in basic units."""
return self._x
Note that the name of the property is the name of the getter method. You can also use decorators to declare setter methods in Python 2.6 or later.
def __mul__(self, other):
"""Multiplies a Length by a scalar.
>>> Length(10)*10
Length(basic=100)
>>> 10*Length(10)
Length(basic=100)
"""
if type(other) not in _NumericTypes:
return NotImplemented
return Length(basic=self._x * other)
This overrides the * operator. Note that you can return the special value NotImplemented to tell Python that this operation isn't implemented (in this case, if you try to multiply by a non-numeric type like a string).
__rmul__ = __mul__
Since code is just a value like anything else, you can assign the code of one method to another. This line tells Python that the something * Length operation uses the same code as Length * something. Don't Repeat Yourself.
Now that the class is declared, I can get back to module code. In this case, I have some code that I want to run only if this file is executed by itself, not if it's imported as a module. So I use the following test:
if __name__ == "__main__":
Then the code in the if is executed only if this is being run directly. In this file, I have the code:
import doctest
doctest.testmod()
This goes through all the docstrings in the module and looks for lines that look like Python prompts with commands after them. The lines following are assumed to be the output of the command. If the commands output something else, the test is considered to have failed and the actual output is printed. Read the doctest module documentation for all the details.
One final note about doctests: They're useful, but they're not the most versatile or thorough tests available. For those, you'll want to read up on unittests (the unittest module).
Each python source file is a module. There are no "header" files. The basic idea is that when you import "foo" it'll load the code from "foo.py" (or a previously compiled version of it). You can then access the stuff from the foo module by saying foo.whatever.
There seem to be two ways for arranging things in Python code. Some projects use a flat layout, where all of the modules are at the top-level. Others use a hierarchy. You can import foo/bar/baz.py by importing "foo.bar.baz". The big gotcha with hierarchical layout is to have __init__.py in the appropriate directories (it can even be empty, but it should exist).
Classes are defined like this:
class MyClass(object):
def __init__(self, x):
self.x = x
def printX(self):
print self.x
To create an instance:
z = MyObject(5)
You can organize it in whatever way makes the most sense for your application. I don't exactly know what you're doing so I can't be certain what the best organization would be for you, but you can pretty much split it up as you see fit and just import what you need.
You can define classes in any file, and you can define as many classes as you would like in a script (unlike Java). There are no official header files (not like C or C++), but you can use config files to store info about connecting to a DB, whatever, and use configparser (a standard library function) to organize them.
It makes sense to keep like things in the same file, so if you have a GUI, you might have one file for the interface, and if you have a CLI, you might keep that in a file by itself. It's less important how your files are organized and more important how the source is organized into classes and functions.
This would be the place to look for that: http://docs.python.org/reference/.
First of all, compile and install pip: http://pypi.python.org/pypi/pip. It is like Ubuntu's apt-get. You run it via a Terminal by typing in pip install package-name. It has a database of packages, so you can install/uninstall stuff quite easily with it.
As for importing and "header" files, from what I can tell, if you run import foo, Python looks for foo.py in the current folder. If it's not there, it looks for eggs (folders unzipped in the Python module directory) and imports those.
As for defining classes and objects, here's a basic example:
class foo(foobar2): # I am extending a class, in this case 'foobar2'. I take no arguments.
__init__(self, the, list, of, args = True): # Instead, the arguments get passed to me. This still lets you define a 'foo()' objects with three arguments, only you let '__init__' take them.
self.var = 'foo'
def bar(self, args):
self.var = 'bar'
def foobar(self): # Even if you don't need arguments, never leave out the self argument. It's required for classes.
print self.var
foobar = foo('the', 'class', 'args') # This is how you initialize me!
Read more on this in the Python Reference, but my only tip is to never forget the self argument in class functions. It will save you a lot of debugging headaches...
Good luck!
There's no some fixed structure for Python programs, but you can take Django project as an example. Django project consists of one settings.py module, where global settings (like your example with DB connection properties) are stored and pluggable applications. Each application has it's own models.py module, which stores database models and, possibly, other domain specific objects. All the rest is up to you.
Note, that these advices are not specific to Python. In C/C++ you probably used similar structure and kept settings in XML. Just forget about headers and put settings in plain in .py file, that's all.