I want to unify two slightly different scripts.
My idea was to keep the common part in a file (modX.py) and create two other files to gives two different entry points (A.py and B.py). The common part will be called through an 'import'.
from modX import *
Now, I don't see how I can have specific parts in modX. One idea was to test a variable having different values in A.py and B.py.
In modX.py :
if 'is_A' in globals():
my_string = "spam"
else:
my_string = "eggs"
A.py :
is_A = True
from modX import *
print("I love {}".format(my_string))
How can my_string get "spam" ?
Altough putting all the common part in a function can be more pythonic, I would avoid refactoring modX.py too much if I can.
I've never figured out a way to pass arguments to a module on import — although it would be very useful. However there are ways to work around the limitation which make use of the fact that modules objects are cached in the sys.modules dictionary when they're first imported and can be replaced with an instance of a class. Note that attributes assigned to the class instance (self) effectively become the module's attributes after it's stored in sys.modules.
Here's how that could be used in your example:
modX.py
import sys
class MyModule(object):
def __init__(self, arg=None):
if arg == 'is_A':
self.my_string = 'spam'
else:
self.my_string = 'eggs'
def called_by(arg): # nested function - no self
import sys
# Replace module entry with new instance of MyModule
sys.modules[__name__] = MyModule(arg)
self.called_by = called_by
# Replace module entry in sys.modules[__name__] with a default instance of
# MyModule (and create an additional reference to original module so it's not
# deleted)
_ref, sys.modules[__name__] = sys.modules[__name__], MyModule()
del sys # clean-up namespace (optional)
A.py
from modX import *
called_by('is_A') # changes modX
from modX import * # do it again to get modified version
print("I love {}".format(my_string)) # -> I love spam
B.py
from modX import *
print("I love {}".format(my_string)) # -> I love eggs
The reason that it is not work is because modX.globals() is different object the A.globals()
You do it like that:
In modX.py :
def foo():
if 'is_A' in globals():
return "spam"
else:
return "eggs"
A.py :
from modX import *
modX.is_A = True
print("I love {}".format(foo()))
You can also try to use at inspect module, maybe you find the way with that. Tell us if it help you.
Related
Problem
Consider the following layout:
package/
main.py
math_helpers/
mymath.py
__init__.py
mymath.py contains:
import math
def foo():
pass
In main.py I want to be able to use code from mymath.py like so:
import math_helpers
math_helpers.foo()
In order to do so, __init__.py contains:
from .mymath import *
However, modules imported in mymath.py are now in the math_helpers namespace, e.g. math_helpers.math is accessible.
Current approach
I'm adding the following at the end of mymath.py.
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
This seems to work, but is it the correct approach?
On the one hand there are many good reasons not to do star imports, but on the other hand, python is for consenting adults.
__all__ is the recommended approach to determining what shows up in a star import. Your approach is correct, and you can further sanitize the namespace when finished:
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
del types
While less recommended, you can also sanitize elements directly out of the module, so that they don't show up at all. This will be a problem if you need to use them in a function defined in the module, since every function object has a __globals__ reference that is bound to its parent module's __dict__. But if you only import math_helpers to call math_helpers.foo(), and don't require a persistent reference to it elsewhere in the module, you can simply unlink it at the end:
del math_helpers
Long Version
A module import runs the code of the module in the namespace of the module's __dict__. Any names that are bound at the top level, whether by class definition, function definition, direct assignment, or other means, live in the that dictionary. Sometimes, it is desirable to clean up intermediate variables, as I suggested doing with types.
Let's say your module looks like this:
test_module.py
import math
import numpy as np
def x(n):
return math.sqrt(n)
class A(np.ndarray):
pass
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
In this case, __all__ will be ['x', 'A']. However, the module itself will contain the following names: 'math', 'np', 'x', 'A', 'types', '__all__'.
If you run del types at the end, it will remove that name from the namespace. Clearly this is safe because types is not referenced anywhere once __all__ has been constructed.
Similarly, if you wanted to remove np by adding del np, that would be OK. The class A is fully constructed by the end of the module code, so it does not require the global name np to reference its parent class.
Not so with math. If you were to do del math at the end of the module code, the function x would not work. If you import your module, you can see that x.__globals__ is the module's __dict__:
import test_module
test_module.__dict__ is test_module.x.__globals__
If you delete math from the module dictionary and call test_module.x, you will get
NameError: name 'math' is not defined
So you under some very special circumstances you may be able to sanitize the namespace of mymath.py, but that is not the recommended approach as it only applies to certain cases.
In conclusion, stick to using __all__.
A Story That's Sort of Relevant
One time, I had two modules that implemented similar functionality, but for different types of end users. There were a couple of functions that I wanted to copy out of module a into module b. The problem was that I wanted the functions to work as if they had been defined in module b. Unfortunately, they depended on a constant that was defined in a. b defined its own version of the constant. For example:
a.py
value = 1
def x():
return value
b.py
from a import x
value = 2
I wanted b.x to access b.value instead of a.value. I pulled that off by adding the following to b.py (based on https://stackoverflow.com/a/13503277/2988730):
import functools, types
x = functools.update_wrapper(types.FunctionType(x.__code__, globals(), x.__name__, x.__defaults__, x.__closure__), x)
x.__kwdefaults__ = x.__wrapped__.__kwdefaults__
x.__module__ = __name__
del functools, types
Why am I telling you all this? Well, you can make a version of your module that does not have any stray names in your namespace. You won't be able to see changes to global variables in your functions though. This is just an exercise in pushing python beyond its normal usage. I highly don't recommend doing this, but here is a sample module that effectively freezes its __dict__ as far as the functions are concerned. This has the same members as test_module above, but with no modules in the global namespace:
import math
import numpy as np
def x(n):
return math.sqrt(n)
class A(np.ndarray):
pass
import functools, types, sys
def wrap(obj):
""" Written this way to be able to handle classes """
for name in dir(obj):
if name.startswith('_'):
continue
thing = getattr(obj, name)
if isinstance(thing, FunctionType) and thing.__module__ == __name__:
setattr(obj, name,
functools.update_wrapper(types.FunctionType(thing.func_code, d, thing.__name__, thing.__defaults__, thing.__closure__), thing)
getattt(obj, name).__kwdefaults__ = thing.__kwdefaults__
elif isinstance(thing, type) and thing.__module__ == __name__:
wrap(thing)
d = globals().copy()
wrap(sys.modules[__name__])
del d, wrap, sys, math, np, functools, types
So yeah, please don't ever do this! But if you do, stick it in a utility class somewhere.
I have written the following code to modify the behavior of a method of one class
import mymodule
mymodule.MyClass.f = mydecorator(mymodule.MyClass.f)
mymodule.MyClass.f(x) # call the modified function
This works for my purposes, but: what have I modified exactly? Is mymodule.MyClass a copy of the original class living inside the current module? Does it in any way affect the original class? How does import work exactly?
When you modify imported module, you modify the cached instance. Thus your changes will affect all other modules, which import the modified module.
https://docs.python.org/3/reference/import.html#the-module-cache
UPDATE:
You can test it.
change_sys.py:
import sys
# Let's change a module
sys.t = 3
main.py:
# the order of imported modules doesn't meter
# they both use cached sys
import sys
import change_sys
print(sys.t)
Output for python ./main.py:
3
It depends. In normal uses cases everything should be ok. But one can imagine special cases where it can lead to weird results:
a.py:
import c
x = c.C()
def disp():
return x.foo()
b.py:
import c
def change():
c.C.foo = (lambda self: "bar at " + str(self))
c.py:
class C:
def foo(self):
return "foo at " + str(self)
Now in top level script (or interactive interpretor) I write:
import a
import b
a.disp()
b.change()
a.disp()
Output will be:
'foo at <c.C object at 0x0000013E4A65D080>'
'bar at <c.C object at 0x0000013E4A65D080>'
It may be what you want, but the change has been done in b module and it does affect a module.
I have a system that collects all classes that derive from certain base classes and stores them in a dictionary. I want to avoid having to specify which classes are available (I would like to discover them programatically), so have used a from ModuleName import * statement. The user is then directed to place all tests to be collected in the ModuleName module. However, I cannot find a way to programatically determine what symbols were imported with that import statement. I have tried using dir() and __dict__ as indicated in the following example, but to no avail. How does one programatically find symbols imported in this manner (with import *)? I am unable to find them with the above methods.
testTypeFigureOuterrer.py:
from testType1 import *
from testType2 import *
class TestFigureOuterrer(object):
def __init__(self):
self.existingTests = {'type1':{},'type2':{}}
def findAndSortTests(self):
for symbol in dir(): # Also tried: dir(self) and __dict__
try:
thing = self.__getattribute__(symbol)
except AttributeError:
continue
if issubclass(thing,TestType1):
self.existingTests['type1'].update( dict(symbol,thing) )
elif issubclass(thing,TestType3):
self.existingTests['type2'].update( dict(symbol,thing) )
else:
continue
if __name__ == "__main__":
testFigureOuterrer = TestFigureOuterrer()
testFigureOuterrer.findAndSortTests()
testType1.py:
class TestType1(object):
pass
class TestA(TestType1):
pass
class TestB(TestType1):
pass
testType2.py:
class TestType2:
pass
class TestC(TestType2):
pass
class TestD(TestType2):
pass
Since you know the imports yourself, you should just import the module manually again, and then check the contents of the module. If an __all__ property is defined, its contents are imported as names when you do from module import *. Otherwise, just use all its members:
def getImportedNames (module):
names = module.__all__ if hasattr(module, '__all__') else dir(module)
return [name for name in names if not name.startswith('_')]
This has the benefit that you do not need to go through the globals, and filter everything out. And since you know the modules you import from at design time, you can also check them directly.
from testType1 import *
from testType2 import *
import testType1, testType2
print(getImportedNames(testType1))
print(getImportedNames(testType2))
Alternatively, you can also look up the module by its module name from sys.modules, so you don’t actually need the extra import:
import sys
def getImportedNames (moduleName):
module = sys.modules[moduleName]
names = module.__all__ if hasattr(module, '__all__') else dir(module)
return [name for name in names if not name.startswith('_')]
print(getImportedNames('testType1'))
print(getImportedNames('testType2'))
Take a look at this SO answer, which describes how to determine the name of loaded classes, you can get the name of all the classes defined within the context of the module.
import sys, inspect
clsmembers = inspect.getmembers(sys.modules['testType1'], inspect.isclass)
which is now defined as
[('TestA', testType1.TestA),
('TestB', testType1.TestB),
('TestType1', testType1.TestType1)]
You can also replace testType1 with __name__ when you're within the function of interest.
Don't use the * form of import. This dumps the imported names into your script's global namespace. Not only could they clobber some important bit of data by using the same name, you don't have any easy way to fish out the names you just imported. (Easiest way is probably to take a snapshot of globals().keys() before and after.)
Instead, import just the module:
import testType1
import testType2
Now you can easily get a list of what's in each module:
tests = dir(testType1)
And access each using getattr() on the module object:
for testname in tests:
test = getattr(testType1, testname)
if callable(test):
# do something with it
I've literally been trying to understand Python imports for about a year now, and I've all but given up programming in Python because it just seems too obfuscated. I come from a C background, and I assumed that import worked like #include, yet if I try to import something, I invariably get errors.
If I have two files like this:
foo.py:
a = 1
bar.py:
import foo
print foo.a
input()
WHY do I need to reference the module name? Why not just be able to write import foo, print a? What is the point of this confusion? Why not just run the code and have stuff defined for you as if you wrote it in one big file? Why can't it work like C's #include directive where it basically copies and pastes your code? I don't have import problems in C.
To do what you want, you can use (not recommended, read further for explanation):
from foo import *
This will import everything to your current namespace, and you will be able to call print a.
However, the issue with this approach is the following. Consider the case when you have two modules, moduleA and moduleB, each having a function named GetSomeValue().
When you do:
from moduleA import *
from moduleB import *
you have a namespace resolution issue*, because what function are you actually calling with GetSomeValue(), the moduleA.GetSomeValue() or the moduleB.GetSomeValue()?
In addition to this, you can use the Import As feature:
from moduleA import GetSomeValue as AGetSomeValue
from moduleB import GetSomeValue as BGetSomeValue
Or
import moduleA.GetSomeValue as AGetSomeValue
import moduleB.GetSomeValue as BGetSomeValue
This approach resolves the conflict manually.
I am sure you can appreciate from these examples the need for explicit referencing.
* Python has its namespace resolution mechanisms, this is just a simplification for the purpose of the explanation.
Imagine you have your a function in your module which chooses some object from a list:
def choice(somelist):
...
Now imagine further that, either in that function or elsewhere in your module, you are using randint from the random library:
a = randint(1, x)
Therefore we
import random
You suggestion, that this does what is now accessed by from random import *, means that we now have two different functions called choice, as random includes one too. Only one will be accessible, but you have introduced ambiguity as to what choice() actually refers to elsewhere in your code.
This is why it is bad practice to import everything; either import what you need:
from random import randint
...
a = randint(1, x)
or the whole module:
import random
...
a = random.randint(1, x)
This has two benefits:
You minimise the risks of overlapping names (now and in future additions to your imported modules); and
When someone else reads your code, they can easily see where external functions come from.
There are a few good reasons. The module provides a sort of namespace for the objects in it, which allows you to use simple names without fear of collisions -- coming from a C background you have surely seen libraries with long, ugly function names to avoid colliding with anybody else.
Also, modules themselves are also objects. When a module is imported in more than one place in a python program, each actually gets the same reference. That way, changing foo.a changes it for everybody, not just the local module. This is in contrast to C where including a header is basically a copy+paste operation into the source file (obviously you can still share variables, but the mechanism is a bit different).
As mentioned, you can say from foo import * or better from foo import a, but understand that the underlying behavior is actually different, because you are taking a and binding it to your local module.
If you use something often, you can always use the from syntax to import it directly, or you can rename the module to something shorter, for example
import itertools as it
When you do import foo, a new module is created inside the current namespace named foo.
So, to use anything inside foo; you have to address it via the module.
However, if you use from from foo import something, you don't have use to prepend the module name, since it will load something from the module and assign to it the name something. (Not a recommended practice)
import importlib
# works like C's #include, you always call it with include(<path>, __name__)
def include(file, module_name):
spec = importlib.util.spec_from_file_location(module_name, file)
mod = importlib.util.module_from_spec(spec)
# spec.loader.exec_module(mod)
o = spec.loader.get_code(module_name)
exec(o, globals())
For example:
#### file a.py ####
a = 1
#### file b.py ####
b = 2
if __name__ == "__main__":
print("Hi, this is b.py")
#### file main.py ####
# assuming you have `include` in scope
include("a.py", __name__)
print(a)
include("b.py", __name__)
print(b)
the output will be:
1
Hi, this is b.py
2
Please excuse the vague title. If anyone has a suggestion, please let me know! Also please retag with more appropriate tags!
The Problem
I want to have an instance of an imported class be able to view things in the scope (globals, locals) of the importer. Since I'm not sure of the exact mechanism at work here, I can describe it much better with snippets than words.
## File 1
def f1(): print "go f1!"
class C1(object):
def do_eval(self,x): # maybe this should be do_evil, given what happens
print "evaling"
eval(x)
eval(x,globals(),locals())
Then run this code from an iteractive session, there there will be lots of NameErrors
## interactive
class C2(object):
def do_eval(self,x): # maybe this should be do_evil, given what happens
print "evaling"
eval(x)
eval(x,globals(),locals())
def f2():
print "go f2!"
from file1 import C1
import file1
C1().do_eval('file1.f1()')
C1().do_eval('f1()')
C1().do_eval('f2()')
file1.C1().do_eval('file1.f1()')
file1.C1().do_eval('f1()')
file1.C1().do_eval('f2()')
C2().do_eval('f2()')
C2().do_eval('file1.f1()')
C2().do_eval('f1()')
Is there a common idiom / pattern for this sort of task? Am I barking up the wrong tree entirely?
In this example, you can simply hand over functions as objects to the methods in C1:
>>> class C1(object):
>>> def eval(self, x):
>>> x()
>>>
>>> def f2(): print "go f2"
>>> c = C1()
>>> c.eval(f2)
go f2
In Python, you can pass functions and classes to other methods and invoke/create them there.
If you want to actually evaluate a code string, you have to specify the environment, as already mentioned by Thomas.
Your module from above, slightly changed:
## File 1
def f1(): print "go f1!"
class C1(object):
def do_eval(self, x, e_globals = globals(), e_locals = locals()):
eval(x, e_globals, e_locals)
Now, in the interactive interpreter:
>>> def f2():
>>> print "go f2!"
>>> from file1 import * # 1
>>> C1().do_eval("f2()") # 2
NameError: name 'f2' is not defined
>>> C1().do_eval("f2()", globals(), locals()) #3
go f2!
>>> C1().do_eval("f1()", globals(), locals()) #4
go f1!
Some annotations
Here, we insert all objects from file1 into this module's namespace
f2 is not in the namespace of file1, therefore we get a NameError
Now we pass the environment explictly, and the code can be evaluated
f1 is in the namespace of this module, because we imported it
Edit: Added code sample on how to explicitly pass environment for eval.
Functions are always executed in the scope they are defined in, as are methods and class bodies. They are never executed in another scope. Because importing is just another assignment statement, and everything in Python is a reference, the functions, classes and modules don't even know where they are imported to.
You can do two things: explicitly pass the 'environment' you want them to use, or use stack hackery to access their caller's namespace. The former is vastly preferred over the latter, as it's not as implementation-dependent and fragile as the latter.
You may wish to look at the string.Template class, which tries to do something similar.