Monkeypatching hardcoded global configuration loaded from a .py file - python

A co-worker has a library that uses a hard-coded configuration defined in its own file. For instance:
constants.py:
API_URL="http://example.com/bogus"
Throughout the rest of the library, the configuration is accessed in the following manner.
from constants import API_URL
As you can imagine, this is not very flexible and causes problems during testing. If I want to change the configuration, I have to modify constants.py, which is in source code management.
Naturally, I'd much rather load the configuration from a JSON or YAML file. I can read the configuration into an object with no problems. Is there a way I can override the constants.py module without breaking the code, so that each global, e.g. API_URL is replaced by a value provided by my file?
I was thinking that after each from constants import ... I could add something like this:
from constants import * # existing configuration import
import json
new_config = json.load(open('config.json')) # load my config file into a dictionary
constants.__dict__.update(new_config) # override any constants with what I've loaded
The problem with this, of course, is that it's not very "DRY" and looks like it might be brittle.
Does anyone have a suggestion for doing this more cleanly? Thanks!
EDIT: looks like my approach doesn't work anyway. I guess "from import *" copies the values from the module into the current module's global scope?
DOUBLE EDIT: no, it does work; I'm just confused. But rather than doing this in X different files I'd like to have it work transparently if possible.

from module import <name> creates a reference in the importing module global namespace to the imported object. If that is an immutable, that means you now have to monkeypatch the value in the module that imported it.
Your only hope is to be the first to import constants and monkeypatch the names in that module. Subsequent imports will then use your monkeypatched values.
To patch the original module early, the following is enough:
import constants
for name, value in new_config.iteritems():
setattr(constants, name, value)

Related

How to get source file which python class/function/variable is imported from?

Let's say I have imported two modules like this:
from module0 import hello_func
from directory.module1 import hello_var
Where in module0:
def hello_func(): return "hello from module0"
And module1:
hello_var = "hello from module1"
How can I know from which file is each object being imported?
I tried to check locals() function but nothing in there giving reference to the file...
Actually, you kind of answered your question yourself:
Let's say I have imported two modules
(insert "from xxx import *" here)
How can I know from which file is each object being imported?
One of the reasons of NOT using wildcard imports is precisely to make it clear where names are imported from (the other one being to avoind one imported name to shadow a previously imported one - something that tends to break you code in the most unexpected - and sometimes quite hard to spot - ways).
Note that in your edited question:
from module0 import hello_func
from directory.module1 import hello_var
you already have a much better idea where a name comes from. Not the exact files path yet, but at least the name of the package/module.
And that's one of the main reasons why one should NOT use wildcard imports.
Now if you want to know the exact files path, you have two distinct cases.
Some objects keep trace of where they were created (mostly modules, classes, functions, etc - cf the list of types supported by inspect.getfile()), and then, well, you already know the answer (use inspect.getfile() xD).
But most types wont (because there's no reason for it). In this case, you have to know which module they were imported from and call inspect.getfile() on the module itself. In this case, if you used wildcard imports, you will have to manually inspect all the modules you imported from to find out which one defined this name. Enjoy. Specially if one of those modules did also use wildcard imports...
one question please: where they does keep traces? and how these traces look like?
Modules keep it in their __file__ attribute. Functions and classes keep a reference to their module's name in their __module__ attribute, and from this you can use it to retrieve the module from the sys.modules dict (a cache of all modules already imported in the current process), which will gives you the file.
I never had a need to search this info for tracebacks, frames, code objects etc so you'll have to check it yourself I'm afraid ;-)
You can define in each module a constant with its path, something like this should work:
import os
FILE_PATH = os.path.abspath(__file__)
When you import that module you can access its location like this:
import module
print(module.FILE_PATH)
Another solution using the inspect and os modules.
import module0
import os
import inspect
print(os.path.abspath(inspect.getfile(module0.hello_func)))
If you are looking for the absolute path of where the script is being run, this should work for sure:
import os
abs_path = os.path.dirname(os.path.abspath(__file__))
print(abs_path)

Best practice for nested Python module imports with * [duplicate]

I have a file, myfile.py, which imports Class1 from file.py and file.py contains imports to different classes in file2.py, file3.py, file4.py.
In my myfile.py, can I access these classes or do I need to again import file2.py, file3.py, etc.?
Does Python automatically add all the imports included in the file I imported, and can I use them automatically?
Best practice is to import every module that defines identifiers you need, and use those identifiers as qualified by the module's name; I recommend using from only when what you're importing is a module from within a package. The question has often been discussed on SO.
Importing a module, say moda, from many modules (say modb, modc, modd, ...) that need one or more of the identifiers moda defines, does not slow you down: moda's bytecode is loaded (and possibly build from its sources, if needed) only once, the first time moda is imported anywhere, then all other imports of the module use a fast path involving a cache (a dict mapping module names to module objects that is accessible as sys.modules in case of need... if you first import sys, of course!-).
Python doesn't automatically introduce anything into the namespace of myfile.py, but you can access everything that is in the namespaces of all the other modules.
That is to say, if in file1.py you did from file2 import SomeClass and in myfile.py you did import file1, then you can access it within myfile as file1.SomeClass. If in file1.py you did import file2 and in myfile.py you did import file1, then you can access the class from within myfile as file1.file2.SomeClass. (These aren't generally the best ways to do it, especially not the second example.)
This is easily tested.
In the myfile module, you can either do from file import ClassFromFile2 or from file2 import ClassFromFile2 to access ClassFromFile2, assuming that the class is also imported in file.
This technique is often used to simplify the API a bit. For example, a db.py module might import various things from the modules mysqldb, sqlalchemy and some other helpers. Than, everything can be accessed via the db module.
If you are using wildcard import, yes, wildcard import actually is the way of creating new aliases in your current namespace for contents of the imported module. If not, you need to use the namespace of the module you have imported as usual.

Should I define __all__ even if I prefix hidden functions and variables with underscores in modules?

From the perspective of an external user of the module, are both necessary?
From my understanding, by correctly prefix hidden functions with an underscore it essentially does the same thing as explicitly define __all__, but I keep seeing developers doing both in their code. Why is that?
When importing from a module with from modulename import * names starting with underscores are indeed skipped.
However, a module rarely contains only public API objects. Usually you've made imports to support the code as well, and those names are global in the module as well. Without __all__, those names would be part of the import too.
In other words, unless you want to 'export' os in the following example you should use __all__:
import os
from .implementation import some_other_api_call
_module_path = os.path.dirname(os.path.abspath(__file__))
_template = open(os.path.join(_module_path, 'templates/foo_template.txt')).read()
VERSION = '1.0.0'
def make_bar(baz, ham, spam):
return _template.format(baz, ham, spam)
__all__ = ['some_other_api_call', 'make_bar']
because without the __all__ list, Python cannot distinguish between some_other_api_call and os here and divine which one should not be imported when using from ... import *.
You could work around this by renaming all your imports, so import os as _os, but that'd just make your code less readable.
And an explicit export list is always nice. Explicit is better than implicit, as the Zen of Python tells you.
I also use __all__: that explictly tells module users what you intend to export. Searching the module for names is tedious, even if you are careful to do, e.g., import os as _os, etc. A wise man once wrote "explicit is better than implicit" ;-)
Defining all will overide the default behaviour. There is actually might be one reason to define __all__
When importing a module, you might want that doing from mod import * will import only a minimal amount of things. Even if you prefix everything correctly, there could be reasons not to import everything.
The other problem that I had once was defining a gettext shortcut. The translation function was _ which would not get imported. Even if it is prefixed "in theory" I still wanted it to get exported.
One other reason as stated above is importing a module that imports a lot of thing. As python cannot make the difference between symbols created by imports and the one defined in the actual module. It will automatically reexport everything that can be reexported. For that reason, it could be wise to explicitely limit the thing exported to the things you want to export.
With that in mind, you might want to have some prefixed symbols exported by default. Usually, you don't redefine __all__. Whenever you need it to do something unusual then it may make sense to do it.

Python module: how to prevent importing modules called by the new module

I am new in Python and I am creating a module to re-use some code.
My module (impy.py) looks like this (it has one function so far)...
import numpy as np
def read_image(fname):
....
and it is stored in the following directory:
custom_modules/
__init.py__
impy.py
As you can see it uses the module numpy. The problem is that when I import it from another script, like this...
import custom_modules.impy as im
and I type im. I get the option of calling not only the function read_image() but also the module np.
How can I do to make it only available the functions I am writing in my module and not the modules that my module is calling (numpy in this case)?
Thank you very much for your help.
I've got a proposition, that could maybe answer the following concern: "I do not want to mess class/module attributes with class/module imports". Because, Idle also proposes access to imported modules within a class or module.
This simply consists in taking the conventional name that coders normally don't want to access and IDE not to propose: name starting with underscore. This is also known as "weak « internal use » indicator", as described in PEP 8 / Naming styles.
class C(object):
import numpy as _np # <-- here
def __init__(self):
# whatever we need
def do(self, arg):
# something useful
Now, in Idle, auto-completion will only propose do function; imported module is not proposed.
By the way, you should change the title of your question: you do not want to avoid imports of your imported modules (that would make them unusable), so it should rather be "how to prevent IDE to show imported modules of an imported module" or something similar.
You could import numpy inside your function
def read_image(fname):
import numpy as np
....
making it locally available to the read_image code, but not globally available.
Warning though, this might cause a performance hit (as numpy would be imported each time the code is run rather than just once on the initial import) - especially if you run read_image multiple times.
If you really want to hide it, then I suggest creating a new directory such that your structure looks like this:
custom_modules/
__init__.py
impy/
__init__.py
impy.py
and let the new impy/__init__.py contain
from impy import read_image
This way, you can control what ends up in the custom_modules.impy namespace.

How should I perform imports in a python module without polluting its namespace?

I am developing a Python package for dealing with some scientific data. There are multiple frequently-used classes and functions from other modules and packages, including numpy, that I need in virtually every function defined in any module of the package.
What would be the Pythonic way to deal with them? I have considered multiple variants, but every has its own drawbacks.
Import the classes at module-level with from foreignmodule import Class1, Class2, function1, function2
Then the imported functions and classes are easily accessible from every function. On the other hand, they pollute the module namespace making dir(package.module) and help(package.module) cluttered with imported functions
Import the classes at function-level with from foreignmodule import Class1, Class2, function1, function2
The functions and classes are easily accessible and do not pollute the module, but imports from up to a dozen modules in every function look as a lot of duplicate code.
Import the modules at module-level with import foreignmodule
Not too much pollution is compensated by the need to prepend the module name to every function or class call.
Use some artificial workaround like using a function body for all these manipulations and returning only the objects to be exported... like this
def _export():
from foreignmodule import Class1, Class2, function1, function2
def myfunc(x):
return function1(x, function2(x))
return myfunc
myfunc = _export()
del _export
This manages to solve both problems, module namespace pollution and ease of use for functions... but it seems to be not Pythonic at all.
So what solution is the most Pythonic? Is there another good solution I overlooked?
Go ahead and do your usual from W import X, Y, Z and then use the __all__ special symbol to define what actual symbols you intend people to import from your module:
__all__ = ('MyClass1', 'MyClass2', 'myvar1', …)
This defines the symbols that will be imported into a user's module if they import * from your module.
In general, Python programmers should not be using dir() to figure out how to use your module, and if they are doing so it might indicate a problem somewhere else. They should be reading your documentation or typing help(yourmodule) to figure out how to use your library. Or they could browse the source code yourself, in which case (a) the difference between things you import and things you define is quite clear, and (b) they will see the __all__ declaration and know which toys they should be playing with.
If you try to support dir() in a situation like this for a task for which it was not designed, you will have to place annoying limitations on your own code, as I hope is clear from the other answers here. My advice: don't do it! Take a look at the Standard Library for guidance: it does from … import … whenever code clarity and conciseness require it, and provides (1) informative docstrings, (2) full documentation, and (3) readable code, so that no one ever has to run dir() on a module and try to tell the imports apart from the stuff actually defined in the module.
One technique I've seen used, including in the standard library, is to use import module as _module or from module import var as _var, i.e. assigning imported modules/variables to names starting with an underscore.
The effect is that other code, following the usual Python convention, treats those members as private. This applies even for code that doesn't look at __all__, such as IPython's autocomplete function.
An example from Python 3.3's random module:
from warnings import warn as _warn
from types import MethodType as _MethodType, BuiltinMethodType as _BuiltinMethodType
from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil
from math import sqrt as _sqrt, acos as _acos, cos as _cos, sin as _sin
from os import urandom as _urandom
from collections.abc import Set as _Set, Sequence as _Sequence
from hashlib import sha512 as _sha512
Another technique is to perform imports in function scope, so that they become local variables:
"""Some module"""
# imports conventionally go here
def some_function(arg):
"Do something with arg."
import re # Regular expressions solve everything
...
The main rationale for doing this is that it is effectively lazy, delaying the importing of a module's dependencies until they are actually used. Suppose one function in the module depends on a particular huge library. Importing the library at the top of the file would mean that importing the module would load the entire library. This way, importing the module can be quick, and only client code that actually calls that function incurs the cost of loading the library. Further, if the dependency library is not available, client code that doesn't need the dependent feature can still import the module and call the other functions. The disadvantage is that using function-level imports obscures what your code's dependencies are.
Example from Python 3.3's os.py:
def get_exec_path(env=None):
"""[...]"""
# Use a local import instead of a global import to limit the number of
# modules loaded at startup: the os module is always loaded at startup by
# Python. It may also avoid a bootstrap issue.
import warnings
Import the module as a whole: import foreignmodule. What you claim as a drawback is actually a benefit. Namely, prepending the module name makes your code easier to maintain and makes it more self-documenting.
Six months from now when you look at a line of code like foo = Bar(baz) you may ask yourself which module Bar came from, but with foo = cleverlib.Bar it is much less of a mystery.
Of course, the fewer imports you have, the less of a problem this is. For small programs with few dependencies it really doesn't matter all that much.
When you find yourself asking questions like this, ask yourself what makes the code easier to understand, rather than what makes the code easier to write. You write it once but you read it a lot.
For this situation I would go with an all_imports.py file which had all the
from foreignmodule import .....
from another module import .....
and then in your working modules
import all_imports as fgn # or whatever you want to prepend
...
something = fgn.Class1()
Another thing to be aware of
__all__ = ['func1', 'func2', 'this', 'that']
Now, any functions/classes/variables/etc that are in your module, but not in your modules's __all__ will not show up in help(), and won't be imported by from mymodule import * See Making python imports more structured? for more info.
I would compromise and just pick a short alias for the foreign module:
import foreignmodule as fm
It saves you completely from the pollution (probably the bigger issue) and at least reduces the prepending burden.
I know this is an old question. It may not be 'Pythonic', but the cleanest way I've discovered for exporting only certain module definitions is, really as you've found, to globally wrap the module in a function. But instead of returning them, to export names, you can simply globalize them (global thus in essence becomes a kind of 'export' keyword):
def module():
global MyPublicClass,ExportedModule
import somemodule as ExportedModule
import anothermodule as PrivateModule
class MyPublicClass:
def __init__(self):
pass
class MyPrivateClass:
def __init__(self):
pass
module()
del module
I know it's not much different than your original conclusion, but frankly to me this seems to be the cleanest option. The other advantage is, you can group any number of modules written this way into a single file, and their private terms won't overlap:
def module():
global A
i,j,k = 1,2,3
class A:
pass
module()
del module
def module():
global B
i,j,k = 7,8,9 # doesn't overwrite previous declarations
class B:
pass
module()
del module
Though, keep in mind their public definitions will, of course, overlap.

Categories

Resources