Monkey patching and dispatching - python

I have the following situation, a module called enthought.chaco2 and I have many imports, like from enthought.chaco.api import ..
so what's the quickest way to add chaco.api and make it dispatch to the correct one?
I tried a few things, for example:
import enthought.chaco2 as c2
import enthought
enthought.chaco = c2
but it doesn't work. I might have to create a real module and add it to the path; is that the only way?

What is the behavior you're looking for?
You could use from enthought.chaco import api as ChacoApi and then address any content from the module through ChacoApi, like ChacoApi.foo() or chaco_class = ChacoApi.MyClass().
You could use (and that's not recommended) from enthought.chaco.api import * and have all the content of the module added to your base namespace.
You could add an __all__ variable declaration to chaco's __init__.py file and have the previous example (with the *) only import what you entered the list __all__.
Or you could import specifically any content you might use the way you do right now which is perfectly fine in my opinion...

Related

How to get source file which python class/function/variable is imported from?

Let's say I have imported two modules like this:
from module0 import hello_func
from directory.module1 import hello_var
Where in module0:
def hello_func(): return "hello from module0"
And module1:
hello_var = "hello from module1"
How can I know from which file is each object being imported?
I tried to check locals() function but nothing in there giving reference to the file...
Actually, you kind of answered your question yourself:
Let's say I have imported two modules
(insert "from xxx import *" here)
How can I know from which file is each object being imported?
One of the reasons of NOT using wildcard imports is precisely to make it clear where names are imported from (the other one being to avoind one imported name to shadow a previously imported one - something that tends to break you code in the most unexpected - and sometimes quite hard to spot - ways).
Note that in your edited question:
from module0 import hello_func
from directory.module1 import hello_var
you already have a much better idea where a name comes from. Not the exact files path yet, but at least the name of the package/module.
And that's one of the main reasons why one should NOT use wildcard imports.
Now if you want to know the exact files path, you have two distinct cases.
Some objects keep trace of where they were created (mostly modules, classes, functions, etc - cf the list of types supported by inspect.getfile()), and then, well, you already know the answer (use inspect.getfile() xD).
But most types wont (because there's no reason for it). In this case, you have to know which module they were imported from and call inspect.getfile() on the module itself. In this case, if you used wildcard imports, you will have to manually inspect all the modules you imported from to find out which one defined this name. Enjoy. Specially if one of those modules did also use wildcard imports...
one question please: where they does keep traces? and how these traces look like?
Modules keep it in their __file__ attribute. Functions and classes keep a reference to their module's name in their __module__ attribute, and from this you can use it to retrieve the module from the sys.modules dict (a cache of all modules already imported in the current process), which will gives you the file.
I never had a need to search this info for tracebacks, frames, code objects etc so you'll have to check it yourself I'm afraid ;-)
You can define in each module a constant with its path, something like this should work:
import os
FILE_PATH = os.path.abspath(__file__)
When you import that module you can access its location like this:
import module
print(module.FILE_PATH)
Another solution using the inspect and os modules.
import module0
import os
import inspect
print(os.path.abspath(inspect.getfile(module0.hello_func)))
If you are looking for the absolute path of where the script is being run, this should work for sure:
import os
abs_path = os.path.dirname(os.path.abspath(__file__))
print(abs_path)

Monkeypatching hardcoded global configuration loaded from a .py file

A co-worker has a library that uses a hard-coded configuration defined in its own file. For instance:
constants.py:
API_URL="http://example.com/bogus"
Throughout the rest of the library, the configuration is accessed in the following manner.
from constants import API_URL
As you can imagine, this is not very flexible and causes problems during testing. If I want to change the configuration, I have to modify constants.py, which is in source code management.
Naturally, I'd much rather load the configuration from a JSON or YAML file. I can read the configuration into an object with no problems. Is there a way I can override the constants.py module without breaking the code, so that each global, e.g. API_URL is replaced by a value provided by my file?
I was thinking that after each from constants import ... I could add something like this:
from constants import * # existing configuration import
import json
new_config = json.load(open('config.json')) # load my config file into a dictionary
constants.__dict__.update(new_config) # override any constants with what I've loaded
The problem with this, of course, is that it's not very "DRY" and looks like it might be brittle.
Does anyone have a suggestion for doing this more cleanly? Thanks!
EDIT: looks like my approach doesn't work anyway. I guess "from import *" copies the values from the module into the current module's global scope?
DOUBLE EDIT: no, it does work; I'm just confused. But rather than doing this in X different files I'd like to have it work transparently if possible.
from module import <name> creates a reference in the importing module global namespace to the imported object. If that is an immutable, that means you now have to monkeypatch the value in the module that imported it.
Your only hope is to be the first to import constants and monkeypatch the names in that module. Subsequent imports will then use your monkeypatched values.
To patch the original module early, the following is enough:
import constants
for name, value in new_config.iteritems():
setattr(constants, name, value)

In Python, is it a bad idea for a directory module's __init__.py to do this?

I have a package with several subdirectories containing __init__.py files. These files perform checks and initialization jobs.
Now I also have in some of these folders a file that isn't supposed to be referenced directly by an import as it relies on the sanity checks performed in its respective __init__.py. Let's call these "hidden modules" - as a convention I use an underscore to make those obvious.
Is it a bad idea to do the following inside my __init__.py (with a _implementation.py located in the same folder):
import os, sys
if sanity_check_successful:
from ._implementation import *
__all__ = sys.modules[__name__ + "._implementation"].__all__
The idea should be clear, I am trying to provide meaningful error information at each respective module level in the package whenever a sanity check fails.
Is this a bad idea, i.e. copying the __all__ array over from the "hidden module"? If so, why and are there better alternatives?
Bonus points: is there a more concise way of writing those two lines?:
from ._implementation import *
__all__ = sys.modules[__name__ + "._implementation"].__all__
In particular it itches me that I have to use a string "._implementation" in one place and as a module name in another.
There are simpler ways to set __all__ from the ._implementation submodule; setting __all__ otherwise fine:
from ._implementation import *
from ._implementation import __all__
This simply imports __all__ and binds it to the local module namespace __all__.
Even simpler than Martijn Pieters' answer:
If you include '__all__' in the __all__ list of ._implementation, it will be automatically imported by your single from ._implementation import * line.
As to your main question, whether this is a bad idea…
Well, it depends on what you're doing. This basically makes your package look exactly like its _implementation module. If that's all you're doing, that's fine… but in that case, why not just move _implementation into __init__ in the first place?
If you're trying to merge multiple modules into one, you probably want to add all of their __all__ lists into a single one. The stdlib has examples of this, like collections, and the usual pattern is:
from collections.abc import *
import collections.abc
__all__ += collections.abc.__all__
That may seem a little verbose, but it's certainly clear.
From your edited question, I think what you're doing is reasonable in exactly the same way that collections is, and the clearest and most idiomatic solution is to do the equivalent, but with = instead of += (since you're just copying one list instead of adding multiple lists together).
But, since this:
import foo
bar = foo.bar
… is pretty much equivalent (as in close enough for your use case) to:
from foo import bar
… Martijn Pieters' answer is an obvious simplication:
from ._implementation import *
from ._implementation import __all__
So, I'd do that.

Module name different than directory name?

Let's assume I have a python package called bestpackage.
Convention dictates that bestpacakge would also be a directory on sys.path that contains an __init__.py to make the interpreter assume it can be imported from.
Is there any way I can set a variable for the package name so the directory could be named something different than the directive I import it with? Is there any way to make the namespacing not care about the directory name and honor some other config instead?
My super trendy client-side devs are just so much in love with these sexy something.otherthing.js project names for one of our smaller side projects!
EDIT:
To clarify, the main purpose of my question was to allow my client side guys continue to call the directories in their "projects" (the one we all have added to our paths) folder using their existing convention (some.app.js), even though in some cases they are in fact python packages that will be on the path and sourced to import statements internally. I realize this is in practice a pretty horrible thing and so I ask more out of curiosity. So part of the big problem here is circumventing the fact that the . in the directory name (and thereby the assumed package name) implies attribute access. It doesn't really surprise me that this cannot be worked around, I was just curious if it was possible deeper in the "magic" behind import.
There's some great responses here, but all rely on doing a classical import of some kind where the attribute accessor . will clash with the directory names.
A directory with a __init__.py file is called a package.
And no, the package name is always the same as the directory. That's how Python can discover packages, it matches it against directory names found on the search path, and if there is a __init__.py file in that directory it has found a match and imports the __init__.py file contained.
You can always import something into your local module namespace under a shorter, easier to use name using the from module import something or the import module as alias syntax:
from something.otherthing.js import foo
from something.otherthing import js as bar
import something.otherthing.js as hamspam
There is one solution wich needs one initial import somewhere
>>> import sys
>>> sys.modules['blinot_existing_blubb'] = sys
>>> import blinot_existing_blubb
>>> blinot_existing_blubb
<module 'sys' (built-in)>
Without a change to the import mechanism you can not import from an other name. This is intended, I think, to make Python easier to understand.
However if you want to change the import mechanism I recommend this: Getting the Most Out of Python Imports
Well, first I would say that Python is not Java/Javascript/C/C++/Cobol/YourFavoriteLanguageThatIsntPython. Of course, in the real world, some of us have to answer to bosses who disagree. So if all you want is some indirection, use smoke and mirrors, as long as they don't pay too much attention to what's under the hood. Write your module the Python way, then provide an API on the side in the dot-heavy style that your coworkers want. Ex:
pythonic_module.py
def func_1():
pass
def func_2():
pass
def func_3():
pass
def func_4():
pass
indirection
/dotty_api_1/__init__.py
from pythonic_module import func_1 as foo, func_2 as bar
/dotty_api_2/__init__.py
from pythonic_module import func_3 as foo, func_4 as bar
Now they can dot to their hearts' content, but you can write things the Pythonic way under the hood.
Actually yes!
you could do a canonical import Whatever or newmodulename = __import__("Whatever")
python keeps track of your modules and you can inspect that by doing:
import sys
print sys.modules
See this article more details
But that's maybe not your problem? let's guess: you have a module in a different path, which your current project can't access because it's not in the sys-path?
well the just add:
import sys
sys.path.append('path_to_the_other_package_or_module_directory')
prior to your import statement or see this SO-post for a more permanent solution.
I was looking for this to happen with setup.py at sdist and install time, rather than runtime, and found the directive package_dir:
https://docs.python.org/3.5/distutils/setupscript.html#listing-whole-packages

Why is "import *" bad?

It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?
Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.
According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?
You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.
That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2
It is OK to do from ... import * in an interactive session.
Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.
Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.
http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.
These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.
It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b
As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *
As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.

Categories

Resources