I have a package with a few modules, each module has a class (or a few classes) defined within it. I need to get the list of all modules within the package. Is there an API for this in python?
Here is the file structure:
\pkg\
\pkg\__init__.py
\pkg\module1.py -> defines Class1
\pkg\module2.py -> defines Class2
\pkg\module3.py -> defines Class3 and Class31
from within module1 I need to get the list of modules within pkg, and then import all the classes defined in these modules
Update 1:
Ok, after considering the answers and comments below I figured, that it's not that's easy to achieve my need. In order to have the code I proposed below working, all the modules should be explicitly imported beforehand.
So now the new concept:
How to get the list of modules within a package without loading the modules? Using python API, i.e. - without listing all the files in the package folder?
Thanks
ak
One ad-hoc approach is list the files in the directory, import each file dynamically with __import__ and then list the classes of the resulting module.
ok, this was actually pretty straightforward:
import pkg
sub_modules = (
pkg.__dict__.get(a) for a in dir(pkg)
if isinstance(
pkg.__dict__.get(a), types.ModuleType
)
)
for m in sub_modules:
for c in (
m.__dict__.get(a) for a in dir(m)
if isinstance(m.__dict__.get(a), type(Base))
):
""" c is what I needed """
Related
Hi I'm building my own package and I have a question on __all__.
Are there any neat way to define __all__, other than explicitly typing each and every function in the module?
I find it very tedious...
I'm trying to make some code which wraps on frequently used libraries such as numpy, pytorch, os. The problem is, the libraries I used to create my modules also gets imported when I import my package.
I want to import every function / class that I defined, but I don't want the third-party libraries that I used in the process to get imported.
I use from .submodule import * in my __init__.py so that I can access my functions inside the submodule directly. (Just like we can access functions directly from the top package like np.sum(), torch.sum() )
My submodule has a lot of functions, and I want to import all of them to __init__.py, except for the third-party packages that I used.
I see that __all__ defines what to import when from package import * is called.
For example,
utils.py
__all__ = ['a']
def a():
pass
def b():
pass
__init__.py
from .utils import *
and
>>> import package
>>> package.a()
None
>>> package.b()
NameError: 'package.b' is not defined
What I want is something like
__all__ = Some_neat_fancy_method()
I tried locals() and dir(), but got lost along the way.
Any suggestions?
As others have pointed out, the whole point of __all__ is to explicitly specify what gets exposed to star-imports. By default everything is. If you really want to specify what doesn't get exposed instead, you can do a little trick and include all modules in __all__ and then remove the ones you want to exclude.
For example:
def _exclude(exclusions: list) -> list:
import types
# add everything as long as it's not a module and not prefixed with _
functions = [name for name, function in globals().items()
if not (name.startswith('_') or isinstance(function, types.ModuleType))]
# remove the exclusions from the functions
for exclusion in exclusions:
if exclusion in functions:
functions.remove(exclusion)
del types # deleting types from scope, introduced from the import
return functions
# the _ prefix is important, to not add these to the __all__
_exclusions = ["function1", "function2"]
__all__ = _exclude(_exclusions)
You can of course repurpose this to simply include everything that's not a function or prefixed with _ but it serves little use since everything is included in star-imports if you don't specify the __all__, so I thought it was better to include the exclusion idea. This way you can simply tell it to exclude specific functions.
Are there any neat way to define all, other than explicitly typing each and every function in the module?
Not built-in no. But defining __all__ by hand is basically the entire point, if you want to include everything in __all__ you can just do nothing at all:
If __all__ is not defined, the statement from sound.effects import * [...] ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package.
The entire point of __all__ is to restricts what gets "exported" by star-imports. There's no real way for Python to know that except by having you tell it, for each symbol, whether it should be there or not.
One easy workaround is to alias all of your imports with a leading underscore. Anything with a leading underscore is excluded from from x import * style imports.
import numpy as _np
import pandas as _pd
def my_fn():
...
What I mean to ask is:
TLDR: how do I have my package's help include all underlying docstrings?
I have created a package. That package has all the proper __init__.py files and all the proper docstrings (module, function, class, and method level docstrings). However, when I perform help(mypackage), the only help provided is the help provided at that top level __init__.py module.
Often package-level help does not include all of the underlying docstrings, but sometimes it does.
I want to make sure that I am embedding all of the underlying docstrings.
For instance, within the numpy package all underlying docstrings are available in the help at the command prompt, even though they are not provided at the top-level __init__.py.
I.e., I can type
>>> help(numpy)
and see all of the documentation, including documentation defined outside of the dunder init module.
However, many other packages, including popular ones like the pandas package do not capture all of the underlying documentation.
I.e., typing
>>> help(pandas)
only provides me the documentation defined in __init__.py.
I want to create package-level documentation mirroring how numpy does it.
I have tried to look through numpy to see how it is performing this magic, with no luck. I have performed Google searches, but it seems there is no way to phrase this question and get any decent links back.
numpy shows you documentation on classes and functions defined outside __init__.py module because of adding their names to __all__ variable in __init__.py. Try to comment lines 169-173 (don't forget to uncomment!):
#__all__.extend(['__version__', 'show_config'])
#__all__.extend(core.__all__)
#__all__.extend(_mat.__all__)
#__all__.extend(lib.__all__)
#__all__.extend(['linalg', 'fft', 'random', 'ctypeslib', 'ma'])
After doing this output of help(numpy) will be very limited.
Also let's reproduce this behaviour. Starting from '/some/path', create folder folder, file named file.py inside it with the following content:
class Class:
"""Class docstring"""
And __init__.py:
from .file import *
Now let's see the help:
/some/path$ python3.5
>>> import folder
>>> help(folder)
Help on package folder:
NAME
folder
PACKAGE CONTENTS
file
FILE
/some/path/folder/__init__.py
And now add this line to __init__.py:
__all__ = ['Class']
After reimporting folder the command help(folder) will contain information about class Class which includes your docstring:
Help on package folder:
NAME
folder
PACKAGE CONTENTS
file
CLASSES
builtins.object
folder.file.Class
class Class(builtins.object)
| Class docstring
|
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
DATA
__all__ = ['Class']
FILE
/some/path/folder/__init__.py
I have a file, myfile.py, which imports Class1 from file.py and file.py contains imports to different classes in file2.py, file3.py, file4.py.
In my myfile.py, can I access these classes or do I need to again import file2.py, file3.py, etc.?
Does Python automatically add all the imports included in the file I imported, and can I use them automatically?
Best practice is to import every module that defines identifiers you need, and use those identifiers as qualified by the module's name; I recommend using from only when what you're importing is a module from within a package. The question has often been discussed on SO.
Importing a module, say moda, from many modules (say modb, modc, modd, ...) that need one or more of the identifiers moda defines, does not slow you down: moda's bytecode is loaded (and possibly build from its sources, if needed) only once, the first time moda is imported anywhere, then all other imports of the module use a fast path involving a cache (a dict mapping module names to module objects that is accessible as sys.modules in case of need... if you first import sys, of course!-).
Python doesn't automatically introduce anything into the namespace of myfile.py, but you can access everything that is in the namespaces of all the other modules.
That is to say, if in file1.py you did from file2 import SomeClass and in myfile.py you did import file1, then you can access it within myfile as file1.SomeClass. If in file1.py you did import file2 and in myfile.py you did import file1, then you can access the class from within myfile as file1.file2.SomeClass. (These aren't generally the best ways to do it, especially not the second example.)
This is easily tested.
In the myfile module, you can either do from file import ClassFromFile2 or from file2 import ClassFromFile2 to access ClassFromFile2, assuming that the class is also imported in file.
This technique is often used to simplify the API a bit. For example, a db.py module might import various things from the modules mysqldb, sqlalchemy and some other helpers. Than, everything can be accessed via the db module.
If you are using wildcard import, yes, wildcard import actually is the way of creating new aliases in your current namespace for contents of the imported module. If not, you need to use the namespace of the module you have imported as usual.
The package I am documenting consists of a set of *.py files, most containing one class with a couple of files being genuine modules with functions defined. I do not need to expose the fact that each class is in a module so I have added suitable from statements in the __init__.py file e.g.
from base import Base
so that the user can use the import pkg command and does not then have to specify the module that contains the class:
import pkg
class MyBase(pkg.Base): # instead of pkg.base.Base ...
...
The problem is that Sphinx insists on documenting the class as pkg.base.Base. I have tried to set the add_module_names = False in conf.py. However this results in Sphinx showing the class as simply Base instead of pkg.Base. Additionally this also ruins the documentation of the couple of *.py files that are modules.
How do I make Sphinx show a class as pkg.Base?
And how do I set the add_module_names directive selectively for each *.py file?
Here is a way to accomplish what the OP asks for:
Add an __all__ list in pkg/__init__.py:
from base import Base # Or use 'from base import *'
__all__ = ["Base"]
Use .. automodule:: pkg in the .rst file.
Sphinx will now output documentation where the class name is shown as pkg.Base instead of pkg.base.Base.
I've incorporated the answers I found in a scalable-ish form factor:
my_project/
__init__.py
mess.py
mess.py:
class MyClass:
pass
class MyOtherClass(MyClass):
pass
__init__.py:
from .mess import MyClass, MyOtherClass
__all_exports = [MyClass, MyOtherClass]
for e in __all_exports:
e.__module__ = __name__
__all__ = [e.__name__ for e in __all_exports]
This seems to have worked pretty well for me.
I would like to provide a more generalized approach.
The variable __all__ is filled up based on dir(). But the sub-packages name (here mypackage) and all in-build attributes (starting with __) are ignored.
from .mypackage import *
__all__ = []
for v in dir():
if not v.startswith('__') and v != 'mypackage':
__all__.append(v)
Short answer: You shouldn't. Just point the sphinx to the directory of your code. Sphinx documents the code and shows the module hirarchy. How the module finally will be imported is purely in the hand of the developer, but not a responsibility of the documentation tool.
A Python namespace package can be spread over many directories, and zip files or custom importers. What's the correct way to iterate over all the importable submodules of a namespace package?
Here is a way that works well for me. Create a new submodule all.py, say, in one of the packages in the namespace.
If you write
import mynamespace.all
you are given the object for the mynamespace module. This object contains all of the loaded modules in the namespace, irrespective of where they were loaded, since there is only one instance of mynamespace around.
So, just load all the packages in the namespace in all.py!
# all.py
from pkgutil import iter_modules
# import this module's namespace (= parent) package
pkg = __import__(__package__)
# iterate all modules in pkg's paths,
# prefixing the returned module names with namespace-dot,
# and import the modules by name
for m in iter_modules(pkg.__path__, __package__ + '.'):
__import__(m.name)
Or in a one-liner that keeps the all module empty, if you care about that sort of thing:
# all.py
(lambda: [__import__(_.name) for _ in __import__('pkgutil').iter_modules(__import__(__package__).__path__, __package__ + '.')])() # noqa
After importing the all module from your namespace, you then actually receive a fully populated namespace module:
import mynamespace.all
mynamespace.mymodule1 # works
mynamespace.mymodule2 # works
...
Of course, you can use the same mechanism to enumerate or otherwise process the modules in the namespace, if you do not want to import them immediately.
Please read import confusion.
It very clearly distinguishes all the different ways you can import packages and its sub modules and in the process answers your question. When you need a certain submodule from a package, it’s often much more convenient to write from io.drivers import zip than import io.drivers.zip, since the former lets you refer to the module simply as zip instead of its full name.
from modname import *, this provides an easy way to import all the items from a module into the current namespace; however, this statement should be used sparingly.