Multi layer package in python - python

I have a python package with packages in it. This explanation seems strange, so I'll include my package's structure:
package\_
__init__.py
subpackage1\_
__init__.py
file1.py
subpackage2\_
__init__.py
file2.py
(I'm simplifying it for easier understanding).
The __init__.py on the top level looks like this:
__all__ = ["subpackage1", "subpackage2"]
And, for some reason, when importing the package, it dosen't recognise anythong from file1.py or file2.py. Any ideas how to fix it?
If you need more details, here's the project on github: https://github.com/Retr0MrWave/mathModule
. The directory I called package is mathmodule_pkg in the actual project

Filling the __all__ field with names does not make imports possible, it merely serves as a hint of what you mean to make importable. This hint is picked up by star-imports to restrict what is imported, and IDEs like pycharm also use it to get an idea of what is and isn't exposed - but that's about it.
If you want to enable top-level imports of your nested classes and functions, you need to
import them into the top-level __init__.py
bind them to names that can be used for the import
optionally, reference said names in __all__ to make the API nice and obvious
Using the project you're referencing as an example, this is what it would look like:
mathmodule_pkg/__init__.py
import mathmodule_pkg.calculus.DerrivativeAndIntegral #1
integral = mathmodule_pkg.calculus.DerrivativeAndIntegral.integral #2
__all__ = ['integral'] # 3
Using the very common form of from some.package import some_name we can combine steps 1 and 2 and reduce the potential for bugs when re-binding the name:
from mathmodule_pkg.calculus.DerrivativeAndIntegral import integral # 1 and 2
__all__ = ['integral'] # 3
Using either form, after installing your package the following will be possible:
>>> from mathmodule_pkg import integral
>>> integral(...)

Related

How to automate object initialization in python3.8+

The Situation
I'm currently working on a small but very expandable project, where i have the following structure:
/
|- main.py
|- services
|- __init__.py
|- service1.py
|- service2.py
|- ...
Every one of these services creates an object and all of them have the exact same arguments and all of them are used in the same way. The difference between them is internally, where they do some, for this question unimportant, thing in a different way.
Now this is around how my code currently handles it like this:
main.py
from services import *
someObject = {} #content doesn't matter, it's always the same
serv_arr = [] # an array to hold all services
serv_arr.append( service1.service1(someObject) )
serv_arr.append( service2.service2(someObject) )
...
for service in serv_arr:
# this function always has the same name and return type in each service
service.do_something()
The Question
My specific question is:
Is there a way to automate the creation of serv_arr with a loop, such that, if i add service100.py and service101.py to the package services and i don't have to go back into main.py and add it manually, but instead it automatically loads whatever it needs?
First off, you should really avoid using the from xxx import * pattern, as it clutters the global namespace.
You could add a list of available services to services/__init__.py
something like this perhaps
# services/__init__.py
from .service1 import service1
from .service2 import service2
...
services = [service1, service2, ...]
__all__ = ['services']
If that's still too manual for you, you could iterate over the directory and use importlib to import the services by their paths.
However, I can't help but think this problem is indicative of a bad design. You might want to consider using something like the Factory Pattern to instantiate the various services, rather than having a large number of separate modules. As it is, if you wanted to make a small change to all of the services, you'll have a lot of tedious work ahead of you to do so.
Okay, building on this idea:
Austin Philp's answer
# services/__init__.py
from .service1 import service1
from .service2 import service2
...
services = [service1, service2, ...]
__all__ = ['services']
And the idea of specifically exposed methods and modules from Factory Pattern, mentioned in this answer, i came up with a very hacky solution that works without cluttering global namespace (another thing criticized by #Austin Philp).
The Solution
I got the idea to implement a method in each module that does nothing but create an instance of said module and each module is mentioned in services/__init__.py:
#services/__init__.py
from .service1 import service1
from .service2 import service2
__all__=["service1", "service2", ...]
#services/service1.py
class service1(object):
def __init__(self, input):
...
...
#
def create_instance(input):
return service1(input) # create the object and return it.
Then in main.py, I simply do this (it is extremely hacky, but it works)
#main.py
import services
import sys
# use the __all__ method to get module names. actually
for name in services.__all__:
service = sys.modules[f'services.{name}'].create_instance( input )
# do whatever with service
This way i can just happily do whatever needed without cluttering the global namespace but still iterating over or even individually calling the modules. The only thing i would have to edit to add/remove a module is another entry in the __all__ variable inside services/__init__.py. It even removed the need to have the serv_arr array, because services.__all__ already has all the names i am interested in and would have the same length as modules used.

Make methods available at higher level in a package

Consider the following package structure:
foo/ # package name
spam/ # module name
__init__.py
eggs.py # contains "bar" method
exceptions.py # contains "BarException" class
Now in order to call the bar method, we have to do
import spam
spam.eggs.bar()
I'd like to lose eggs.
Now I know it is possible to import ... as (and from ... import), but is there no way to make methods available higher up in a tree?
Things I do not want to resort to:
lots of from ... import ...
putting my eggs.py code in __init__.py instead
starred imports
long names like spam.exceptions.BarException (possibly longer)
An example would be to have exceptions.py where I define my exception classes.
Whenever I would want to make them available to a user I wouldn't want them to use spam.exceptions.BarException, but rather be able to use spam.BarException.
Goal:
import spam
try:
spam.bar() # in this case throws BarException
except spam.BarException:
pass
Note that, contrary to your comments, the top foo is not the package name, it's just a directory that's (presumably) on your sys.path somewhere, and spam is not the module name but the package name, and eggs is the module name. So:
foo/ # directory package is in
spam/ # package name
__init__.py
eggs.py # contains "bar" method
exceptions.py # contains "BarException" class
The key to what you want to do is this:
Any global names in spam/__init__.py are members of the spam package. It doesn't matter whether they were actually defined in __init__.py, or imported from somewhere else.
So, if you want to make the spam.eggs.bar function available as spam.bar, all you have to do is add this line to spam/__init__.py:
from .eggs import bar
If you have an __all__ attribute in spam/__init__.py to define the public attributes of spam, you will want to add bar to that list:
__all__ = ['other', 'stuff', 'directly', 'in', 'spam', 'bar']
If you want to re-export everything public from spam.eggs as a public part of spam, you can just do this:
from .eggs import *
__all__ = ['other', 'stuff', directly', 'in', spam'] + eggs.__all__
And of course you can extend this to more than one child module:
from .eggs import *
from .exceptions import *
__all__ = (['other', 'stuff', directly', 'in', spam'] +
eggs.__all__ +
exceptions.__all__)
This is common in the stdlib, and in popular third-party packages. For a good example, see the source to asyncio/__init__.py from Python 3.4.
However, it's only really common in this exact case: you want your uses to be able to treat your package as if it were a simple, flat module, but it actually has some internal structure (either because the implementation would be too complicated otherwise, or because occasionally users will need that structure). If you're pulling in names from a grandchild, sibling, or parent instead of a child, you're probably abusing the idiom (or at least you should stop and convince yourself that you're not).
In your __init__.py, you can import things from other modules in the package. If in __init__.py you do from .eggs import bar, then someone can do import spam and access spam.bar. If in __init__.py you do from .exceptions import BarException, then someone can do import spam and then do spam.BarException.
However, you should be wary of going too far with this. Using nesting in packages and modules has a purpose, namely to create separate namespaces. Explicitly importing a few common things from a submodule to the top level is fine, but if you start trying to implicitly make everything available at the top level, you set yourself up for name collisions down the road (e.g., if one module defines something called Blah and then later another module also does so, without realizing they will collide when they're both imported to the top level).
"Forcing" users to use from is not an onerous requirement. If cumbersome imports are required to use your library, that may be a sign that your package/module structure is too cumbersome, and you should combine some things rather than splitting them up into separate directories/files.
Incidentally, the file structure you have indicated in your post has some problems. The top-level foo as you have shown is not a package, since it doesn't have an __init__.py. The second-level spam is not a module, since it is not a file. In your example, spam is a package, and it has inside it a module called eggs (in the file eggs.py); the top-level foo directory has no status in the Python packaging system.

Approach to Python package contents that are both part of the 'public interface' and 'used internally'?

I am in the midst of refactoring some single-file Python modules into multi-file packages and I am encountering the same problem pattern repeatedly: I have objects that are part of the public interface of the package, but must also be used internally by submodules the package.
mypackage/
__init__.py # <--- Contains object 'cssURL'
views.py # <--- Needs to use object 'cssURL'
In this case, it's important that clients of mypackage have access to mypackage.cssURL. However, my submodule, views.py, also needs it, but has no access to the contents of __init__.py. Sure, I can create another submodule like so:
mypackage/
__init__.py
views.py
style.py # <--- New home for 'cssURL'
However, if I did this every time, it seems like it would multiply the number of submodules exceedingly. Moreover, clients must now refer to mypackage.cssURL as mypackage.style.cssURL, or else I must create a synonym in __init__.py like this:
import style
cssURL = style.cssURL
I think I am doing something wrong. Is there a better way to handle these kinds of package members that are both part of the public interface and used internally?
You can refer to the current package as .:
# views.py
from . import cssURL
See here for more information.
I would structure it as follows:
/mypackage
__init__.py
from style import cssURL
...
style.py
cssURL = '...' # or whatever
...
views.py
from .style import cssURL
...
If other modules within the same package need them, I wouldn't define names in __init__.py; just create an alias there for external consumers to use.
As far as I know, the preferred way is to create a "synonym" in __init__.py with "from .style import cssURL"; cf. the source for the json module.

make available only a subset of functions in python package

I am currently creating a package for python but I would like to give access to the user only a specific set of functions defined in this package. Let's say that the structure file is as follows:
my_package/
__init__.py
modules/
__init__.py
functions.py
In functions.py, there are several functions as below (those are silly examples):
def myfunction(x):
return my_subfunction1(x) + my_subfunction2(x)
def my_subfunction1(x):
return x
def my_subfunction2(x):
return 2*x
I want the user to be able to import my_package and directly access myfunction, but NOT my_subfunction1 and my_subfunction2. For example, let's say that only myfunction is useful for the user, whereas the sub-functions are only intermediate computations.
import my_package
a=my_package.myfunction(1) #should return 3
b=my_package.my_subfunction1(1) # should returns an error, function does not exist
I can think of two ways of solving my problem by adding the following lines to the __init__.py file inside my_package/
1/ from modules.functions import myfunction
2/ from modules.functions import *, and renaming the subfunctions with a leading underscore to exclude them from the starred import, ie :
_my_subfunction1 and _my_subfunction2
and both of these tricks seems to work well so far.
My question is thus : Is this the correct "pythonic" way to do ? Which one is better ? If none of them is the good way, how should I re-write it ?
Thanks for your help.
I believe you should take a look at the __all__ variable.
In your case, just set, in yout __init__.py:
__all__ = ['myfunction']

How to bring docstrings into package scope with import * in __init__.py?

I have a Python package in which the implementation is split (for maintainability) into two internal submodules. From the user point of view the package should appear as one unit, though, so in the package's __init__.py both submodules are imported with import *, as follows:
# filesystem layout:
mypkg/
__init__.py
subA.py # defines class A
subB.py # defines class B
and
# __init__.py
from .subA import *
from .subB import *
This works as intended from the package functionality point of view:
>>> import mypkg
>>> a = mypkg.A() # works
>>> b = mypkg.B() # works
and if looking up inline help for these classes directly, everything is also good:
>>> help(mypkg.A) # works
>>> help(mypkg.subA.A) # also works
The problem is that if I just look up the help for the top-level package, cf.
>>> help(mypkg)
then the classes and functions from the submodules do not "voluntarily" appear at all (although variables from them do appear in the DATA section). Is this expected/correct behaviour, and is there a way to bypass it so that the users do not have to know about the submodules that exist for implementation/maintenance convenience only?
The best solution I know of is just to add the relevant documented objects (classes, functions, data) to __all__ in your __init__.py.

Categories

Resources