How to structure a Python package so others can add modules easily - python

I have a component-oriented Python (3.3) project. I have a base Component class. I want others to be able to add modules that contain subclasses of Component just by copying their .py file(s) into some folder, without having to edit anything. Running the main program should then simply import all the .py files found in that folder. All accesses from my main program to these subclasses are via Component.__subclasses__(), not by explicit name. I am not especially worried about name clashes between code in different user-written modules, but of course I would like to avoid it if possible without screwing up the simple drop-file-into-folder inclusion.
How do I structure a package to achieve this?

I would structure the package like this:
myPackage
+ -- __init__.py
+ -- Component.py
+ -- user_defined_packages
+ -- __init__.py # 1
+ -- example.py
Ideas:
let the users drop into a different folder so that they do not mix up your code and theirs
The init file in user_defined_packages can load all the subpackages once user_defined_packages is imported. It must print all errors.
__init__.py # 1
import os
import traceback
import sys
def import_submodules():
def import_submodule(name):
module_name = __name__ + '.' + name
try:
__import__(module_name)
except:
traceback.print_exc() # no error should pass silently
else:
module = sys.modules[module_name]
globals()[name] = module # easier access
directory = os.path.dirname(__file__)
for path_name in os.listdir(directory):
path = os.path.join(directory, path_name)
if path_name.startswith('_'):
# __pycache__, __init__.py and others
continue
if os.path.isdir(path):
import_submodule(path_name)
if os.path.isfile(path):
name, extension = os.path.splitext(path_name)
if extension in ('.py', '.pyw'):
import_submodule(name)
import_submodules()

Related

Package __init__.py import all subfiles, but only load one from another script?

I have created a package with the following file structure:
- package
- __init__.py
- load.py
- train.py
- test.py
My __init__.py file is simply an import of classes for these files:
from package.load import Load
from package.train import Train
from package.test import Test
Most of the time, I want to load all three, however on occasion I only want to load one of these classes specifically. For example in an ad hoc script (outside of the package) I want to be able to call only the Load class like so:
from package import Load
While all of the above works in this design, I have an issue where dependencies from train/test are also loaded when I import Load like the above. How can I setup the __init__.py file such that I can make the same import call without getting the dependency to load from train/test?
Additional explanation:
Why I am doing this: I have an issue where I want some people to be able to use the Load class, which only uses base python, however the Train/Test files include specialized dependencies which users of just the Load class wont want to utilize or even install.
Here's a way to do something very close what you want. Instead of unconditionally importing all the package's classes in your __init__.py, you can define a function in it to explicitly import any of the ones desired (or all of them if none are specified).
__init__.py:
from pathlib import Path
import sys
print(f'In {Path(__file__).name}')
package_name = Path(__file__).parent.name
package_prefix = package_name + '.'
class_to_module_map = {'Load': 'load', 'Train': 'train', 'Test': 'test'}
def import_classes(*class_names):
namespace = sys._getframe(1).f_globals # Caller's globals.
if not class_names:
class_names = class_to_module_map.keys() # Import them all.
for class_name in class_names:
module = class_to_module_map[class_name]
temp = __import__(package_prefix+module, globals(), locals(), [class_name])
namespace[class_name] = getattr(temp, class_name) # Add to caller's namespace.
For testing purposes, here's what I put in the load.py script:
(I also put something similar in the other two modules in order to verify whether or not they were getting imported.)
load.py:
from pathlib import Path
print(f'In {Path(__file__).name}')
class Load: pass
And finally here's a example of using it to only import the Load class:
ad_hoc.py:
from my_package import import_classes
#from my_package import Load
import_classes('Load')
test = Load()
print(test)
Along with the output produced:
In __init__.py
In load.py
<my_package.load.Load object at 0x001FE4A8>
Inside a folder oranges\, this is our __init__.py file:
__all__ = []
from pathlib import Path
from importlib import import_module
from sys import modules
package = modules[__name__]
initfile = Path(__file__)
for entry in initfile.parent.iterdir():
is_file = entry.is_file()
is_pyfile = entry.name.endswith('.py')
is_not_initpy = (entry != initfile)
if is_file and is_pyfile and is_not_initpy:
module_name = entry.name.removesuffix('.py')
module_path = __name__ + '.' + module_name
module = import_module(module_path)
setattr(package, module_name, module)
__all__.append(module_name)
When we do from oranges import *, the code inside oranges\__init__.py cycles through the *.py files inside oranges\ (except __init__.py), and for each .py file does the following:
imports the .py file as a module into the variable module using Python's importlib.import_module
sets the module as a variable within the __init__.py file (or, more correctly, within the oranges package) using Python's setattr
finally, appends the module to the __all__ list

Importing class in same directory

I have a project named AsyncDownloaderTest with main.py and AsyncDownloader.py in same directory.I have just started learning python but it seems issue is with the import.
main.py
from .AsyncDownloader import AsyncDownloader
ad = AsyncDownloader()
ad.setSourceCSV("https://people.sc.fsu.edu/~jburkardt/data/csv/grades.csv","First name")
print(ad.printURLs)
AsyncDownloader.py
import pandas as pd
class AsyncDownloader:
"""Download files asynchronously"""
__urls = None
def setSourceCSV(self, source_path, column_name):
self.source_path = source_path
self.column_name = column_name
# TODO Check if path is a valid csv
# TODO Store the urls found in column in a list
my_csv = pd.read_csv(source_path, usecols=[column_name], chunksize=10)
for chunk in my_csv:
AsyncDownloader.urls += chunk.column_name
def printURLs(self):
print(AsyncDownloader.urls)
I am getting the following error
ModuleNotFoundError: No module named '__main__.AsyncDownloader'; '__main__' is not a package
Do you have __init__.py in the same directory as AsyncDownloader.py? That should do it.
__init__.py is an empty file that signals that the directory contains packages and makes functions and classes importable from .py files in that directory.
You can probably lose the leading . in from .AsyncDownloader as well. If you like, you can make the import absolute by changing it to:
from enclosing_folder.AsyncDownloader import AsyncDownloader

Dynamic importing modules results in an ImportError

So I've been trying to make a simple program that will dynamically import modules in a folder with a certain name. I cd with os to the directory and I run module = __import__(module_name) as I'm in a for loop with all of the files names described being iterated into the variable module_name.
My only problem is I get hit with a:
ImportError: No module named "module_name"
(saying the name of the variable I've given as a string). The file exists, it's in the directory mentioned and import works fine in the same directory. But normal even import doesn't work for modules in the cd directory. The code looks as follows. I'm sorry if this is an obvious question.
import os
class Book():
def __init__(self):
self.name = "Book of Imps"
self.moduleNames = []
# configure path
def initialize(self):
path = os.getcwd() + '/Imp-pit'
os.chdir(path)
cwd = os.walk(os.getcwd())
x, y, z = next(cwd)
# Build Modules
for name in z:
if name[:3] == 'Imp':
module_name = name[:len(name) - 3]
module = __import__(module_name)
def start_sim():
s = Book()
s.initialize()
if __name__ == '__main__':
start_sim()
I don't think the interpreter dynamically alters sys.path if you simply change the current directory with os.chdir. You'll manually have to insert the path variable into the sys.path list for this to work, i.e:
sys.path.insert(0, path)
Python generally searches sys.path when looking for modules, so it will find it if you specify it there.
An additional note; don't use __import__, rather use importlib.import_module. The interface is exactly the same but the second is generally advised in the documentation.
You should use the try-except concept, e.g.:
try:
import csv
except ImportError:
raise ImportError('<error message>')
If I did understand you correct then
try:
module = __import__(module_name)
except ImportError:
raise ImportError('<error message>')

Manually import a package and sub-modules in python given their name

I have this problem and do not know how to solve it efficiently.
I've this file structure
THE NUMBER OF FOLDERS NOR THE NAMES ARE GIVEN, IT'S ALL UNKNOWN
app/
__init__.py
modules/
__init__.py
ModuleA/
__init__.py
file.py
otherfile.py
config.ini
ModuleB/
__init__.py
file.py
otherfile.py
config.ini
ModuleC/
__init__.py
file.py
otherfile.py
config.ini
**arbitrary number of modules with same structure*
As you can notiche, app is the main package of my app, but I need an efficient way to import the mods folder and its' content
* My actual solution *
from app import modules ad mods
def load_modules_from_packages(self, pkg = mods):
pkgname = pkg.__name__
pkgpath = dirname(pkg.__file__)
for loader,name,ispkg in pkgutil.walk_packages(pkg.__path__, pkgname+'.'):
if ispkg is True:
__import__(name,globals(),locals(),[],0)
elif ispkg is False:
__import__(name,globals(),locals(),[],0)
This works since pkgutil iterate the structure with the dot notation for names, so import works well.
But now I want load infos in the config file if I am in one of the somemodule folder(the one with own init.py and config.ini
I want to do this to recreate the structure of module package and output it in a JSON rapresentation for another thing
* my other solution does not works*
def load_modules_from_packages(directory)
dir_path = dirname(directory.__file__)
dir_name = directory.__name__
for filename in glob.glob(dir_path + '/**/*.ini', recursive=True):
plugin = {}
plugin['name'] = filename.split('/')[-2]
plugin['path'] = dirname(filename)
plugin['config_file'] = filename
for pyname in glob.glob(dirname(filename)+ '/**/*.py', recursive=True):
importlib.import_module(pyname)
I cant use the solution posted in this thread
How to import a module given the full path?
since I do not know the module name, ad pointed without solutions in the comment.
spec = importlib.util.spec_from_file_location('what.ever', 'foo.py')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
I know 'foo.py' but I cant figure out 'what.ever' like pkgutil.walk_package does.
In fact the modules imported in this way have the package and name entry wrong. With this approach I cant figure out where I am in the file structure to create modules dictionary and the relative modules (for the JSON output)
Any help?

Include .pyd module files in py2exe compilation

I'm trying to compile a python script. On executing the exe I got:-
C:\Python27\dist>visualn.exe
Traceback (most recent call last):
File "visualn.py", line 19, in <module>
File "MMTK\__init__.pyc", line 39, in <module>
File "Scientific\Geometry\__init__.pyc", line 30, in <module>
File "Scientific\Geometry\VectorModule.pyc", line 9, in <module>
File "Scientific\N.pyc", line 1, in <module>
ImportError: No module named Scientific_numerics_package_id
I can see the file Scientific_numerics_package_id.pyd at the location "C:\Python27\Lib\site-packages\Scientific\win32". I want to include this module file into the compilation. I tried to copy the above file in the "dist" folder but no good. Any idea?
Update:
Here is the script:
from MMTK import *
from MMTK.Proteins import Protein
from Scientific.Visualization import VRML2; visualization_module = VRML2
protein = Protein('3CLN.pdb')
center, inertia = protein.centerAndMomentOfInertia()
distance_away = 8.0
front_cam = visualization_module.Camera(position= [center[0],center[1],center[2]+distance_away],description="Front")
right_cam = visualization_module.Camera(position=[center[0]+distance_away,center[1],center[2]],orientation=(Vector(0, 1, 0),3.14159*0.5),description="Right")
back_cam = visualization_module.Camera(position=[center[0],center[1],center[2]-distance_away],orientation=(Vector(0, 1, 0),3.14159),description="Back")
left_cam = visualization_module.Camera(position=[center[0]-distance_away,center[1],center[2]],orientation=(Vector(0, 1, 0),3.14159*1.5),description="Left")
model_name = 'vdw'
graphics = protein.graphicsObjects(graphics_module = visualization_module,model=model_name)
visualization_module.Scene(graphics, cameras=[front_cam,right_cam,back_cam,left_cam]).view()
Py2exe lets you specify additional Python modules (both .py and .pyd) via the includes option:
setup(
...
options={"py2exe": {"includes": ["Scientific.win32.Scientific_numerics_package_id"]}}
)
EDIT. The above should work if Python is able to
import Scientific.win32.Scientific_numerics_package_id
There is a way to work around this types of issues that I have used a number of times. In order to add extra files to the py2exe result you can extend the media collector in order to have a custom version of it. The following code is an example:
import glob
from py2exe.build_exe import py2exe as build_exe
def get_py2exe_extension():
"""Return an extension class of py2exe."""
class MediaCollector(build_exe):
"""Extension that copies Scientific_numerics_package_id missing data."""
def _add_module_data(self, module_name):
"""Add the data from a given path."""
# Create the media subdir where the
# Python files are collected.
media = module_name.replace('.', os.path.sep)
full = os.path.join(self.collect_dir, media)
if not os.path.exists(full):
self.mkpath(full)
# Copy the media files to the collection dir.
# Also add the copied file to the list of compiled
# files so it will be included in zipfile.
module = __import__(module_name, None, None, [''])
for path in module.__path__:
for f in glob.glob(path + '/*'): # does not like os.path.sep
log.info('Copying file %s', f)
name = os.path.basename(f)
if not os.path.isdir(f):
self.copy_file(f, os.path.join(full, name))
self.compiled_files.append(os.path.join(media, name))
else:
self.copy_tree(f, os.path.join(full, name))
def copy_extensions(self, extensions):
"""Copy the missing extensions."""
build_exe.copy_extensions(self, extensions)
for module in ['Scientific_numerics_package_id',]:
self._add_module_data(module)
return MediaCollector
I'm not sure which is the Scientific_numerics_package_id module so I've assumed that you can import it like that. The copy extensions method will get a the different module names that you are having problems with and will copy all their data into the dir folder for you. Once you have that, in order to use the new Media collector you just have to do something like the following:
cmdclass['py2exe'] = get_py2exe_extension()
So that the correct extension is used. You might need to touch the code a little but this should be a good starting point for what you need.
I encountered similar probelm with py2exe and the only solution I can find ,is to use another tool to convert python to exe - pyinstaller
Its very easy tool to use and more important , it works!
UPDATE
As I understood from your comments below , running your script from command line is not working also , due to import error (My recommendation is to first check your code from command line ,and than try to convert it to EXE)
It looks like PYTHONPATH problem.
PYTHONPATH is list of paths (similar of Windows PATH) that python programs use to find import modules.
If your script run from your IDE , that means the PYTHONPATH is set correctly in the IDE ,so all imported modules are found.
In order to set PYTHONPATH you can use :
import sys|
sys.path.append(pathname)
or use the following code that add the all folders under path parameter to PYTHONPATH:
import os
import sys
def add_tree_to_pythonpath(path):
"""
Function: add_tree_to_pythonpath
Description: Go over each directory in path and add it to PYTHONPATH
Parameters: path - Parent path to start from
Return: None
"""
# Go over each directory and file in path
for f in os.listdir(path):
if f == ".bzr" or f.lower() == "dll":
# Ignore bzr and dll directories (optional to NOT include specific folders)
continue
pathname = os.path.join(path, f)
if os.path.isdir(pathname) == True:
# Add path to PYTHONPATH
sys.path.append(pathname)
# It is a directory, recurse into it
add_tree_to_pythonpath(pathname)
else:
continue
def startup():
"""
Function: startup
Description: Startup actions needed before call to main function
Parameters: None
Return: None
"""
parent_path = os.path.normpath(os.path.join(os.getcwd(), ".."))
parent_path = os.path.normpath(os.path.join(parent_path, ".."))
# Go over each directory in parent_path and add it to PYTHONPATH
add_tree_to_pythonpath(parent_path)
# Start the program
main()
startup()
The ImportError is rectified by using "Gil.I" and "Janne Karila" suggestion by setting pythonpath and by using include function. But before this I had to create __init__.py file in the win32 folder of both the modules.
BTW I still got another error for the above script - link

Categories

Resources