python 3 project structure for single v.s. multiple modules - python

In a python project, I have the following directory structure
├── cooccurrence
│   ├── cooccurrence.py
│   ├── __init__.py
├── README.md
└── tests
├── __init__.py
└── test_coccurrence.py
This leads to tests code inside my test source files having a quite ceremonial line:
from cooccurrence.cooccurrence import CoCreate
How would I simplify this overall setup if I only needed a single module, and conversely, what project structure should I have to manage multiple modules under the same package?
To test, I simply use python -m unittest discover -v, and a solution that can also seamlessly enable using the project within PyCharm would be much appreciated.

You can import files in __init__.py so it will be available on package level. For example you can do in cooccurrence/__init__.py:
from cooccurrence import CoCreate
and then in your test file:
from cooccurrence import CoCreate
It will be the Pythonic way of doing so

Put the following code line in cooccurrence/__init__.py path:
from cooccurrence import *
[Note]:
Tested on Python 2.7

Related

Pylint disagrees with VSCode and python in imports

I am not finding the way to properly code so that both pylint and the execution of the code (within VSCode or from the command line) would work.
There are some similar questions but none seems to apply to my project structure with a src directory under which there will be multiple packages. Here's the simplified project structure:
.
├── README.md
├── src
│   ├── rssita
│   │   ├── __init__.py
│   │   ├── feeds.py
│   │   ├── rssita.py
│   │   └── termcolors.py
│   └── zanotherpackage
│   ├── __init__.py
│   └── anothermodule.py
└── tests
├── __init__.py
└── test_feeds.py
From what I understand rssita is one of my my packages (because of the init.py file) with some modules under it amongst which rssita.py file contains the following imports:
from feeds import RSS_FEEDS
from termcolors import PC
The rssita.py code as shown above runs well from both within VSCode and from command line python ( python src/rssita/rssita.py ) from the project root, but at the same time pylint (both from within VSCode and from the command line (pylint src or pylint src/rssita)) flags the two imports as not found.
If I modify the code as follows:
from rssita.feeds import RSS_FEEDS
from rssita.termcolors import PC
pylint will then be happy but the code will not run anymore since it would not find the imports.
What's the cleanest fix for this?
As far as I'm concerned pylinty is right, your setup / PYTHONPATH is screwed up: in Python 3, all imports are absolute by default, so
from feeds import RSS_FEEDS
from termcolors import PC
should look for top-level packages called feeds and termcolors which I don't think exist.
python src/rssita/rssita.py
That really ain't the correct invocation, it's going to setup a really weird PYTHONPATH in order to run a random script.
The correct imports should be package-relative:
from .feeds import RSS_FEEDS
from .termcolors import PC
Furthermore if you intend to run a package, that should be either a runnable package using __main__:
python -m rssita
or you should run the sub-package as a module:
python -m rssita.rssita
Because you're using an src-package, you'll either need to create a pyproject.toml so you can use an editable install, or you'll have to PYTHONPATH=src before you run the command. This ensures the packages are visible at the top-level of the PYTHONPATH, and thus correctly importable. Though I'm not a specialist in the interaction of src layouts & runnable packages, so there may be better solutions.

Relative Repo Path in __init__.py?

I'll share a problem that we are kinda having at work. We have a code repo and there are files spread all around (currently trying to get everyone to clean up). The issue is that with every file that gets created we are having to make a RELATIVE_REPO_PATH (RRP) in order to import other files in our repo. The repo is hosted using mercurial, and all users have a clone of the repo on their local machines. They each have the ability to push and pull updates as needed. This means that the RRP cannot be hardcoded in anywhere. Below I'll show an example of our structure, and also show what we currently do to get a RRP.
An example of our structure.
.
├── __init__.py
├── misc_functions
│   ├── __init__.py
│   ├── myTest.py
│   └── __pycache__
│   └── __init__.cpython-37.pyc
└── Test
├── __init__.py
├── pythonfile1 (copy).py
└── pythonfile1.py
Here is what we currently do at work
import os
import sys
if not hasattr(sys, 'frozen'):
RELATIVE_REPO_PATH = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
if __name__ == '__main__':
sys.path.append(RELATIVE_REPO_PATH)
sys.path.pop(0)
else:
RELATIVE_REPO_PATH = os.path.dirname(sys.executable)
As you can see, however deep you are from the root of the repo, you need to add more or less os.path.dirname. We have tried this with while loops, however, alot of our files import each other and it's a kind of mess.
Is there any way to just add a simple snippet of code to the top level init.py or all the init.py files that would eliminate the constant copy and pasting of this code?
Just to clarify, everything that we do works just fine. Want to know if there is a way to set up a global RRP in the init.py that will work anywhere there is an init.py.

How to set up a Pypi package without the dot notation in absolute import - python3

Most "professional" python modules can be imported like so:
from pythonfile import class
And that is how I usually import my own classes that are in my local file system/current directory.
But after looking at several posts and python.org documents, I still can't get my python module (on Pypi.org) to import like that.
The only way I can get my module to import is like so:
from example.example import Example
(Example is a class within example.py)
How to I get my module to import like so:
from example import Example
My module on Pypi has a folder structure is like this and installs fine via pip:
example/
|
├── example/
│   ├── example.py
│   ├── __init__.py
│   └── __pycache__
│   └── __init__.cpython-36.pyc
|
|
├── __init__.py
├── LICENSE
|
├── README.md
└── setup.py
I've omitted the build/, dist/, and egg-info directories for clarity.
Should I add import statements to one of the __init__.py files?
I want developers to be able to install the package via pip, and then use a simple
from example import Example and not from example.example import Example
Thank you.
when you have a look at some other big packages they usually have in their top __init__.py something like
from .modulex import *
from .moduley import *
the dot means, python is looking in current package before rest of the PYTHONPATH.

How to package a library whose import produces side effects in Python?

I am working on a python library (not mine) which looks like this:
.
├── README.md
├── setup.py
└── library
├── __init__.py
├── core.py
├── a.py
└── b.py
The file __init__.py make use of core.py which itself uses a.py and b.py. The important thing to note is that import library has some side effets which are deliberately intended.
However, I would like to give the user the possibility to use functions of core.py without there being any side effects. Unfortunately, as you know, import library.core or from library import core will execute __init__.py (where side effects occur) anyway.
Do you know how could I reorganize my package and the setup.py to solve this problem?
I thought to something like this:
.
├── README.md
├── setup.py
├── library_core
│ ├── __init__.py
│ ├── core.py
│ ├── a.py
│ └── b.py
└── library
└── __init__.py # Import library_core and apply side effects
I would update setup.py with packages = ['library', 'library_core']. That way, importing library do not change anything, but user could then import library_core without any side effects. Also, this would avoid duplicating code and everything would stay in the same repository.
Unfortunately, this does not work because I do not have the ability to import library_core from library since they are not in the same place in the file tree.
I'd recommend that you stop relying on side effects and require the user to explicitly trigger them by calling a documented function. Otherwise you are fighting a losing battle: the default is currently to trigger the side effects, and then you have to undo them if the user doesn't want them.
Using two package seems to be the best way.
The use of two adjacent packages can work only if the whole library is installed (with python setup.py install for example). This complicates development considerably, for unit tests for example: it was then impossible to do import library since library_core could not be found if not installed.
So, the best solution is to simply make a sub-package, and specify within the setup.py where library_core is located thanks to the package_dir option.
The files tree would look like this:
.
├── README.md
├── setup.py
└── library
├── __init__.py
└── core
├── __init__.py
├── a.py
└── b.py
And in setup.py:
setup(
name = 'library',
packages = ['library', 'library.core', 'library_core'],
package_dir = {'library_core': 'library/core'},
...
)

Python Not Finding Module

Given the following python project, created in PyDev:
├── algorithms
│   ├── __init__.py
│   └── neighborhood
│   ├── __init__.py
│   ├── neighbor
│   │   ├── connector.py
│   │   ├── __init__.py
│   │   ├── manager.py
│   │   └── references.py
│   ├── neighborhood.py
│   ├── tests
│   │   ├── fixtures
│   │   │   └── neighborhood
│   │   ├── __init__.py
│   └── web
│   ├── __init__.py
│   └── service.py
├── configuration
│   ├── Config.py
│   └── __init__.py
├── __init__.py
└── webtrack
|- teste.py
├── .gitignore
├── __init__.py
├── manager
   ├── Data.py
   ├── ImportFile.py
   └── __init__.py
We've been trying with no success to import modules from one folder to another, such as:
from algorithms.neighborhood.neighbor.connector import NeighborhoodConnector
Which yields the result:
Traceback (most recent call last):
File "teste.py", line 49, in <module>
from algorithms.neighborhood.neighbor.connector import NeighborhoodConnector
ImportError: No module named algorithms.neighborhood.neighbor.connector
We tried to append its path to the sys.path variable but with no success.
We also tried to use os.walk to insert all paths into PATH variable but still we get the same error, even though we checked PATH does contain the path to find the modules.
We are using Python 2.7 on Linux Ubuntu 13.10.
Is there anything we could be doing wrong?
Thanks in advance,
Getting imports right when running a script that lives within a package is tricky. You can read this section of the (sadly deferred) PEP 395 for a description of a bunch of ways that don't work to run such a script.
Give a file system hierarchy like:
top_level/
my_package/
__init__.py
sub_package/
__init__.py
module_a.py
module_b.py
sub_sub_package/
__init__.py
module_c.py
scripts/
__init__.py
my_script.py
script_subpackage/
__init__.py
script_module.py
There are only a few ways to make running my_script.py work right.
The first would be to put the top_level folder into the PYTHONPATH environment variable, or use a .pth file to achieve the same thing. Or, once the interpreter is running, insert that folder into sys.path (but this can get ugly).
Note that you're adding top_level to the path, not my_package! I suspect this is what you've got messed up in your current attempts at this solution. Its very easy to get wrong.
Then, absolute imports like import my_package.sub_package.module_a will mostly work correctly. (Just don't try importing package.scripts.my_script itself while it is running as the __main__ module, or you'll get a weird duplicate copy of the module.)
However, absolute imports will always be more verbose than relative imports, since you always need to specify the full path, even if you're importing a sibling module (or "niece" module, like module_c from module_a). With absolute imports, the way to get module_c is always the big, ugly mouthful of code from my_package.sub_package.sub_sub_package import module_c regardless of what module is doing the importing.
For that reason, using relative imports is often more elegant. Alas, they're hard to get to work from a script. The only ways are:
Run my_script from the top_level folder with the -m flag (e.g. python -m my_package.scripts.my_script) and never by filename.
It won't work if you're in a different folder, or if you use a different method to run the script (like pressing F5 in an IDE). This is somewhat inflexible, but there's not really any way to make it easier (until PEP 395 gets undeferred and implemented).
Set up sys.path like for absolute imports (e.g. add top_level to PYTHONPATH or something), then use a PEP 366 __package__ string to tell Python what the expected package of your script is. That is, in my_script.py you'd want to put something like this above all your relative imports:
if __name__ == "__main__" and __package__ is None:
__package__ = "my_package.my_scripts"
This will require updating if you reorganize your file organization and move the script to a different package (but that's probably less work than updating lots of absolute imports).
Once you've implemented one of those soutions, your imports can get simpler. Importing module_c from module_a becomes from .sub_sub_package import module_c. In my_script, relative imports like from ..subpackage import module_a will just work.
I know this is an old post but still I am going to post my solution.
Had a similar issue. Just added the paths with the following line before importing the package:
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
from lib import create_graph
The way imports work is slightly different in Python 2 and 3. First Python 3 and the sane way (which you seem to expect). In Python 3, all imports are relative to the folders in sys.path (see here for more about the module search path). Python doesn't use $PATH, by the way.
So you can import anything from anywhere without worrying too much.
In Python 2, imports are relative and sometimes absolute. The document about packages contains an example layout and some import statements which might be useful for you.
The section "Intra-package References" contains information about how to import between packages.
From all the above, I think that your sys.path is wrong. Make sure the folder which contains algorithms (i.e. not algorithms itself but it's parent) needs to be in sys.path
Just set __package__ = None in every .py file. It will setup all the package hierarchy automatically.
After that you may freely use absolute module names for import.
from algorithms.neighborhood.neighbor.connector import NeighborhoodConnector

Categories

Resources