Is it good practice to develop python applications as packages - python

I'm developing a rather complex desktop application using wxPython framework. At this point, app already contains a dozens of modules: libraries, UI modules, util modules.
Project looks like this:
MyApp/
__init__.py -- empty
main.py
util/
__init__.py -- empty
lib1/
__init__.py
lib2/
__init__.py
gui/
__init__.py -- empty
window1.py
Unfortunately with current project structure I cannot use absolute imports, because python MyApp/main.py will fail with error like ImportError: No module named MyApp.gui
To overcome this, I'd like to make MyApp an executable package:
my_app/
__init__.py -- empty
__main__.py
util/
__init__.py -- empty
lib1/
__init__.py
lib2/
__init__.py
gui/
__init__.py -- empty
window1.py
Now application can be started using python -m my_app
Everything seems ok so far… but I am full of doubts, because no-one uses such approach. If you take a look at demo that comes with wxPython you'll see that it's mostly flat project.
I'm definitely not the smartest one therefore I'm missing something simple and obvious why no-one uses such approach.
Maybe I should just stick with subfolders or flat project structure? Maybe absolute imports don't worth such changes?

If I were you, I would look at some of the wxPython applications that are out there and see how they do it.
Editra - a text file editor + more
Chandler project
Phatch
Or even the wxPython demo package.

Putting most of it in a package namespace is the best way to go, since you can also take advantage of byte-code caching and setuptools/Distribute can easily install it. Then you just provide a simple top-level script to load the main module and run it.
Something like:
#!/usr/bin/python
import sys
from MyApp import main
main.main(sys.argv)
Just name that something like myapp and install in /usr/local/bin (or somewhere on PATH). All it does is import the main module and run the main function (or class).

You seem to understand why it's a good thing to have it under a single package. So I think I'll just go over this briefly:
You get to organize your project better. If you have multiple modules responsible for different things, you don't fear having a conflict with some other library. So it will basically act as a simple namespace.
You get the benefit of updating via easy_install or whatever. I'm not sure it's really a big plus though.
It can be easier to extend with plugins, whether voluntarily allowing them, or just leaving place for some tweaking on the user-side.
I'll just give you some examples which use this approach, I think mainly for the plugins approach:
Exaile: music player, which allows you to add plugins through this structure, and notice also the gui is in a separate package. Not sure why, but it definitely makes the separation of UI (GTK, but does not matter) clear.
I'll just advertise myself here :) wxpos: it's one of my projects, if you want to take a look at my approach with wxPython as an example.
There are more than I can think of, I'm sure I've seen it somewhere else too.

Related

Best way to organize a python project to easily access all of the modules

I know this question has been asked in a lot of places, but none of them really seem to answer my question.
I am creating a standalone application in python, my current project structure is like so:
Project/
utils/
log.py
api/
get_data_from_api.py
main.py
I would like to have a set up similar to django, in which I can refer to any file by using a Project.<package> syntax.
For example, if I wanted to access my log module from my get_data_from_api module, I could do something like this:
get_data_from_api.py
import Project.utils.log
# Rest of code goes here...
However, I can't seem to get that to work, even when I added an __init__.py file in the root directory.
I read somewhere that I should modify my PYTHONPATH, but I would like to prevent that, if possible. Additionally, django seemed to pull it off, as I couldn't find any PYTHONPATH modification code in there.
I really appreciate the help!
Post Note: Where would tests fit in this file structure? I would like them to be separate, but also have access to the entire project really easily.
You need to add __init__.py to every directory that you want to treat as a Python package, not just to the root directory.
Post Note: Where would tests fit in this file structure? I would like
them to be separate, but also have access to the entire project really
easily.
You could place your tests in the Project directory itself, parallel to api and utils.
I ended up using a file structure like the following:
src/
main.py
project/
api/
get_data_from_api.py
util/
log.py
test/
That way, because I am running the main.py file, from anywhere in the program, I can just do import project.<package>.<module>. For example, I can just do this:
get_data_from_api.py
import project.util.log
# Rest of code goes here...
And everything works!
If I'm completely shooting in the dark please let me know, I'd much rather be wrong now then hundreds of hours into the project!
It may be helpful if you look at these three tutorials. It may not completely solve what you imagined, but it will certainly give you an idea or background on how to do it.
How to structure a Flask-RESTPlus web service for production builds - GitHub Code
Structuring Your Project - GitHub Code
Clean architectures in Python GitHub Code
They have already written about Django, so you can see the details at: Best practice for Django project working directory structure.

Pythonic way for magaging and using user-created shared libraries

tl;dr I have a directory of common files outside of my various project directories. What is the pythonic way of using/importing these common files inside my projects, and for building them into an output directory.
Background:
I'm in school and taking a data structures class that uses Python as the language. I'm learning the languages as I take the class but have having some issues trying to maintain a shared code base.
In all of the other languages I've used, both compiled and interpreted, there has been a fairly intuitive way of being able to keep shared modules separate from the code that is using them so that updating a shared module doesn't require updates to the calling code.
This is how I initially had my directory structure organized.
/.../Projects
Assignment_1
__init__.py
classA.py
classB.py
Assignment_2
__init__.pu
classC.py
(etc)
After realizing that much of the functionality of classA and classB would be required later on, I reorganized to this:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
Assignment_1
__init__.py
main.py
Assignment_2
__init__.py
main.py
My issue is that I can't find a good way of importing things like SimpleProfiler or MergeSort from main,py. Right now I'm manually coping all of the Common files into each assignment, which is bad.
I understand that one possible solution is to update the path to include the common folder form within each main.py file, but I've also read that this is very hacky and isn't encouraged.
Another Stackoverflow answer to a similar question suggested that the user structure everything under one large project. I tried this but still couldn't import modules from one sibling into another sibling.
My other issue is how to package everything together when submitting the assignment. In other languages it was easy to implement a build script that would scan the main project for any imports, then copy (flatten) those imported files into a single output directory which I could then compress and submit for grading. I'm using PyCharm, but can't seem to find a way to reference the imports as part of the build process. Is there any kind of script for this? Whatever the solution is, I need to be able to submit the project in such a way that all the instructor has to do is call a single python file (such as main.py)
This issue isn't unique to a school setting, but seems universal to most programming projects. So, what is the pythonic way of managing a shared code base and for building that shared code into a final project?
[Disclaimer: I think it is better to use PYTHONPATH environment variable]
I think of two very similar alternatives:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
assignment_1.py
assignment_2.py
If you use, from assignment_1.py, the following import: from Common.Sorters.BubbleSort import bubble_sort. This is because, by default, PYTHONPATH considers the current path as a valid PYTHONPATH. This is assuming that you are invoking the scripts assignment_* directly.
The other alternative would be:
/.../Projects
Common
Sorters
__init__.py
BubbleSort.py
MergeSort.py
__init__.py
SimpleProfiler.py
Assignment_1
__init__.py
__main__.py
Assignment_2
__init__.py
__main__.py
And invoking the assignments like so: python -m Assignment_1 (from the Projects folder). By default, "executing" a module like that will load its __main__.py code. (This is not a rigurous explanation, although the official one is a bit short).
It works for the same reasons as before: Python interpreter will consider the current path as a valid PYTHONPATH.
Try setting PYTHONPATH environment variable to your directory.
Python first searches for files being imported in sys.path, and the first directory in sys.path is the current directory. PYTHONPATH is the next where python will look for files.
On the minimum end, make a PyCharm run configuration that sets your PYTHONPATH before executing and include the other directory. That way you don't need to do a sys.path call in your code.
Closer to the "perfect" end, make your other directory into a Python package with a setup.py. Then, using the interpreter from your project, do "python path/to/other/dir/setup.py develop" to bring your separately-developed package into the consuming project.

Python __init__.py proved to be irrelevant in version 3.4?

I am running python 3.4 on the main.py file in the same directory.
/root directory is not in python path. It is simply the current directory that the script is executing in. All pycache folders were deleted after each test
So why exactly is __init__.py important? I thought it was necessary as stated in this post:
What is __init__.py for?
If you remove the init.py file, Python will no longer look for submodules inside that directory, so attempts to import the module will fail.
Right now, it seems to me that __init__.py is nothing more than an optional constructor where we do housekeeping and other optional things like specifying the "all" variable, etc. But not a critical item to have.
Image showing the results of the test:
Can someone explain the discrepancy or what is the cause of this issue?
As confusing as it may be, although the basics will work without __init__.py files, you should probably still use them. Many external tools, as well as package-related functions in the standard library, will not work as expected without them. More words of wisdom here (as well as a misleading accepted answer): Is __init__.py not required for packages in Python 3.3+.
Found Answer
In essence, init.py is not needed, and its purpose is for legacy and optional housekeeping tasks that you may or may not want or need in Python versions 2.7 vs 3.0+. However, it is important to take into account that they have slightly different behavior during more complex parsing if you are building something more complex.
Please refer to the following links for additional reading material:
https://www.python.org/dev/peps/pep-0420/#namespace-packages-today
How do I create a namespace package in Python?
What's the difference between a Python module and a Python package?
https://softwareengineering.stackexchange.com/questions/276888/python-namespace-vs-module-with-underscores

python module / package sharing

I have a flask app that uses functions from custom modules.
My File hierarchy is like so:
__init__.py
ec2/__init__.py
citrixlb/__init__.py
So far in the root __init__.py I have a from ec2 import * clause to load my module.
Now I'm adding a new 'feature' called citrixlb.
Both the of the __init__.py files in citrixlb and ec2 use some of the same functions to do their task.
I was thinking of doing something like:
__init__.py
common/__init__.py
ec2/__init__.py
citrixlb/__init__.py
If I do the above,and move all common functions to common/__init__.py, how would ec2/__init__.py and citrixlb/__init__.py get access to the functions
in common/__init__.py?
The reason is that
I would like to keep the root __init__.py as sparse as possible
I wish to be able to run the __init__.py in citrixlb and ec2 as
standalone scripts.
I also wish to be able to continue to add functionality by adding newdir/__init__.py
If I do the above,and move all common functions to common/__init__.py, how would ec2/__init__.py and citrixlb/__init__.py get access to the functions in common/__init__.py?
This is exactly what explicit relative imports were designed for:
from .. import common
Or, if you insist on using import *:
from ..common import *
You can do this with absolute import instead. Assuming your top-level package is named mything:
from mything import common
from mything.common import *
But in this case, I think you're better with the relative version. It's not just more concise and easier to read, it's more robust (if you rename mything, or reorganize its structure, or embed this whole package inside a larger package…). But you may want to read the rationales for the two different features in PEP 328 to decide which one seems more compelling to you here.
One thing:
I wish to be able to run the __init__.py in citrixlb and ec2 as standalone scripts.
That, you can't do. Running modules inside a package as a top-level script is not supposed to work. Sometimes you get away with it. Once you're importing from a sibling or a parent, you definitely will not get away with it.
The right way to do it is either:
python -m mything.ec2 instead of python mything/ec2/__init__.py
Write a trivial ec2 script at the top level, that just does something like from mything.ec2 import main; main().
The latter is a common enough pattern that, if you're building a setuptools distribution, it can build the ec2 script for you automatically. And automatically make it still work even ec2 ends up in /usr/local/bin while the mything package is in your site-packages. See console_scripts for more details.

Packaging a single python file along with an "extras" package

I currently have a project called "peewee" which consists of a single python file, peewee.py. There is also a module "tests.py" containing unit tests. This has been great, people that want to use the library can just grab a single file and run with it.
I've lately wanted to add some extras, but am not sure how to do this to make the namespacing right. If you look in the root of my project, it is something like:
peewee.py
tests.py
I want to add the following:
extras/__init__.py
extras/foo.py
extras/bar.py
And this is the tricky part. I want to have it such that folks using the single file can still do this, but if you want the extras you can have them, too. I want the extras to be namespaced such that:
from peewee.extras import foo
from peewee.extras.bar import Baz
My setup.py looks a bit like this:
setup(
name='peewee',
packages=['extras'],
py_modules=['peewee'],
# ... etc ...
)
But this doesn't quite work. Any help would be greatly appreciated!
Setting Up a Package
As #ThomasK said, the easiest way to do this would be with a package. If you name your package peewee, then you can edit the top-level __init__.py file to allow users to continue to use your package in the same way they have previously.
First, directory structure for your package and subfolders:
peewee/
__init__.py
peewee.py
extras/
__init__.py
foo.py
bar.py
The __init__.py file
Next, you need to add a few lines to the top-level __init__.py.
You could go for a quick-and-dirty method and just include:
from peewee.peewee import *
which would put everything in peewee.py in the top-level namespace of your package. Or, you could take the more traditional alternative and explicitly import only those methods that should be at the top level.
from peewee.peewee import funtion1, class1,...
and, for backwards compatibility, you could explicitly set the __all__ attribute of your module to include only peewee
__all__ = ['peewee']
which will let people continue to use from peewee import * if they really need to.
Writing a setup.py file
Finally, you'll have to set up some install scripts and such too. Zed Shaw's Learn Python The Hard Way exercise 46 has a simple and clear project skeleton that you should use.
The most important part is the setup.py file. The example page isn't too long and Zed's put a lot of work into making a really great book, so I'm not going to repost it here (though the entire book is available for free). You can also read the longer instructions/documentation for writing a setup.py file for distutils, however LPTHW will give you something that will do everything you want quickly and easily.
Total package directory structure
Note that your final directory structure will actually be a bit bigger (the name of peewee-pkg doesn't matter, bin is optional--the names of the subfolders matter)
peewee-pkg/
setup.py
bin
peewee/
__init__.py
peewee.py
extras/
__init__.py
foo.py
bar.py
Installing and using
After that, you could actually post your package to PyPi if you like, or you can distribute it to users directly. All they would need to do is run:
python setup.py install
and everything will be available to them.
Importing post-install
Finally, if you do specific imports in the peewee/__init__.py file as described earlier, your users will still be able to do:
from peewee import function1, class1, ...
But now they can also use import peewee.extras to get to the extras functions (or import peewee.extras.foo as foo or from pewee.extras.foo import extra_awesome), etc. And, as you asked in your question, they will also be able to do:
from pewee.extras import foo
And then access foo's functions as if the file were in the current directory.
Useful note for developing a package
On your computer, you should run:
python setup.py develop
which will install the package to your path just like using python setup.py install; however, develop tells python to recheck the file every time it uses the module, so that every time you make changes, they will be immediately available to the system for testing.

Categories

Resources