Import statements, a billion one - python

My apologies if this post is a bit inappropriate, but I'm desperate for comments on a programming problem.
I have been frustrated for years about what I believe is a Python design flaw, and many others agree. It's about how import statements work, particularly code like:
from ..mypackage import mymodule
from ..mypackage.mymodule import mymethod
from .. import mypackage
and similar statements.
Only the most simple, contrived, canonical cases actually work. Anything else results in the error message:
ImportError: attempted relative import with no known parent package
I use sys.path.append() as a workaround, but that should not be necessary, and it is not portable.
It seems the issue revolves around where Python thinks the importing module is in the file system at the time it attempts to execute the import statements. My opinion is that Python should be able to figure out if it is in a package and exactly where in the file hierachy that is. The import statements should work as expected if the importing module is called from another module, or if it is run from an interpreter, or PyCharm, IDLE, Spyder, or by some other way.
There is a SO post, Relative imports for the billionth time, which addresses this problem. The main article plus 15 answers and 36 comments indicate that this issue has been around for a long time. Many very smart people have offered exotic explanations and proposed cumbersome solutions to an issue that should not exist, and yet the powers that control the development of Python have not moved on the matter. It should not be necessary to be knee deep in Python internals in order to write a simple application. That's whole idea of a high level language; this is not C++.
Someone reading this must have influence with Python developers. Please use it.
Any comments, please.

Related

Visual Studio Python reccomendations from module are incomplete

I followed a tutorial for websockets in python and stumbled across the issue that pylance does not reccomend me the fuctions related to the class that I have imported from a module:
My Editor:
The Tutorial:
The Code Itself runs without any issue, so the Import seems to work, but I dont recieve the reccomendations in vs code. What is the reason for this or where could i debug something like this?
Thanks to wjandrea I found some thing I personally did not stumble acros before.
So a classic mistake happened to me when following a tutorial, I work on newer versions than the 1.5 year old video... Unfortunately the part talking about versions was in a nother part...
Long story short, in the mean time a bigger change for the websockets module appeared since the used functions are now imported lazily. Wich makes sense to reduce startup time incase you'll run a websocket server with the module.
Little info about lazy imports (for me it was the first time i heard about this intelligent feature)
Incase anybody else stumbles across this im currently on Python 3.10.7 and I am talking about websockets 10.3!
Back to the issue.
Pylance obviously can't make any reccomendations since the functions like websockets.connect(uri, ...) are just loaded if they are used in the runtime by the default websockets module so tools for code reccomendations inside the editor dont not know they are there..
I took a glance inside the module and through the indirect hint from wjandrea about the lazy imports inside __init__.py the listed dictionary made much more sense now! Based on this I could backtrack the Python scripts I need for my functions or rather Pylance needs to create those handy reccomendations for me inside vs code (or any other ide).
For now I just manually imported the desired script so I have a bit more guidance while writing and since startuptime in my current project is not crutial I let the manual imports exist or i´ll just change out the import variations based on the cirumstance if I am currently developing or if the code goes into production.

What issues with future 3rd-party packages could be expected if removing unneeded Python3 core modules

I have an environment with some extreme constraints that require me to reduce the size of a planned Python 3.8.1 installation. The OS is not connected to the internet, and a user will never open an interactive shell or attach a debugger.
There are of course lots of ways to do this, and one of the ways I am exploring is to remove some core modules, for example python3-email. I am concerned that there are 3rd-party packages that future developers may include in their apps that have unused but required dependencies on core python features. For example, if python3-email is missing, what 3rd-party packages might not work that one would expect too? If a developer decides to use a logging package that contains an unreferenced EmailLogger class in a referenced module, it will break, simply because import email appears at the top.
Do package design requirements or guidelines exist that address this?
It's an interesting question, but it is too broad to be cleanly answered here. In short, the Python standard library is expected to always be there, even though sometimes it broken up in multiple parts (Debian for example). But you say it yourself, you don't know what your requirements are since you don't know yet what future packages will run on this interpreter... This is impossible to answer. One thing you could do is to use something like modulefinder on the future code before letting it run on that constrained Python interpreter.
I was able to get to a solution. The issue was best described to me as cascading imports. It is possible to stop a module from being loaded, by adding an entry to sys.modules. For example, when importing the asyncio module ssl and _ssl modules will be loaded, even though they are not needed outside of ssl. This can be stopped with the following code. This can be verified both by seeing the python process is 3MB smaller, but also by using module load hooks to watch each module as it loads:
import importhook
import sys
sys.modules['ssl'] = None
#importhook.on_import(importhook.ANY_MODULE)
def on_any_import(module):
print(module.__spec__.name)
assert module.__spec__.name not in ['ssl', '_ssl']
import asyncio
For my original question about 3rd-party design guidelines, some recommend placing the import statements within the class rathe that at the module level, however this is not routinely done.

Testing all functions in a module for errors

This may be a dumb question, but charging boldly ahead anyway.
I have a library of about a dozen Python modules I maintain for general use. Recently, after advice found here on SO, I changed all of the modules so they are imported in the import x as y style instead of from x import *. This solved several problems and made the code easier to manage.
However, there was an unintended side effect of this. Many of the modules use Python builtin modules like sys or os to do whatever, and the way the code was previously set up, if I typed import sys in module x, and used from x import * in module y, I didn't have to import sys in module y. As a result, I took this for granted quite a lot (terrible practice, I know). When I switched to import x, this caused a lot of broken functions, as you can imagine.
So here's the main issue: because Python is an interpreted language, errors about missing modules in a function won't show up until a function is actually run. And because this is just a general-use library, some of these errors could persist for months undetected, or longer.
I'm completely prepared to write a unit test for each module (if __name__ == "__main__" and all that), but I wanted to ask first: is there an automated way of checking every function in a module for import/syntax errors, or any other error that is not dependent on input? Things that a compiler would catch in C or another language. A brief Google and SO search didn't turn up anything. Any suggestions are welcome and appreciated.
Yes. PyFlakes will warn you about those most basic errors, and you should make sure it's integrated with your favorite text editor, so it tells you about missing imports or unused imports whenever you save the file.
PyFlakes will, amgonst other things, tell you about
syntax errors
undefined names
missing imports
unused imports
To just run PyFlakes on all your files in a directory, you can just do:
pyflakes /path/to/dir
One big advantage that PyFlakes has over more advanced linting tools like PyLint is that it does static analysis - which means, it doesn't need to import your code (which can be a pain if you've got some complex dependencies). It just analyses the abstract syntax tree of your Python source, and therefore catches the most basic of errors - those that usually prevent your script from having even a chance of running.
I should also mention that there is a related tool, flake8, which combines PyFlakes with PEP8 convention checks and McCabe code complexity analysis.
There's PyFlakes integrations for every editor (or IDE) I know. Here's just a couple (in no particular order):
Sublime Text
vim
emacs
TextMate
Eclipse / PyDev

How to deal with third party Python imports when sharing a script with other people on Git?

I made a Python module (https://github.com/Yannbane/Tick.py) and a Python program (https://github.com/Yannbane/Flatland.py). The program imports the module, and without it, it cannot work. I have intended for people to download both of these files before they can run the program, but, I am concerned about this a bit.
In the program, I've added these lines:
sys.path.append("/home/bane/Tick.py")
import tick
"/home/bane/Tick.py" is the path to my local repo of the module that needs to be included, but this will obviously be different to other people! How can I solve this situation better?
What suggested by #Lattyware is a viable option. However, its not uncommon to have core dependencies boundled with the main program (Django and PyDev do this for example). This works fine especially if the main code is tweaked against a specific version of the library.
In order to avoid the troubles mentioned by Lattyware when it comes to code maintenance, you should look into git submodules, which allow precisely this kind of layout, keeping code versioning sane.
From the structure of your directory it seems that both files live in the same directory. This might be the tell-tale than they might be two modules of a same package. In that case you should simply add an empty file called __init__.py to the directory, and then your import could work by:
import bane.tick
or
from bane import tick
Oh, and yes... you should use lower case for module names (it's worth to take a in-depth look at PEP8 if you are going to code in python! :)
HTH!
You might want to try submitting your module to the Python Package Index, that way people can easily install it (pip tick) into their path, and you can just import it without having to add it to the python path.
Otherwise, I would suggest simply telling people to download the module as well, and place it in a subdirectory of the program. If you really feel that is too much effort, you could place a copy of the module into the repository for the program (of course, that means ensuring you keep both versions up-to-date, which is a bit of a pain, although I imagine it may be possible just to use a symlink).
It's also worth noting your repo name is a bit misleading, capitalisation is often important, so you might want to call the repo tick.py to match the module, and python naming conventions.

Python directory invocation

I have a directory with several python modules in it. Each module is mutually exclusive of all the others but after lots of trial and error, I have decided that each module horks out when using the multiprocessing functionality in Python. I have used the join() function on each process and its just not working like I want.
What I am really looking for is the ability to drop new mutually exclusive python modules in to the directory and have them invoked when the directory is launched. Does anyone know how to do this?
It sounds to me like you are asking about plugin architecture and sandboxing. Does that sound right?
The plugin component has been done and written about else where. SO has code examples on basic ways to import all the files.
The sandbox part is going to be a harder. Have a look at RestrictedPython and the Restricted Execution docs and the generally older but nevertheless helpful discussion of sandboxing.
If you aren't worried about untrusted code but rather want to isolate errors you could just wrap each module in a generic try/except that handles all errors. This would make debugging hard but would ensure that an error in one module didn't bring down the whole system.
If you aren't worried about untrused code but do need to have each file totally isolated then you might be best off looking into various systems of interprocess communication. I've actually had some luck using Redis for this (which sounds ridiculous but actually has been very easy and effective).
Anyway hopefully some of that helps you. Without more information it's hard to provide more than general thoughts and a guide to better googling.

Categories

Resources