I'm writing a Python program that can parse a binary protocol consisting of messages. Each message has an identifier, followed by type-specific data. I've structured my app so that each message-type is represented by a (sub)class, located in its own module inside my package:
messages/
__init__.py
MessageTypeOne.py
MessageTypeTwo.py
...
From my main file (which is inside the same package, but I don't think it matters) I would like to do the equivalent of
import package.*
That is, I would like all module types to be loaded, but not imported in to the local namespace (i.e. not what from package import * would do). I prefer not to list the message types explicitly (simply adding a file should be enough), but using something similar to the __all__ construct from from bla import * would be acceptable.
I've found a way to accomplish this by looping over os.listdir(__path__), and importlib.import_module()'ing each found file, but this feels overly hacky... Is there a more elegant way to do this?
Update:
Depending on the usage (e.g. decoding for logging or sending a single message), I don't always want to import every message type, so statically importing them in __init__.py is not desirable
I would probably have __init__.py do the importing:
# in __init__.py
from . import MessageTypeOne
from . import MessageTypeTwo
...
If you don't want to do that, you can use __import__:
__import__('package', fromlist=['*'])
This performs the same initialization as from package import * would, but without actually binding any names in the local namespace. Note that it won't initialize any submodules that from package import * wouldn't, so you still need to configure the __all__ list in __init__.py.
Related
I currently have a module I created that has a number of functions.
It's getting quite large so I figured I should make it into a package and split the functions up to make it more manageable.
I'm just testing out how this all works before I do this for real so apologies if it seems a bit tenuous.
I've created a folder called pack_test and in it I have:
__init__.py
foo.py
bar.py
__init__.py contains:
__all__ = ['foo', 'bar']
from . import *
import subprocess
from os import environ
In the console I can write import pack_test as pt and this is fine, no errors.
pt. and two tabs shows me that I can see pt.bar, pt.environ, pt.foo and pt.subprocess in there.
All good so far.
If I want to reference subprocess or environ in foo.py or bar.py how do I do it in there?
If in bar.py I have a function which just does return subprocess.call('ls') it errors saying NameError: name 'subprocess' is not defined. There must be something I'm missing which enables me to reference subprocess from the level above? Presumably, once I can get the syntax from that I can also just call environ in a similar way?
The alternative as I could see it would be to have import subprocess in both foo.py and bar.py but then this seems a bit odd to me to have it appear across multiple files when I could have it the once at a higher level, particularly if I went on to have a large number of files rather than just 2 in this example.
TL;DR:
__init__.py :
import foo
import bar
__all__ = ["foo", "bar"]
foo.py:
import subprocess
from os import environ
# your code here
bar.py
import subprocess
from os import environ
# your code here
There must be something I'm missing which enables me to reference subprocess from the level above?
Nope, this is the expected behaviour.
import loads a module (if it isn't already), caches it in sys.modules (idem), and bind the imported names in the current namespace. Each Python module has (or "is") it's own namespace (there's no real "global" namespace). IOW, you have to import what you need in each module, ie if foo.py needs subprocess, it must explicitely import it.
This can seem a bit tedious at first but in the long run it really helps wrt/ maintainability - you just have to read the imports at the top of your module (pep 08: always put all imports at the beginning of the module) to know where a name comes from.
Also you should not use star imports (aka wild card imports aka from xxx import *) anywhere else than in your python shell (and even then...) - it's a maintainance time bomb. Not only because you don't know where each name comes from, but also because it's a sure way to rebind an already import name. Imagine that your foo module defines function "func". Somewhere you have "from foo import *; from bar import *", then later in the code a call to func. Now someone edits bar.py and adds a (distinct) "func" function, and suddenly you call fails, because you're not calling the expected "func". Now enjoy debugging this... And real-life examples are usually a bit more complex than this.
So if you fancy your mental sanity, don't be lazy, don't try to be smart either, just do the simple obvious thing: explicitely import the names you're interested in at the top of your modules.
(been here, done that etc)
You could create modules.py containing
import subprocess
import os
Then in foo.py or any of your files just have.
from modules import *
Your import statements in your files are then static and just update modules.py when you want to add an additional module accessible to them all.
In Python 2.7, I'm getting
'module' has no attribute
, and/or
'name' is not defined
errors when I try to split up a large python file.
(I have already read similar posts and the Python modules documentation)
Say you have a python file that is structured like this:
<imports>
<50 global variables defined>
<100 lengthy functions that each use most or all of the globals
defined above, and also call each other>
<main() that calls some of the functions and uses the globals>
So I can easily categorize groups of functions together, create a python file for each group, and put them there. The problem is whenever I try to call any of them from the main python file, I get the errors listed above. I think the problem is related to circular dependencies. Since all of the functions rely on the globals, and each other, they are circularly dependent.
If I have main_file.py, group_of_functions_1.py, and group_of_functions_2.py,
main_file.py will have:
import group_of_functions_1.py
import group_of_functions_2.py
and group_of_functions_1.py will have
import main_file.py
import group_of_functions_2.py
and group_of_functions_2.py will have
import main_file.py
import group_of_functions_1.py
Regardless of whether I use "import package_x" or "from package_x import *" the problem remains.
If I take the route of getting rid of the globals, then most of the functions will have 50 parameters they will be passing around which then also need to be returned
What is the right way to clean this up?
(I have already read similar posts and the Python modules documentation)
One of the sources of your errors is likely the following:
import group_of_functions_1.py
import group_of_functions_2.py
When importing, you don't add .py to the end of the module name. Do this instead:
import group_of_functions_1
import group_of_functions_2
I'm currently coding an app which is basically structured that way :
main.py
+ Package1
+--- Class1.py
+--- Apps
+ Package2
+--- Class1.py
+--- Apps
So I have two questions :
First, inside both packages, there are modules needed by all Apps, eg : re. Is there a way I can import a module for the whole package at once, instead of importing it in every file that needs it ?
And, as you can see, Class1 is used in both packages. Is there a good way to share it between both packages to avoid code duplication ?
I would strongly recommend against doing this: by separating the imports from the module that uses the functionality, you make it more difficult to track dependencies between modules.
If you really want to do it though, one option would be to create a new module called common_imports (for example) and have it do the imports you are after.
Then in your other modules, add the following:
from common_imports import *
This should give you all the public names from that module (including all the imports).
To answer your second question, if your two modules named Class1.py are in fact the same, then you should not copy it to both packages. Place it in a package which will contain only code which is common to both, and then import it. It is absolutely not necessary to copy the file and try to maintain each change in both copies.
Q1:
You Must find a way to import your package, thus you have two choices:
(Please correct me if I'm wrong or not thorough)
1. Look at James' solution, which you need to define a class, put all the modules inside and finally import them to your Sub-classes
2. Basically import nothing to your Main class, but instead, import only once to your Sub-classes
For example:(inside A_1 subclass)
import re
def functionThatUseRe(input):
pass
Then inside your main class, just do
try:
from YourPackage import A_1 #windows
except:
import A_1 #MAC OSX
A_1.functionThatUseRe("")
And you completely avoided importing modules multiple times
Q2: put your class1.py in the same directory with your main class, or move it to another folder, in Package1(&2).Apps
import Class1
Start using the code from there
I have a file structure like this:
dir_a
__init__.py
mod_1.py
mod_2.py
dir_b
__init__.py
my_mod.py
I want to dynamically import every module in dir_a in my_mod.py, and in order to do that, I have the following inside the __init__.py of dir_a:
import os
import glob
__all__ = [os.path.basename(f)[:-3] for f in glob.glob(os.path.dirname(os.path.abspath(__file__)) + "/*.py")]
But when I do:
from dir_a import *
I get ImportError: No module named dir_a
I wonder what causes this, is it because the parent directory containing dir_a and dir_b has to be in PYTHONPATH? In addition, the above approach is dynamic only in the sense that my_mod.py picks up everything when it is run, but if a new module is added to dir_a during its run this new module won't be picked up. So is it possible to implement a truly dynamic importing mechanism?
The answer to the first part of your question is a trivial yes: you cannot import dir_a when the directory containing dir_a isn't in the search path.
For the second part, I don't believe that is possible in general - you could listen for 'file created' OS signals in dir_a, see whether the created file is a .py, and add it to dir_a.__all__ - but that is complicated, probably expensive, and doesn't retroactively add the new thing to the global namespace, since from foo import * only sees what is in foo.__all__ at import time. And changing that would be error-prone - it allows your namespace to change at any time based on an unpredictable external event. Say you do this:
from dir_a import *
bar = 5
And then a bar.py gets added to dir_a. You can't write a "truly dynamic importer" without considering that situation, and deciding what you want to have happen.
I am importing a lot of different scripts, so at the top of my file it gets cluttered with import statements, i.e.:
from somewhere.fileA import ...
from somewhere.fileB import ...
from somewhere.fileC import ...
...
Is there a way to move all of these somewhere else and then all I have to do is import that file instead so it's just one clean import?
I strongly advise against what you want to do. You are doing the global include file mistake again. Although only one module is importing all your modules (as opposed to all modules importing the global one), the remaining point is that if there's a valid reason for all those modules to be collected under a common name, fine. If there's no reason, then they should be kept as separate includes. The reason is documentation. If I open your file, and see only one import, I don't get any information about what is imported and where it comes from. If on the other hand, I have the list of imports, I know at a glance what is needed and what not.
Also, there's another important error I assume you are doing. When you say
from somewhere.fileA import ...
from somewhere.fileB import ...
from somewhere.fileC import ...
I assume you are importing, for example, a class, like this
from somewhere.fileA import MyClass
this is wrong. This alternative solution is much better
from somewhere import fileA
<later>
a=fileA.MyClass()
Why? two reasons: first, namespacing. If you have two modules having a class named MyClass, you would have a clash. Second, documentation. Suppose you use the first option, and I find in your code the following line
a=MyClass()
now I have no idea where this MyClass comes from, and I will have to grep around all your files in order to find it. Having it qualified with the module name allows me to immediately understand where it comes from, and immediately find, via a /search, where stuff coming from the fileA module is used in your program.
Final note: when you say "fileA" you are doing a mistake. There are modules (or packages), not files. Modules map to files, and packages map to directories, but they may also map to egg files, and you may even create a module having no file at all. This is naming of concepts, and it's a lateral issue.
Of course there is; just create a file called myimports.py in the same directory where your main file is and put your imports there. Then you can simply use from myimports import * in your main script.