Python namespaces: How to make unique objects accessible in other modules? - python

I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.

You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...

It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.

At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.

Related

Python global variable in import * [duplicate]

I've run into a bit of a wall importing modules in a Python script. I'll do my best to describe the error, why I run into it, and why I'm tying this particular approach to solve my problem (which I will describe in a second):
Let's suppose I have a module in which I've defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let "a" be such an entity):
module1:
def f():
print a
And then I have the main program, where "a" is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d'uh) and several solutions have been suggested, however I don't really think these fit my requirements. Here's my particular context:
I'm trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I've defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned "global name not defined" error.
A particular suggestion was to have a "from program import cur" statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that's cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the "cur" object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can't find the answer myself and I've got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
import module1
thingy1 = module1.Thingy(a=3)
thingy1.f()
If you really do want a global, but it's just there to be used by module1, set it in that module.
import module1
module1.a=3
module1.f()
On the other hand, if a is shared by a whole lot of modules, put it somewhere else, and have everyone import it:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
… and, in module1.py:
import shared_stuff
def f():
print shared_stuff.a
Don't use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
import builtins
import module1
builtins.a = 3
module1.f()
As a workaround, you could consider setting environment variables in the outer layer, like this.
main.py:
import os
os.environ['MYVAL'] = str(myintvariable)
mymodule.py:
import os
myval = None
if 'MYVAL' in os.environ:
myval = os.environ['MYVAL']
As an extra precaution, handle the case when MYVAL is not defined inside the module.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don't work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
#sharedstuff.py
globaltimes_randomnode=[]
globalist_randomnode=[]
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
from globals import *
it worked (the arrays were populated).
Just sayin'
A function uses the globals of the module it's defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don't use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
main program:
import module1
cursor = get_a_cursor()
module1.setCursor(cursor)
module1.method()
Since globals are module specific, you can add the following function to all imported modules, and then use it to:
Add singular variables (in dictionary format) as globals for those
Transfer your main module globals to it
.
addglobals = lambda x: globals().update(x)
Then all you need to pass on current globals is:
import module
module.addglobals(globals())
Since I haven't seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module's globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
Update
To test the theory, I created a module and put it on pypi. It all worked perfectly.
pip install superglobals
Short answer
This works fine in Python 2 or 3:
import inspect
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
save as superglobals.py and employ in another module thusly:
from superglobals import *
superglobals()['var'] = value
Extended Answer
You can add some extra functions to make things more attractive.
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
def getglobal(key, default=None):
"""
getglobal(key[, default]) -> value
Return the value for key if key is in the global dictionary, else default.
"""
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals.get(key, default)
def setglobal(key, value):
_globals = superglobals()
_globals[key] = value
def defaultglobal(key, value):
"""
defaultglobal(key, value)
Set the value of global variable `key` if it is not otherwise st
"""
_globals = superglobals()
if key not in _globals:
_globals[key] = value
Then use thusly:
from superglobals import *
setglobal('test', 123)
defaultglobal('test', 456)
assert(getglobal('test') == 123)
Justification
The "python purity league" answers that litter this question are perfectly correct, but in some environments (such as IDAPython) which is basically single threaded with a large globally instantiated API, it just doesn't matter as much.
It's still bad form and a bad practice to encourage, but sometimes it's just easier. Especially when the code you are writing isn't going to have a very long life.

Should I always use the most pythonic way to import modules?

I am making a tiny framework for games with pygame, on which I wish to implement basic code to quickly start new projects. This will be a module that whoever uses should just create a folder with subfolders for sprite classes, maps, levels, etc.
My question is, how should my framework module load these client modules? I was considering to design it so the developer could just pass to the main object the names of the directories, like:
game = Game()
game.scenarios = 'scenarios'
Then game will append 'scenarios' to sys.path and use __import__(). I've tested and it works.
But then I researched a little more to see if there were already some autoloader in python, so I could avoid to rewrite it, and I found this question Python modules autoloader?
Basically, it is not recommended to use a autoloader in python, since "explicit is better than implicit" and "Readability counts".
That way, I think, I should compel the user of my module to manually import each of his/her modules, and pass these to the game instance, like:
import framework.Game
import scenarios
#many other imports
game = Game()
game.scenarios = scenarios
#so many other game.whatever = whatever
But this doesn't looks good to me, not so confortable. See, I am used to work with php, and I love the way it works with it's autoloader.
So, the first exemple has some problability to crash or be some trouble, or is it just not 'pythonic'?
note: this is NOT an web application
I wouldn't consider letting a library import things from my current path or module good style. Instead I would only expect a library to import from two places:
Absolute imports from the global modules space, like things you have installed using pip. If a library does this, this library must also be found in its install_requires=[] list
Relative imports from inside itself. Nowadays these are explicitly imported from .:
from . import bla
from .bla import blubb
This means that passing an object or module local to my current scope must always happen explicitly:
from . import scenarios
import framework
scenarios.sprites # attribute exists
game = framework.Game(scenarios=scenarios)
This allows you to do things like mock the scenarios module:
import types
import framework
# a SimpleNamespace looks like a module, as they both have attributes
scenarios = types.SimpleNamespace(sprites='a', textures='b')
scenarios.sprites # attribute exists
game = framework.Game(scenarios=scenarios)
Also you can implement a framework.utils.Scenario() class that implements a certain interface to provide sprites, maps etc. The reason being: Sprites and Maps are usually saved in separate files: What you absolutely do not want to do is look at the scenarios's __file__ attribute and start guessing around in its files. Instead implement a method that provides a unified interface to that.
class Scenario():
def __init__(self):
...
def sprites(self):
# optionally load files from some default location
# If no such things as a default location exists, throw a NotImplemented error
...
And your user-specific scenarios will derive from it and optionally overload the loading methods
import framework.utils
class Scenario(framework.utils.Scenario):
def __init__(self):
...
def sprites(self):
# this method *must* load files from location
# accessing __file__ is OK here
...
What you can also do is have framework ship its own framework.contrib.scenarios module that is used in case no scenarios= keyword arg was used (i.e. for a square default map and some colorful default textures)
from . import contrib
class Game()
def __init__(self, ..., scenarios=None, ...):
if scenarios is None:
scenarios = contrib.scenarios
self.scenarios = scenarios

Export decorator that manages __all__

A proper Python module will list all its public symbols in a list called __all__. Managing that list can be tedious, since you'll have to list each symbol twice. Surely there are better ways, probably using decorators so one would merely annotate the exported symbols as #export.
How would you write such a decorator? I'm certain there are different ways, so I'd like to see several answers with enough information that users can compare the approaches against one another.
In Is it a good practice to add names to __all__ using a decorator?, Ed L suggests the following, to be included in some utility library:
import sys
def export(fn):
"""Use a decorator to avoid retyping function/class names.
* Based on an idea by Duncan Booth:
http://groups.google.com/group/comp.lang.python/msg/11cbb03e09611b8a
* Improved via a suggestion by Dave Angel:
http://groups.google.com/group/comp.lang.python/msg/3d400fb22d8a42e1
"""
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
name = fn.__name__
all_ = mod.__all__
if name not in all_:
all_.append(name)
else:
mod.__all__ = [fn.__name__]
return fn
We've adapted the name to match the other examples. With this in a local utility library, you'd simply write
from .utility import export
and then start using #export. Just one line of idiomatic Python, you can't get much simpler than this. On the downside, the module does require access to the module by using the __module__ property and the sys.modules cache, both of which may be problematic in some of the more esoteric setups (like custom import machinery, or wrapping functions from another module to create functions in this module).
The python part of the atpublic package by Barry Warsaw does something similar to this. It offers some keyword-based syntax, too, but the decorator variant relies on the same patterns used above.
This great answer by Aaron Hall suggests something very similar, with two more lines of code as it doesn't use __dict__.setdefault. It might be preferable if manipulating the module __dict__ is problematic for some reason.
You could simply declare the decorator at the module level like this:
__all__ = []
def export(obj):
__all__.append(obj.__name__)
return obj
This is perfect if you only use this in a single module. At 4 lines of code (plus probably some empty lines for typical formatting practices) it's not overly expensive to repeat this in different modules, but it does feel like code duplication in those cases.
You could define the following in some utility library:
def exporter():
all = []
def decorator(obj):
all.append(obj.__name__)
return obj
return decorator, all
export, __all__ = exporter()
export(exporter)
# possibly some other utilities, decorated with #export as well
Then inside your public library you'd do something like this:
from . import utility
export, __all__ = utility.exporter()
# start using #export
Using the library takes two lines of code here. It combines the definition of __all__ and the decorator. So people searching for one of them will find the other, thus helping readers to quickly understand your code. The above will also work in exotic environments, where the module may not be available from the sys.modules cache or where the __module__ property has been tampered with or some such.
https://github.com/russianidiot/public.py has yet another implementation of such a decorator. Its core file is currently 160 lines long! The crucial points appear to be the fact that it uses the inspect module to obtain the appropriate module based on the current call stack.
This is not a decorator approach, but provides the level of efficiency I think you're after.
https://pypi.org/project/auto-all/
You can use the two functions provided with the package to "start" and "end" capturing the module objects that you want included in the __all__ variable.
from auto_all import start_all, end_all
# Imports outside the start and end functions won't be externally availab;e.
from pathlib import Path
def a_private_function():
print("This is a private function.")
# Start defining externally accessible objects
start_all(globals())
def a_public_function():
print("This is a public function.")
# Stop defining externally accessible objects
end_all(globals())
The functions in the package are trivial (a few lines), so could be copied into your code if you want to avoid external dependencies.
While other variants are technically correct to a certain extent, one might also be sure that:
if the target module already has __all__ declared, it is handled correctly;
target appears in __all__ only once:
# utils.py
import sys
from typing import Any
def export(target: Any) -> Any:
"""
Mark a module-level object as exported.
Simplifies tracking of objects available via wildcard imports.
"""
mod = sys.modules[target.__module__]
__all__ = getattr(mod, '__all__', None)
if __all__ is None:
__all__ = []
setattr(mod, '__all__', __all__)
elif not isinstance(__all__, list):
__all__ = list(__all__)
setattr(mod, '__all__', __all__)
target_name = target.__name__
if target_name not in __all__:
__all__.append(target_name)
return target

Proper use of `isinstance(obj, class)`

As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods
This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.
You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.

circular import dependencies in a package with inheritances

I have basically the following setup in my package:
thing.py:
from otherthing import *
class Thing(Base):
def action(self):
...do something with Otherthing()...
subthing.py:
from thing import *
class Subthing(Thing):
pass
otherthing.py:
from subthing import *
class Otherthing(Base):
def action(self):
... do something with Subthing()...
If I put all objects into one file, it will work, but that file would just become way too big and it'll be harder to maintain. How do I solve this problem?
This is treading into the dreaded Python circular imports argument but, IMHO, you can have an excellent design and still need circular references.
So, try this approach:
thing.py:
class Thing(Base):
def action(self):
...do something with otherthing.Otherthing()...
import otherthing
subthing.py:
import thing
class Subthing(thing.Thing):
pass
otherthing.py:
class Otherthing(Base):
def action(self):
... do something with subthing.Subthing()...
import subthing
There are a couple of things going on here. First, some background.
Due to the way importing works in Python, a module that is in the process of being imported (but has not been fully parsed yet) will be considered already imported when future import statements in other modules referencing that module are evaluated. So, you can end up with a reference to a symbol on a module that is still in the middle of being parsed - and if the parsing hasn't made it down to the symbol you need yet, it will not be found and will throw an exception.
One way to deal with this is to use "tail imports". The purpose of this technique is to define any symbols that other modules referring to this one might need before potentially triggering the import of those other modules.
Another way to deal with circular references is to move from from based imports to a normal import. How does this help? When you have a from style import, the target module will be imported and then the symbol referenced in the from statement will be looked up on the module object right at that moment.
With a normal import statement, the lookup of the reference is delayed until something does an actual attribute reference on the module. This can usually be pushed down into a function or method which should not normally be executed until all of your importing is complete.
The case where these two techniques don't work is when you have circular references in your class hierarchy. The import has to come before the subclass definition and the attribute representing the super class must be there when the class statement is hit. The best you can do is use a normal import, reference the super class via the module and hope you can rearrange enough of the rest of your code to make it work.
If you are still stuck at that point, another technique that can help is to use accessor functions to mediate the access between one module and another. For instance, if you have class A in one module and want to reference it from another module but can't due to a circular reference, you can sometimes create a third module with a function in it that just returns a reference to class A. If you generalize this into a suite of accessor functions, this doesn't end up as much of a hack as it sounds.
If all else fails, you can move import statements into your functions and methods - but I usually leave that as the very last resort.
--- EDIT ---
Just wanted to add something new I discovered recently. In a "class" statement, the super class is actually a Python expression. So, you can do something like this:
>>> b=lambda :object
>>> class A(b()):
... pass
...
>>> a=A()
>>> a
<__main__.A object at 0x1fbdad0>
>>> a.__class__.__mro__
(<class '__main__.A'>, <type 'object'>)
>>>
This allows you to define and import an accessor function to get access to a class from another class definition.
Stop writing circular imports. It's simple. thing cannot possible depend on everything that's in otherthing.
1) search for other questions exactly like yours.
2) read those answers.
3) rewrite otherthing so that thing depends on part of otherthing, not all of otherthing.

Categories

Resources