Reloading module causes different results

Reloading module causes different results - python

During testing I added a reload command to my test cases so I could change code in several different places without having to reload everything manually and I noticed the reloads seemed to affect the results of the tests.
Here's what I did:
import mymodule
import mymodule.rules as rules
def testcase():
reload(mymodule)
reload(rules)
# The rest of the test case
Everything works fine like this, or when both reloads are commented out, but when I comment out the second reload the results of the test are different. Is there something that happens during the reload process that I'm not aware of that necessitates reloading all scripts from a module once the module is reloaded? Is there some other explanation?
I'm not sure if this is relevant, but rules is a separate script inside the package that includes this line:
from mymodule import Rule

The information in your question is rather vague, and your terminology rather non-standard. From
rules is a separate script inside mymodule.
I infer that mymodule is actually a package, and it seems it does not automatically import rules upon import. This implies that after executing
import mymodule
there will be no mymodule.rules, but after executing
import mymodule.rules as rules
the module rules will be imported into the namespace of mymodule. (Side note: The latter code line is usually written as from mymodule import rules.)
After executing the first reload() statement, you will get a frsh copy of mymodule, which wil not contain mymodule.rules – this will only be recreated after the second reload() statement.
I had to do a lot of guessing for this answer, so I might have gotten it wrong. The reload() statement has lots of subtleties, as can be seen in its documentation, so its better to only use it if you are closely familiar with Python's import machinery.
(Another side note: If rule.py resides inside the package mymodule, as your setup seems to be, you should use a relative import there. Instead of
from mymodule import Rule
you should do
from . import Rule
I also recommend from __future__ import absolute_import for more transparent import rules.)

I'm not sure exactly what is causing your problem, but I think you may be misusing reload().
According to the docs, reload() will
Reload a previously imported module.
However, if you're running this in a testcase, there won't be any changes to the module between when you import it and when you reload it, right? In order for there to be changes, I think you would have to be changing those files as the test case runs, which is probably not a good idea.

Related

How to define a function which imports modules

I am trying to define a function which imports modules and place that function in a module of my own so that when I am working on a certain type of project all I need to type is:
import from user *
setup()
#setup is the function which imports the modules
However, whenever I try this, it simply doesn't work. trying to call upon the modules defined in setup after I have run the function only results in an error saying that the modules aren't installed.
Here is the code in my module:
def setup():
import keyboard, win32api, win32con
Let me know if there is any more information I can provide, and thanks for any help.

It's generally a good idea to explicitly import names into your module where you need them, so you can see where things come from. Explicit is better than implicit. But for interactive sessions, it can sometimes be useful to import loads of things at once, so...
You problem is that your setup method imports these modules into its own namespace, which isn't available outside the function. But you can do something much simpler. If your user module just contained:
import keyboard, win32api, win32con
Then in your interactive session you could do:
>>> from user import *
These modules should then be available in your session's namespace.

I think you are having a scope problem, if setup is defined in some other module, the import will be valid in that module only (or maybe only in the function that would need to be tested).
As a general matter an "import everything possibly needed" policy is something I would consider wrong. Your code ought only to import what it really needs. Dependencies are better reduced to a minimum and explicit.

Circular imports hell

Python is extremely elegant language. Well, except... except imports. I still can't get it work the way it seems natural to me.
I have a class MyObjectA which is in file mypackage/myobjecta.py. This object uses some utility functions which are in mypackage/utils.py. So in my first lines in myobjecta.py I write:
from mypackage.utils import util_func1, util_func2
But some of the utility functions create and return new instances of MyObjectA. So I need to write in utils.py:
from mypackage.myobjecta import MyObjectA
Well, no I can't. This is a circular import and Python will refuse to do that.
There are many question here regarding this issue, but none seems to give satisfactory answer. From what I can read in all the answers:
Reorganize your modules, you are doing it wrong! But I do not know
how better to organize my modules even in such a simple case as I
presented.
Try just import ... rather than from ... import ...
(personally I hate to write and potentially refactor all the full
name qualifiers; I love to see what exactly I am importing into
module from the outside world). Would that help? I am not sure,
still there are circular imports.
Do hacks like import something in the inner scope of a function body just one line before you use something from other module.
I am still hoping there is solution number 4) which would be Pythonic in the sense of being functional and elegant and simple and working. Or is there not?
Note: I am primarily a C++ programmer, the example above is so much easily solved by including corresponding headers that I can't believe it is not possible in Python.

There is nothing hackish about importing something in a function body, it's an absolutely valid pattern:
def some_function():
import logging
do_some_logging()
Usually ImportErrors are only raised because of the way import() evaluates top level statements of the entire file when called.
In case you do not have a logic circular dependency...
, nothing is impossible in python...
There is a way around it if you positively want your imports on top:
From David Beazleys excellent talk Modules and Packages: Live and Let Die! - PyCon 2015, 1:54:00, here is a way to deal with circular imports in python:
try:
from images.serializers import SimplifiedImageSerializer
except ImportError:
import sys
SimplifiedImageSerializer = sys.modules[__package__ + '.SimplifiedImageSerializer']
This tries to import SimplifiedImageSerializer and if ImportError is raised (due to a circular import error or the it not existing) it will pull it from the importcache.
PS: You have to read this entire post in David Beazley's voice.

Don't import mypackage.utils to your main module, it already exists in mypackage.myobjecta. Once you import mypackage.myobjecta the code from that module is being executed and you don't need to import anything to your current module, because mypackage.myobjecta is already complete.

What you want isn't possible. There's no way for Python to know in which order it needs to execute the top-level code in order to do what you ask.
Assume you import utils first. Python will begin by evaluating the first statement, from mypackage.myobjecta import MyObjectA, which requires executing the top level of the myobjecta module. Python must then execute from mypackage.utils import util_func1, util_func2, but it can't do that until it resolves the myobjecta import.
Instead of recursing infinitely, Python resolves this situation by allowing the innermost import to complete without finishing. Thus, the utils import completes without executing the rest of the file, and your import statement fails because util_func1 doesn't exist yet.
The reason import myobjecta works is that it allows the symbols to be resolved later, after the body of every module has executed. Personally, I've run into a lot of confusion even with this kind of circular import, and so I don't recommend using them at all.
If you really want to use a circular import anyway, and you want them to be "from" imports, I think the only way it can reliably work is this: Define all symbols used by another module before importing from that module. In this case, your definitions for util_func1 and util_func2 must be before your from mypackage.myobjecta import MyObjectA statement in utils, and the definition of MyObjectA must be before from mypackage.utils import util_func1, util_func2 in myobjecta.
Compiled languages like C# can handle situations like this because the top level is a collection of definitions, not instructions. They don't have to create every class and every function in the order given. They can work things out in whatever order is required to avoid any cycles. (C++ does it by duplicating information in prototypes, which I personally feel is a rather hacky solution, but that's also not how Python works.)
The advantage of a system like Python is that it's highly dynamic. Yes you can define a class or a function differently based on something you only know at runtime. Or modify a class after it's been created. Or try to import dependencies and go without them if they're not available. If you don't feel these things are worth the inconvenience of adhering to a strict dependency tree, that's totally reasonable, and maybe you'd be better served by a compiled language.

Pythonistas frown upon importing from a function. Pythonistas usually frown upon global variables. Yet, I saw both and don't think the projects that used them were any worse than others done by some strict Pythhonistas. The feature does exist, not going into a long argument over its utility.
There's an alternative to the problem of importing from a function: when you import from the top of a file (or the bottom, really), this import will take some time (some small time, but some time), but Python will cache the entire file and if another file needs the same import, Python can retrieve the module quickly without importing. Whereas, if you import from a function, things get complicated: Python will have to process the import line each time you call the function, which might, in a tiny way, slow your program down.
A solution to this is to cache the module independently. Okay, this uses imports inside function bodies AND global variables. Wow!
_MODULEA = None
def util1():
if _MODULEA is None:
from mymodule import modulea as _MODULEA
obj = _MODULEA.ClassYouWant
return obj
I saw this strategy adopted with a project using a flat API. Whether you like it or not (and I'm not sure about that myself), it works and is fast, because the import line is executed only once (when the function first executes). Still, I would recommend restructuring: problems with circular imports show a problem in structure, usually, and this is always worth fixing. I do agree, though, it would be nice if Python provided more useful errors when this kind of situation happens.

Save A Reloaded Python Module For Testing Purposes

I have a Python module that I am testing, and because of the way that the module works (it does some initialization upon import) have been reloading the module during each unittest that is testing the initialization. The reload is done in the setUp method, so all tests are actually reloading the module, which is fine.
This all works great if I am only running tests in that file during any given Python session because I never required a reference to the previous instance of the module. But when I use Pydev or unittest's discover I get errors as seen here because other tests which import this module have lost their reference to objects in the module since they were imported before all of the reloading business in my tests.
There are similar questions around SO like this one, but those all deal with updating objects after reloads have occurred. What I would like to do is save the state of the module after the initial import, run my tests that do all of the reloading, and then in the test tearDown to put the initial reference to the module back so that tests that run downstream that use the module still have the correct reference. Note that I am not making any changes to the module, I am only reloading it to test some initialization pieces that it does.
There are also some solutions that include hooks in the module code which I am not interested in. I don't want to ask developers to push things into the codebase just so tests can run. I am using Python 2.6 and unittest. I see that some projects exist like process-isolation, and while I am not sure if that does entirely what I am asking for, it does not work for Python 2.6 and I don't want to add new packages to our stack if possible. Stub code follows:
import mypackage.mymodule
saved_module = mypackage.mymodule
class SomeTestThatReloads(unittest.TestCase):
def setUp(self):
reload(mypackage.mymodule)
def tearDown(self):
# What to do here with saved_module?
def test_initialization(self):
# testing scenario code

Unfortunately, there is no simple way to do that. If your module's initialization has side effects (and by the looks of it it does -- hooks, etc.), there is no automated way to undo them, short of entirely restarting the Python process.
Similarly, if anything in your code imports something from your module rather than the module itself (e.g. from my_package.my_module import some_object instead of import my_package.my_module), reloading the module won't do anything to the imported objects (some_object will refer to whatever my_package.my_module.some_object referred to when the import statement was executed, regardless of what you reload and what's on the disk).
The problem this all comes down to is that Python's module system works by executing the modules (which is full of side effects, the definition of classes/functions/variables being only one of many) and then exposing the top-level variables they created, and the Python VM itself treats modules as one big chunk of global state with no isolation.
Therefore, the general solution to your problem is to restart a new Python process after each test (which sucks :( ).
If your modules' initialization side effects are limited, you can try running your tests with Nose instead of Unittest (the tests are compatible, you don't have to rewrite anything), whose Isolate plugin attempts to do what you want: http://nose.readthedocs.org/en/latest/plugins/isolate.html
But it's not guaranteed to work in the general case, because of what I said above.

How to properly handle a circular module dependency in Python?

Trying to find a good and proper pattern to handle a circular module dependency in Python. Usually, the solution is to remove it (through refactoring); however, in this particular case we would really like to have the functionality that requires the circular import.
EDIT: According to answers below, the usual angle of attack for this kind of issue would be a refactor. However, for the sake of this question, assume that is not an option (for whatever reason).
The problem:
The logging module requires the configuration module for some of its configuration data. However, for some of the configuration functions I would really like to use the custom logging functions that are defined in the logging module. Obviously, importing the logging module in configuration raises an error.
The possible solutions we can think of:
Don't do it. As I said before, this is not a good option, unless all other possibilities are ugly and bad.
Monkey-patch the module. This doesn't sound too bad: load the logging module dynamically into configuration after the initial import, and before any of its functions are actually used. This implies defining global, per-module variables, though.
Dependency injection. I've read and run into dependency injection alternatives (particularly in the Java Enterprise space) and they remove some of this headache; however, they may be too complicated to use and manage, which is something we'd like to avoid. I'm not aware of how the panorama is about this in Python, though.
What is a good way to enable this functionality?
Thanks very much!

As already said, there's probably some refactoring needed. According to the names, it might be ok if a logging modules uses configuration, when thinking about what things should be in configuration one think about configuration parameters, then a question arises, why is that configuration logging at all?
Chances are that the parts of the code under configuration that uses logging does not belong to the configuration module: seems like it is doing some kind of processing and logging either results or errors.
Without inner knowledge, and using only common sense, a "configuration" module should be something simple without much processing and it should be a leaf in the import tree.
Hope it helps!

Will this work for you?
# MODULE a (file a.py)
import b
HELLO = "Hello"
# MODULE b (file b.py)
try:
import a
# All the code for b goes here, for example:
print("b done",a.HELLO))
except:
if hasattr(a,'HELLO'):
raise
else:
pass
Now I can do an import b. When the circular import (caused by the import b statement in a) throws an exception, it gets caught and discarded. Of course your entire module b will have to indented one extra block spacing, and you have to have inside knowledge of where the variable HELLO is declared in a.
If you don't want to modify b.py by inserting the try:except: logic, you can move the whole b source to a new file, call it c.py, and make a simple file b.py like this:
# new Module b.py
try:
from c import *
print("b done",a.HELLO)
except:
if hasattr(a,"HELLO"):
raise
else:
pass
# The c.py file is now a copy of b.py:
import a
# All the code from the original b, for example:
print("b done",a.HELLO))
This will import the entire namespace from c to b, and paper over the circular import as well.
I realize this is gross, so don't tell anyone about it.

A cyclic module dependency is usually a code smell.
It indicates that part of the code should be re-factored so that it is external to both modules.

So if I'm reading your use case right, logging accesses configuration to get configuration data. However, configuration has some functions that, when called, require that stuff from logging be imported in configuration.
If that is the case (that is, configuration doesn't really need logging until you start calling functions), the answer is simple: in configuration, place all the imports from logging at the bottom of the file, after all the class, function and constant definitions.
Python reads things from top to bottom: when it comes across an import statement in configuration, it runs it, but at this point, configuration already exists as a module that can be imported, even if it's not fully initialized yet: it only has the attributes that were declared before the import statement was run.
I do agree with the others though, that circular imports are usually a code smell.

how to test if one python module has been imported?

How to test if a module has been imported in python?
for example I need the basics:
if not has_imported("sys"):
import sys
also
if not has_imported("sys.path"):
from sys import path
Thanks!
Rgs.
Thanks for all of your comments:
the code been pasted here.
auto import all sub modules in a folder then invoke same name functions - python runtime inspect related

If you want to optimize by not importing things twice, save yourself the hassle because Python already takes care of this.
If you need this to avoid NameErrors or something: Fix your sloppy coding - make sure you don't need this, i.e. define (import) everything before you ever use it (in the case if imports: once, at startup, at module level).
In case you do have a good reason: sys.modules is a dictionary containing all modules already imported somewhere. But it only contains modules, and because of the way from <module> import <variable> works (import the whole module as usual, extract the things you import from it), from sys import path would only add sys to sys.modules (if it wasn't already imported on startup). from pkg import module adds pkg.module as you probably expect.

I feel the answer that has been accepted is not fully correct.
Python still has overhead when importing the same module multiple times. Python handles it without giving you an error, sure, but that doesn't mean it won't slow down your script. As you will see from the URL below, there is significant overhead when importing a module multiple times.
For example, in a situation where you may not need a certain module except under a particular condition, if that module is large or has a high overhead then there is reason to import only on condition. That does not explicitly mean you are a sloppy coder either.
https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Import_Statement_Overhead

from sys import modules
try:
module = modules[module_name]
except KeyError:
__import__('m')
this is my solution of changing code at runtime!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.