Python import type detection - python

Can a python module detect if has been imported with import module or from module import *? Something like
if __something__=='something':
print 'Directly imported with "import ' + __name__ + '"'
else:
print 'Imported with "from ' + __name__ + ' import *"'
Thank you.

No, it's not possible to detect this from within the module's code. Upon the first import, the module body is executed and a new module object is inserted in sys.modules. Only after this, the requested names are inserted into the namespace of the importing module.
Upon later imports, the module body isn't even executed. So if a module is first imported as
import module
and a second time as
from module import name
it has no chance to do anything at all during the second import. In particular, it cannot check how it is imported.

While Svens answer is probably the correct one, and this might seem a bit obvious, It is what I was really looking for when I stumbled upon this question.
This module will at least know that you passed an input argument to it. While allows unit testing of just this specific script without the unit test being performed in the module that imported it.
import sys
def myfunction(blah):
return "something like: " + blah
noargs=len(sys.argv)
if noargs>1:
for i in range(noargs-1):
print myfunction(sys.argv[i+1])
However, It doesn't really help you, Emilio, if you have no input arguments. : )

Related

import -force mymodule for python 3.6+?

Is there a way to do this in python 3.6+?
import -force mymodule
I just want a single python command that both:
(1) loads the module for the first time, and
(2) forces a reload of the module if it already loaded without barfing.
(This is not a duplicate question because I'm asking for something different. What I want is a single function call that will do Items (1) and (2) above as the same function call. I don't want to make a coding decision manually about if I could issue "import" or "imp.reload". I just want python code for a single function "def" that can detect which case is appropriate and proceed automatically to make the decision for me about how to import it it, (1) or (2).
I'm thinking that it something like this:
def import_force(m):
import sys
if m not in sys.modules:
import m
else:
import importlib
importlib.reload(m)
Except, I can't figure out how to pass a module name as a parameter. Just gives me an error no such module named 'm'
There is one missing step that you semi-corrected in your new answer, which is that you need to assign the new module in every scope that uses it. The easiest way is to return the module object and bind it to the name you want outside your function. Your original implementation was 90% correct:
import sys, importlib
def import_force(m):
if m not in sys.modules:
return __import__(m)
else:
return importlib.reload(sys.modules[m])
Now you can use this function from the command line to replace import, e.g.:
my_module = force_import('my_module')
Any time you find yourself using exec to perform a task for which there is so much well defined machinery already available, you have code smell. There is also no reason to re-import sys and importlib every time.
This function should do what you want:
def import_force(name):
needs_reload = name in sys.modules
module = importlib.import_module(name)
if needs_reload:
module = importlib.reload(module)
return module
# Usage example:
os = import_force('os')
An alternative approach is to write your own import hooks, which I won't describe.
However please note that this is an anti-pattern and I would discourage the practice of reloading modules at every import.
If this is for debugging purposes, then I would suggest using one of the many auto-reloader solutions available online: they watch your Python files for changes, and when you make modifications they automatically re-import the modules.
The reasons why your function didn't work are two:
The import keyword does not resolve variables, so import m does not mean "import the module which name is in the variable m", but rather it means "import the module named m".
importlib.reload wants a module object, not a module name.
import sys
import importlib
# importing with a sledgehammer... simple, effective, and it always works
def import_force(name):
module = importlib.import_module(name)
module = importlib.reload(module)
return module
#assuming mymodule.py is in the current directory
mymodule = import_force("mymodule")
It's possible! but a little bit tricky to code correctly the first time...
import sys
import importlib
def import_force(modstr):
if modstr not in sys.modules:
print("IMPORT " + modstr)
cmd = "globals()['%s'] = importlib.import_module('%s')" % (modstr, modstr)
exec(cmd)
else:
print("RELOAD " + modstr)
cmd = "globals()['%s'] = importlib.reload(%s)" % (modstr, modstr)
exec(cmd)
If you have a module file in your current directory call "mymodule.py", then use it like this:
Py> import_force("mymodule")
Version 2.0:
def import_force(modstr):
if modstr not in sys.modules:
print("IMPORT " + modstr)
globals()[modstr] = importlib.import_module(modstr)
else:
print("RELOAD " + modstr)
globals()[modstr] = importlib.reload(sys.modules[modstr])

Reversing from module import *

I have a codebase where I'm cleaning up some messy decisions by the previous developer. Frequently, he has done something like:
from scipy import *
from numpy import *
...This, of course, pollutes the name space and makes it difficult to tell where an attribute in the module is originally from.
Is there any way to have Python analyze and fix this for me? Has anyone made a utility for this? If not, how might a utility like this be made?
I think PurityLake's and Martijn Pieters's assisted-manual solutions are probably the best way to go. But it's not impossible to do this programmatically.
First, you need to get a list of all names that existing in the module's dictionary that might be used in the code. I'm assuming your code isn't directly calling any dunder functions, etc.
Then, you need to iterate through them, using inspect.getmodule() to find out which module each object was originally defined in. And I'm assuming that you're not using anything that's been doubly from foo import *-ed. Make a list of all of the names that were defined in the numpy and scipy modules.
Now you can take that output and just replace each foo with numpy.foo.
So, putting it together, something like this:
for modname in sys.argv[1:]:
with open(modname + '.py') as srcfile:
src = srcfile.read()
src = src.replace('from numpy import *', 'import numpy')
src = src.replace('from scipy import *', 'import scipy')
mod = __import__(modname)
for name in dir(mod):
original_mod = inspect.getmodule(getattr(mod, name))
if original_mod.__name__ == 'numpy':
src = src.replace(name, 'numpy.'+name)
elif original_mod.__name__ == 'scipy':
src = src.replace(name, 'scipy.'+name)
with open(modname + '.tmp') as dstfile:
dstfile.write(src)
os.rename(modname + '.py', modname + '.bak')
os.rename(modname + '.tmp', modname + '.py')
If either of the assumptions is wrong, it's not hard to change the code. Also, you might want to use tempfile.NamedTemporaryFile and other improvements to make sure you don't accidentally overwrite things with temporary files. (I just didn't want to deal with the headache of writing something cross-platform; if you're not running on Windows, it's easy.) And add in some error handling, obviously, and probably some reporting.
Yes. Remove the imports and run a linter on the module.
I recommend using flake8, although it may also create a lot of noise about style errors.
Merely removing the imports and trying to run the code is probably not going to be enough, as many name errors won't be raised until you run just the right line of code with just the right input. A linter will instead analyze the code by parsing and will detect potential NameErrors without having to run the code.
This all presumes that there are no reliable unit tests, or that the tests do not provide enough coverage.
In this case, where there are multiple from module import * lines, it gets a little more painful in that you need to figure out for each and every missing name what module supplied that name. That will require manual work, but you can simply import the module in a python interpreter and test if the missing name is defined on that module:
>>> import scipy, numpy
>>> 'loadtxt' in dir(numpy)
True
You do need to take into account that in this specific case, that there is overlap between the numpy and scipy modules; for any name defined in both modules, the module imported last wins.
Note that leaving any from module import * line in place means the linter will not be able to detect what names might raise NameErrors!
I've now made a small utility for doing this which I call 'dedazzler'. It will find lines that are 'from module import *', and then expand the 'dir' of the target modules, replacing the lines.
After running it, you still need to run a linter. Here's the particularly interesting part of the code:
import re
star_match = re.compile('from\s(?P<module>[\.\w]+)\simport\s[*]')
now = str(time.time())
error = lambda x: sys.stderr.write(x + '\n')
def replace_imports(lines):
"""
Iterates through lines in a Python file, looks for 'from module import *'
statements, and attempts to fix them.
"""
for line_num, line in enumerate(lines):
match = star_match.search(line)
if match:
newline = import_generator(match.groupdict()['module'])
if newline:
lines[line_num] = newline
return lines
def import_generator(modulename):
try:
prop_depth = modulename.split('.')[1:]
namespace = __import__(modulename)
for prop in prop_depth:
namespace = getattr(namespace, prop)
except ImportError:
error("Couldn't import module '%s'!" % modulename)
return
directory = [ name for name in dir(namespace) if not name.startswith('_') ]
return "from %s import %s\n"% (modulename, ', '.join(directory))
I'm maintaining this in a more useful stand-alone utility form here:
https://github.com/USGM/dedazzler/
ok, this is what i think you could do, break the program. remove the imports and notice the errors that are made. Then import only the modules that you want, this may take a while but this is the only way I know of doing this, I will be happily surprised if someone does know of a tool to help
EDIT:
ah yes, a linter, I hadn't thought of that.

exec statement python and importing module

Im trying to import a module using exec statement but it fails,
code.py
def test(jobname):
print jobname
exec ('import ' + jobname)
if __name__ = '__main__':
test('c:/python27/test1.py')
Error:
Syntax error:
import:c:\python27 est1.py
You probably mean execfile(jobname). And import does not work with filenames. It works with package names. Any good tutorial would cover that. Another issue would be the \t being interpreted as a tab character, but here it is not the case because you are uaing forward slash not baclslash...
Somehow, I think you must be calling
test('c:\python27\test1.py')
instead of
test('c:/python27/test1.py')
The backslash in front of the t is being interpreted as a tab character. Thus the error
import:c:\python27 est1.py
Notice the missing t.
Secondly, the import command expects a module name, not a path. For importing, use __import__ not exec or execfile. execfile has been removed from Python3, so for future compatibilty, you may not want to use it in Python2. exec can be used instead, but there are problems with using exec.
Assuming c:\python27 is in your PYTHONPATH, you could
do something like this:
def test(jobname):
print jobname
__import__(jobname)
if __name__ == '__main__':
test('test1')
def test(jobname):
print jobname
a = jobname.split('/')
b = "/".join(a[0:-1])
c = a[-1][0:-3]
sys.path.append(b)
exec ('import ' + c)
if __name__ = '__main__':
test('c:/python27/test1.py')
Try this code. Your path must be added to sys.path() variable.
Im trying to import a module using exec statement
Don't do that.
First, do you really need to import a module programmatically? If you tell us what you're actually trying to accomplish, I'm willing to bet we can find the square hole for your square page, instead of teaching you how to force it into a round hole.
If you do ever need to do this, use the imp module; that's what it's for.
Especially if you want to import a module by path instead of by module name, which is impossible to do with the import statement (and exec isn't going to help you with that).
Here's an example:
import imp
def test(jobname):
print jobname
while open(jobname, 'r') as f:
job = imp.load_module('test', f, jobname, ('.py', 'U', 1))
Of course this doesn't do the same thing that import test1 would do if it were on your sys.path. The module will be at sys.modules['test'] instead of sys.modules['test1'], and in local variable job instead of global variable test1, and it'll reload instead of doing nothing if you've already loaded it. But anyone who has a good reason for doing this kind of thing had better know how to deal with all of those issues.

Tool to help eliminate wildcard imports

I'm refactoring and eliminating wildcard imports on some fairly monolithic code.
Pylint seems to do a great job of listing all the unused imports that come along with a wildcard import, but what i wish it did was provide a list of used imports so I can quickly replace the wildcard import. Any quick ways of doing this? I'm about to go parse the output of pyLint and do a set.difference() on this and the dir() of the imported module. But I bet there's some tool/procedure I'm not aware of.
NB: pylint does not recommend a set of used imports. When changing this, you have to be aware of other modules importing the code you are modifying, which could use symbols which belong to the namespace of the module you are refactoring only because you have unused imports.
I recommend the following procedure to refactor from foo import *:
in an interactive shell, type:
import re
import foo as module # XXX use the correct module name here!
module_name = module.__name__
import_line = 'from %s import (%%s)' % module_name
length = len(import_line) - 3
print import_line % (',\n' + length * ' ').join([a for a in dir(module)
if not re.match('__.*[^_]{2}', a)])
replace the from foo import * line with the one printed above
run pylint, and remove the unused imports flagged by pylint
run pylint again on the whole code based, looking for imports of non existing sympols
run your unit tests
repeat with from bar import *
Here's dewildcard, a very simple tool based on Alex's initial ideas:
https://github.com/quentinsf/dewildcard
This is an old question, but I wrote something that does this based on autoflake.
See here: https://github.com/fake-name/autoflake/blob/master/autostar.py
It works the opposite way dewildcard does, in that it attempts to fully qualify all uses of wildcard items.
E.g.
from os.path import *
Is converted to
import os.path
and all uses of os.path.<func> are prepended with the proper function.

How to prevent a module from being imported twice?

When writing python modules, is there a way to prevent it being imported twice by the client codes? Just like the c/c++ header files do:
#ifndef XXX
#define XXX
...
#endif
Thanks very much!
Python modules aren't imported multiple times. Just running import two times will not reload the module. If you want it to be reloaded, you have to use the reload statement. Here's a demo
foo.py is a module with the single line
print("I am being imported")
And here is a screen transcript of multiple import attempts.
>>> import foo
Hello, I am being imported
>>> import foo # Will not print the statement
>>> reload(foo) # Will print it again
Hello, I am being imported
Imports are cached, and only run once. Additional imports only cost the lookup time in sys.modules.
As specified in other answers, Python generally doesn't reload a module when encountering a second import statement for it. Instead, it returns its cached version from sys.modules without executing any of its code.
However there are several pitfalls worth noting:
Importing the main module as an ordinary module effectively creates two instances of the same module under different names.
This occurs because during program startup the main module is set up with the name __main__. Thus, when importing it as an ordinary module, Python doesn't detect it in sys.modules and imports it again, but with its proper name the second time around.
Consider the file /tmp/a.py with the following content:
# /tmp/a.py
import sys
print "%s executing as %s, recognized as %s in sys.modules" % (__file__, __name__, sys.modules[__name__])
import b
Another file /tmp/b.py has a single import statement for a.py (import a).
Executing /tmp/a.py results in the following output:
root#machine:/tmp$ python a.py
a.py executing as __main__, recognized as <module '__main__' from 'a.py'> in sys.modules
/tmp/a.pyc executing as a, recognized as <module 'a' from '/tmp/a.pyc'> in sys.modules
Therefore, it is best to keep the main module fairly minimal and export most of its functionality to an external module, as advised here.
This answer specifies two more possible scenarios:
Slightly different import statements utilizing different entries in sys.path leading to the same module.
Attempting another import of a module after a previous one failed halfway through.

Categories

Resources