Python __getattr__ executed multiple times - python

I've been trying to implement the __getattr__ function as in the following example:
PEP 562 -- Module __getattr__ and __dir__
And I don't get why this simple piece of code:
# lib.py
def __getattr__(name):
print(name)
# main.py
from lib import test
outputs:
__path__
test
test
What is __path__ ? Why is it sent to __getattr__ ? Why is test sent 2 times ?

TL;DR the first "test" printed is a side-effect of the "from import" implementation, i.e. it's printed during creation of lib module. The second "test" is from subsequent access of dynamic attribute on the module directly.
Knowing that importlib is implemented in Python code, modify your lib.py slightly to also dump a trace:
# lib.py
from traceback import print_stack
def __getattr__(name):
print_stack()
print(name)
print("-" * 80)
This gives the hint to pinpoint the library location in importlib which triggers double attribute access:
$ python3 main.py
File "main.py", line 3, in <module>
from lib import test
File "<frozen importlib._bootstrap>", line 1019, in _handle_fromlist
File "/private/tmp/lib.py", line 5, in __getattr__
print_stack()
__path__
--------------------------------------------------------------------------------
File "main.py", line 3, in <module>
from lib import test
File "<frozen importlib._bootstrap>", line 1032, in _handle_fromlist
File "/private/tmp/lib.py", line 5, in __getattr__
print_stack()
test
--------------------------------------------------------------------------------
File "main.py", line 3, in <module>
from lib import test
File "/private/tmp/lib.py", line 5, in __getattr__
print_stack()
test
--------------------------------------------------------------------------------
Now we can find the answer easily by RTFS (below I use Python v3.7.6, switch on git to the exact tag you use in case of different version). Look in importlib._bootstrap. _handle_fromlist at the indicated line numbers.
_handle_fromlist is a helper intended to load package submodules in a from import. Step 1 is to see if the module is a package at all:
if hasattr(module, '__path__'):
The __path__ access comes there, on line 1019. Because your __getattr__ returns None for all inputs, hasattr returns True here, so your module looks like a package, and the code continues on. (If hasattr had returned False, _handle_fromlist would abort at this point.)
The "fromlist" here will have the name you requested, ["test"], so we go into the for-loop with x="test" and on line 1032 there is the "extra" invocation:
elif not hasattr(module, x):
from lib import test will only attempt to load a lib.test submodule if lib does not already have a test attribute. This check is testing whether the attribute exists, to see if _handle_fromlist needs to attempt to load a submodule.
Should you return different values for the first and second invocation of __getattr__ with name "test", then the second value returned is the one which will actually be received within main.py.

Related

Noestests: AttributeError using #patch when running Nosetests on multiple files

I am running nosetests across multiple files and getting an error relating to the importing of a specific file, well I'm not actually sure what the error is related to, I think it is either something up with the import or something up with the patching of it. The error itself looks like:
(I'm getting one of these errors for each test function that uses an #patch decorator)
Error
Traceback (most recent call last):
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor
yield
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/unittest2/case.py", line 625, in run
testMethod()
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/mock/mock.py", line 1297, in patched
arg = patching.__enter__()
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/mock/mock.py", line 1353, in __enter__
self.target = self.getter()
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/mock/mock.py", line 1523, in <lambda>
getter = lambda: _importer(target)
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/mock/mock.py", line 1210, in _importer
thing = _dot_lookup(thing, comp, import_path)
File "/home/user/Documents/venvs/migration/local/lib/python2.7/site-packages/mock/mock.py", line 1200, in _dot_lookup
return getattr(thing, comp)
AttributeError: 'module' object has no attribute 'utils'
The package structure looks like this:
my_package
- my_module
- __init__.py
- utils.py
- other.py
- tests
- test_utils.py
- test_other.py
The nosetests command:
nosetests -e unit --with-coverage --cover-package=my_package --cover-erase --cover-xml --with-xunit tests --nocapture
So the weird thing is, if I run nosetests only on the utils test class itself, it runs fine, all imports work and all patches work, no errors, all tests pass.
Here's what the test_utils.py file looks like:
from my_module.utils import *
class TestBusinessProcess(unittest2.TestCase):
#patch('my_module.utils.something')
def test_some_utils_function(self, something_mock):
# test implementation..
# this function will throw:
# AttributeError: 'module' object has no attribute 'utils'
# when running whole tests folder and not on individual test file
pass
#patch('my_module.utils.something_else')
def test_some_other_utils_function(self, something_else_mock):
# test implementation..
# same as above
pass
An example of a test in the other test file that has no issues when ran either way:
from my_module.other import *
class TestBusinessProcess(unittest2.TestCase):
#patch('my_module.other.something')
def test_some_function(self, something_mock):
# test implementation..
# no issues!
pass
#patch('my_module.other.something_else')
def test_some_other_function(self, something_else_mock):
# test implementation..
# no issues!
pass
Any help greatly appreciated.
I still have no idea what was wrong, but it seemed to be something to do with the importing.
Anyway, this workaround solved the problem, but not sure why exactly.
The __init__.py in my_module was empty initially. Then I edited it to expose the individual functions of utils.py:
__init__.py
from utils import test_some_utils_function, test_some_other_utils_function
__all__ = [
"test_some_utils_function",
"test_some_other_utils_function"
]

Parallel Python: Passing a function written in another module to 'submit'

I am using the Parallel Python module (pp), and want to submit a job to a worker. However, the function that I want to execute is in another module (written with Cython), and I don't know how to import the function name to the new worker. The method suggested here, i.e importing the module "walkerc" inside the function cannot work since walk itself is defined in walkerc, from the filename "walkerc.so"
import pp
from walkerc import walk
# Other stuff here
ser = pp.Server()
# Some more definitions
ser.submit(walk, (it, params))
ser.submit(walk, (1000, params), modules = ("walkerc",), globals = globals())
Both the statements above fail, I get the following error:
Traceback (most recent call last):
File "", line 1, in
ser.submit(walk, (1000, params), modules = ("walkerc",), globals = globals())
File "/usr/lib/python2.7/site-packages/pp.py", line 458, in submit
sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
File "/usr/lib/python2.7/site-packages/pp.py", line 629, in
__dumpsfunc
sources = [self.__get_source(func) for func in funcs]
File "/usr/lib/python2.7/site-packages/pp.py", line 696, in
__get_source
sourcelines = inspect.getsourcelines(func)[0]
File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
lines, lnum = findsource(object)
File "/usr/lib/python2.7/inspect.py", line 526, in findsource
file = getfile(object)
File "/usr/lib/python2.7/inspect.py", line 420, in getfile
'function, traceback, frame, or code object'.format(object))
TypeError: '<'built-in function walk'>' is not a module, class, method,
function, traceback, frame, or code object
The function 'walk' itself is imported properly within the main program, it is the process of submitting it to a new worker that is problematic.
How can I specify the function name 'walk' properly?
I do not want to define 'walk' in the same file as which I have called it because I have modified it in Cython and want to have better performance. Is there an alternative?
Try renaming your walk function to something else, mywalk for example. As the exception text suggests, your environment seems to have a built-in function that goes by the name walk, so the inspect module gets confused.
I can successfully pass my imported walk function like this on my system, no conflict here and nothing more needed, the function gets executed using the given argument:
import pp
from walkerc import walk
pps = pp.Server()
pps.submit(walk, args=(1,))
But passing dir, which is a built-in function for sure:
pps.submit(dir)
I get the exact same error as you do:
Traceback (most recent call last):
File "parallel.py", line 9, in
pps.submit(dir)
...
File ".../lib/python2.7/inspect.py", line 420, in getfile
'function, traceback, frame, or code object'.format(object))
TypeError: is not a module, class, method, function, traceback, frame, or code object
Update after the below discussion:
So the problem here is that Python treats the members that come from C extensions as built-ins. The code above works with the regular Python module, but I was able to replicate the OP's error when importing and passing the function from a C extension.
Therefore I wrapped the C extension function call inside a normal Python function, which does the trick. Note that now the walk function import was moved to the wrapping function, so that it can construct it's own context itself when dispatched.
import pp
def walk(n):
import walkerc
return walkerc.walk(n)
def print_callback(result):
print('callback: ', result)
pps = pp.Server()
job = pps.submit(walk, args=(1,), callback=print_callback)

Why can't I use inspect.getsource() to view the source for list?

I tried to retrieve the source code for the list class using the inspect module, without success:
>>> import inspect
>>> inspect.getsource(list)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/inspect.py", line 701, in getsource
lines, lnum = getsourcelines(object)
File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
lines, lnum = findsource(object)
File "/usr/lib/python2.7/inspect.py", line 526, in findsource
file = getfile(object)
File "/usr/lib/python2.7/inspect.py", line 408, in getfile
raise TypeError('{!r} is a built-in class'.format(object))
TypeError: <module '__builtin__' (built-in)> is a built-in class
I don't understand why this didn't work - the documentation for inspect.getsource() says that
An IOError is raised if the source code cannot be retrieved.
... but doesn't explain why that might happen (and in any case I got a TypeError, not an IOError).
Is there some other way I can programmatically retrieve the source for an object in cases like this? If not, how can I find the source for myself?
While inspect.getsource() can retrieve the source code for objects written in Python, list is written in C, so there's no Python source for getsource() to retrieve.
If you're comfortable reading C, you can find the complete source code for Python at its official GitHub repo. For example, the source of list for various releases can be found at:
https://github.com/python/cpython/blob/master/Objects/listobject.c (latest development version)
https://github.com/python/cpython/blob/3.6/Objects/listobject.c
https://github.com/python/cpython/blob/2.7/Objects/listobject.c
... and so on.

Meaning of absolute/relative paths in python stack trace

I ran top_level_script.py and got an exception with a stack trace like:
File "top_level_script.py", line 114, in main
…
File "top_level_script.py", line 91, in func1
...
File "top_level_script.py", line 68, in func2
**kwargs)
File "/home/max/.../cccc/ffff/mmmm.py", line 69, in some_func
obj = SomeClass(…)
File "mmm/ttt/bbb/core.py", line 17, in __init__
File "/home/max/.../pppp/pppp.py", line 474, in func
...
File "/home/max/.../pppp/pppp.py", line 355, in some_func
...
Notice that mmm/ttt/bbb/core.py has a relative path while the frame above and below it have absolute paths. Also, there is no print out of line 17, in __init__, and the code being called was "old". I just changed it, but old code was getting called. Hence the exception.
I still find the Python's import mechanic sometimes confusing. Can anyone elucidate what's up with core.py and what is the significance, if any, of the relative path shown in that frame?
After some tinkering, my hypothesis was that python was somehow calling the .pyc (hence no source shown in the line below). After tinkering with the file (i.e. changing and saving it), I now get:
File "top_level_script.py", line 114, in main
…
File "top_level_script.py", line 91, in func1
...
File "top_level_script.py", line 68, in func2
**kwargs)
File "/home/max/.../cccc/ffff/mmmm.py", line 69, in some_func
obj = SomeClass(…)
File "/home/max/.../mmm/ttt/bbb/core.py", line 17, in __init__
...
File "/home/max/.../pppp/pppp.py", line 474, in func
...
File "/home/max/.../pppp/pppp.py", line 355, in some_func
...
Now, I can't reproduce the effect but I am still curious if anyone knows what may have happened.
In general, Python is being transparent about how it understands the name of the file.
Whenever Python performs an import, the environment variable PYTHONPATH is consulted and that sets the Python variable sys.path.
Path components insys.path can be absolute or relative. A common relative path name is . (the current working directory).
If while performing the import, the name found in sys.path is based on a relative path, then the file name that appears in the stack trace will also be relative. I also think that if a Python program uses a relative import then that too appears as a relative file name.

Can't import django packages with nosegae

I am trying to get started with using nosegae, however I run into the issue that I can't seem to get it to pass even the simplest of cases when using django.
when running without the --without-sandbox flag both the following tests fail
def test_import_django ():
import django
def test_import_django_http ():
import django.http
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\nose-1.1.2-py2.7.egg\nose\case.py", line 1
97, in runTest
self.test(*self.arg)
File "C:\Users\User\Desktop\TDD_GAE\myproj\tests.py", line 2, in test_import_d
jango
import django
File "C:\Python27\lib\site-packages\nosegae-0.1.9-py2.7.egg\nosegae.py", line
207, in find_module
return super(HookMixin, self).find_module(fullname, path)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\de
v_appserver.py", line 1505, in Decorate
return func(self, *args, **kwargs)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\de
v_appserver.py", line 1998, in find_module
search_path)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\de
v_appserver.py", line 1505, in Decorate
return func(self, *args, **kwargs)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\de
v_appserver.py", line 2119, in FindModuleRestricted
result = self.FindPathHook(submodule, submodule_fullname, path_entry)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\de
v_appserver.py", line 2219, in FindPathHook
return self._imp.find_module(submodule, [path_entry])
Howevere if I do use --without-sandbox at least the first test passes
myproj.tests.test_import_django ... ok
myproj.tests.test_import_django_http ... ERROR
======================================================================
ERROR: myproj.tests.test_import_django_http
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\nose-1.1.2-py2.7.egg\nose\case.py", line 1
97, in runTest
self.test(*self.arg)
File "C:\Users\User\Desktop\TDD_GAE\myproj\tests.py", line 5, in test_import_d
jango_http
import django.http
File "C:\Program Files (x86)\Google\google_appengine\lib\django_1_2\django\htt
p\__init__.py", line 9, in <module>
from mod_python.util import parse_qsl
File "C:\Python27\lib\site-packages\nosegae-0.1.9-py2.7.egg\nosegae.py", line
199, in find_module
mod_path = self.find_mod_path(fullname)
File "C:\Python27\lib\site-packages\nosegae-0.1.9-py2.7.egg\nosegae.py", line
251, in find_mod_path
_sf, path, _desc= self._imp.find_module(top, None)
AttributeError: 'str' object has no attribute 'find_module'
Has anyone encountered and know how I can go about past this?
Edit
It seems that the issue is recursive imports
def test_import_pdb ():
import pdb
pdb.set_trace ()
part of the stack trace is
File "C:\Python27\lib\pdb.py", line 72, in __init__
import readline
notice that an import in __init__ of django.http is also part of the stack trace
Read https://docs.djangoproject.com/en/dev/topics/testing/ about Django testing.
As I know it's better to use unittest or doctest shipped with django as it have several improvements for django-specific testing like form field output testing and some database features. Hovewer it's not essential and if you want to continue using nose - think you missed django environment setup:
from django.test.utils import setup_test_environment
setup_test_environment()
This lines needed to run your tests outside of ./manage.py --test
UPD
Yeah my previous thought's were wrong. So I just digged into sources of nose and nose-gae, and what I think - check HardenedModulesHook definition in your nose version, cause in trunk of nose I've found following:
class HardenedModulesHook(object):
...
def __init__(self,
module_dict,
imp_module=imp,
os_module=os,
dummy_thread_module=dummy_thread,
pickle_module=pickle):
...
That gives following - when noseGAE plugin begin() method is executed -> there self._install_hook(dev_appserver.HardenedModulesHook) is called which declares mixed-hook class and creates it's instance like self.hook = Hook(sys.modules, self._path). <- There is HardenedModulesHook.__init__ called with second argument as mystic '_path' however in NOSE this argument should be 'imp' module by default -> That makes an exception you've got:
_sf, path, _desc= self._imp.find_module(top, None)
AttributeError: 'str' object has no attribute 'find_module'
So I think it might be a problem with nose-gae :(

Categories

Resources