Python App Engine import issues after app is cached

Python App Engine import issues after app is cached - python

I'm using a modified version on juno (http://github.com/breily/juno/) in Google App Engine. The problem I'm having is I have code like this:
import juno
import pprint
#get('/')
def home(web):
pprint.pprint("test")
def main():
run()
if __name__ == '__main__':
main()
The first time I start the app up in the dev environment it works fine. The second time and every time after that it can't find pprint. I get this error:
AttributeError: 'NoneType' object has no attribute 'pprint'
If I set the import inside the function it works every time:
#get('/')
def home(web):
import pprint
pprint.pprint("test")
So it seems like it is caching the function but for some reason the imports are not being included when it uses that cache. I tried removing the main() function at the bottom to see if that would remove the caching of this script but I get the same problem.
Earlier tonight this code was working fine, I'm not sure what could have changed to cause this. Any insight is appreciated.

I would leave it that way. I saw a slideshare that Google put out about App Engine optimization that said you can get better performance by keeping imports inside of the methods, so they are not imported unless necessary.

Is it possible you are reassigning the name pprint somewhere? The only two ways I know of for a module-level name (like what you get from the import statement) to become None is if you either assign it yourself pprint = None or upon interpreter shutdown, when Python's cleanup assigns all module-level names to None as it shuts things down.

Related

Passing imports to a script from inside an external method

I have kind of a tricky question, so that it is difficult to even describe it.
Suppose I have this script, which we will call master:
#in master.py
import slave as slv
def import_func():
import time
slv.method(import_func)
I want to make sure method in slave.py, which looks like this:
#in slave.py
def method(import_func):
import_func()
time.sleep(10)
actually runs like I imported the time package. Currently it does not work, I believe because the import stays exists only in the scope of import_func().
Keep in mind that the rules of the game are:
I cannot import anything in slave.py outside method
I need to pass the imports which method needs through import_func() in master.py
the procedure must work for a variable number of imports inside method. In other words, method cannot know how many imports it will receive but needs to work nonetheless.
the procedure needs to work for any import possible. So options like pyforest are not suitable.
I know it can theoretically be done through importlib, but I would prefer a more straightforward idea, because if we have a lot of imports with different 'as' labels it would become extremely tedious and convoluted with importlib.
I know it is kind of a quirky question but I'd really like to know if it is possible. Thanks

What you can do is this in the master file:
#in master.py
import slave as slv
def import_func():
import time
return time
slv.method(import_func)
Now use time return value in the slave file:
#in slave.py
def method(import_func):
time = import_func()
time.sleep(10)
Why would you have to do this? It's because of the application's stack. When import_func() is called on slave.py, it imports the library on the stack. However, when the function terminates, all stack data is released from memory. So the library would get released and collected by the garbage collector.
By returning time from import_func(), you guarantee it continues existing in memory once the function terminates executing.
Now, to import more modules? Simple. Return a list with multiples modules inside. Or maybe a Dictionary for simple access. That's one way of doing it.
[Edit] Using a dictionary and importlib to pass multiple imports to slave.py:
master.py:
import test2 as slv
import importlib
def master_import(packname, imports={}):
imports[packname] = importlib.import_module(packname)
def import_func():
imports = {}
master_import('time', imports)
return imports
slv.method(import_func)
slave.py:
#in slave.py
def method(import_func):
imports = import_func()
imports['time'].sleep(10)
This way, you can literally import any modules you want on master.py side, using master_import() function, and pass them to slave script.
Check this answer on how to use importlib.

Shared global state in a qt5 python 3.6 program

I am attempting to use a module named "global_data" to save global state information without success. The code is getting large already, so I`ll try to post only the bare essentials.
from view import cube_control
from ioserver.ioserver import IOServer
from manager import config_manager, global_data
if __name__ == "__main__":
#sets up initial data
config_manager.init_manager()
#modifies data
io = IOServer()
#verify global data modified from IOServer.__init__
global_data.test() #success
#start pyqt GUI
cube_control.start_view()
So far so good. However in the last line cube_control.start_view() it enters this code:
#inside cube_control.py
def start_view():
#verify global data modified from IOServer.__init__
global_data.test() #fail ?!?!
app = QApplication(sys.argv)
w = MainWindow()
sys.exit(app.exec_())
Running the global_data.test() in this case fails. printing the entire global state reveals it now somehow reverted back to the data setup by config_manager.init_manager()
How is this possible?
While Qt is running I have a scheduler called every 10 seconds, also reporting a failed test.
However once the Qt GUI is stopped (clicked "x"), and I run the test from console, it succeeds again.
Inside the global_data module I`ve attempted to store the data in a dict inside both a simple python object as well as a ZODB in memory database:
#inside global_data
state = {
"units" : {}
}
db = ZODB.DB(None) #creates an in memory db
def test(identity="no-id"):
con = db.open()
r = con.root()
print("test online: ", r["units"]["local-test"]["online"], identity)
con.close()
Both have the exact same problem. Above the test is only done using the db.
The reason I attempted to use a db is that I understand threads can create a completely new global dictionary. However the 2 first tests are in the same thread. The cyclic one is in its own thread and could potentially create such a problem...?
File organization
If it helps my program is organized with the following structure:
There is also a "view" folder with some qt5 GUI files.
The IOServer attempts to connect to a bunch of OPC-UA servers using the opcua module. No threads are manually started there, although I suppose the opcua module does to stay connected.
global_data id()
I attempted to also print(id(global_data)) together with the tests and found that the ID is the same in IOServer AND top level code, but changes inside the cube_control.py#start_view. Should not these always refer to the same module?

I`m still not sure what exactly happened. But apparently this was solved by removing the init.py file inside of the folder named manager. Now all imports of the module named "global_data" points to the same ID.
How using a init.py file caused a second instance of the same module remains a mystery

Access part of module from pytest

I have an issue accessing part of imported module from the pytest.
Here is branch with code referenced below: https://github.com/asvc/snapshotr/tree/develop
In particular, when running this test, it works as expected for test_correct_installation() but test_script_name_checking() fails with AttributeError.
import main as ss
import os
class TestInit:
def test_correct_installation(self):
assert os.path.exists(ss.snapr_path)
assert os.path.isfile(ss.snapr_path + "/main/markup.py")
assert os.path.isfile(ss.snapr_path + "/main/scandir.py")
def test_script_name_checking(self):
assert ss.ssPanel.check_script('blah') is None # Here it fails
Link to the main which is being tested
What I'm trying to do is to "extract" isolated piece of code, run it with known data and compare result to some reference. Seems like extraction part doesn't work quite well, best practises for such cases would be greatly appreciated.
Traceback:
AttributeError: 'module' object has no attribute 'ssPanel'
I have tried a small hack in the test_init.py:
class dummy():
pass
nuke = dummy()
nuke.GUI = True
But it (obviously) doesn't work as nuke.GUI is being redefined in __init__.py upon every launch.

This is a quite complex situation. When you import main in test_init.py, it will import main/__init__.py and execute all the code. This will cause nuke being imported and also, if nuke.GUI is False, there will not be ssPanel, as you can see.
The problem is that, you can't fake a dummy nuke in the test script. It won't work. Because before the test is running, the real nuke was already imported.
My suggestion would be seperate ssPanel into another python file. Then in __init__.py we can do:
if nuke.GUI:
from sspanel import ssPanel
And in test scripts, we can also easily import it using:
from main.sspanel import ssPanel

Test if code is executed from within a py.test session

I'd like to connect to a different database if my code is running under py.test. Is there a function to call or an environment variable that I can test that will tell me if I'm running under a py.test session? What's the best way to handle this?

A simpler solution I came to:
import sys
if "pytest" in sys.modules:
...
Pytest runner will always load the pytest module, making it available in sys.modules.
Of course, this solution only works if the code you're trying to test does not use pytest itself.

There's also another way documented in the manual:
https://docs.pytest.org/en/latest/example/simple.html#pytest-current-test-environment-variable
Pytest will set the following environment variable PYTEST_CURRENT_TEST.
Checking the existence of said variable should reliably allow one to detect if code is being executed from within the umbrella of pytest.
import os
if "PYTEST_CURRENT_TEST" in os.environ:
# We are running under pytest, act accordingly...
Note
This method works only when an actual test is being run.
This detection will not work when modules are imported during pytest collection.

A solution came from RTFM, although not in an obvious place. The manual also had an error in code, corrected below.
Detect if running from within a pytest run
Usually it is a bad idea to make application code behave differently
if called from a test. But if you absolutely must find out if your
application code is running from a test you can do something like
this:
# content of conftest.py
def pytest_configure(config):
import sys
sys._called_from_test = True
def pytest_unconfigure(config):
import sys # This was missing from the manual
del sys._called_from_test
and then check for the sys._called_from_test flag:
if hasattr(sys, '_called_from_test'):
# called from within a test run
else:
# called "normally"
accordingly in your application. It’s also a good idea to use your own
application module rather than sys for handling flag.

Working with pytest==4.3.1 the methods above failed, so I just went old school and checked with:
script_name = os.path.basename(sys.argv[0])
if script_name in ['pytest', 'py.test']:
print('Running with pytest!')

While the hack explained in the other answer (http://pytest.org/latest/example/simple.html#detect-if-running-from-within-a-pytest-run) does indeed work, you could probably design the code in such a way you would not need to do this.
If you design the code to take the database to connect to as an argument somehow, via a connection or something else, then you can simply inject a different argument when you're running the tests then when the application drives this. Your code will end up with less global state and more modulare and reusable. So to me it sounds like an example where testing drives you to design the code better.

This could be done by setting an environment variable inside the testing code. For example, given a project
conftest.py
mypkg/
__init__.py
app.py
tests/
test_app.py
In test_app.py you can add
import os
os.environ['PYTEST_RUNNING'] = 'true'
And then you can check inside app.py:
import os
if os.environ.get('PYTEST_RUNNING', '') == 'true':
print('pytest is running')

Python: intercept a class loading action

Summary: when a certain python module is imported, I want to be able to intercept this action, and instead of loading the required class, I want to load another class of my choice.
Reason: I am working on some legacy code. I need to write some unit test code before I start some enhancement/refactoring. The code imports a certain module which will fail in a unit test setting, however. (Because of database server dependency)
Pseduo Code:
from LegacyDataLoader import load_me_data
...
def do_something():
data = load_me_data()
So, ideally, when python excutes the import line above in a unit test, an alternative class, says MockDataLoader, is loaded instead.
I am still using 2.4.3. I suppose there is an import hook I can manipulate
Edit
Thanks a lot for the answers so far. They are all very helpful.
One particular type of suggestion is about manipulation of PYTHONPATH. It does not work in my case. So I will elaborate my particular situation here.
The original codebase is organised in this way
./dir1/myapp/database/LegacyDataLoader.py
./dir1/myapp/database/Other.py
./dir1/myapp/database/__init__.py
./dir1/myapp/__init__.py
My goal is to enhance the Other class in the Other module. But since it is legacy code, I do not feel comfortable working on it without strapping a test suite around it first.
Now I introduce this unit test code
./unit_test/test.py
The content is simply:
from myapp.database.Other import Other
def test1():
o = Other()
o.do_something()
if __name__ == "__main__":
test1()
When the CI server runs the above test, the test fails. It is because class Other uses LegacyDataLoader, and LegacydataLoader cannot establish database connection to the db server from the CI box.
Now let's add a fake class as suggested:
./unit_test_fake/myapp/database/LegacyDataLoader.py
./unit_test_fake/myapp/database/__init__.py
./unit_test_fake/myapp/__init__.py
Modify the PYTHONPATH to
export PYTHONPATH=unit_test_fake:dir1:unit_test
Now the test fails for another reason
File "unit_test/test.py", line 1, in <module>
from myapp.database.Other import Other
ImportError: No module named Other
It has something to do with the way python resolves classes/attributes in a module

You can intercept import and from ... import statements by defining your own __import__ function and assigning it to __builtin__.__import__ (make sure to save the previous value, since your override will no doubt want to delegate to it; and you'll need to import __builtin__ to get the builtin-objects module).
For example (Py2.4 specific, since that's what you're asking about), save in aim.py the following:
import __builtin__
realimp = __builtin__.__import__
def my_import(name, globals={}, locals={}, fromlist=[]):
print 'importing', name, fromlist
return realimp(name, globals, locals, fromlist)
__builtin__.__import__ = my_import
from os import path
and now:
$ python2.4 aim.py
importing os ('path',)
So this lets you intercept any specific import request you want, and alter the imported module[s] as you wish before you return them -- see the specs here. This is the kind of "hook" you're looking for, right?

There are cleaner ways to do this, but I'll assume that you can't modify the file containing from LegacyDataLoader import load_me_data.
The simplest thing to do is probably to create a new directory called testing_shims, and create LegacyDataLoader.py file in it. In that file, define whatever fake load_me_data you like. When running the unit tests, put testing_shims into your PYTHONPATH environment variable as the first directory. Alternately, you can modify your test runner to insert testing_shims as the first value in sys.path.
This way, your file will be found when importing LegacyDataLoader, and your code will be loaded instead of the real code.

The import statement just grabs stuff from sys.modules if a matching name is found there, so the simplest thing is to make sure you insert your own module into sys.modules under the target name before anything else tries to import the real thing.
# in test code
import sys
import MockDataLoader
sys.modules['LegacyDataLoader'] = MockDataLoader
import module_under_test
There are a handful of variations on the theme, but that basic approach should work fine to do what you describe in the question. A slightly simpler approach would be this, using just a mock function to replace the one in question:
# in test code
import module_under_test
def mock_load_me_data():
# do mock stuff here
module_under_test.load_me_data = mock_load_me_data
That simply replaces the appropriate name right in the module itself, so when you invoke the code under test, presumably do_something() in your question, it calls your mock routine.

Well, if the import fails by raising an exception, you could put it in a try...except loop:
try:
from LegacyDataLoader import load_me_data
except: # put error that occurs here, so as not to mask actual problems
from MockDataLoader import load_me_data
Is that what you're looking for? If it fails, but doesn't raise an exception, you could have it run the unit test with a special command line tag, like --unittest, like this:
import sys
if "--unittest" in sys.argv:
from MockDataLoader import load_me_data
else:
from LegacyDataLoader import load_me_data

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python App Engine import issues after app is cached - python

I would leave it that way. I saw a slideshare that Google put out about App Engine optimization that said you can get better performance by keeping imports inside of the methods, so they are not imported unless necessary.

Related

Passing imports to a script from inside an external method

Shared global state in a qt5 python 3.6 program

Access part of module from pytest

Test if code is executed from within a py.test session

Python: intercept a class loading action

Categories

Resources