Lambda Handler to invoke Python script directly

Lambda Handler to invoke Python script directly - python

I'm not a Python developer but I got stuck with taking over the maintenance of a Python 3.9 project that currently runs as a cronjob on an EC2 instance. I need to get it running from AWS Lambda. Currently the crontab on the EC2 instance invokes a cronjob/run.py script that looks a little like this:
import os
import sys
from dotenv import load_dotenv
load_dotenv()
sync_events = get_sync_events()
# lots more stuff down here
The important thing here is that there is no __main__ method invoked. The crontab just treats this Python source file like a script and executes it from top to bottom.
My understanding is that the Lambda Handler needs a main method to be invoked. So I need a way to run the existing cronjob/run.py (that again, has no main entry point) from inside the Lambda Handler, somehow:
def lambda_handler(event, context):
try:
# run everything thats in cronjob/run.py right here
raise e
except Exception as e:
raise e
if __name__ == '__main__':
lambda_handler(None, None)
So my question: do I need my Lambda Handler to have a __main__ method like the above, or is it possible to configure my Lambda to just call cronjob/run.py directly? If not, what are the best options here? Thanks in advance!

do I need my Lambda Handler to have a main method
No, you don't.
If you just want to run run.py with lambda, you can keep things simple and just use:
import os
import sys
from dotenv import load_dotenv
def main(event, context):
load_dotenv()
sync_events = get_sync_events()
# lots more stuff down here
and configure the lambda function to have run.main as the handler.
The name of the handler function, in this case main, can be anything, but it must have event and context as arguments.
You can find more information on lambda handler here: https://docs.aws.amazon.com/lambda/latest/dg/python-handler.html

Related

How to mock a function within a python script tested with pytest-console-scripts?

I need to test some legacy code, among which there are a number of Python scripts.
By script I mean Python code not within a class or module, just within a unique file and executed with python script.py
Here is a example oldscript.py:
import socket
def testfunction():
return socket.gethostname()
def unmockable():
return "somedata"
if __name__ == '__main__':
result = testfunction()
result = unmockable()
print(result)
I'm using pytest-console-scripts to test this as it's "inprocess" launcher makes it possible to actually mock some things.
AFAIU, there's no way to mock any call made within a Python script when it is ran with subprocess
pytest-console-scripts makes this possible, and indeed mocks to external functions work.
Here's a test case for the above :
import socket
from pytest_console_scripts import ScriptRunner
from pytest_mock import MockerFixture
class TestOldScript:
def test_success(self, script_runner: ScriptRunner, mocker: MockerFixture) -> None:
mocker.patch('socket.gethostname', return_value="myhostname")
mocker.patch('oldscript.unmockable', return_value="mocked!", autospec=True)
ret = script_runner.run('oldscript.py', print_result=True, shell=True)
socket.gethostname.assert_called_with()
assert ret.success
assert ret.stdout == 'mocked!'
assert ret.stderr is None
This is failing as unmockable cannot be mocked this way.
The call to socket.gethostname() can be successfully mocked, but can the unmockable function be mocked? That is my issue.
Would there be another strategy to test such Python scripts and be able to mock internal functions?

The problem here is that when the script is executed, oldscript.py is not being imported into oldscript namespace, it's instead in __main__ (that's why the condition of the if at the bottom of the script is true). Your code successfully patchess oldscript.unmockable, but the script is calling __main__.unmockable and that one is indeed unmockable.
I see two ways to get around this:
You can split the code that you would like to mock into another module that's imported by the main script. For example if you split oldscript.py into two files like this:
lib.py:
def unmockable():
return "somedata"
oldscript.py:
import socket
import lib
def testfunction():
return socket.gethostname()
if __name__ == '__main__':
result = testfunction()
print('testfunction:', result)
result = lib.unmockable()
print('unmockable:', result)
then you can mock lib.unmockable in the test and everything works as expected.
Another approach is to use a console_scripts entry point in setup.py (see here for more info on this). This is a more sophisticated approach that would be a good fit for python packages that have setup.py and are installed (e.g. via pip).
When you set up your script to be installed and called this way, it becomes available in the PATH as e.g. oldscript and then you can call it from tests with:
script_runner.run('oldscript') # Without the .py
These installed console scripts are imported and executed using a wrapper that setup.py creates, so oldscript.py would be imported as oldscript and again mocking will work.

Test if code is executed from within a py.test session

I'd like to connect to a different database if my code is running under py.test. Is there a function to call or an environment variable that I can test that will tell me if I'm running under a py.test session? What's the best way to handle this?

A simpler solution I came to:
import sys
if "pytest" in sys.modules:
...
Pytest runner will always load the pytest module, making it available in sys.modules.
Of course, this solution only works if the code you're trying to test does not use pytest itself.

There's also another way documented in the manual:
https://docs.pytest.org/en/latest/example/simple.html#pytest-current-test-environment-variable
Pytest will set the following environment variable PYTEST_CURRENT_TEST.
Checking the existence of said variable should reliably allow one to detect if code is being executed from within the umbrella of pytest.
import os
if "PYTEST_CURRENT_TEST" in os.environ:
# We are running under pytest, act accordingly...
Note
This method works only when an actual test is being run.
This detection will not work when modules are imported during pytest collection.

A solution came from RTFM, although not in an obvious place. The manual also had an error in code, corrected below.
Detect if running from within a pytest run
Usually it is a bad idea to make application code behave differently
if called from a test. But if you absolutely must find out if your
application code is running from a test you can do something like
this:
# content of conftest.py
def pytest_configure(config):
import sys
sys._called_from_test = True
def pytest_unconfigure(config):
import sys # This was missing from the manual
del sys._called_from_test
and then check for the sys._called_from_test flag:
if hasattr(sys, '_called_from_test'):
# called from within a test run
else:
# called "normally"
accordingly in your application. It’s also a good idea to use your own
application module rather than sys for handling flag.

Working with pytest==4.3.1 the methods above failed, so I just went old school and checked with:
script_name = os.path.basename(sys.argv[0])
if script_name in ['pytest', 'py.test']:
print('Running with pytest!')

While the hack explained in the other answer (http://pytest.org/latest/example/simple.html#detect-if-running-from-within-a-pytest-run) does indeed work, you could probably design the code in such a way you would not need to do this.
If you design the code to take the database to connect to as an argument somehow, via a connection or something else, then you can simply inject a different argument when you're running the tests then when the application drives this. Your code will end up with less global state and more modulare and reusable. So to me it sounds like an example where testing drives you to design the code better.

This could be done by setting an environment variable inside the testing code. For example, given a project
conftest.py
mypkg/
__init__.py
app.py
tests/
test_app.py
In test_app.py you can add
import os
os.environ['PYTEST_RUNNING'] = 'true'
And then you can check inside app.py:
import os
if os.environ.get('PYTEST_RUNNING', '') == 'true':
print('pytest is running')

Error with custom GAE task queue

I am writing my first "serious" application with AppEngine and have run into some problems with the task queue.
I have read and reproduced the example code that is given in the appengine docs.
When I tried to add a Task to a custom Queue though it doesn't seem to work for me as it works for others:
What I do is:
from google.appengine.api import taskqueue
def EnterQueueHandler(AppHandler):
def get(self):
#some code
def post(self):
key = self.request.get("value")
task = Task(url='/queue', params={'key':key})
task.add("testqueue")
self.redirect("/enterqueue")
And then I have a handler set for "/queue" that does stuff.
The problem is that this throws the following error:
NameError: global name 'Task' is not defined
Why is that? It seems to me I am missing something basic, but I can't figure out what. It says in the docs that the Task-Class is provided by the taskqueue module.
By now I have figured out that it works if I replace the two task-related lines in the code above with the following:
taskqueue.add(queue_name="testqueue", url="/queue", params={"key":key})
But I would like to understand why the other method doesn't work nonetheless. It would be very nice if someone could help me out here.

From the documentation
Task is provided by the google.appengine.api.taskqueue module.
Since you have already imported
from google.appengine.api import taskqueue
You can replace this line:
task = Task(url='/queue', params={'key':key})
with
task = taskqueue.Task(url='/queue', params={'key':key})

I think the reason is does not work is "Task" is not imported. Below is an example that i use all of the time successfully. Looks just like yours but my import is different.
from google.appengine.api.taskqueue import Task
task = Task(
url=url,
method=method,
payload=payload,
params=params,
countdown=0
)
task.add(queue_name=queue)

Python: intercept a class loading action

Summary: when a certain python module is imported, I want to be able to intercept this action, and instead of loading the required class, I want to load another class of my choice.
Reason: I am working on some legacy code. I need to write some unit test code before I start some enhancement/refactoring. The code imports a certain module which will fail in a unit test setting, however. (Because of database server dependency)
Pseduo Code:
from LegacyDataLoader import load_me_data
...
def do_something():
data = load_me_data()
So, ideally, when python excutes the import line above in a unit test, an alternative class, says MockDataLoader, is loaded instead.
I am still using 2.4.3. I suppose there is an import hook I can manipulate
Edit
Thanks a lot for the answers so far. They are all very helpful.
One particular type of suggestion is about manipulation of PYTHONPATH. It does not work in my case. So I will elaborate my particular situation here.
The original codebase is organised in this way
./dir1/myapp/database/LegacyDataLoader.py
./dir1/myapp/database/Other.py
./dir1/myapp/database/__init__.py
./dir1/myapp/__init__.py
My goal is to enhance the Other class in the Other module. But since it is legacy code, I do not feel comfortable working on it without strapping a test suite around it first.
Now I introduce this unit test code
./unit_test/test.py
The content is simply:
from myapp.database.Other import Other
def test1():
o = Other()
o.do_something()
if __name__ == "__main__":
test1()
When the CI server runs the above test, the test fails. It is because class Other uses LegacyDataLoader, and LegacydataLoader cannot establish database connection to the db server from the CI box.
Now let's add a fake class as suggested:
./unit_test_fake/myapp/database/LegacyDataLoader.py
./unit_test_fake/myapp/database/__init__.py
./unit_test_fake/myapp/__init__.py
Modify the PYTHONPATH to
export PYTHONPATH=unit_test_fake:dir1:unit_test
Now the test fails for another reason
File "unit_test/test.py", line 1, in <module>
from myapp.database.Other import Other
ImportError: No module named Other
It has something to do with the way python resolves classes/attributes in a module

You can intercept import and from ... import statements by defining your own __import__ function and assigning it to __builtin__.__import__ (make sure to save the previous value, since your override will no doubt want to delegate to it; and you'll need to import __builtin__ to get the builtin-objects module).
For example (Py2.4 specific, since that's what you're asking about), save in aim.py the following:
import __builtin__
realimp = __builtin__.__import__
def my_import(name, globals={}, locals={}, fromlist=[]):
print 'importing', name, fromlist
return realimp(name, globals, locals, fromlist)
__builtin__.__import__ = my_import
from os import path
and now:
$ python2.4 aim.py
importing os ('path',)
So this lets you intercept any specific import request you want, and alter the imported module[s] as you wish before you return them -- see the specs here. This is the kind of "hook" you're looking for, right?

There are cleaner ways to do this, but I'll assume that you can't modify the file containing from LegacyDataLoader import load_me_data.
The simplest thing to do is probably to create a new directory called testing_shims, and create LegacyDataLoader.py file in it. In that file, define whatever fake load_me_data you like. When running the unit tests, put testing_shims into your PYTHONPATH environment variable as the first directory. Alternately, you can modify your test runner to insert testing_shims as the first value in sys.path.
This way, your file will be found when importing LegacyDataLoader, and your code will be loaded instead of the real code.

The import statement just grabs stuff from sys.modules if a matching name is found there, so the simplest thing is to make sure you insert your own module into sys.modules under the target name before anything else tries to import the real thing.
# in test code
import sys
import MockDataLoader
sys.modules['LegacyDataLoader'] = MockDataLoader
import module_under_test
There are a handful of variations on the theme, but that basic approach should work fine to do what you describe in the question. A slightly simpler approach would be this, using just a mock function to replace the one in question:
# in test code
import module_under_test
def mock_load_me_data():
# do mock stuff here
module_under_test.load_me_data = mock_load_me_data
That simply replaces the appropriate name right in the module itself, so when you invoke the code under test, presumably do_something() in your question, it calls your mock routine.

Well, if the import fails by raising an exception, you could put it in a try...except loop:
try:
from LegacyDataLoader import load_me_data
except: # put error that occurs here, so as not to mask actual problems
from MockDataLoader import load_me_data
Is that what you're looking for? If it fails, but doesn't raise an exception, you could have it run the unit test with a special command line tag, like --unittest, like this:
import sys
if "--unittest" in sys.argv:
from MockDataLoader import load_me_data
else:
from LegacyDataLoader import load_me_data

Python App Engine import issues after app is cached

I'm using a modified version on juno (http://github.com/breily/juno/) in Google App Engine. The problem I'm having is I have code like this:
import juno
import pprint
#get('/')
def home(web):
pprint.pprint("test")
def main():
run()
if __name__ == '__main__':
main()
The first time I start the app up in the dev environment it works fine. The second time and every time after that it can't find pprint. I get this error:
AttributeError: 'NoneType' object has no attribute 'pprint'
If I set the import inside the function it works every time:
#get('/')
def home(web):
import pprint
pprint.pprint("test")
So it seems like it is caching the function but for some reason the imports are not being included when it uses that cache. I tried removing the main() function at the bottom to see if that would remove the caching of this script but I get the same problem.
Earlier tonight this code was working fine, I'm not sure what could have changed to cause this. Any insight is appreciated.

I would leave it that way. I saw a slideshare that Google put out about App Engine optimization that said you can get better performance by keeping imports inside of the methods, so they are not imported unless necessary.

Is it possible you are reassigning the name pprint somewhere? The only two ways I know of for a module-level name (like what you get from the import statement) to become None is if you either assign it yourself pprint = None or upon interpreter shutdown, when Python's cleanup assigns all module-level names to None as it shuts things down.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Lambda Handler to invoke Python script directly - python

Related

How to mock a function within a python script tested with pytest-console-scripts?

Test if code is executed from within a py.test session

Error with custom GAE task queue

Python: intercept a class loading action

Python App Engine import issues after app is cached

Categories

Resources