python testing with a imported module - python

I'm fairly new to using mock and testing in general. This is my first attempt to mock a whole imported module. So for example I have
try:
import redis
except:
redis = None
Then later on in the code I check for redis
if redis is None:
return
How can I set a mock object or class to the redis namespace so I don't have to install redis on my CI server?

Names are just names, and you can assign anything to the 'redis' name at file/global scope, using either import or plain old assignment.
Like so:
import mock_redis as redis
...or so:
def mock_redis(): pass
BTW, your exception clause should be narrowed to handle only ImportError.

Related

Mocking global variable which is a module

I am trying to write a test for a small module which makes use of boto3, there are a few global imports that I want to mock, but it seems like the act of mocking the module means the global module is loaded and fails.
I do wonder if this highlights the fact the code should be refactored but I am not totally sure on this, the code is deployed as an AWS lambda function and the global variables are created outside the handler function.
Here are the global variables, defined outside the handler:
QUEUE_NAME = os.getenv("QUEUE_NAME", None)
TABLE_NAME = os.getenv("TABLE_NAME", None)
sqs_client = boto3.resource("sqs")
dynamo_client = boto3.client("dynamodb")
queue = sqs_client.get_queue_by_name(QueueName=QUEUE_NAME)
When I try to mock the sqs_client object I get an error, similarly I get that error if I import the module into my test module. Here is the stub of the failing test:
#mock.patch.dict(os.environ, {"QUEUE_NAME": "QUEUE1", "TABLE_NAME": "TABLE1"})
#mock.patch("queue_processor.sqs_client.get_queue_by_name", return_value=None)
#mock.patch("queue_processor.dynamo_client.get_item", return_value=None)
def test_queue_entry(mocker):
The error appears to be because the mock is not being used so a call is made to AWS:
E botocore.errorfactory.QueueDoesNotExist: An error occurred (AWS.SimpleQueueService.NonExistentQueue) when calling the GetQueueUrl operation: The specified queue does not exist for this wsdl version.
My question is twofold, how do I avoid this error, in which the SQS client tries to call AWS rather than using the mock, and secondly, is this actually something I want to fix or would the code be better refactored to avoid global variables?

How to set environment variable in pytest

I have a lamba handler that uses an environment variable. How can I set that value using pytest. I'm getting the error
tests/test_kinesis.py:3: in <module>
from runner import kinesis
runner/kinesis.py:6: in <module>
DATA_ENGINEERING_BUCKET = os.environ["BUCKET"]
../../../../../.pyenv/versions/3.8.8/lib/python3.8/os.py:675: in __getitem__
raise KeyError(key) from None
E KeyError: 'BUCKET'
7:03
I tried setting in the test like this
class TestHandler(unittest.TestCase):
#mock_s3
#mock_lambda
def test_handler(monkeypatch):
monkeypatch.setenv("BUCKET", "test-bucket")
actual = kinesis.handler(kinesis_stream_event, "")
expected = {"statusCode": 200, "body": "OK"}
assert actual == expected
DATA_ENGINEERING_BUCKET = os.environ["BUCKET"]
def handler(event, context):
...
You're getting the failure before your monkeypatch is able to run. The loading of the environment variable will happen when the runner module is first imported.
If this is a module you own, I'd recommend modifying the code to use a default value if DATA_ENGINEERING_BUCKET isn't set. Then you can modify it's value to whatever you want at runtime by calling module.DATA_ENGINEERING_BUCKET = "my_bucket".
DATA_ENGINEERING_BUCKET = os.environ.get("BUCKET", default="default_bucket")
If you can't modify that file then things are more complicated.
I looked into creating a global fixture that monkeypatches the environment and loads the module once, before any tests load and received a pytest error about using function level fixtures within a session level fixture. Which makes sense monkeypatch really isn't intended to fake things long term. You can stick the module load into your test after the monkeypatch but that will generate a lot of boilerplate.
What eventually worked creating a fixture that will provide the class in lieu of importing it. The fixture; sets os.environ to the desired value, loads the module, resets os.environ to it's origional value then yields the module. Any tests that need this module can request the fixture to have access to it within their scope. A word of caution, because test files are imported before fixtures are run any test files that don't use the fixture and import the module normally will raise a KeyError and cause pytest to crash before running any tests.
conftest.py
import os, pytest
#pytest.fixture(scope='session')
def kinesis():
old_environ = os.environ
os.environ = {'BUCKET': 'test-bucket'}
import kinesis
os.environ = old_environ
yield kinesis
tests.py
# Do NOT import kinesis in any test file. Rely on the fixture.
class TestHandler(unittest.TestCase):
#mock_s3
#mock_lambda
def test_handler(kinesis):
actual = kinesis.handler(kinesis_stream_event, "")
expected = {"statusCode": 200, "body": "OK"}
assert actual == expected
A potentially simpler method
os.environ is a dictionary of environment variables that is created when os first loads. If you want a single value for every test then you just need to add the value you want to it before loading any test modules. If you put os.environ['BUCKET'] = 'test-bucket' at the top of conftest.py you will set the environment variable for the rest of the test session. Then as long as the first import of the module happens afterwards you won't have a key error. The big downside to this approach is that unless you know to look in conftest.py or grep the code it will be difficult to determine where the environment variable is getting set when troubleshooting.

How to mock a imported object with pytest-mock or magicmock

I am trying to understand the mock/monkeypatch/pytest-mock capabilities.
Let me know if this is possible. If not could you please suggest how I can test this code.
My code structure:
/
./app
../__init__.py
../some_module1
.../__init__.py
../some_module2
.../__init__.py
./tests
../test_db.py
The /app/__init__.py is where my application (a Flask application if it helps) is started along with initializing a database connection object to a MongoDB database:
# ...
def create_app():
# ...
return app
db_conn = DB()
The some_module1 and some_module import the db_conn object and use it as part of their functions:
## some_module1/__init__.py
from app import db_conn
...
db = db_conn.db_name2.db_collection2
def some_func1():
data = db.find()
# check and do something with data
return boolean_result
...
## some_module2/__init__.py
from app import db_conn
...
db = db_conn.db_name1.db_collection1
def some_func2():
data = db.find()
# check and do something with data
return boolean_result
...
In my tests, I want to test if my code works properly based on the data received from the database.
I want to mock the database, more specifically the db_conn object since I don't want to use a real database (which would be a lot of work setting up the environment and maintaining it).
Any suggestions on how I can emulate the db_conn?
I've been exploring pytest-mock and magicmock but I don't think or know how to mock the db_conn in my test.
To answer the intial question "How to mock a imported object with pytest-mock or magicmock" you can do:
from unittest import mock # because unittest's mock works great with pytest
def test_some_func1():
with mock.patch('some_module1.db', mock.MagicMock(return_value=...)) as magicmock:
result = some_func1(...)
assert ... e.g. different fields of magicmock
assert expected == result
# or alternatively use annotations
#mock.patch('some_module2.db', mock.MagicMock(return_value=...))
def test_some_func2():
result = some_func2(...)
note that you do not patch the actual source of db
For your other use case
I want to mock the database (using a mongo database), more specifically the "db_conn" object
you similarly follow the hints of the link above:
mock.patch('some_module1.db_conn', mock.MagicMock(return_value=...))
Given that, you will notice in your tests that db from `db = db_conn.db_name2.db_collection2' will create another mock object. Calls to that object will be recorded as well. In such a way you will be able to trace history of calls and values assignments as well.
Furthermore, see an example how to pach mongo db.
For testing of Flask apps see the documentation of flask. Also this is a nice explanation as well, and uses DB connections.
As a general hint, like #MikeMajara mentioned - separate your code more into smaller functions that are also easy to test. In tradition to TDD: write tests first, implement later, and refactor (especially DRY!)
I believe you are right not testing cases on a real database because it's not unit testing anymore if you are using external dependencies.
There is a possibility to specify return-value and customize it (different return values on each iteration even) for Mock or MagicMock objects.
from unittest.mock import Mock, patch
from app import db_conn
#patch('app.db_conn.find')
def test_some_func1(db_con_mock):
...
assert ...
Keep in mind that in each patch you should specify the import path of db_conn - the path where db_conn **is used (I assume it's a different path in each test), not where it is defined.
Separation of concerns. Build methods that do one, and only one thing. Even more if you are going with TDD. In your example some_func2 does more than one. You could refactor as follows:
def get_object_from_db():
return db.find()
def check_condition_on_object(obj):
check something to do with object
return true or false
def some_func2():
obj = get_object_from_db()
check_condition_on_object(obj)
With this approach you could easily test get_object_from_db and check_condition_on_object separately. This will improve readability, avoid bugs, and help detecting these if they appear at some point.
About "mocking an imported object". You might be trying to mock an object with a library that is meant for a more advance case than yours. These libraries provide you with a bunch of methods surrounding test environment out of the box that you might not need. From the looks, you just want to populate an object with mock data, and/or interact with a mocked db_connection instance. So...
To populate, I would simplify: You know the condition you want to test and you want to check if the result for a given object is the expected one. Just build yourself a test_object_provider.py that returns your known cases for true|false. No need to make things more complex.
To use a fake MongoDB connection you can try with mongomock. (although ideally you would test mongo with a real instance in a separate test).

Python tornado global variables

I have a Python Tornado application. I want to have variables which are shared across multiple files. previously I used to declare and initiate them in a python file name global.py and import it into other files. this was a good idea until some of my variables needs to query from database, so every time I imported global.py to get just one value, all of queries was running and causes to slow down my application.
The next step was that I defined my variables in tornado start.py like this:
class RepublishanApplication(tornado.web.Application):
def __init__(self):
##################################################
# conn = pymongo.Connection("localhost", 27017)
self.Countries = GlobalDefined.Countries
self.Countries_rev = GlobalDefined.Countries_rev
self.Languages = GlobalDefined.Languages
self.Categories = GlobalDefined.Categories
self.Categories_rev = GlobalDefined.Categories_rev
self.NewsAgencies = GlobalDefined.NewsAgencies
self.NewsAgencies_rev = GlobalDefined.NewsAgencies_rev
self.SharedConnections = SharedConnections
I can access these variables in handlers like this:
self.application.Countries
It's working good. but the problem is that I can access to these variables only in handler classes and if I want to access them, I have to pass them to functions. I think it's not a good idea. Do you have any suggestion to have access to these variable every where without having to pass application instance to all of my functions or even another way to help me?
Putting your global variables in a globals.py file is a fine way to accomplish this. If you use PyMongo to query values from MongoDB when globals.py is imported, that work is only done the first time globals.py is imported in a process. Other imports of globals.py get the module from the sys.modules cache.

Assign a variable into `g` once and only once for application in Flask

I want to save an object which is the result of a expensive function.
The expensive function should only be processed once before any request.
I checked the document of Flask and considered about g for saving the result and #app.before_first_request decorator to define this assignment happended only once.
My codes are like this:
#app.before_first_request
def before_first_request():
g.rec = take_long_time_to_do()
#app.route('/test/')
def test():
return render_template('index.html',var_rec=g.rec)
However, these codes won't work well. It works only in the first time test request is called. When I access "myapplication/test" second time, the g.rec doesn't exist, which will throw an exception
Does anyone have ideas about how to assign a global variable into g when initing the application?
g is the global object for that request only. Have you considered using a caching mechanism?
> pip search flask | grep "cache" | sort
Flask-Cache - Adds cache support to your Flask application
Flask-Cache-PyLibMC - PyLibMC cache for Flask-Cache, supports multiple operations
Flask-Memsessions - A simple extension to drop in memcached session support for Flask.
Then you can store the result of take_long_time_to_do() there and retrieve it if it exists.
Can you try putting that function call in __init__.py of your project package? I usually use __init__.py to initialize the app = Flask(__name__) etc. So lets say you can try the following:
# __init__.py:
from flask import Flask
from somemodule import take_long_time_to_do
app = Flask(__name__)
rec = take_long_time_to_do()
Then you can use the rec variable in any views as long as you import it:
# views.py
from myproject import rec
#app.route('/test/')
def test():
return render_template('index.html',var_rec=rec)
Caching is the way to do it generally. But assuming you have a one-off value and don't want to introduce caching infrastructure for some reason (value doesn't time out), you could also compute it at app startup as a module global, use it, and/or import it where needed.
If the value is needed in templates:
Add it to templates via context processor.
Add it to the config module before app initialization.

Categories

Resources