How do I confirm entities are saved with GAE's Eventual Consistency?

How do I confirm entities are saved with GAE's Eventual Consistency? - python

I'm trying to create tests to verify that my entities are being saved in the database.
When I put breakpoints in the post function, I can see that the customer count changes after the record is saved.
I read https://cloud.google.com/appengine/docs/python/tools/localunittesting#Python_Writing_High_Replication_Datastore_tests
From what I understood, the tests were failing because of Eventual Consistency and the way to get around that was to change the PseudoRandomHRConsistencyPolicy settings.
policy = datastore_stub_util.PseudoRandomHRConsistencyPolicy(probability=1)
And when I ran the test again I got the same error.
What am I doing wrong with creating these tests?
> /Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/main.py(137)post()
-> customer.put()
(Pdb) l
134 query = Customer.query()
135 orig_customer_count = query.count()
136 import pdb; pdb.set_trace()
137 -> customer.put()
138 import pdb; pdb.set_trace()
139 query_params = {'leadbook_name': leadbook_name}
140 self.redirect('/?' + urllib.urlencode(query_params))
141
142 config = {}
(Pdb) orig_customer_count
5
(Pdb) c
> /Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/main.py(139)post()
-> query_params = {'leadbook_name': leadbook_name}
(Pdb) l
134 query = Customer.query()
135 orig_customer_count = query.count()
136 import pdb; pdb.set_trace()
137 customer.put()
138 import pdb; pdb.set_trace()
139 -> query_params = {'leadbook_name': leadbook_name}
140 self.redirect('/?' + urllib.urlencode(query_params))
141
142 config = {}
143 config['webapp2_extras.sessions'] = {
144 'secret_key': 'my-super-secret-key',
(Pdb) query.count()
6
The entities also show up in the Datastore Viewer.
However, my test keeps failing.
$ nosetests --with-gae
F
======================================================================
FAIL: test_guest_can_submit_contact_info (dermalfillersecrets.functional_tests.NewVisitorTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/functional_tests.py", line 80, in test_guest_can_submit_contact_info
self.assertNotEqual(orig_custs, query.count())
AssertionError: 0 == 0
This is the functional_tests.py file contents:
import os, sys
sys.path.append("/usr/local/google_appengine")
sys.path.append("/usr/local/google_appengine/lib/yaml/lib")
sys.path.append("/usr/local/google_appengine/lib/webapp2-2.5.2")
sys.path.append("/usr/local/google_appengine/lib/django-1.5")
sys.path.append("/usr/local/google_appengine/lib/cherrypy")
sys.path.append("/usr/local/google_appengine/lib/concurrent")
sys.path.append("/usr/local/google_appengine/lib/docker")
sys.path.append("/usr/local/google_appengine/lib/requests")
sys.path.append("/usr/local/google_appengine/lib/websocket")
sys.path.append("/usr/local/google_appengine/lib/fancy_urllib")
sys.path.append("/usr/local/google_appengine/lib/antlr3")
import unittest
from selenium import webdriver
from google.appengine.api import memcache
from google.appengine.ext import db
from google.appengine.ext import testbed
import dev_appserver
from google.appengine.tools.devappserver2 import devappserver2
class NewVisitorTest(unittest.TestCase):
def setUp(self):
self.testbed = testbed.Testbed()
self.testbed.activate()
#self.testbed.setup_env(app_id='dermalfillersecrets')
self.testbed.init_user_stub()
####################################################
# this sets testbed to imitate strong consistency
from google.appengine.datastore import datastore_stub_util
policy = datastore_stub_util.PseudoRandomHRConsistencyPolicy(probability=1)
self.testbed.init_datastore_v3_stub(consistency_policy=policy)
self.testbed.init_memcache_stub()
####################################################
# setup the dev_appserver
APP_CONFIGS = ['app.yaml']
self.browser = webdriver.Firefox()
self.browser.implicitly_wait(3)
def tearDown(self):
self.browser.quit()
self.testbed.deactivate()
def test_guest_can_submit_contact_info(self):
from main import Customer
query = Customer.query()
orig_custs = query.count()
self.browser.get('http://localhost:8080')
self.browser.find_element_by_name('id_name').send_keys("Kallie Wheelock")
self.browser.find_element_by_name('id_street').send_keys("123 main st")
self.browser.find_element_by_name('id_phone').send_keys('(404)555-1212')
self.browser.find_element_by_name('id_zip').send_keys("30306")
self.browser.find_element_by_name('submit').submit()
# this should return 1 more record
#import pdb; pdb.set_trace()
query = Customer.query()
self.assertNotEqual(orig_custs, query.count())
assert(Customer.query(Customer.name == "Kallie Wheelock").get())
# Delete the Customer record
Customer.query(Customer.name =="Kallie Wheelock").delete()

The PseudoRandomHRConsistencyPolicy is not helping you here because your selenium test is submitting a live html form and the subsequent db update happening on the server which is outside scope of your policy.
What you testing here is the end to end testing not the unit test per se. So your selenium test should take care of the real world scenario and should wait for a predefined period of time before comparing the counts.

There's nothing wrong with strong/eventual consistency, but the design of your tests is wrong. Why you're trying to deal with devappserver in your tests by yourself? Why you're trying to remove entities in the end of the test? Each test should be isolated from each other and start from empty datastore with some possible initializations.
Please, use the latest version of NoseGAE plugin. Here's two simple tests about strong/eventual consistency:
import unittest
from google.appengine.ext import ndb
from google.appengine.datastore import datastore_stub_util
class Foo(ndb.Model):
pass
class TestEventualConsistency(unittest.TestCase):
nosegae_datastore_v3 = True
nosegae_datastore_v3_kwargs = {
'consistency_policy': datastore_stub_util.PseudoRandomHRConsistencyPolicy(
probability=0)}
def test_eventual_consistency(self):
self.assertEqual(Foo.query().count(), 0)
Foo().put()
self.assertEqual(Foo.query().count(), 0)
class TestStrongConsistency(unittest.TestCase):
nosegae_datastore_v3 = True
nosegae_datastore_v3_kwargs = {
'consistency_policy': datastore_stub_util.PseudoRandomHRConsistencyPolicy(
probability=1)}
def test_strong_consistency(self):
self.assertEqual(Foo.query().count(), 0)
Foo().put()
self.assertEqual(Foo.query().count(), 1)
Notice that I don't have anything about GAE paths, dev_appserver, etc.
You still can control testbed by yourself, but better configure it with nosegae_*. (read about this in plugin documentation)
And as I remember, it will work even if you will programmatically fill your HTML form, but its not unittests anymore though.

Try using ancestor queries to get strong consistency instead of eventual consistency. From the docs:
Ancestor queries allow you to make strongly consistent queries to the datastore...
If this does not work, the next thing I would try is to not reuse the query object but create a new the second time.
If this does not work either, my guess is that something else is wrong. I am not familiar with browser test, but I have used webtest with great success for testing web endpoints and have not had any consistency issues while unit testing.

Queries are eventually consistent (unless an ancestor is set), but a get operation is always consistent.
If your objective is to simply test the code for writing an entity, you can insert an entity in this test and check if you can retrieve this entity using its key.

Related

Mocking Azure BlobServiceClient in Python

I am trying to write a unit test that will test azure.storage.blob.BlobServiceClient class and its methods. Below is my code
A fixture in the conftest.py
#pytest.fixture
def mock_BlobServiceClient(mocker):
azure_ContainerClient = mocker.patch("azure.storage.blob.ContainerClient", mocker.MagicMock())
azure_BlobServiceClient= mocker.patch("azure_module.BlobServiceClient", mocker.MagicMock())
azure_BlobServiceClient.from_connection_string.return_value
azure_BlobServiceClient.get_container_client.return_value = azure_ContainerClient
azure_ContainerClient.list_blob_names.return_value = "test"
azure_ContainerClient.get_container_client.list_blobs.return_value = ["test"]
yield azure_BlobServiceClient
Contents of the test file
from azure_module import AzureBlob
def test_AzureBlob(mock_BlobServiceClient):
azure_blob = AzureBlob()
# This assertion passes
mock_BlobServiceClient.from_connection_string.assert_called_once_with("testconnectionstring")
# This assertion fails
mock_BlobServiceClient.get_container_client.assert_called()
Contents of the azure_module.py
from azure.storage.blob import BlobServiceClient
import os
class AzureBlob:
def __init__(self) -> None:
"""Initialize the azure blob"""
self.azure_blob_obj = BlobServiceClient.from_connection_string(os.environ["AZURE_STORAGE_CONNECTION_STRING"])
self.azure_container = self.azure_blob_obj.get_container_client(os.environ["AZURE_CONTAINER_NAME"])
My test fails when I execute it with below error message
> mock_BlobServiceClient.get_container_client.assert_called()
E AssertionError: Expected 'get_container_client' to have been called.
I am not sure why it says that the get_container_client wasn't called when it was called during the AzureBlob's initialization.
Any help is very much appreciated.
Update 1
I believe this is a bug in the unittest's MagicMock itself. Per
Michael Delgado suggested that I dialed the code to a bare minimum to test and identify the issue, and I concluded that the MagicMock was causing the problem. Below are my findings:
conftest.py
#pytest.fixture
def mock_Blob(mocker):
yield mocker.patch("module.BlobServiceClient")
test_azureblob.py
def test_AzureBlob(mock_Blob):
azure_blob = AzureBlob()
print(mock_Blob)
print(mock_Blob.mock_calls)
print(mock_Blob.from_connection_string.mock_calls)
print(mock_Blob.from_connection_string.get_container_client.mock_calls)
assert False # <- Intentional fail
After running the test, I got the following results.
$ pytest -vv
.
.
.
------------------------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------------------------
<MagicMock name='BlobServiceClient' id='140704187870944'>
[call.from_connection_string('AZURE_STORAGE_CONNECTION_STRING'),
call.from_connection_string().get_container_client('AZURE_CONTAINER_NAME')]
[call('AZURE_STORAGE_CONNECTION_STRING'),
call().get_container_client('AZURE_CONTAINER_NAME')]
[]
.
.
.
The prints clearly show that the get_container_client was seen being called, but the mocked method did not register it at its level. That led me to conclude that the MagicMock has a bug which I will report to the developers for further investigation.

Run python script like a service with Twisted

I would like to run this script like a automatic service who will run every minute, everyday with Twisted (I first tried to 'DAEMON' but it seems to difficult and i didn't find good tutos to do it, I already tried crontab but that's not what I'm looking for).
Do anyone ever do that with Twisted because I'm not finding the tutorial made for my kind of script(getting datas from a db table and putting them in another table of same db) ? I have to keep the logs in a file but it will not be the most difficult part.
from twisted.enterprise import adbapi
from twisted.internet import task
import logging
from datetime import datetime
from twisted.internet import reactor
from twisted.internet.defer import inlineCallbacks
"""
Test DB : This File do database connection and basic operation.
"""
log = logging.getLogger("Test DB")
dbpool = adbapi.ConnectionPool("MySQLdb",db="xxxx",user="guza",passwd="vQsx7gbblal8aiICbTKP",host="192.168.15.01")
class MetersCount():
def getTime(self):
log.info("Get Current Time from System.")
time = str(datetime.now()).split('.')[0]
return time
def getTotalMeters(self):
log.info("Select operation in Database.")
getMetersQuery = """ SELECT count(met_id) as totalMeters FROM meters WHERE DATE(met_last_heard) = DATE(NOW()) """
return dbpool.runQuery(getMetersQuery).addCallback(self.getResult)
def getResult(self, result):
print ("Receive Result : ")
print (result)
# general purpose method to receive result from defer.
return result
def insertMetersCount(self, meters_count):
log.info("Insert operation in Database.")
insertMetersQuery = """ INSERT INTO meter_count (mec_datetime, mec_count) VALUES (NOW(), %s)"""
return dbpool.runQuery(insertMetersQuery, [meters_count])
def checkDB(self):
d = self.getTotalMeters()
d.addCallback(self.insertMetersCount)
return d
a= MetersCount()
a.checkDB()
reactor.run()

If you want to run a function once a minute, have a look at LoopingCall. It takes a function, and runs it at intervals unless told to stop.
You would use it something like this (which I haven't tested):
from twisted.internet.task import LoopingCall
looper = LoopingCall(a.checkDB)
looper.start(60)
The documentation is at the link.

Celery task does not timeout

I would like to verify that setting time limits for celery tasks work.
I currently have my configuration looking like this:
CELERYD_TASK_SOFT_TIME_LIMIT = 30
CELERYD_TASK_TIME_LIMIT = 120
task_soft_time_limit = 29
task_time_limit = 44_LIMIT = 120
I am overloading the timeout parameters because it appears that there name change coming and I just want to be sure that I hit at least one timeout.
But when I run a test in the debugger the cellery app.conf dictionary looks like this:
(Pdb) app.conf['task_time_limit'] == None
True
(Pdb) app.conf['task_soft_time_limit'] == None
True
(Pdb) app.conf['CELERYD_TASK_SOFT_TIME_LIMIT']
30
(Pdb) app.conf['CELERYD_TASK_TIME_LIMIT']
120
I've written a test which I believe would trigger the timeout but no error is ever raised:
#app.task(soft_time_limit=15)
def time_out_task():
import time
count = 0
#import pdb; pdb.set_trace()
while count < 1000:
time.sleep(1)
count += 1
print(count)
My questions are as follows:
What are the conical settings I should set for a soft and hard time limit?
How could I execute a task in a test which proves to me that the time limits are in place.
Thanks

I solved the issue by changing the way I was testing and by changing the way I was importing the Celery configuration.
Initially, I was setting the configuration by importing a Django settings object:
app = Celery('groot')
app.config_from_object('django.conf:settings', namespace='CELERY')
But this was ignoring the settings with the CELERYD_... prefix. Thus I used the new notation and called the following method:
app.conf.update(
task_soft_time_limit=30,
task_time_limit=120,
)
I also changed from testing this in the Django test environment to spinning up an actual Celery worker and sending the task to the worker.
If someone would supply a solution for how to test the timeout settings in the unit test it would be much appreciated.

In django settings CELERY_TASK_TIME_LIMIT is working for me
CELERY_TASK_TIME_LIMIT = 60

Django + Pytest + Selenium

I recently switched from Django's TestCase classes to the third party pytest system. This allowed me to speed up my test suite significantly (by a factor of 5), and has been a great experience overall.
I Do have issues with selenium though. I've made a simple fixture to include the browser in my tests
#pytest.yield_fixture
def browser(live_server, transactional_db, admin_user):
driver_ = webdriver.Firefox()
driver_.server_url = live_server.url
driver_.implicitly_wait(3)
yield driver_
driver_.quit()
But for some reason, The database is not properly reset between tests. I have a test similar to
class TestCase:
def test_some_unittest(db):
# Create some items
#...
def test_with_selenium(browser):
# The items from the above testcase exists in this testcase
The objects created in test_some_unittest are present in test_with_selenium. I'm not really sure how to solve this.

switch from django.test.TestCase in favour of pytest shall mean employing pytest-django plugin and your tests should look like this:
class TestSomething(object):
def setup_method(self, method):
pass
#pytest.mark.django_db
def test_something_with_dg(self):
assert True
that above all means no django.test.TestCase (which is a derivation from python std unittest framework) inheritance.
the #pytest.mark.django_db means your test case will run in a transaction which will be rolled back once the test case is over.
first occurrence of the django_db marker will also trigger django migrations.
beware using database calls in special pytest methods such as setup_method for it's unsupported and otherwise problematic:
django-pytest setup_method database issue

def _django_db_fixture_helper(transactional, request, _django_cursor_wrapper):
if is_django_unittest(request):
return
if transactional:
_django_cursor_wrapper.enable()
def flushdb():
"""Flush the database and close database connections"""
# Django does this by default *before* each test
# instead of after.
from django.db import connections
from django.core.management import call_command
for db in connections:
call_command('flush', verbosity=0,
interactive=False, database=db)
for conn in connections.all():
conn.close()
request.addfinalizer(_django_cursor_wrapper.disable)
request.addfinalizer(flushdb)
else:
if 'live_server' in request.funcargnames:
return
from django.test import TestCase
_django_cursor_wrapper.enable()
_django_cursor_wrapper._is_transactional = False
case = TestCase(methodName='__init__')
case._pre_setup()
request.addfinalizer(_django_cursor_wrapper.disable)
request.addfinalizer(case._post_teardown)
As i see you use pytest-django (which is fine)
From this code of it, it doesn't flush the db if it's non-transactional db.
So in your 'other' tests you'd have to use transactional_db and then it will be isolated as you wanted.
So your code will look like:
class TestCase:
def test_some_unittest(transactional_db):
# Create some items
#...
def test_with_selenium(browser):
# The items from the above testcase does not exist in this testcase
Hovewer, an improvement to pytest-django could be that flush is performed before, not after yield of the fixture value, which makes much more sense. It's not so important what's in teardown, it's important that set up is correct.
As a side suggestion, for browser fixture you can just use pytest-splinter plugin

Gearman + SQLAlchemy - keep losing MySQL thread

I have a python script that sets up several gearman workers. They call into some methods on SQLAlchemy models I have that are also used by a Pylons app.
Everything works fine for an hour or two, then the MySQL thread gets lost and all queries fail. I cannot figure out why the thread is getting lost (I get the same results on 3 different servers) when I am defining such a low value for pool_recycle. Also, why wouldn't a new connection be created?
Any ideas of things to investigate?
import gearman
import json
import ConfigParser
import sys
from sqlalchemy import create_engine
class JSONDataEncoder(gearman.DataEncoder):
#classmethod
def encode(cls, encodable_object):
return json.dumps(encodable_object)
#classmethod
def decode(cls, decodable_string):
return json.loads(decodable_string)
# get the ini path and load the gearman server ips:ports
try:
ini_file = sys.argv[1]
lib_path = sys.argv[2]
except Exception:
raise Exception("ini file path or anypy lib path not set")
# get the config
config = ConfigParser.ConfigParser()
config.read(ini_file)
sqlachemy_url = config.get('app:main', 'sqlalchemy.url')
gearman_servers = config.get('app:main', 'gearman.mysql_servers').split(",")
# add anypy include path
sys.path.append(lib_path)
from mypylonsapp.model.user import User, init_model
from mypylonsapp.model.gearman import task_rates
# sqlalchemy setup, recycle connection every hour
engine = create_engine(sqlachemy_url, pool_recycle=3600)
init_model(engine)
# Gearman Worker Setup
gm_worker = gearman.GearmanWorker(gearman_servers)
gm_worker.data_encoder = JSONDataEncoder()
# register the workers
gm_worker.register_task('login', User.login_gearman_worker)
gm_worker.register_task('rates', task_rates)
# work
gm_worker.work()

I've seen this across the board for Ruby, PHP, and Python regardless of DB library used. I couldn't find how to fix this the "right" way which is to use mysql_ping, but there is a SQLAlchemy solution as explained better here http://groups.google.com/group/sqlalchemy/browse_thread/thread/9412808e695168ea/c31f5c967c135be0
As someone in that thread points out, setting the recycle option to equal True is equivalent to setting it to 1. A better solution might be to find your MySQL connection timeout value and set the recycle threshold to 80% of it.
You can get that value from a live set by looking up this variable http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_connect_timeout
Edit:
Took me a bit to find the authoritivie documentation on useing pool_recycle
http://www.sqlalchemy.org/docs/05/reference/sqlalchemy/connections.html?highlight=pool_recycle

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I confirm entities are saved with GAE's Eventual Consistency? - python

Queries are eventually consistent (unless an ancestor is set), but a get operation is always consistent. If your objective is to simply test the code for writing an entity, you can insert an entity in this test and check if you can retrieve this entity using its key.

Related

Mocking Azure BlobServiceClient in Python

Run python script like a service with Twisted

Celery task does not timeout

Django + Pytest + Selenium

Gearman + SQLAlchemy - keep losing MySQL thread

Categories

Resources