Celery task does not timeout

Celery task does not timeout - python

I would like to verify that setting time limits for celery tasks work.
I currently have my configuration looking like this:
CELERYD_TASK_SOFT_TIME_LIMIT = 30
CELERYD_TASK_TIME_LIMIT = 120
task_soft_time_limit = 29
task_time_limit = 44_LIMIT = 120
I am overloading the timeout parameters because it appears that there name change coming and I just want to be sure that I hit at least one timeout.
But when I run a test in the debugger the cellery app.conf dictionary looks like this:
(Pdb) app.conf['task_time_limit'] == None
True
(Pdb) app.conf['task_soft_time_limit'] == None
True
(Pdb) app.conf['CELERYD_TASK_SOFT_TIME_LIMIT']
30
(Pdb) app.conf['CELERYD_TASK_TIME_LIMIT']
120
I've written a test which I believe would trigger the timeout but no error is ever raised:
#app.task(soft_time_limit=15)
def time_out_task():
import time
count = 0
#import pdb; pdb.set_trace()
while count < 1000:
time.sleep(1)
count += 1
print(count)
My questions are as follows:
What are the conical settings I should set for a soft and hard time limit?
How could I execute a task in a test which proves to me that the time limits are in place.
Thanks

I solved the issue by changing the way I was testing and by changing the way I was importing the Celery configuration.
Initially, I was setting the configuration by importing a Django settings object:
app = Celery('groot')
app.config_from_object('django.conf:settings', namespace='CELERY')
But this was ignoring the settings with the CELERYD_... prefix. Thus I used the new notation and called the following method:
app.conf.update(
task_soft_time_limit=30,
task_time_limit=120,
)
I also changed from testing this in the Django test environment to spinning up an actual Celery worker and sending the task to the worker.
If someone would supply a solution for how to test the timeout settings in the unit test it would be much appreciated.

In django settings CELERY_TASK_TIME_LIMIT is working for me
CELERY_TASK_TIME_LIMIT = 60

Related

Prefect caching through a file target

After reading the documentation on Output Caching based on a file target
, I figured this workflow should be an example of output caching:
from time import sleep
from prefect import Flow, task
from prefect.engine.results import LocalResult
#task(target="func_task_target.txt", checkpoint=True,
result=LocalResult(dir="~/.prefect"))
def func_task():
sleep(5)
return 99
with Flow("Test-cache") as flow:
func_task()
if __name__ == '__main__':
flow.run()
I would expect func_task to run one time, get cached, and then use the cached value next time I run the flow. However, it seems that func_task runs each time.
Where am I going wrong? Or have I misunderstood the documentation?

Try setting environment variable PREFECT__FLOWS__CHECKPOINTING to True
import os
os.environ["PREFECT__FLOWS__CHECKPOINTING"] = "true"
you can also change the results dir
os.environ["PREFECT__HOME_DIR"] = "path to dir"

What is python best practice to to detect then influence behavior if in development?

I'm new to python and I want to know the pythonic way to influence behavior differently between my run time test environment and production.
My use case is I'm using a decorator that needs different parameters in test vs production runs.
It looks like this:
# buffer_time should be 0 in test and 5000 lets say in prod
#retry(stop_max_attempt_number=7, wait_fixed=buffer_time)
def wait_until_volume_status_is_in_use(self, volume):
if volume.status != 'in use':
log_fail('Volume status is ... %s' % volume.status)
volume.update()
return volume.status
One solution is use os environment variables.
At the top of a file I can write something like this
# some variation of this
buffer_time = 0 if os.environ['MODE'] == 'TEST' else 5000
class Guy(object):
# Body where buffer_time is used in decorator
Another solution is to use a settings file
# settings.py
def init():
global RETRY_BUFFER
RETRY_BUFFER = 5000
# __init__.py
import settings
settings.init()
# test file
from my_module import settings
settings.RETRY_BUFFER = 0
from my_module.class_file import MyKlass
# Do Tests
# class file
import settings
buffer_time = settings.RETRY_BUFFER
class Guy(object):
# Body where buffer_time is used in decorator
Ultimately my problem with both solutions is they both used shared state.
I would like to know what is the standard way to accomplish this.

How do I confirm entities are saved with GAE's Eventual Consistency?

I'm trying to create tests to verify that my entities are being saved in the database.
When I put breakpoints in the post function, I can see that the customer count changes after the record is saved.
I read https://cloud.google.com/appengine/docs/python/tools/localunittesting#Python_Writing_High_Replication_Datastore_tests
From what I understood, the tests were failing because of Eventual Consistency and the way to get around that was to change the PseudoRandomHRConsistencyPolicy settings.
policy = datastore_stub_util.PseudoRandomHRConsistencyPolicy(probability=1)
And when I ran the test again I got the same error.
What am I doing wrong with creating these tests?
> /Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/main.py(137)post()
-> customer.put()
(Pdb) l
134 query = Customer.query()
135 orig_customer_count = query.count()
136 import pdb; pdb.set_trace()
137 -> customer.put()
138 import pdb; pdb.set_trace()
139 query_params = {'leadbook_name': leadbook_name}
140 self.redirect('/?' + urllib.urlencode(query_params))
141
142 config = {}
(Pdb) orig_customer_count
5
(Pdb) c
> /Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/main.py(139)post()
-> query_params = {'leadbook_name': leadbook_name}
(Pdb) l
134 query = Customer.query()
135 orig_customer_count = query.count()
136 import pdb; pdb.set_trace()
137 customer.put()
138 import pdb; pdb.set_trace()
139 -> query_params = {'leadbook_name': leadbook_name}
140 self.redirect('/?' + urllib.urlencode(query_params))
141
142 config = {}
143 config['webapp2_extras.sessions'] = {
144 'secret_key': 'my-super-secret-key',
(Pdb) query.count()
6
The entities also show up in the Datastore Viewer.
However, my test keeps failing.
$ nosetests --with-gae
F
======================================================================
FAIL: test_guest_can_submit_contact_info (dermalfillersecrets.functional_tests.NewVisitorTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/Bryan/work/GoogleAppEngine/dermalfillersecrets/functional_tests.py", line 80, in test_guest_can_submit_contact_info
self.assertNotEqual(orig_custs, query.count())
AssertionError: 0 == 0
This is the functional_tests.py file contents:
import os, sys
sys.path.append("/usr/local/google_appengine")
sys.path.append("/usr/local/google_appengine/lib/yaml/lib")
sys.path.append("/usr/local/google_appengine/lib/webapp2-2.5.2")
sys.path.append("/usr/local/google_appengine/lib/django-1.5")
sys.path.append("/usr/local/google_appengine/lib/cherrypy")
sys.path.append("/usr/local/google_appengine/lib/concurrent")
sys.path.append("/usr/local/google_appengine/lib/docker")
sys.path.append("/usr/local/google_appengine/lib/requests")
sys.path.append("/usr/local/google_appengine/lib/websocket")
sys.path.append("/usr/local/google_appengine/lib/fancy_urllib")
sys.path.append("/usr/local/google_appengine/lib/antlr3")
import unittest
from selenium import webdriver
from google.appengine.api import memcache
from google.appengine.ext import db
from google.appengine.ext import testbed
import dev_appserver
from google.appengine.tools.devappserver2 import devappserver2
class NewVisitorTest(unittest.TestCase):
def setUp(self):
self.testbed = testbed.Testbed()
self.testbed.activate()
#self.testbed.setup_env(app_id='dermalfillersecrets')
self.testbed.init_user_stub()
####################################################
# this sets testbed to imitate strong consistency
from google.appengine.datastore import datastore_stub_util
policy = datastore_stub_util.PseudoRandomHRConsistencyPolicy(probability=1)
self.testbed.init_datastore_v3_stub(consistency_policy=policy)
self.testbed.init_memcache_stub()
####################################################
# setup the dev_appserver
APP_CONFIGS = ['app.yaml']
self.browser = webdriver.Firefox()
self.browser.implicitly_wait(3)
def tearDown(self):
self.browser.quit()
self.testbed.deactivate()
def test_guest_can_submit_contact_info(self):
from main import Customer
query = Customer.query()
orig_custs = query.count()
self.browser.get('http://localhost:8080')
self.browser.find_element_by_name('id_name').send_keys("Kallie Wheelock")
self.browser.find_element_by_name('id_street').send_keys("123 main st")
self.browser.find_element_by_name('id_phone').send_keys('(404)555-1212')
self.browser.find_element_by_name('id_zip').send_keys("30306")
self.browser.find_element_by_name('submit').submit()
# this should return 1 more record
#import pdb; pdb.set_trace()
query = Customer.query()
self.assertNotEqual(orig_custs, query.count())
assert(Customer.query(Customer.name == "Kallie Wheelock").get())
# Delete the Customer record
Customer.query(Customer.name =="Kallie Wheelock").delete()

The PseudoRandomHRConsistencyPolicy is not helping you here because your selenium test is submitting a live html form and the subsequent db update happening on the server which is outside scope of your policy.
What you testing here is the end to end testing not the unit test per se. So your selenium test should take care of the real world scenario and should wait for a predefined period of time before comparing the counts.

There's nothing wrong with strong/eventual consistency, but the design of your tests is wrong. Why you're trying to deal with devappserver in your tests by yourself? Why you're trying to remove entities in the end of the test? Each test should be isolated from each other and start from empty datastore with some possible initializations.
Please, use the latest version of NoseGAE plugin. Here's two simple tests about strong/eventual consistency:
import unittest
from google.appengine.ext import ndb
from google.appengine.datastore import datastore_stub_util
class Foo(ndb.Model):
pass
class TestEventualConsistency(unittest.TestCase):
nosegae_datastore_v3 = True
nosegae_datastore_v3_kwargs = {
'consistency_policy': datastore_stub_util.PseudoRandomHRConsistencyPolicy(
probability=0)}
def test_eventual_consistency(self):
self.assertEqual(Foo.query().count(), 0)
Foo().put()
self.assertEqual(Foo.query().count(), 0)
class TestStrongConsistency(unittest.TestCase):
nosegae_datastore_v3 = True
nosegae_datastore_v3_kwargs = {
'consistency_policy': datastore_stub_util.PseudoRandomHRConsistencyPolicy(
probability=1)}
def test_strong_consistency(self):
self.assertEqual(Foo.query().count(), 0)
Foo().put()
self.assertEqual(Foo.query().count(), 1)
Notice that I don't have anything about GAE paths, dev_appserver, etc.
You still can control testbed by yourself, but better configure it with nosegae_*. (read about this in plugin documentation)
And as I remember, it will work even if you will programmatically fill your HTML form, but its not unittests anymore though.

Try using ancestor queries to get strong consistency instead of eventual consistency. From the docs:
Ancestor queries allow you to make strongly consistent queries to the datastore...
If this does not work, the next thing I would try is to not reuse the query object but create a new the second time.
If this does not work either, my guess is that something else is wrong. I am not familiar with browser test, but I have used webtest with great success for testing web endpoints and have not had any consistency issues while unit testing.

Queries are eventually consistent (unless an ancestor is set), but a get operation is always consistent.
If your objective is to simply test the code for writing an entity, you can insert an entity in this test and check if you can retrieve this entity using its key.

What is the correct way to start endless threads when django is run as fcgi?

I want to use pyinotify to watch changes on the filesystem. If a file has changed, I want to update my database file accordingly (re-read tags, other information...)
I put the following code in my app's signals.py
import pyinotify
....
# create filesystem watcher in seperate thread
wm = pyinotify.WatchManager()
notifier = pyinotify.ThreadedNotifier(wm, ProcessInotifyEvent())
# notifier.setDaemon(True)
notifier.start()
mask = pyinotify.IN_CLOSE_WRITE | pyinotify.IN_CREATE | pyinotify.IN_MOVED_TO | pyinotify.IN_MOVED_FROM
dbgprint("Adding path to WatchManager:", settings.MUSIC_PATH)
wdd = wm.add_watch(settings.MUSIC_PATH, mask, rec=True, auto_add=True)
def connect_all():
"""
to be called from models.py
"""
rescan_start.connect(rescan_start_callback)
upload_done.connect(upload_done_callback)
....
This works great when django is run with ''./manage.py runserver''. However, when run as ''./manage.py runfcgi'' django won't start. There is no error message, it just hangs and won't daemonize, probably at the line ''notifier.start()''.
When I run ''./manage.py runfcgi method=threaded'' and enable the line ''notifier.setDaemon(True)'', then the notifier thread is stopped (isAlive() = False).
What is the correct way to start endless threads together with django when django is run as fcgi? Is it even possible?

Well, duh. Never start an own, endless thread besides django. I use celery, where it works a bit better to run such threads.

Gearman + SQLAlchemy - keep losing MySQL thread

I have a python script that sets up several gearman workers. They call into some methods on SQLAlchemy models I have that are also used by a Pylons app.
Everything works fine for an hour or two, then the MySQL thread gets lost and all queries fail. I cannot figure out why the thread is getting lost (I get the same results on 3 different servers) when I am defining such a low value for pool_recycle. Also, why wouldn't a new connection be created?
Any ideas of things to investigate?
import gearman
import json
import ConfigParser
import sys
from sqlalchemy import create_engine
class JSONDataEncoder(gearman.DataEncoder):
#classmethod
def encode(cls, encodable_object):
return json.dumps(encodable_object)
#classmethod
def decode(cls, decodable_string):
return json.loads(decodable_string)
# get the ini path and load the gearman server ips:ports
try:
ini_file = sys.argv[1]
lib_path = sys.argv[2]
except Exception:
raise Exception("ini file path or anypy lib path not set")
# get the config
config = ConfigParser.ConfigParser()
config.read(ini_file)
sqlachemy_url = config.get('app:main', 'sqlalchemy.url')
gearman_servers = config.get('app:main', 'gearman.mysql_servers').split(",")
# add anypy include path
sys.path.append(lib_path)
from mypylonsapp.model.user import User, init_model
from mypylonsapp.model.gearman import task_rates
# sqlalchemy setup, recycle connection every hour
engine = create_engine(sqlachemy_url, pool_recycle=3600)
init_model(engine)
# Gearman Worker Setup
gm_worker = gearman.GearmanWorker(gearman_servers)
gm_worker.data_encoder = JSONDataEncoder()
# register the workers
gm_worker.register_task('login', User.login_gearman_worker)
gm_worker.register_task('rates', task_rates)
# work
gm_worker.work()

I've seen this across the board for Ruby, PHP, and Python regardless of DB library used. I couldn't find how to fix this the "right" way which is to use mysql_ping, but there is a SQLAlchemy solution as explained better here http://groups.google.com/group/sqlalchemy/browse_thread/thread/9412808e695168ea/c31f5c967c135be0
As someone in that thread points out, setting the recycle option to equal True is equivalent to setting it to 1. A better solution might be to find your MySQL connection timeout value and set the recycle threshold to 80% of it.
You can get that value from a live set by looking up this variable http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_connect_timeout
Edit:
Took me a bit to find the authoritivie documentation on useing pool_recycle
http://www.sqlalchemy.org/docs/05/reference/sqlalchemy/connections.html?highlight=pool_recycle

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Celery task does not timeout - python

In django settings CELERY_TASK_TIME_LIMIT is working for me CELERY_TASK_TIME_LIMIT = 60

Related

Prefect caching through a file target

What is python best practice to to detect then influence behavior if in development?

How do I confirm entities are saved with GAE's Eventual Consistency?

What is the correct way to start endless threads when django is run as fcgi?

Gearman + SQLAlchemy - keep losing MySQL thread

Categories

Resources