gevent and posgres: Asynchronous connection failed - python

I'm using gevent to handle API I/O on a Django-based web system.
I've monkey-patched using:
import gevent.monkey; gevent.monkey.patch_socket()
I've patched psychopg using:
import psycogreen; psycogreen.gevent.patch_psycopg()
Nonetheless, certain Django calls so Model.save() are failing with the error: "Asynchronous Connection Failed." Do I need to do something else to make postgres greenlet-safe in the Django environment? Is there something else I'm missing?

there is an article on this promblem, unfortunately it's in Russian. Let me quote the final part:
All the connections are stored in django.db.connections, which is
the instance of django.db.utils.ConnectionHandler. Every time ORM
is about to issue a query, it requests a DB connection by calling
connections['default']. In turn, ConnectionHandler.__getattr__ checks if there is a connection in
ConnectionHandler._connections, and creates a new one if it is
empty.
All opened connections should be closed after use. There is a signal
request_finished, which is run by
django.http.HttpResponseBase.close. Django closes DB connections
at the very last moment, when nobody could use it anymore - and it
seems reasonable.
Yet there is tricky part about how ConnectionHandler stores DB
connections. It uses threading.local, which becomes
gevent.local.local after monkeypatching. Declared once, this
structure works just as it was unique at every greenlet. Controller
*some_view* started its work in one greenlet, and now we've got a connection in *ConnectionHandler._connections*. Then we create few
more greenlets and which get an empty
*ConnectionHandlers._connections*, and they've got connectinos from pool. After new greenlets done, the content of their local() is gone,
and DB connections gone withe them without being returned to pool. At
some moment, pool becomes empty
Developing Django+gevent you should always keep that in mind and close
the DB connection by calling django.db.close_connection. It
should be called at exception as well, you can use a decorator for
that, something like:
class autoclose(object):
def __init__(self, f=None):
self.f = f
def __call__(self, *args, **kwargs):
with self:
return self.f(*args, **kwargs)
def __enter__(self):
pass
def __exit__(self, exc_type, exc_info, tb):
from django.db import close_connection
close_connection()
return exc_type is None

Related

Python - FTP Connection Pool

I have a generic multiprocess script that could run any task in a multiprocess set up. I inject the task as a command line argument and use getattr to call the functions in the injected code.
taskModule = importlib.import_module(taskFile.replace(".py", ""))
taskContext = getattr(taskModule, 'init')()
response = pool.map_async(getattr(taskModule, 'run'), inputList)
The init() function creates all relevant variables for the task to execute and returns them as a dict object - the taskContext. inputList is a list of dict objects, each dict containing both the taskContext object as well as the specific item to be processed, so that each process gets a unique item to process along with a copy of the context required by the task.
One of those tasks is meant for FTP and the taskContext in that case contains information on the FTP server along with other details. The run function in the FTP task pretty much opens a connection using the context variables, uploads the required files and closes it, and this works perfectly.
However, I think it'd be good if I can set up a connection pool with multiple FTP connections at the start, as part of the init() function when the context is created, and then use them in an as-available fashion within the run method, very similar to a DB connection pool that prevents the need to open and close connections to the database every time.
Is this even feasible? If so, what's the best way to go about doing it?
I put together a connection_pool module as part of a proof of concept. I'm not sure how robust it is.
I added connection closing to this which is a bugfix of this.
I was able to set up connection pooling of FTP and SFTP connections transferring a few thousand files over 10-20 threads.
You can install my version from conda:
conda install -c jmeppley connectionpool
Creating an FTP pool looks something like this:
# this is from snakemake
def connect(*args_to_use, **kwargs_to_use):
ftp_base_class = (
ftplib.FTP_TLS if kwargs_to_use["encrypt_data_channel"] else ftplib.FTP
)
ftp_session_factory = ftputil.session.session_factory(
base_class=ftp_base_class,
port=kwargs_to_use["port"],
encrypt_data_channel=kwargs_to_use["encrypt_data_channel"],
debug_level=None,
)
return ftputil.FTPHost(
kwargs_to_use["host"],
kwargs_to_use["username"],
kwargs_to_use["password"],
session_factory=ftp_session_factory,
)
# a function to create a pool using the connect() method
def create_ftp_pool(pool_size, *args_to_use, **kwargs_to_use):
create_callback = partial(connect, *args_to_use, **kwargs_to_use)
connection_pool = ConnectionPool(create_callback,
close=lambda c: c.close(),
max_size=pool_size)
# create a pool with the arguments you'd use to create a connection
pool_size = 10
ftp_pool = create_ftp_pool(pool_size, host=...)
# use item() as a context manager
with ftp_pool.item() as connection:
...

SQLAlchemy session and connection relationship

Do queries executed with the same SQLAlchemy session object use the same underlying connection? If not, is there a way to ensure this?
Some background: I have a need to use MySQL's named lock feature, i.e. GET_LOCK() and RELEASE_LOCK() functions. As far as the MySQL server is concerned, only the connection that obtained the lock can release it - so I have to make sure that I either execute these two commands within the same connection or the connection dies to ensure the lock is released.
To make things nicer, I have created a "locked" context like so:
#contextmanager
def mysql_named_lock(session, name, timeout):
"""Get a named mysql lock on a session
"""
lock = session.execute("SELECT GET_LOCK(:name, :timeout)",
name=name, timeout=timeout).scalar()
if lock:
try:
yield session
finally:
session.execute("SELECT RELEASE_LOCK(:name)", name=name)
else:
e = "Count not obtain named lock {} within {} sections".format(
name, timeout)
raise RuntimeError(e)
def my_critical_section(session):
with mysql_named_lock(session, __name__, 10) as lockedsession:
thing = lockedsession.query(MyStuff).one()
return thing
I want to make sure that the two execute calls in mysql_named_lock happen on the same underlying connection or the connection is closed.
Can I assume this would "just work" or is there anything I need to be aware of here?
it will "just work" if (a) your session is a scoped_session and (b) you are using it in a non-concurrent fashion (same pid / thread). If you're too paranoid, make sure (assert) you're using the same connection ID via
session.connection().connection.thread_id()
also, there is no point to pass session as an argument. Init it once, somewhere in your application’s global scope, then call anywhere in a code, you will get the same connection ID.

Locking with Tornado and multiple instances

I'm fairly new to Python and Tornado, so please forgive if I overcomplicated a long-solved problem, but I didn't find much else out there.
I'm running multiple Tornado instances (multiple instances per server, multiple servers) for an application and have some tasks that only one instance should perform, such as scheduling certain events in the application. Instead of running a dedicated instance that performs this task, I'd like to have an opportunistic approach where the first instance that tries gets to do the job.
Part of my solution is a database based locking mechanism (MongoDB findAndUpdate). The code below seems to work just fine but I'd like to get some advice if this is a good solution or if there are ready-made locking and task distribution solutions out there for Tornado.
This is the decorator that acquires the lock when entering the function and releases it afterwards:
def locking(fn):
#tornado.gen.engine
def wrapped(wself, *args, **kwargs):
#tornado.gen.engine
def wrapped_callback(*cargs, **ckwargs):
logging.info("release lock")
yield tornado.gen.Task(lock.release_lock)
logging.info("release lock done")
original_callback(*cargs, **ckwargs)
logging.info("acquire lock")
yield tornado.gen.Task(model.SchedulerLock.initialize_lock, area_id=wself.area_id)
lock = yield tornado.gen.Task(model.SchedulerLock.acquire_lock, area_id=wself.area_id)
if lock:
logging.info("acquire lock done")
original_callback = kwargs['callback']
kwargs['callback'] = wrapped_callback
fn(wself, *args, **kwargs)
else:
logging.info("acquire lock not possible, postponed")
ioloop = tornado.ioloop.IOLoop.instance()
ioloop.add_timeout(datetime.timedelta(seconds=2), functools.partial(wrapped, wself, *args, **kwargs))
return wrapped
The acquire_lock method returns the lock object or False
Any thoughts on this? I know that the lock is only half of the solution, as I also need a mechanism that ensures that a one-off task only gets done once. However, this can be achieved very similarly. Is there anything that solves the problem more elegantly?

Monitoring gevent exceptions in jobs

I'm building an application using gevent. My app is getting rather big now as there are a lot of jobs being spawned and destroyed. Now I've noticed that when one of these jobs crashes my entire application just keeps running (if the exception came from a non main greenlet) which is fine. But the problem is that I have to look at my console to see the error. So some part of my application can "die" and I'm not instantly aware of that and the app keeps running.
Jittering my app with try catch stuff does not seem to be a clean solution.
Maybe a custom spawn function which does some error reporting?
What is the proper way to monitor gevent jobs/greenlets? catch exceptions?
In my case I listen for events of a few different sources and I should deal with each different.
There are like 5 jobs extremely important. The webserver greenlet, websocket greenlet,
database greenlet, alarms greenlet, and zmq greenlet. If any of those 'dies' my application should completely die. Other jobs which die are not that important. For example, It is possible that websocket greenlet dies due to some exception raised and the rest of the applications keeps running fine like nothing happened. It is completely useless and dangerous now and should just crash hard.
I think the cleanest way would be to catch the exception you consider fatal and do sys.exit() (you'll need gevent 1.0 since before that SystemExit did not exit the process).
Another way is to use link_exception, which would be called if the greenlet died with an exception.
spawn(important_greenlet).link_exception(lambda *args: sys.exit("important_greenlet died"))
Note, that you also need gevent 1.0 for this to work.
If on 0.13.6, do something like this to kill the process:
gevent.get_hub().parent.throw(SystemExit())
You want to greenlet.link_exception() all of your greenlets to a to janitor function.
The janitor function will be passed any greenlet that dies, from which it can inspect its greenlet.exception to see what happened, and if necessary do something about it.
As #Denis and #lvo said, link_exception is OK, but I think there would be a better way for that, without change your current code to spawn greenlet.
Generally, whenever an exception is thrown in a greenlet, _report_error method (in gevent.greenlet.Greenlet) will be called for that greenlet. It will do some stuff like call all the link functions and finally, call self.parent.handle_error with exc_info from current stack. The self.parent here is the global Hub object, this means, all the exceptions happened in each greenlet will always be centralize to one method for handling. By default Hub.handle_error distinguish the exception type, ignore some type and print the others (which is what we always saw in the console).
By patching Hub.handle_error method, we can easily register our own error handlers and never lose an error anymore. I wrote a helper function to make it happen:
from gevent.hub import Hub
IGNORE_ERROR = Hub.SYSTEM_ERROR + Hub.NOT_ERROR
def register_error_handler(error_handler):
Hub._origin_handle_error = Hub.handle_error
def custom_handle_error(self, context, type, value, tb):
if not issubclass(type, IGNORE_ERROR):
# print 'Got error from greenlet:', context, type, value, tb
error_handler(context, (type, value, tb))
self._origin_handle_error(context, type, value, tb)
Hub.handle_error = custom_handle_error
To use it, just call it before the event loop is initialized:
def gevent_error_handler(context, exc_info):
"""Here goes your custom error handling logics"""
e = exc_info[1]
if isinstance(e, SomeError):
# do some notify things
pass
sentry_client.captureException(exc_info=exc_info)
register_error_handler(gevent_error_handler)
This solution has been tested under gevent 1.0.2 and 1.1b3, we use it to send greenlet error information to sentry (a exception tracking system), it works pretty well so far.
The main issue with greenlet.link_exception() is that it does not give any information on traceback which can be really important to log.
For logging with traceback, I use a decorator to spwan jobs which indirect job call into a simple logging function:
from functools import wraps
import gevent
def async(wrapped):
def log_exc(func):
#wraps(wrapped)
def wrapper(*args, **kwargs):
try:
func(*args, **kwargs)
except Exception:
log.exception('%s', func)
return wrapper
#wraps(wrapped)
def wrapper(*args, **kwargs):
greenlet = gevent.spawn(log_exc(wrapped), *args, **kwargs)
return wrapper
Of course, you can add the link_exception call to manage jobs (which I did not need)

SQLAlchemy session management in long-running process

Scenario:
A .NET-based application server (Wonderware IAS/System Platform) hosts automation objects that communicate with various equipment on the factory floor.
CPython is hosted inside this application server (using Python for .NET).
The automation objects have scripting functionality built-in (using a custom, .NET-based language). These scripts call Python functions.
The Python functions are part of a system to track Work-In-Progress on the factory floor. The purpose of the system is to track the produced widgets along the process, ensure that the widgets go through the process in the correct order, and check that certain conditions are met along the process. The widget production history and widget state is stored in a relational database, this is where SQLAlchemy plays its part.
For example, when a widget passes a scanner, the automation software triggers the following script (written in the application server's custom scripting language):
' wiget_id and scanner_id provided by automation object
' ExecFunction() takes care of calling a CPython function
retval = ExecFunction("WidgetScanned", widget_id, scanner_id);
' if the python function raises an Exception, ErrorOccured will be true
' in this case, any errors should cause the production line to stop.
if (retval.ErrorOccured) then
ProductionLine.Running = False;
InformationBoard.DisplayText = "ERROR: " + retval.Exception.Message;
InformationBoard.SoundAlarm = True
end if;
The script calls the WidgetScanned python function:
# pywip/functions.py
from pywip.database import session
from pywip.model import Widget, WidgetHistoryItem
from pywip import validation, StatusMessage
from datetime import datetime
def WidgetScanned(widget_id, scanner_id):
widget = session.query(Widget).get(widget_id)
validation.validate_widget_passed_scanner(widget, scanner) # raises exception on error
widget.history.append(WidgetHistoryItem(timestamp=datetime.now(), action=u"SCANNED", scanner_id=scanner_id))
widget.last_scanner = scanner_id
widget.last_update = datetime.now()
return StatusMessage("OK")
# ... there are a dozen similar functions
My question is: How do I best manage SQLAlchemy sessions in this scenario? The application server is a long-running process, typically running months between restarts. The application server is single-threaded.
Currently, I do it the following way:
I apply a decorator to the functions I make avaliable to the application server:
# pywip/iasfunctions.py
from pywip import functions
def ias_session_handling(func):
def _ias_session_handling(*args, **kwargs):
try:
retval = func(*args, **kwargs)
session.commit()
return retval
except:
session.rollback()
raise
return _ias_session_handling
# ... actually I populate this module with decorated versions of all the functions in pywip.functions dynamically
WidgetScanned = ias_session_handling(functions.WidgetScanned)
Question: Is the decorator above suitable for handling sessions in a long-running process? Should I call session.remove()?
The SQLAlchemy session object is a scoped session:
# pywip/database.py
from sqlalchemy.orm import scoped_session, sessionmaker
session = scoped_session(sessionmaker())
I want to keep the session management out of the basic functions. For two reasons:
There is another family of functions, sequence functions. The sequence functions call several of the basic functions. One sequence function should equal one database transaction.
I need to be able to use the library from other environments. a) From a TurboGears web application. In that case, session management is done by TurboGears. b) From an IPython shell. In that case, commit/rollback will be explicit.
(I am truly sorry for the long question. But I felt I needed to explain the scenario. Perhaps not necessary?)
The described decorator is suitable for long running applications, but you can run into trouble if you accidentally share objects between requests. To make the errors appear earlier and not corrupt anything it is better to discard the session with session.remove().
try:
try:
retval = func(*args, **kwargs)
session.commit()
return retval
except:
session.rollback()
raise
finally:
session.remove()
Or if you can use the with context manager:
try:
with session.registry().transaction:
return func(*args, **kwargs)
finally:
session.remove()
By the way, you might want to use .with_lockmode('update') on the query so your validate doesn't run on stale data.
Ask your WonderWare administrator to give you access to the Wonderware Historian, you can track the values of the tags pretty easily via MSSQL calls over sqlalchemy that you can poll every so often.
Another option is to use the archestra toolkit to listen for the internal tag updates and have a server deployed as a platform in the galaxy which you can listen from.

Categories

Resources