Django access to database from another process - python

I create a new Process from django app. Can i create new record in database from this process?
My code throws exception:
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
UPD_1
def post(self, request):
v = Value('b', True)
proc = Process(target=start, args=(v, request.user,
request.data['stock'], request.data['pair'], '1111'))
proc.start()
def start(v, user, stock_exchange, pair, msg):
MyModel.objects.create(user=user, stock_exchange=stock_exchange, pair=pair, date=datetime.now(), message=msg)

You need to initialise the project first. You don't usually have to do this when going through manage.py, because it does it automatically, but a new process won't have had this done for it. So you need to put something like the following at the top of your code:
import django
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.settings")
django.setup()
myproject.settings needs to be importable from whereever this code is running, so if not you might need to add to sys.path first.
Once this is done you can access your project's models, and use them to access your database, just like you normally would.

I was having a similar problem (also starting process from view), and what eventually helped me solve it was this answer.
The solution indicated is to close your DB connections just before you fork your new Process, so Django can recreate the connection when a query is needed in the new process. Adapted to you code it would be:
def post(self, request):
v = Value('b', True)
#close db connections here
from django import db
db.connections.close_all()
#create and fork your process
proc = Process(target=start, args=(v, request.user,
request.data['stock'], request.data['pair'], '1111'))
proc.start()
Calling django.setup() did not help in my case, and after reading the linked answer, probably because forked processes already share the file descriptors, etc. of its parent process (so Django is already setup).

Related

Flask SqlAlchemy/Alembic migration sends invalid charset to PyMysql

I've spent a 3+ hours on this for 18 of the last 21 days. Please, someone, tell me what I'm misunderstanding!
TL;DR: My code is repeatedly sending the db charset as a string to PyMysql, while it expects an object with an attribute called "encoding"
Background
This is Python code running on a docker container. A second container houses the database. The database address is stored in a .env variable called ENGINE_URL:
ENGINE_URL=mysql+pymysql://root:#database/starfinder_development?charset=utf8
I'm firing off Alembic and Flask-Alembic commands using click commands in the CLI. All of the methods below are used in CLI commands.
Models / Database Setup (works)
from flask import Flask
flask_app = Flask(__name__)
db_engine = SQLAlchemy(flask_app)
from my_application import models
def create_database():
db_engine.create_all()
At this point I can open up the database container and use the MySql CLI to see that all of my models have now been converted into tables with columns and relationships.
Attempt 1: Alembic
Create Revision Files with Alembic (works)
from alembic.config import Config
def main(): # fires prior to any CLI command
filepath = os.path.join(os.path.dirname(__file__),
"alembic.ini")
alembic_config = Config(file_=filepath)
alembic_config.set_main_option("sqlalchemy.url",
ENGINE_URL)
alembic_config.set_main_option("script_location",
SCRIPT_LOCATION)
migrate_cli(obj={"alembic_config": alembic_config})
def revision(ctx, message):
alembic_config = ctx.obj["alembic_config"]
alembic_command.revision(alembic_config, message)
At this point I have a migration file the was created exactly as expected. Then I need to upgrade the database using that migration...
Running Migrations with Alembic (fails)
def upgrade(ctx, migration_revision):
alembic_config = ctx.obj["alembic_config"]
migration_revision = migration_revision.lower()
_dispatch_alembic_cmd(alembic_config, "upgrade",
revision=migration_revision)
firing this off with cli_command upgrade head causes a failure, which I've included here at the bottom because it has an identical stack trace to my second attempt.
Attempt 2: Flask-Alembic
This attempt finds me completely rewriting my main and revision commands, but it doesn't get as far as using upgrade.
Create Revision Files with Flask-Alembic (fails)
def main(): # fires prior to any CLI command
alembic_config = Alembic()
alembic_config.init_app(flask_app)
migrate_cli(obj={"alembic_config": alembic_config})
def revision(ctx, message):
with flask_app.app_context():
alembic_config = ctx.obj["alembic_config"]
print(alembic_config.revision(message))
This results in an error that is identical to the error from my previous attempt.
The stack trace in both cases:
(Identical failure using alembic upgrade & flask-alembic revision)
File "/Users/MyUser/.pyenv/versions/3.6.2/envs/sf/lib/python3.6/site-packages/pymysql/connections.py", line 678, in __init__
self.encoding = charset_by_name(self.charset).encoding
AttributeError: 'NoneType' object has no attribute 'encoding'
In response, I went into the above file & added a print on L677, immediately prior to the error:
print(self.charset)
utf8
Note: If I modify my ENGINE_URL to use a different ?charset=xxx, that change is reflected here.
So now I'm stumped
PyMysql expects self.charset to have an attribute encoding, but self.charset is simply a string. How can I change this to behave as expected?
Help?
A valid answer would be an alternative process, though the "most correct" answer would be to help me resolve the charset/encoding problem.
My primary goal here is simply to get migrations working on my flask app.

django DoesNotExist matching query does not exist with postgres only [duplicate]

This question already has answers here:
Django related objects are missing from celery task (race condition?)
(3 answers)
Closed 5 years ago.
The assets django app I'm working on runs well with SQLite but I am facing performance issues with deletes / updates of large sets of records and so I am making the transition to a PostgreSQL database.
To do so, I am starting fresh by updating theapp/settings.py to configure PostgreSQL, starting with a fresh db and deleting the assets/migrations/ directory. I am then running:
./manage.py makemigrations assets
./manage.py migrate --run-syncdb
./manage.py createsuperuser
I have a function called within a registered post_create signal. It runs a scan when a Scan object is created. Within the class assets.models.Scan:
#classmethod
def post_create(cls, sender, instance, created, *args, **kwargs):
if not created:
return
from celery.result import AsyncResult
# get the domains for the project, from scan
print("debug: task = tasks.populate_endpoints.delay({})".format(instance.pk))
task = tasks.populate_endpoints.delay(instance.pk)
The offending code:
from celery import shared_task
....
import datetime
#shared_task
def populate_endpoints(scan_pk):
from .models import Scan, Project,
from anotherapp.plugins.sensual import subdomains
scan = Scan.objects.get(pk=scan_pk) #<<<<<<<< django no like
new_entries_count = 0
project = Project.objects.get(id=scan.project.id)
....
The resultant exception DoesNotExist raised:
debug: task = tasks.populate_endpoints.delay(2)
[2017-09-14 23:18:34,950: ERROR/ForkPoolWorker-8] Task assets.tasks.populate_endpoints[4555d329-2873-4184-be60-55e44c46a858] raised unexpected: DoesNotExist('Scan matching query does not exist.',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 374, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 629, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/src/app/theapp/assets/tasks.py", line 12, in populate_endpoints
scan = Scan.objects.get(pk=scan_pk)
Interacting through ./manage.py shell however indicates that Scan object with pk == 2 exists:
>>> from assets.models import Scan
>>> Scan.objects.all()
<QuerySet [<Scan: ACME Web Test Scan>]>
>>> s = Scan.objects.all().first()
>>> s.pk
2
My only guess is that at the time the post_create function is called, the Scan object still does not exist in the PostgreSQL database, despite save() having been called.
SQLite does not exhibit this problem.
Also, I haven't found a relevant, related problem on stackoverflow as the DoesNotExist exception looks to be fairly generic and caused by many things.
Any ideas on this would be much appreciated.
This is a well known problem resulting from transactions and isolation level - sometimes the transaction has not been commited when the task is executed and if your isolation level is READ COMMITED then you can't indeed read this record from another process. Django 1.9 introduced the on_commit hook as a solution.
NB : technically this question is a duplicate of Django related objects are missing from celery task (race condition?) but the accepted answer uses django-transaction-hooks which has since then been merged into django.

Django autoreload: add watched file

When source files in my project change, django server reloads. I want to extend this to non-Python source files. I use native SQL queries, which are stored in separate files (eg. big_select.sql), and I want the server to reload when these files change.
I use django on Windows.
I have tried adding .py extension, which didn't work.
Django>=2.2
The autoreloading was given a major overhaul (thanks to #Glenn who notified about the incoming changes in this comment!), so one doesn't have to use the undocumented Django features and append files to _cached_filenames anymore. Instead, register custom signal listener, listening to autoreloading start:
# apps.py
from django.apps import AppConfig
from django.utils.autoreload import autoreload_started
def my_watchdog(sender, **kwargs):
sender.watch_file('/tmp/foo.bar')
# to listen to multiple files, use watch_dir, e.g.
# sender.watch_dir('/tmp/', '*.bar')
class EggsConfig(AppConfig):
name = 'eggs'
def ready(self):
autoreload_started.connect(my_watchdog)
Django<2.2
Django stores the watched filepaths in the django.utils.autoreload._cached_filenames list, so adding to or removing items from it will force django to start or stop watching files.
As for your problem, this is the (kind of a hacky) solution. For the demo purpose, I adapted the apps.py so the file starts being watched right after django initializes, but feel free to put the code wherever you want to. First of all, create the file as django can watch only files that already exist:
$ touch /tmp/foo.bar
In your django app:
# apps.py
from django.apps import AppConfig
...
import django.utils.autoreload
class MyAppConfig(AppConfig):
name = 'myapp'
def ready(self):
...
django.utils.autoreload._cached_filenames.append('/tmp/foo.bar')
Now start the server, in another console modify the watched file:
$ echo baz >> /tmp/foo.bar
The server should trigger an autoreload now.
The accepted answer did not work in Django 3.0.7 probably due to changes since.
Came up with the following after going through autoreload:
from django.utils.autoreload import autoreload_started
# Watch .conf files
def watch_extra_files(sender, *args, **kwargs):
watch = sender.extra_files.add
# List of file paths to watch
watch_list = [
FILE1,
FILE2,
FILE3,
FILE4,
]
for file in watch_list:
if os.path.exists(file): # personal use case
watch(Path(file))
autoreload_started.connect(watch_extra_files)

What is the correct way to start endless threads when django is run as fcgi?

I want to use pyinotify to watch changes on the filesystem. If a file has changed, I want to update my database file accordingly (re-read tags, other information...)
I put the following code in my app's signals.py
import pyinotify
....
# create filesystem watcher in seperate thread
wm = pyinotify.WatchManager()
notifier = pyinotify.ThreadedNotifier(wm, ProcessInotifyEvent())
# notifier.setDaemon(True)
notifier.start()
mask = pyinotify.IN_CLOSE_WRITE | pyinotify.IN_CREATE | pyinotify.IN_MOVED_TO | pyinotify.IN_MOVED_FROM
dbgprint("Adding path to WatchManager:", settings.MUSIC_PATH)
wdd = wm.add_watch(settings.MUSIC_PATH, mask, rec=True, auto_add=True)
def connect_all():
"""
to be called from models.py
"""
rescan_start.connect(rescan_start_callback)
upload_done.connect(upload_done_callback)
....
This works great when django is run with ''./manage.py runserver''. However, when run as ''./manage.py runfcgi'' django won't start. There is no error message, it just hangs and won't daemonize, probably at the line ''notifier.start()''.
When I run ''./manage.py runfcgi method=threaded'' and enable the line ''notifier.setDaemon(True)'', then the notifier thread is stopped (isAlive() = False).
What is the correct way to start endless threads together with django when django is run as fcgi? Is it even possible?
Well, duh. Never start an own, endless thread besides django. I use celery, where it works a bit better to run such threads.

Gearman + SQLAlchemy - keep losing MySQL thread

I have a python script that sets up several gearman workers. They call into some methods on SQLAlchemy models I have that are also used by a Pylons app.
Everything works fine for an hour or two, then the MySQL thread gets lost and all queries fail. I cannot figure out why the thread is getting lost (I get the same results on 3 different servers) when I am defining such a low value for pool_recycle. Also, why wouldn't a new connection be created?
Any ideas of things to investigate?
import gearman
import json
import ConfigParser
import sys
from sqlalchemy import create_engine
class JSONDataEncoder(gearman.DataEncoder):
#classmethod
def encode(cls, encodable_object):
return json.dumps(encodable_object)
#classmethod
def decode(cls, decodable_string):
return json.loads(decodable_string)
# get the ini path and load the gearman server ips:ports
try:
ini_file = sys.argv[1]
lib_path = sys.argv[2]
except Exception:
raise Exception("ini file path or anypy lib path not set")
# get the config
config = ConfigParser.ConfigParser()
config.read(ini_file)
sqlachemy_url = config.get('app:main', 'sqlalchemy.url')
gearman_servers = config.get('app:main', 'gearman.mysql_servers').split(",")
# add anypy include path
sys.path.append(lib_path)
from mypylonsapp.model.user import User, init_model
from mypylonsapp.model.gearman import task_rates
# sqlalchemy setup, recycle connection every hour
engine = create_engine(sqlachemy_url, pool_recycle=3600)
init_model(engine)
# Gearman Worker Setup
gm_worker = gearman.GearmanWorker(gearman_servers)
gm_worker.data_encoder = JSONDataEncoder()
# register the workers
gm_worker.register_task('login', User.login_gearman_worker)
gm_worker.register_task('rates', task_rates)
# work
gm_worker.work()
I've seen this across the board for Ruby, PHP, and Python regardless of DB library used. I couldn't find how to fix this the "right" way which is to use mysql_ping, but there is a SQLAlchemy solution as explained better here http://groups.google.com/group/sqlalchemy/browse_thread/thread/9412808e695168ea/c31f5c967c135be0
As someone in that thread points out, setting the recycle option to equal True is equivalent to setting it to 1. A better solution might be to find your MySQL connection timeout value and set the recycle threshold to 80% of it.
You can get that value from a live set by looking up this variable http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_connect_timeout
Edit:
Took me a bit to find the authoritivie documentation on useing pool_recycle
http://www.sqlalchemy.org/docs/05/reference/sqlalchemy/connections.html?highlight=pool_recycle

Categories

Resources