Django how to debug a frozen save operation on a queryset object

Django how to debug a frozen save operation on a queryset object - python

I have the following code in a Django project (within the create method of a Django Rest Framework serializer)
def create(self, validated_data):
<...>
log.info("\n\n\n")
log.info(f"django model: {self.Meta.model}")
log.info("CREATING CASE NOW .....")
case = self.Meta.model(**kwargs)
log.info(f"Case to be saved: {case}")
case.save()
log.info(f"Case object Created: {case}")
When I'm posting to the endpoint, it's just freezing up completely on .save(). Here's example output:
2020-06-15 02:47:46,008 - serializers - INFO ===> django model: <class 'citator.models.InternalCase'>
2020-06-15 02:47:46,008 - serializers - INFO ===> django model: <class 'citator.models.InternalCase'>
2020-06-15 02:47:46,009 - serializers - INFO ===> CREATING CASE NOW .....
2020-06-15 02:47:46,009 - serializers - INFO ===> CREATING CASE NOW .....
2020-06-15 02:47:46,010 - serializers - INFO ===> Case to be saved: seychelles8698
2020-06-15 02:47:46,010 - serializers - INFO ===> Case to be saved: seychelles8698
No error is thrown and the connection isn't broken. How can I debug this? Is there a way to get logging from the save method?

The error likely unrelated to the use of the Django rest serializers as the code that hangs simple creates a new model and saves it. Now you did not specify how kwargs is defined, but the most likely candidate is that it gets stuck talking to the DB.
To debug the code, you should learn how to step in the code. There are a number of options depending on your preferences.
Visual studio code
Install the debugpy package.
Run python3 -m debugpy --listen localhost:12345 --pid <pid_of_django_process>
Run the "Python: Remote Attach" command.
CLI
Before the line case.save() do
import pdb; pdb.set_trace()
This assumes you are running the Django server interactively and not e.g. through gunicorn. You will get a debug console right before the save line. When the console appears, type 'c' and press enter to continue execution. Then press Ctrl+C when the process appears stuck. Type bt to find out what goes on in the process.
Native code
If the stack trace points to native code, you could switch over to gdb. To debug this (make sure to exit any Python debugger or restart the process without a debugger). Run
gdb -p <pid_of_django>
when the process appears stuck. Then type 'bt' and press enter to get a native traceback of what is going on. This should help you identifiy e.g. database clients acting up.

It is very probable that Django is waiting for a response from database server and it is a configuration problem, not a problem in the Python code where it froze. It is better to check and exclude this possibility before debugging anything. For example it is possible that a table is locked or an updated row is locked by another frozen process and the timeout in the database for waiting for end of lock is long and also the timeout of Django waiting for the database response is very long or infinite.
It is confirmed if a similar save operation takes an abnormally long time in another database client, preferably in your favorite database manager.
Waiting for socket responce is excluded if you see CPU % activity of the locked Python process.
It may be easier explored if you can reproduce the problem in CLI by python manage.py shell or python manage.py runserver --nothreading --nothreading. Then you can press Ctrl+C and maybe after some time Ctrl+C again. If you are lucky you kill the process and will see a KeyboardInterrupt with a traceback. It helps you identify if the process was waiting for something else than for a database server socket response.
Another possible cause in Django could be related to a custom callback code connected to a pre_save or post_save signal.
Instead of plain python manage.py ... you can run python -m pdb manage.py ... and optionally set a break point or simply press "c" "Enter" (continue). The process will run and will be not killed after any exception, but stay in pdb (the native Python DeBugger).

Related

Local Django server unresponsive when adding function scheduling

I am trying to schedule a function to run everyday using the Schedule
library.
My local Django server however hangs and becomes unresponsive during the system check after saving the schedules to my code. It is only when I remove the schedule code the system check passes and the server runs without problem.
I have copied the example directly from the documentation and the server is not returning any errors so I am unsure what the problem is ..
views.py
....
def test_task():
user = user.objects.get(pk=1)
user.task_complete = True
user.save()
schedule.every(10).minutes.do(test_task)
while True:
schedule.run_pending()
time.sleep(1)
....
Terminal output (hangs here)
chaudim#TD app_one % python3 manage.py runserver
Watching for file changes with StatReloader
Performing system checks...

Django loads (imports) files based on its settings.
When you put this while loop in a global scope, it is executed on import. It runs the while loop until it's done. And it's never done. You can add a print statement there if you want to see for yourself if that's the root cause.
Normally people use periodic_tasks from celery but it might be an overkill for your needs.
I'd rather advise to create a command so you could run python manage.py test_task and on the os level just add a cron job that will run this command every 10 minutes.

Calling Python CLI program from django views and displaying the results back asynchronously?

I have a django project having a web interface where you can upload files and after the upload is successful it calls the cli version of the software to process it and returns the result after it's successful execution
Here, is a bit of snippet I use in my views.py
from cliproject.main import clirunner
# Some code for file upload and saving
clirunner()
This will run the command line python script main.py which is present inside cliproject/ directory and it will do some stuff and saves the output
The problem is, this whole process is Synchronous at this moment. Hence, the user page loads after they upload the file from the UI and until it gets processed by Python CLI script behind the scenes.
The flow is as
Django UI
| (User upload files)
views.py gets request and saves it somewhere
| (views run clirunner() to give python cli program control)
cliproject runs
| (After doing the stuff which is intended, it saves the output file)
views.py resumes
| (Reads the output file)
Django UI displays the output file
So, we can see the problem here that I am calling a different CLI program from views.py to do the stuff which I want. But it happens synchronously.
What I need is to make the process Asynchronous and I want to show something as a Loading Bar to notify them that the cli program is executing at the back-side and it's status.
After CLI program is done executing then loading bar will reach 100% and asynchronously django UI will display the output
I tried Celery. But I could not figure out how to make this loading bar work based on the python cli script. Any ideas?

I have a thought, you need:
A) To launch the task asynchronously
B) To be able to get the value of its current status.
Here's an idea:
1) Make the task a manage.py command that you can invoke using threads or have a Celery task call.
2) As the task executes, have it write its current completion state using a Django model to your DB of choice. (The step above is meant to simplify using the DB. You can always write directly if you need to do so.)
3) Pass the task id (assigned by you or generated by Celery, on a db column that's indexed) to the template context and use an AJAX call to ping a view that returns the percentage complete from a database lookup, then set your completion from there.
This way, your view submits and launches the task, it takes care of marking its own work, and then the other view just makes a quick db query to find out where it is.
Edited to add: You could also use the cache backend and write to a key in something like memcached, redis, etc. to avoid pings on your relational database.

django daemon fails to read database updates

I have written a small daemon script which will check status of model(Foo) field and if it is 0 then update another model(Bar) and if it is 1 then don't do anything.
The daemon is running fine, And it is updating as we expected if the models status is 0.
Please find the link to the script from here: click
But once we start the daemon script, it is unable to read the database changes after that.
Means, after the daemon has started running even if an object is created in model Foo with status 0, daemon is not able to read it, consequently the model Bar is not getting updated as expected.
How could I resolve this issue?

The fundamental problem was, on subsequent daemon process checks, the database state was same as when daemon started.
I got hint for the solution from this link.
Before each orm query, I reset the connection and now the orm is reading the updated database.
def reset_database_connection():
from django import db
db.close_connection()

Pyramid: restart the apps in a exception view

The command that I've been using is:
pserve development.ini --reload
and every time when i meet a error like SQLAlchemy's "IntegrityError" or something else,
I have to kill pserve an type the command again to restart the apps.
Is there a method i can restart the apps in a exception view like this?
#view_config(context=Exception)
def error_view(exc, request):
#restart the waitress or apache...
return Response("Sorry there was an error, wait seconds, we will fix it soon.")

Restarting your server is not a sensical response to an IntegrityError. This is something that is expected to happen and you need to handle it. Restarting the server really makes no sense in the context of anything other than development.
If you run into exceptions in development, fix the code and save the file and the --reload will automatically restart your server for you.

If you have to restart the application after an exception (supposedly because nothing works after an exception otherwise) it suggests your requests try to re-use the same transaction - in other words, your application is not configured properly.
You should be using a session configured with ZopeTransactionExtension as Pyramide's scaffolds generate.
If you show us some code we may be able to pinpoint the exact cause of the problem.

GAE SDK 1.6.4 dev_appserver datastore flush

Hoping to get a comment from the GAE python team on this one.
Has something changed between 1.6.3, 1.6.4 with regards to the way the dev_appserver datastore is flushed to disk on app exit?
I'm using django-nonrel, and on 1.6.3, and before, I used to be able to load up a python shell:
python manage.py shell (manage.py is provided by django-nonrel)
I could then import my models and acceess the datastore, modify entities, and quit.
On 1.6.4, I'm able to do that, but when I quit, changes are not saved to the datastore. When I run django-nonrel as a WSGI app, it saves properly, and I see a message on exit ("Applying all pending transactions and saving the datastore").

Thanks to dragonx for his solution and info.
I run my devserver from eclipse, and I was amazed to see my data not beeing saved after upgrading to 1.6.4
I added a flush to the database after every web request, to do that I implemented a base class for all requests and override dispatch:
developmentServer = False
if os.environ.get('SERVER_SOFTWARE','').startswith('Development'):
developmentServer = True
class BaseRequestHandler(webapp2.RequestHandler):
def dispatch(self):
retValue = super(BaseRequestHandler, self).dispatch()
if developmentServer:
from google.appengine.tools import dev_appserver
dev_appserver.TearDownStubs()
return retValue
informing about a change in behavior like that in the release notes, would have saved me two days of searching what went wrong in my upgrade.

It looks like there have been some changes. I've been able to hack around the problem with the following:
from google.appengine.tools import dev_appserver
import atexit
atexit.register(dev_appserver.TearDownStubs)
This ensures the datastore is flushed on exit.

Before 1.6.4, we saved the datastore after every write. This method does not work when simulating the transactional model found in the High Replication Datastore (you would lose the last couple writes). It is also horribly inefficient. We changed it so the datastore dev stub flushs all writes and saves it's state on shut down.
Following the code:
https://bitbucket.org/wkornewald/djangoappengine/src/60c2b3339a9f/management/commands/runserver.py#cl-154
http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/tools/dev_appserver_main.py#683
It looks like manage.py should work if the server is shut down cleanly (with a TERM signal or KeyInterrupt).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.