Using Django 2.2
I have two models, Job and Operation, where Operation has a foreign key relation to Job (ie, a job can have 0 or more operations).
I want people to enter info about jobs and operations together via the Django Admin interface, using inlines. (This means that on the Admin site, on the job create/edit page, a user can add one or more operations "inline" without leaving the page.)
Job has a few fields computed from its related operations. Rather than simply make them computed #property properties, I want them to be regular database fields that get updated when the operations change via signals. It looks something like this:
class Job(models.Model):
name = models.CharFields(...)
def compute_fields(self):
qs = self.operations.filter(...) # get data from operations
self.name = ... # set properties using that data
self.save()
...
class Operation(models.Model):
job = models.ForeignKey(Job, related_name="operations", on_delete=models.CASCADE)
...
#receiver(post_save, sender=Operation)
def update_job_on_operation_save(sender, instance, **kwargs):
"""Update job fields when an operation is saved"""
instance.job.compute_fields()
Here is the issue: If someone is editing the Job form on the Django Admin, and they add multiple operations inline before hitting save, then the receiver function gets called multiple times simultaneously. I am a little worried about a race condition, as well as the inefficiency of each signal leading Job to recompute some properties and save to the database.
It would be better, perhaps, to attach the receiver to Job, so the function only gets called once, but if someone were to edit an Operation outside of the Job form that should also trigger a recompute.
Is it possible to set up similar post_save receivers for both Job and Operation and say "ignore the operation receiver if the operation was edited inline as part of the job form"? Are there alternative solutions?
Related
Is there any way to pass additional parameters to instance I'm saving in DB to later access them after the instance is saved?
Brief example of my case:
I'm using Django's signals as triggers to events, like sending a confirmation email, executed by other processes, like workers.
I'm willing to specify which instance and when should trigger the event, and which should not: sometimes I want created/updated records to trigger series of events, and sometimes I want them to be processes silently or do some other actions.
One solution for this is saving desired behaviour for specific instance in model's field like JSONField and recover this behaviour at post_save, but this seems very ugly way of handlign such problem.
I'm using post_save signal as verification that instance was correctly saved in the DB, because I don't want to trigger event and a moment later something goes wrong while saving instance in DB.
Instances are saved through Django Forms, backend routines and RestFramework Seralizers
One solution is to use an arbitrary model instance attribute (not field) to store the desired state. For example:
def my_view(request):
...
instance._send_message = True if ... else False
instance.save()
#receiver(post_save, sender=MyModel)
def my_handler(sender, instance, **kwargs):
if instance._send_message:
...
I have a fairly complex django application that has been in production for over a year.
The application holds data from different customers. The data is obviously in the same table, separated by customer_id.
Recently the client has started to ask questions about data segregation. Since the app is sold on a per user basis and holds sensitive information, customers have been asking if and how we maintain data segregation per customer, and are there any security measures that we take to prevent data leakages (ie. data from one customer being accessed by another customer).
We do our filters in the view endpoints, but eventually a developer in the team might forget to include a filter in his ORM query, and cause a data leakage
So we came up with the idea to implement default filters on our models. Basically whenever a developer writes:
Some_Model.objects.all()
essentially they will execute:
Some_Model.objects.filter(customer_id = request.user.customer_id)
We plan to achieve this by overriding the objects property on each model to point to a manager with a filtered queryset. Something like this:
class Allowed_Some_Model_Manager(models.Manager):
def get_queryset(self):
return super(Allowed_Some_Model_Manager, self).get_queryset().filter(
customer_id = request.user.customer_id
# the problem is that request.user is not available in models.py
)
class Some_Model(models.Model):
name = models.CharField(max_length=50)
customer = models.ForeignKey(Customer)
objects = Allowed_Some_Model_Manager()
all_objects = models.Manager() # use this if we want all objects
However our problem is that request.user is not available in models.py.
I have found several ways to solve this.
Option 1 includes passing the request.user to the manager each time. However since I am dealing with thousands of lines of old code, I don't want to go and change all of our ORM queries.
Option 2, included using threading.local() to set the request.user in the thread local data.
Something like this: https://djangosnippets.org/snippets/2179/
There is a module that seems to be doing this: https://github.com/Alir3z4/django-crequest
However, a lot of people seem to be against this idea... Namely these two discussions:
django get_current_user() middleware - strange error message which goes away if source code is "changed" , which leads to an automatic server restart
Django custom managers - how do I return only objects created by the logged-in user?
So that brings me to Option 3 which I came up with, and I can not find anybody else using it. Use the python builtins module to pass the user from the middleware to the model.
#middleware.py
import builtins
def process_request(self, request):
if request.user.id:
builtins.django_user = request.user
#models.py
import builtins
class Allowed_Some_Model_Manager(models.Manager):
def get_queryset(self):
if 'django_user' in vars(builtins):
return super(Allowed_Some_Model_Manager, self).get_queryset().filter(
customer_id = django_user.customer_id
)
else:
return super(Allowed_Some_Model_Manager, self).get_queryset()
I have tested the code and it is working on my local django server and on Apache with mod_wsgi. But I really want to hear if there are any pitfalls of this approach. I have never used builtins module before, and I am not sure if I understand how it works, and what is the use-case for it.
I have two models, one of which uses data from the other model to populate its own fields. The issue is that when the first model is updated, the second model does not also update its own fields. I have to go in and actually edit/save the 2nd model for its fields to update.
Something like this:
models.py:
class ModelA(models.ModelForm)
...
class ModelB(models.ModelForm)
count_number_of_model_A = models.IntegerField
def save(self)
self.count_number_of_model_A = ModelA.objects.all().count()
super(ModelB, self).save()
(this is a simplified version of what I'm trying to do)
Now I want the field "count_number_of_model_A" in ModelB to update every time ModelA is altered. Right now, it only refreshes if I actually modify+save ModelB.
I think the answer is to use signals (maybe?). I'm trying to set up a signal so that ModelB updates whenever a new object is created in ModelA. I have the following:
#receiver(post_save, sender=ModelA)
def update_sends(sender, **kwargs):
if kwargs.get('created', False):
#some code here to refresh ModelB??
The signal is functioning properly, as if I put in something like ModelB.objects.filter(some filter).update(some field), those changes are reflected when I go in and create a new ModelA object. But the whole model itself does not update, and the field in question that I'm after ("count_number_of_model_A") does not refresh.
Any help?
Just use:
for model_b in ModelB.objects.filter(<some_filter>):
model_b.save()
But you should be aware that this pulls all (filtered) objects to Django, there do something with them and saves them back to the database. This is much slower than using query expressions. You will have a little bit more work to set it up, but it will run much faster - especially when database grows.
I'm writing an application in Django (which I'm very new to) where the admin area will be exposed to 'customers' of the application, not just staff/superusers, because of the nature of the application and the way Django automatically generates forms in the admin area with such little code..
As such I need to robust and manageable way to maintain authentication and separating data, so only data created by a user is seen by that user.
At the moment I'm just using the default admin package and changing permissions for 'client users' to filter what data they can see (I only want them to see data they've created) using code like the below:
class MyModelAdmin(admin.ModelAdmin):
def get_queryset(self, request):
qs = super(MyModelAdmin, self).get_queryset(request)
return qs.filter(user=request.user)
def save_model(self, request, obj, form, change):
# field not editable in admin area so handle it here...
obj.user = request.user
obj.save()
However as the application scales, I can see ensuring this type of data filtering becoming difficult to manage, for example if there are chains of foreign keys on certain tables(A->B B->C C->D), and to filter the table at the end of the chain I need to do various JOINs to get the rows which relate to the current user.
A couple of solutions I'm pondering are creating a separate admin app per user, but this feels like overkill and even more unmanageable.
Or just adding the user column to every Model where data filtering by user is required to make it easier to filter.
Any thoughts on the best approach to take?
First off, from experience, you're better off offering editing and creating functionality to your users in an actual django app, using Views. Generic views make this very easy. Once you let your users into the admin, they will get used to it and it's hard to get them to leave.
Additionally you should use contrib.auth.Group together with django-guardian to keep track of object-level permissions instead of implementing it yourself. It pays off in the long run.
If you want to make this experience on your own however, you have more than one sensible choice:
owner on root objects in the ForeignKey pyramid
owner on every model
To realize the first option, you should implement two methods on every model down the ForeignKey chain:
def get_parent(self):
"""Should return the object that should be queried for ownership information"""
pass
def owned_by(self, user):
"""Should return a boolean indicating whether `user` owns the object, by querying `self.get_parent().owned_by(user)`"""
pass
However, as you stated, this incurrs many JOINS if your schema is sufficiently complex.
I would advise you to store the information about the owner in every model, everything else is a maintanence nightmare in my experience.
Instead of adding the field manually to every model manually, you should use inheritance. However django provides bad built-in support for inheritance with relations: An abstract base model cannot define a models.ForeignKey, so you're stuck with table based inheritance.
Table based inheritance brings another problem with itself: Consider these models:
from django.db import models
from app.settings import AUTH_USER_MODEL
class Base(models.Model):
owner = models.ForeignKey(AUTH_USER_MODEL)
class ChildA(Base):
name = models.CharField(max_length=5)
class ChildB(Base):
location = models.CharField(max_length=5)
It is easy to find the owner of a given instance of ChildA or ChildB:
>>> obj = ChildA.objects.create(owner=Peter, name="alex")
>>> obj.owner
Peter
However it is non trivial to find all objects owned by a particular user:
>>> Base.objects.filter(owner=Peter)
<Base-object at 0xffffff>
The default manager returns a Base object, and doesn't contain information about whether it is a ChildA or ChildB instance, which can be troublesome.
To circumvent this, I recommend a polymorphic approach with django-polymorphic or django-model-utils, which is more lightweight. They both provide means to retrieve the child classes for a given Base model in the queries.
See my answer here for more information on polymorphism in django.
These also incur JOINs, but at least the complexity is manageable.
Having something like
created_by
created_date
modified_by
modified_date
Would be a very common pattern for a lot of tables.
1) You can set created date automatically (but not others) in model.py with
created_date = models.DateTimeField(auto_now_add=True, editable=False)
2) You could do created/modified dates (but not by/user as don't have request context) in model.py with
def save(self):
if self.id:
self.modified_date = datetime.now()
else:
self.created_date = datetime.now()
super(MyModel,self).save()
3) You could set the created/modifed date and by in admin.py - but this doesn't deal with non admin updates
def save_model(self, request, obj, form, change):
if change:
obj.modified_by = request.user
obj.modified_date = datetime.now()
else:
obj.created_by = request.user
obj.created_date = datetime.now()
obj.save()
4) And the final place would be in the view.py which can do all 4, but doesn't cover admin updates.
So realistically have to have logic spread out, at a minimum repeated in 3 & 4 (or a method on the model called from both, which will be missed)
Whats a better way? (I've been working with python/django for a couple of days so could easily be missing something obvious)
Can you do someting like #login_required e.g. #audit_changes
Can you get access to the request and current user in the model and centralise logic there?
The create/modification dates can be handled by Django now, so they can be implemented like:
class BaseModel(models.Model):
created_date = models.DateTimeField(auto_now_add=True)
modified_date = models.DateTimeField(auto_now=True)
class Meta:
abstract = True
By adding this to a abstract model base class, it can be easily added to all models of the application.
Storing the user is harder, since the request.user is not available. As SeanOC mentioned, this is a separation of concerns between the web request, and model layer. Either you pass this field all the time, or store request.user in a threadlocal. Django CMS does this for their permission system.
class CurrentUserMiddleware(object):
def process_request(self, request):
set_current_user(getattr(request, 'user', None))
And the user tracking happens elsewhere:
from threading import local
_thread_locals = local()
def set_current_user(user):
_thread_locals.user=user
def get_current_user():
return getattr(_thread_locals, 'user', None)
For non-web environments (e.g. management commands), you'd have to call set_current_user at the start of the script.
For timestamped models you probably want to look at django-model-utils or django-extensions. They each include abstract base classes which automatically handle of a created and last modified timestamp. You can either use these tools directly or look at how they solved the problem and come up with your own solution.
As for your other questions:
Can you do someting like #login_required e.g. #audit_changes
Potentially yes but you'd have to be very careful to keep things thread-safe. What you potentially could do is in your #audit_changes decorator, set a flag to enable auditing in a threadlocal. Then either in the save method of your models or in a signal handler, you could check for your audit flag and record your audit info if the flag had been set.
Can you get access to the request and current user in the model and centralise logic there?
Yes, but you'll be making a tradeoff. As you've touched on a little bit, there is a very clear and intentional separation of concerns between Django's ORM and it's request/authentication handling bits. There are two ways ways to get information from the request (the current user) to the ORM (your model(s)). You can manually manage updating the creator/modifier information on your objects or you can set up a mechanism to automatically handle that maintenance work. If you take the manual approach (passing the information through method calls from the request in the view to the ORM), it will be more code to maintain/test but you keep the separation of concerns in place. With the manual approach, you will be in much better shape if you ever have to work with your objects outside of the request/response cycle (e.g. cron-scripts, delayed tasks, interactive shell). If you are ok with breaking down that separation of concerns, then you could setup something where you set a thread local with the current user in a middleware and then look at that thread local in the save method of your model. Converse to the manual approach, you'll have less code to deal with but you'll have a much harder time if you ever want to work with your objects outside of the request/response cycle. Additionally, you will have to be very careful to keep everything thread-safe with the more automated approach.
Can you import the User model object and call get_current()?
Also, I think you can call views in the admin.py.