I have a database model that is being updated based on changes in remote data (via an HTML scraper).
I want to maintain a field called changed - a timestamp denoting when the last time that model's values changed from what they were previously (note that this is different from auto_now as these fields are updated every time a model's save method is called).
Here is my question:
In a model's save method, is there a straightforward way to detect if a model instance's current values are different from the values in the database? Or, are there any alternative methods to easily maintain a changed timestamp?
If you save your instance through a form, you can check form.has_changed().
http://code.activestate.com/pypm/django-dirtyfields/
Tracks dirty/changed fields on a django model instance.
Sounds to me like what you want is Signals: http://docs.djangoproject.com/en/1.2/topics/signals/
You could use a post_save signal to update a related field in another model to store the previous value. Then on the next go-round you'd have something to compare.
You might try computing a checksum of the record values when you save them. Then when you read it later, recompute the checksum and see if it has changed. Perhaps the crc32 function in the Python zlib standard module. (I'm not sure what kind of performance this would have. So you may want to investigate that.)
This library has tracks FK lookups.
https://github.com/mmilkin/django_dirty_bits
Related
Using:
django 1.10 reversion 2.0.8.
My question is how to show a nice list of changes done to a given model instance. By that I mean that the user can quickly see a list of all the changes (new values for fields) in all revisions. He doesn't need o see all the fields only the new values of the changed ones.
So I found that a good tool for storing changes is django-reversion. However, I cannot find a solution for my problem which as I mentioned is to show a nice change-log history for a given model instance.
I found solution that can compare two revisions django-reversion-compare, but that is not what I am looking for. Maybe there is a better tool for that ?
The task is too quickly show to user what was changed by who and when. The model is simple and doesn't store a lot of data. It does store however foreign keys.
I was also looking to do the same, and after reading up a few SO posts, docs etc., it seems I had to roughly choose the solution from one of the following 3 approaches:
1) Fetch the existing model instance before saving the new model instance. Compare each field. Put the changed field in reversion.set_comment('(all changes here)'). Continue with saving the model instance.
2) Save a copy of the old fields separately in model's __init__() and later compare the new fields with them (in model's save()) to track what changed. Put the changed fields in reversion.set_comment('(all changes here)'). Continue with saving the model instance. (This approach will save a DB lookup)
3) Generate a diff using django-reversion's low-level API and integrate with the Admin somehow
I ended up using django-reversion-compare which worked great for me showing the edits wiki-style (which may be using (3) above anyways)
django-reversion's developer also confirmed (3) as a better option which also avoids race condition.
If you would like to explore different options, this is a great SO post with lots of good ideas with their pros/cons.
(I am also on Django 1.10)
I have a pre-save signal for one of my models. This pre-save signal does some background API activity to syndicate new and updated objects to service providers and return meaningless data for us to store as references in the places of the original data.
The new and update methods are different in the API.
Ideally, if a user were to perform an update they would be clearing the meaningless data from a field and typing over it. My signal would need to know which fields were updated to send changes for just those fields, as sending all fields in an update would send meaningless references as the raw data in addition to the updates.
The pre-save signal has the argument update_fields. I searched for some details and found that this argument may include all fields when an update is performed.
Regarding update_fields as the docs have little information on this
When creating an object, does anything get passed to update_fields?
When updating an object, do all fields get passed to update_fields, or just the ones that were updated?
Is there some other suggestions on how to tackle this? I know post_save has the created argument, but I'd prefer to operate on the data before it's saved.
When creating an object, does anything get passed to update_fields?
No.
When updating an object, do all fields get passed to update_fields, or just the ones that were updated?
Depends who is calling the save() method. By default, Django doesn't set update_fields. Unless your code calls save() with the update_fields argument set, it will rewrite all the fields in the database and the pre_save signal will see update_fields=None.
My signal would need to know which fields were updated to send changes for just those fields.
Unless you are controlling what calls the save() method on the object, you will not get this information using update_fields. The purpose of that argument is not to let you track which fields have changed - rather it is to facilitate efficient writing of data when you know that only certain columns in the database need to be written.
I didn't read the doc enough before starting this, my mistake.
I have :
class A(db.Model):
date = db.DateTimeProperty(auto_now_add=True)
I would prefer auto_now=True instead. Can I just change it ? I know that a change won't affect existing data (i.e it won't magically change the date of objects in the datastore to their last update date).
Bu what will happen to entities that were created with the auto_now_add=True ? Is a model transformation like that permitted ? Or will this just affect new objects ?
I can reformulate my questions if I am not clear, don't hesitate to ask
This is not a model transformation. auto_now and auto_now_add are applied entirely in the Python db client, not at the datastore level. You can change it whenever you like, and all entities that you modify after making that change (as long as you're using the new code) will update the date field when put() is called.
I am using 0.97-pre-SVN-unknown release of Django.
I have a model for which I have not given any primary_key. Django, consequently, automatically provides an AutoField that is called "id". Everything's fine with that. But now, I have to change the "verbose_name" of that AutoField to something other than "id". I cannot override the "id" field the usual way, because that would require dropping/resetting the entire model and its data (which is strictly not an option). I cannot find another way around it. Does what I want even possible to achieve? If you may suggest any alternatives that would get me away with what I want without having to drop the model/table, I'd be happy.
Hmm... and what about explicitly write id field in the model definition? Like this for example:
class Entry(models.Model):
id = models.AutoField(verbose_name="custom name")
# and other fields...
It doesn't require any underlying database changes.
Look into the command-line options for manage.py; there's a command to dump all of the model data to JSON, and another command to load it back in from JSON. You can export all of your model data, add your new field to the model, then import your data back in. Just make sure that you set the db_column option to 'id' so you don't break your existing data.
Edit: Specifically, you want the commands dumpdata and loaddata.
I'm having trouble wrapping my head around this. Right now I have some models that looks kind of like this:
def Review(models.Model)
...fields...
overall_score = models.FloatField(blank=True)
def Score(models.Model)
review = models.ForeignKey(Review)
question = models.TextField()
grade = models.IntegerField()
A Review is has several "scores", the overall_score is the average of the scores. When a review or a score is saved, I need to recalculate the overall_score average. Right now I'm using a overridden save method. Would there be any benefits to using Django's signal dispatcher?
Save/delete signals are generally favourable in situations where you need to make changes which aren't completely specific to the model in question, or could be applied to models which have something in common, or could be configured for use across models.
One common task in overridden save methods is automated generation of slugs from some text field in a model. That's an example of something which, if you needed to implement it for a number of models, would benefit from using a pre_save signal, where the signal handler could take the name of the slug field and the name of the field to generate the slug from. Once you have something like that in place, any enhanced functionality you put in place will also apply to all models - e.g. looking up the slug you're about to add for the type of model in question, to ensure uniqueness.
Reusable applications often benefit from the use of signals - if the functionality they provide can be applied to any model, they generally (unless it's unavoidable) won't want users to have to directly modify their models in order to benefit from it.
With django-mptt, for example, I used the pre_save signal to manage a set of fields which describe a tree structure for the model which is about to be created or updated and the pre_delete signal to remove tree structure details for the object being deleted and its entire sub-tree of objects before it and they are deleted. Due to the use of signals, users don't have to add or modify save or delete methods on their models to have this management done for them, they just have to let django-mptt know which models they want it to manage.
You asked:
Would there be any benefits to using Django's signal dispatcher?
I found this in the django docs:
Overridden model methods are not called on bulk operations
Note that the delete() method for an object is not necessarily called
when deleting objects in bulk using a QuerySet or as a result of a
cascading delete. To ensure customized delete logic gets executed, you
can use pre_delete and/or post_delete signals.
Unfortunately, there isn’t a workaround when creating or updating
objects in bulk, since none of save(), pre_save, and post_save are
called.
From: Overriding predefined model methods
Small addition from Django docs about bulk delete (.delete() method on QuerySet objects):
Keep in mind that this will, whenever possible, be executed purely in
SQL, and so the delete() methods of individual object instances will
not necessarily be called during the process. If you’ve provided a
custom delete() method on a model class and want to ensure that it is
called, you will need to “manually” delete instances of that model
(e.g., by iterating over a QuerySet and calling delete() on each
object individually) rather than using the bulk delete() method of a
QuerySet.
https://docs.djangoproject.com/en/1.11/topics/db/queries/#deleting-objects
And bulk update (.update() method on QuerySet objects):
Finally, realize that update() does an update at the SQL level and,
thus, does not call any save() methods on your models, nor does it
emit the pre_save or post_save signals (which are a consequence of
calling Model.save()). If you want to update a bunch of records for a
model that has a custom save() method, loop over them and call save()
https://docs.djangoproject.com/en/2.1/ref/models/querysets/#update
If you'll use signals you'd be able to update Review score each time related score model gets saved. But if don't need such functionality i don't see any reason to put this into signal, that's pretty model-related stuff.
It is a kind sort of denormalisation. Look at this pretty solution. In-place composition field definition.