How to clear cache for specific model in NDB

How to clear cache for specific model in NDB - python

I am in the process of transitioning to the NDB, and I am using two model sets: one based in plain old google.appengine.ext.db and one based on new fancy google.appengine.ext.ndb.
I would like to use NDB-based models for read-only and retain the caching that's built into NDB, while being able to store changes using the old models (and signal the need to update caches to the NDB when needed).
How can I flush/clear cache for a specific model instance in the NDB while saving changes in the model based on old db?

I would recommend just disabling the cache for those model classes that you have in duplicate; better be safe than sorry. This is easily done by putting
_use_memcache = False
_use_cache = False
inside each ndb.Model subclass (i.e. before or after the property declarations). Docs for this are here: https://developers.google.com/appengine/docs/python/ndb/cache#policy_functions (look for the table towards the end).
If you really want to clear the cache only when you write an entity using a old db.Model subclass, instead of the above you can try the following (assume ent is a db.Model subclass instance):
ndbkey = ndb.Key.from_old_key(ent.key())
ndbkey.delete(use_datastore=False)
This deletes the key from memcache and from the context cache but does not delete it from the datastore. When you try to read it back using its NDB key (or even when it comes back as a query result), however, it will appear to be deleted until the current HTTP request handler finishes, and it will not use memcache for about 30 seconds.

Related

Django Model with API as data source

I want to create a new model which uses the https://developer.microsoft.com/en-us/graph/graph-explorer api as a data source as i want to have additional info on the user.
Using a computed property on the model does not work as it is going to query for each instance in the set.
So, i want to have the model relate to a new model which has the api as it´s data source.
I could not find anything on this topic, besides maybe abusing the from_db() method, if this even works.

It appears that what you're trying to do is to cache data from an external API that relates to, and augments/enriches, your user model data. If so, you can simply use a custom user model (instead of Django's default; this is a highly-recommended practice anyway) and then simply store the external API data in serialized form in a TextField attribute of your custom user model (let's call it user_extras; you can write model methods that serializes and deserializes this field for convenience upon access in your views).
The key challenge then is how to keep user_extras fresh, without doing something terrible performance-wise or hitting some constraint like API call limits. As you said, we can't do API queries in computed properties. (At least not synchronously.) One option then is to have a batch job/background task that regularly goes through your user database to update the user_extras in some controlled, predictable fashion.

In Django, given a model instance, how can I determine if a ForeignKey field has been fully hydrated or not?

Context:
I maintain legacy Django code. In many cases my code receives multiple model objects, each of which has a ForeignKey or a manually cached property that represents the same data entity. The data entities referred to thus do not change. However, not all objects I receive have ever accessed those ForeignKey fields or cached properties, so the data may not be present on those objects, though it will be lazy-loaded on first access.
I do not easily have access to/control over the code that declares the models.
I want to find any one of the objects I have which has a primed cache, so that I can avoid hitting the database to retrieve the data I'm after (if none have it, I'll do it if I have to). This is because the fetch of that data is frequent enough that it causes performance issues.
Django 1.6, Python 2.7.
Problem
I can interrogate our manually cached fields and say internal_is_cached(instance, 'fieldname') without running a query. However, I cannot do this with ForeignKey fields.
Say I have a model class in Django, Foo, like so:
class Foo(models.Model):
bar = models.ForeignKey('BarModel')
Question
If I get an instance of the model Foo from somewhere, but I do not know if bar has ever been called on it, or if it has been eagerly fetched, how do I determine if reading instance.bar will query the database or not?
In other words, I want to externally determine if an arbitrary model has its internal cache primed for a given ForeignKey, with zero other knowledge about the state or source of that model.
What I've Tried
I tried model caching using the Django cache to make an "end run" around the issue. The fetches of the related data are frequent enough that they caused unsustainable load on our caching systems.
I tried various solutions from this question. They work well for modified models, but do not seem to work for models which haven't been mutated--I'm interested in the "lazy load" state of a model, not the "pending modification" state. Many of those solutions are also inapplicable since they require changing model inheritance or behavior, which I'd like to avoid if possible (politics).
This question looked promising, but it requires control over the model initial-reader process. The model objects my code receives could come from anywhere.
Doing the reverse of the after-the-fact cache priming described in [this writeup] works for me in testing. However, it relies on the _default_manager internal method of model objects, which is known to be an inaccurate field reference for some of our (highly customized) model objects in production. Some of them are quite weird, and I'd prefer to stick to documented (or at least stable and not frequently bypassed) APIs if possible.

Thanks #Daniel Roseman for the clarification.
With Django version <= 1.6 (working solution in your case)
You can check if your foo_instance has a _bar_cache attribute:
hasattr(foo_instance, "_bar_cache")
Like explained here.
With Django version > 1.6
The cached fields are now stored in the fields_cache dict in the _state attribute:
foo_instance._state.fields_cache["bar"]

Dynamically generate accompanying model in django

In my project i have many models in multiple apps, all of them inherit from one abstract model. I would like to create a model that would hold the changes to the history for every one of my models, so that every model would have its own history model. Each model would also have one-to-many relation to its history model. All history models would be the same, except for the foreign key to their respective model.
My problem is that I do not want to write all the history models manually. Instead i would like to have the history model created for every model automatically, so I don't have to write all that boilerplate code. Can this be achieved?

There is a widely-used django package that I believe solves this exact problem called django-reversion with a nice API. I recommend using it if it fits your needs rather than building a custom solution.
Object version control is usually better solved by serializing your objects and storing the serialization every time they are edited (e.g. in the json format).
You may also want to keep track of when objects are deleted.
This way, you only need to store a reference to the serialized object. Versions of all objects can live in the same database table and reference their "source" object using Django's generic relations.

You can eventually create your classes dynamically with type()
There is many ways to do it, but you can do something as follows:
class SomeParentClass: pass
NewClass = type('NewClass', (SomeParentClass,), {'new_method': lambda self:
'foo' } )
new_class_instance = NewClass()
print(new_class_instance.new_method())
So you can create models dynamically, with a different name, inherit from a different class, new methods...
You can then use globals()[variable_name_to_store_class] to assign newly created classes to a dynamic variable name.
Hope its relavant for your problem.

What is the best way to override the delete function in django python

I have the Model where i have relations with 3 diff models.
Now i know that if i use
object.delete() , then child objects will also gets deleted.
Now the problem is that in my whole models classes i have the database column called DELETED which i want to set to 1 whenever someone deletes some object.
I can override the deleted function in class called BaseModel and and override the custom delete method of updating field to 1. But the problem is
If i do that way then i have to manually go through all the cascading relationships and manually call the delete ob every object.
Is there any way that by just calling object.delete(). It automatically traverses through child objects as well

Please look at Django: How can I find which of my models refer to a model.
You can use a Collector to get all references to all the necessary items using collect(). This is the code Django is using to simulate the CASCADE behavior. Once you have collected all the references, for each of those items you can update the DELETED column.
More info in the code.
Good luck.

Django signals vs. overriding save method

I'm having trouble wrapping my head around this. Right now I have some models that looks kind of like this:
def Review(models.Model)
...fields...
overall_score = models.FloatField(blank=True)
def Score(models.Model)
review = models.ForeignKey(Review)
question = models.TextField()
grade = models.IntegerField()
A Review is has several "scores", the overall_score is the average of the scores. When a review or a score is saved, I need to recalculate the overall_score average. Right now I'm using a overridden save method. Would there be any benefits to using Django's signal dispatcher?

Save/delete signals are generally favourable in situations where you need to make changes which aren't completely specific to the model in question, or could be applied to models which have something in common, or could be configured for use across models.
One common task in overridden save methods is automated generation of slugs from some text field in a model. That's an example of something which, if you needed to implement it for a number of models, would benefit from using a pre_save signal, where the signal handler could take the name of the slug field and the name of the field to generate the slug from. Once you have something like that in place, any enhanced functionality you put in place will also apply to all models - e.g. looking up the slug you're about to add for the type of model in question, to ensure uniqueness.
Reusable applications often benefit from the use of signals - if the functionality they provide can be applied to any model, they generally (unless it's unavoidable) won't want users to have to directly modify their models in order to benefit from it.
With django-mptt, for example, I used the pre_save signal to manage a set of fields which describe a tree structure for the model which is about to be created or updated and the pre_delete signal to remove tree structure details for the object being deleted and its entire sub-tree of objects before it and they are deleted. Due to the use of signals, users don't have to add or modify save or delete methods on their models to have this management done for them, they just have to let django-mptt know which models they want it to manage.

You asked:
Would there be any benefits to using Django's signal dispatcher?
I found this in the django docs:
Overridden model methods are not called on bulk operations
Note that the delete() method for an object is not necessarily called
when deleting objects in bulk using a QuerySet or as a result of a
cascading delete. To ensure customized delete logic gets executed, you
can use pre_delete and/or post_delete signals.
Unfortunately, there isn’t a workaround when creating or updating
objects in bulk, since none of save(), pre_save, and post_save are
called.
From: Overriding predefined model methods

Small addition from Django docs about bulk delete (.delete() method on QuerySet objects):
Keep in mind that this will, whenever possible, be executed purely in
SQL, and so the delete() methods of individual object instances will
not necessarily be called during the process. If you’ve provided a
custom delete() method on a model class and want to ensure that it is
called, you will need to “manually” delete instances of that model
(e.g., by iterating over a QuerySet and calling delete() on each
object individually) rather than using the bulk delete() method of a
QuerySet.
https://docs.djangoproject.com/en/1.11/topics/db/queries/#deleting-objects
And bulk update (.update() method on QuerySet objects):
Finally, realize that update() does an update at the SQL level and,
thus, does not call any save() methods on your models, nor does it
emit the pre_save or post_save signals (which are a consequence of
calling Model.save()). If you want to update a bunch of records for a
model that has a custom save() method, loop over them and call save()
https://docs.djangoproject.com/en/2.1/ref/models/querysets/#update

If you'll use signals you'd be able to update Review score each time related score model gets saved. But if don't need such functionality i don't see any reason to put this into signal, that's pretty model-related stuff.

It is a kind sort of denormalisation. Look at this pretty solution. In-place composition field definition.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to clear cache for specific model in NDB - python

Related

Django Model with API as data source

In Django, given a model instance, how can I determine if a ForeignKey field has been fully hydrated or not?

Dynamically generate accompanying model in django

What is the best way to override the delete function in django python

Django signals vs. overriding save method

Categories

Resources