I am using Django non-rel version with mongodb backends. I am interested in tracking the changes that occur on model instances e.g if someone creates/edits or deletes a model instance. Backend db is mongo hence models have an associated "_id" fields with them in the respective collections/dbs.
Now i want to extract this "_id" field on which this modif operation took place. The idea is to write this "_id" field to another db so someone can pick it up from there and know what object was updated.
I thought about overriding the save() method from Django "models.Model" since all my models are derived from that. However the mongo "_id" field is obviously not present there since the mongo-insert has not taken place yet.
Is there any possibility of a pseudo post-save() method that can be called after the save operation has taken place into mongo? Can django/django-toolbox/pymongo provide such a combination?
After some deep digging into the Django Models i was able to solve the problem. The save() method inturn call the save_base() method. This method saves the returned results, ids in case of mongo, into self.id. This _id field can then be picked by by over riding the save() method for the model
Related
I'm developing a Django application that uses mongodb in a part of it. My database schema is changing (some fields are deleted/added continuously), so how to integrate it with Django in a way that accepts the changes without changing the old data, and without affecting the search queries?
I have searched about the available libraries, and I found the below:
mongoengine and django-mongodb-engine: mongoengine is not supported for a while now and not updated, also django-mongodb-engine required a forked old django-nonrel package. This matches with the question: Use pymongo in django directly
djongo: Initially it is working fine with the most updated versions of python and Django, and it accepts the changes in my database models without migrations and Django admin panel is working fine. Later on, after applying some changes, I have faced an issue when it comes to querying the database or listing the data in the admin panel. The old data fails to fit with the new model if the change includes deleted fields.
pymongo: The disadvantage is that I cannot use django models or panel and I have to build my own database abstract layer, but the advantage is about the higher control that I will have over the database. It will be like the first solution in Use pymongo in django directly , then I can build some layers for the different database structures that I will have
Using djongo
Let's say that I have a model called Test:
models.py
from djongo import models
class Test(models.Model):
x = models.CharField(max_length=100)
y = models.CharField(max_length=100)
I have created a new object as below:
{
_id: ObjectId("..."),
x: "x1",
y: "y1"
}
Then, I have removed the y field and added a new field called z, then I have created a new object, so it is created as below:
{
_id: ObjectId("..."),
x: "x2",
z: "z2"
}
Now, I want to extract all the collection data as below:
python manage.py shell
>>Test.objects.all()
Error as field "y" is not exist in the model
>>Test.objects.filter(z="z2")
Error as field "y" is not exist in the model
I can understand that it cannot be mapped to the new changed model, but I want the old fields to be ignored at least without errors exactly like the queries in mongodb directly.
According to my request, it is the wrong approach to use djongo? Or is there any workaround for to handle that issue? If no, how to apply that properly using pymongo? I expect to change my collection fields with addition or deletion anytime, and extracting all the data anytime without errors.
Context:
I maintain legacy Django code. In many cases my code receives multiple model objects, each of which has a ForeignKey or a manually cached property that represents the same data entity. The data entities referred to thus do not change. However, not all objects I receive have ever accessed those ForeignKey fields or cached properties, so the data may not be present on those objects, though it will be lazy-loaded on first access.
I do not easily have access to/control over the code that declares the models.
I want to find any one of the objects I have which has a primed cache, so that I can avoid hitting the database to retrieve the data I'm after (if none have it, I'll do it if I have to). This is because the fetch of that data is frequent enough that it causes performance issues.
Django 1.6, Python 2.7.
Problem
I can interrogate our manually cached fields and say internal_is_cached(instance, 'fieldname') without running a query. However, I cannot do this with ForeignKey fields.
Say I have a model class in Django, Foo, like so:
class Foo(models.Model):
bar = models.ForeignKey('BarModel')
Question
If I get an instance of the model Foo from somewhere, but I do not know if bar has ever been called on it, or if it has been eagerly fetched, how do I determine if reading instance.bar will query the database or not?
In other words, I want to externally determine if an arbitrary model has its internal cache primed for a given ForeignKey, with zero other knowledge about the state or source of that model.
What I've Tried
I tried model caching using the Django cache to make an "end run" around the issue. The fetches of the related data are frequent enough that they caused unsustainable load on our caching systems.
I tried various solutions from this question. They work well for modified models, but do not seem to work for models which haven't been mutated--I'm interested in the "lazy load" state of a model, not the "pending modification" state. Many of those solutions are also inapplicable since they require changing model inheritance or behavior, which I'd like to avoid if possible (politics).
This question looked promising, but it requires control over the model initial-reader process. The model objects my code receives could come from anywhere.
Doing the reverse of the after-the-fact cache priming described in [this writeup] works for me in testing. However, it relies on the _default_manager internal method of model objects, which is known to be an inaccurate field reference for some of our (highly customized) model objects in production. Some of them are quite weird, and I'd prefer to stick to documented (or at least stable and not frequently bypassed) APIs if possible.
Thanks #Daniel Roseman for the clarification.
With Django version <= 1.6 (working solution in your case)
You can check if your foo_instance has a _bar_cache attribute:
hasattr(foo_instance, "_bar_cache")
Like explained here.
With Django version > 1.6
The cached fields are now stored in the fields_cache dict in the _state attribute:
foo_instance._state.fields_cache["bar"]
I am using Django 1.11, in one of my models I have added actions when the model is saved.
However I don't want these actions to be done when only a part of the model is saved.
I know that update_fields=('some_field',) can be used to specify which field must be saved.
But, when the object has been fetched in the database using the methods only() or defer() I don't see any information about the fields updated in the save() method, update_fields is empty.
Thus my question: How can I get the fields saved by Django when only some fields have been fetched ?
When you use defer or only to load an instance, the get_deferred_fields() method returns a list of field names that have not been loaded; you should be able to use this to work out which ones will be saved.
In django model query, i want to know the sequential execution of it. Consider a query Blog.objects.get(name='palm').
In this where the Blog is defined, is that same as class blog in models.py?
What is objects i can't find anything related to this in source files of django. If Blog is a class, then what is the type of objects?
I want a development side concept. Can any one please explain how django makes these possible?
Every non-abstract Django model class has an attribute objects attached to it (unless you of course explicitly remove it).
object is a Manager. It is an object that has a lot of methods to construct queries that are then send to the database to fetch/store data.
So you first access the objects manager of the Blog class, next you call .get(name='palm') on it. This means that Django will translate this into a query. This depends on the database system you use. For instance if it is MySQL it will look like:
SELECT name, some, other columns
FROM app_blog
WHERE name = 'palm'
The database will respond with zero, one or more rows, and Django will, in case no or more than one row is found, raise a DoesNotExists or MultipleObjectsReturned error. Otherwise it will load the data into a Blog object (by deserializing the columns into Python objects).
I have the Model where i have relations with 3 diff models.
Now i know that if i use
object.delete() , then child objects will also gets deleted.
Now the problem is that in my whole models classes i have the database column called DELETED which i want to set to 1 whenever someone deletes some object.
I can override the deleted function in class called BaseModel and and override the custom delete method of updating field to 1. But the problem is
If i do that way then i have to manually go through all the cascading relationships and manually call the delete ob every object.
Is there any way that by just calling object.delete(). It automatically traverses through child objects as well
Please look at Django: How can I find which of my models refer to a model.
You can use a Collector to get all references to all the necessary items using collect(). This is the code Django is using to simulate the CASCADE behavior. Once you have collected all the references, for each of those items you can update the DELETED column.
More info in the code.
Good luck.