Caching in Django's ManyToManyField

Caching in Django's ManyToManyField - python

I struggle with some caching issue inside Django. So far I've seen this issue only when running testsuite. The problem is that sometimes (it seems that this happens always on second invocation of the code), Django does not update it's cache or it becomes inconsistent.
The extracted code with some debugging is:
class Source(models.Model):
name = models.CharField(max_length = 50)
quality = models.IntegerField(default = 0)
class Reference(models.Model):
url = models.URLField()
source = models.ForeignKey(Source)
class Meta:
ordering = ['-source__quality']
class Issue(models.Model):
references = models.ManyToManyField(Reference)
master = models.ForeignKey(Reference, related_name = 'mastered_issue_set')
def auto_create(instance):
issue = Issue.objects.create(master = instance)
print issue.references.count(), issue.references.all()
issue.references.add(instance)
print issue.references.count(), issue.references.all()
At first invocation I correctly get following output:
0 []
1 [<Reference: test>]
However in second call to to auto_create, Django thinks there is one reference, but it does not give it to me:
0 []
1 []
This behavior of course breaks further code. Any idea what can be going wrong here or at least how to debug it?
PS: It looks like ordering on Reference class is causing this. But it is still unclear to me why.

I was not able to reproduce with sqlite3. Could it be that the instance of Reference passed in is not saved? The following ran without a hiccup:
def auto_create(instance):
issue = Issue.objects.create(master = instance)
print issue.references.count(), issue.references.all()
assert issue.references.count()==0, "initial ref count is not null"
assert len(issue.references.all())==0, "initial ref array is not empty"
issue.references.add(instance)
print issue.references.count(), issue.references.all()
assert issue.references.count()==1, "ref count is not incremented"
assert len(issue.references.all())==1, "initial ref array is not populated"
def test_auto():
s = Source()
s.save()
r = Reference(source=s)
r.save()
auto_create(r)

In the end I've found what is causing this problem. It was my own caching code rather than Django's.
I had custom Source manager in place, which returned and cached some standard source:
class SourceManager(models.Manager):
url_source = None
def get_generic(self):
if self.url_source is None:
self.url_source, created = self.get_or_create(name = 'URL', quality = 0)
return self.url_source
class Source(models.Model):
name = models.CharField(max_length = 50)
quality = models.IntegerField(default = 0)
objects = SourceManager()
This works perfectly fine in the application - once source is created, the manager remembers it for it's existence as the sources do not change over their lifetime. However in tests they go away as the whole test is run in single transaction and then reverted.
What I find strange is that models.ForeignKey did not complain about getting non existing object, but the error appeared later, while sorting by source__quality as the underlaying JOIN SELECT could not find matching Source object.

Related

Factory-boy / Django - Factory instance not reflecting changes to model instance

I am writing tests for a website I'm working on and I'm representing the models with factoryboy Factory objects.
However, I've run into some behavior I found somewhat confusing and I was wondering if anyone here would be so kind to explain it to me
I'm running a test that contains the following model:
STATUS = (
('CALCULATING'),
('PENDING'),
('BUSY'),
('SUCCESS'),
('FAILED')
)
class SchoolImport(models.Model):
date = models.DateTimeField(auto_now_add=True)
status = models.CharField(
verbose_name=_('status'), choices=STATUS,
max_length=50, default='CALCULATING'
)
For which I've created the following factory. As you can see the status is set to its default value, which I found more realistic than having a randomly selected value
class SchoolImportFactory(factory.DjangoModelFactory):
class Meta:
model = models.SchoolImport
status = 'CALCULATING'
school = factory.SubFactory(SchoolFactory)
#factory.lazy_attribute
def date(self):
return timezone.now() - datetime.timedelta(days=10)
Below you'll see both a (simplified) version of the function that is being tested, as well as the test itself. (I've currently commented out all other code on my laptop, so the function that you see below is an accurate representation)
The gist of it is that the function receives an id value that it will use to fetch an SchoolImport object from the database and change its status. The function will be run in celery and thus returns nothing.
When I run this test through the debugger I can see that the value is changed correctly. However, when the test runs its final assertion it fails as self.school_import.status is still equal to CALCULATING.
#app.utils.py
def process_template_objects(school_import_pk):
school_import = models.SchoolImport.objects.get(id=import_file_pk)
school_import.status = 'BUSY'
school_import.save()
#app.tests.test_utils.py
class Test_process_template_objects_function(TestCase):
def setUp(self):
self.school = SchoolFactory()
self.school_import = SchoolImportFactory(
school=self.school
)
def test_function_alters_school_import_status(self):
self.assertEqual(
self.school_import.status, 'CALCULATING'
)
utils.process_template_objects(self.school_import.id)
self.assertNotEqual(
self.school_import.status, 'CALCULATING'
)
When I run this test through a debugger (with a breakpoint just before the failing assertion) and run SchoolImport.objects.get(id=self.school_import.id).status it does return the correct BUSY value.
So though the object being represented by the FactoryInstance is being updated correctly, the changes are not reflected in the factory instance itself.
Though I realize I'm probably doing something wrong here / encountering expected behavior, I was wondering how people who write tests using factoryboy fget around this behavior, or if perhaps there was a way to 'refresh' the factoryboy instance to reflect changes to the model instance.

The issue comes from the fact that, in process_template_objects, you work with a different instance of the SchoolImport object than the one in the test.
If you run:
a = models.SchoolImport.objects.get(pk=1)
b = models.SchoolImport.objects.get(pk=2)
assert a == b # True: both refer to the same object in the database
assert a is b # False: different Python objects, each with its own memory
a.status = 'SUCCESS'
a.save()
assert a.status == 'SUCCESS' # True: it was indeed changed in this Python object
assert b.status == 'SUCCESS' # False: the 'b' object hasn't seen the change
In order to fix this, you should refetch the instance from the database after calling process_template_objects:
utils.process_template_objects(self.school_import.id)
self.school_import.refresh_from_db()
See https://docs.djangoproject.com/en/2.2/ref/models/instances/#refreshing-objects-from-database for a more detailed explanation!

If you delete a field from a model instance, accessing it again reloads the value from the database.
obj = MyModel.objects.first()
del obj.field
obj.field # Loads the field from the database
See https://docs.djangoproject.com/en/2.2/ref/models/instances/#refreshing-objects-from-database

Django Many to Many Relationship Add Not Working

I'm using Django's ManyToManyField for one of my models.
class Requirement(models.Model):
name = models.CharField(max_length=200)
class Course(models.Model):
requirements = models.ManyToManyField(Requirement)
I want to be able to assign requirements for my classes, so to do that, I try the following: I get a class, course, that is already saved or that I have just saved, and I run the following:
c = Course.objects.get(title="STACK 100")
req = Requirement.objects.get(name="XYZ")
c.requirements.add(req)
While this works when I do it through the Django manage.py shell, it does not work when I do it programatically in a script. I work with other models in this script and that all works fine. And I even know it successfully retrieves the current course and the requirement as I check both. I can't figure out what the problem is!
EDIT:
What I mean by not working is that, the requirements field of the course remains empty. For example, if i do c.requirements.all(), I'll get an empty list. However, if I do this approach through the shell, the list will be populated. The script is a crawler that uses BeautifulSoup to crawl a website. I try to add requirements to courses in the following function:
def create_model_object(self, course_dict, req, sem):
semester = Semester.objects.get(season=sem)
#Checks if the course already exists in the database
existing_matches = Course.objects.filter(number=course_dict["number"])
if len(existing_matches) > 0:
existing_course = existing_matches[0]
if sem=="spring":
existing_course.spring = semester
else:
existing_course.fall = semester
existing_course.save()
c = existing_course
#Creates a new Course in the database
else:
if sem == "spring":
new_course = Course(title=course_dict["title"],
spring=semester)
else:
new_course = Course(title=course_dict["title"],
fall=semester)
new_course.save()
c = new_course
curr_req = Requirement.objects.get(name=req)
c.requirements.add(curr_req)
print(c.id)
EDIT 2:
After stepping into the function, this is what I found:
def __get__(self, instance, instance_type=None):
if instance is None:
return self
rel_model = self.related.related_model
manager = self.related_manager_cls(
model=rel_model,
query_field_name=self.related.field.name,
prefetch_cache_name=self.related.field.related_query_name(),
instance=instance,
symmetrical=False,
source_field_name=self.related.field.m2m_reverse_field_name(),
target_field_name=self.related.field.m2m_field_name(),
reverse=True,
through=self.related.field.rel.through,
)
return manager
And according to my debugger, manager is of type planner(my project name).Course.None.

Verify the last element in the sequence of the joint table and make the update. Django does not throw an error in case the pk already exists.
ALTER SEQUENCE schema.table_id_seq RESTART <last_number + 1>;

I think you need to call c.save() after c.requirements.add(curr_req)

How to delete a many to many object without deleting all objects within the relationship?

In models.py I have...
class Siteinfo(models.Model):
url = models.CharField(max_length=100)
description = models.TextField()
class Makesite(models.Model):
sitename = models.CharField(max_length=100, unique = True)
siteinfo = models.ManyToManyField(Siteinfo)
ref_id = models.ManyToManyField(RefID)
def __unicode__(self):
return u'%s' %(self.sitename)
I'm trying to delete a instance of description and replace it with another instance and still have it be associated with the same url and still be the many to many object under say. Group on.
So group1 is the site name. to create the relation I have
url = request.POST['url']
description = request.POST['description']
datsite = Makesite.objects.get(sitename=site)
datsite.siteinfo.add(Siteinfo.objects.create(url=url,description=description))
But then when I try to delete and replace the description with this bit of code it also deletes the url.
name = Makesite.objects.get(sitename=site).siteinfo.values_list('description',flat=True)[0]
Makesite.objects.get(sitename=site).siteinfo.get(description=name).delete()
I guess I could try to write some code that could get around this problem but I'd rather find a way to just delete one and add another instance in its place.

Just to be picky, you should be using forms for processing user input.
It sounds like you want to be updating an instance, not deleting and adding one nearly exactly the same.
site_info = Makesite.objects.get(sitename=site).siteinfo.get(description=name)
site_info.description = "new description"
site_info.save()
Or, more simply:
site_info = Siteinfo.objects.get(makesite__sitename=site, description=name) # only 1 query
site_info.description = "new description"
site_info.save()

How to soft delete many to many relation with Django

In my Django project, all entities deleted by the user must be soft deleted by setting the current datetime to deleted_at property. My model looks like this: Trip <-> TripDestination <-> Destination (many to many relation). In other words, a Trip can have multiple destinations.
When I delete a Trip, the SoftDeleteManager filters out all the deleted trip. However, if I request all the destinations of a trip (using get_object_or_404(Trip, pk = id)), I also get the deleted ones (i.e. TripDestination models with deleted_at == null OR deleted_at != null). I really don't understand why since all my models inherit from LifeTimeTracking and are using the SoftDeleteManager.
Can someone please help me to understand why the SoftDeleteManager isn't working for n:m relation?
class SoftDeleteManager(models.Manager):
def get_query_set(self):
query_set = super(SoftDeleteManager, self).get_query_set()
return query_set.filter(deleted_at__isnull = True)
class LifeTimeTrackingModel(models.Model):
created_at = models.DateTimeField(auto_now_add = True)
updated_at = models.DateTimeField(auto_now = True)
deleted_at = models.DateTimeField(null = True)
objects = SoftDeleteManager()
all_objects = models.Manager()
class Meta:
abstract = True
class Destination(LifeTimeTrackingModel):
city_name = models.CharField(max_length = 45)
class Trip(LifeTimeTrackingModel):
name = models.CharField(max_length = 250)
destinations = models.ManyToManyField(Destination, through = 'TripDestination')
class TripDestination(LifeTimeTrackingModel):
trip = models.ForeignKey(Trip)
destination = models.ForeignKey(Destination)
Resolution
I filed the bug 17746 in Django Bug DB. Thanks to Caspar for his help on this.

It looks like this behaviour comes from the ManyToManyField choosing to use its own manager, which the Related objects reference mentions, because when I try making up some of my own instances & try soft-deleting them using your model code (via the manage.py shell) everything works as intended.
Unfortunately it doesn't mention how you can override the model manager. I spent about 15 minutes searching through the ManyToManyField source but haven't tracked down where it instantiates its manager (looking in django/db/models/fields/related.py).
To get the behaviour you are after, you should specify use_for_related_fields = True on your SoftDeleteManager class as specified by the documentation on controlling automatic managers:
class SoftDeleteManager(models.Manager):
use_for_related_fields = True
def get_query_set(self):
query_set = super(SoftDeleteManager, self).get_query_set()
return query_set.filter(deleted_at__isnull = True)
This works as expected: I'm able to define a Trip with 2 Destinations, each through a TripDestination, and if I set a Destination's deleted_at value to datetime.datetime.now() then that Destination no longer appears in the list given by mytrip.destinations.all(), which is what you are after near as I can tell.
However, the docs also specifically say do not filter the query set by overriding get_query_set() on a manager used for related fields, so if you run into problems later, bear this in mind as a possible cause.

To enable filtering by deleted_at field of Destinantion and Trip models setting use_for_related_fields = True for SoftDeleteManager class is enough. As per Caspar's answer this does not return deleted Destinations for trip_object.destinations.all().
However from your comments we can see you would like to filter out Destinations that are linked to Trip via a TripDestination object with a set deleted_at field, a.k.a. soft delete on a through instance.
Let's clarify the way managers work. Related managers are the managers of the remote model, not of a through model.
trip_object.destinantions.some_method() calls default Destination manager.
destinantion_object.trip_set.some_method() calls default Trip manager.
TripDestination manager is not called at any time.
You can call it with trip_object.destinantions.through.objects.some_method(), if you really want to. Now, what I would do is add an Instance method Trip.get_destinations and a similar Destination.get_trips that filters out deleted connections.
If you insist on using the manager to do the filtering it gets more complicated:
class DestinationManager(models.Manager):
use_for_related_fields = True
def get_query_set(self):
query_set = super(DestinationManager, self).get_query_set()
if hasattr(self, "through"):
through_objects = self.through.objects.filter(
destination_id=query_set.filter(**self.core_filters).get().id,
trip_id=self._fk_val,
deleted_at__isnull=True)
query_set = query_set.filter(
id__in=through_objects.values("destination_id"))
return query_set.filter(deleted_at__isnull = True)
The same would have to be done for TripManager as they would differ. You may check the performance and look at django/db/models/fields/related.py for reference.
Modifying the get_queryset method of the default manager may hamper the ability to backup the database and the documentation discourages it. Writing a Trip.get_destinations method is the alternative.

ForeignKey model has no Manager (i.e., 'Foo' object has no attribute 'foo_set')

I have searched around for an answer to this but can't find one. When using a ForeignKey, I am consistently getting an error telling me that 'Foo object has no attribute 'foo_set'. I am a bit new to Django/Python, so I'm sure there is a simple answer here, but I haven't been able to find it so far. Here's some code (to store varied Boards for use in a game, each of which should have a number of Hexes associated with it):
Models:
class Boards(models.Model):
boardnum = models.IntegerField(unique=True)
boardsize = models.IntegerField(default=11)
hexside = models.IntegerField(default=25)
datecreated = models.DateTimeField(auto_now_add = True)
class Hexes(models.Model):
boardnum = models.ForeignKey(Boards, null = True)
col = models.IntegerField()
row = models.IntegerField()
cost = models.IntegerField(default=1)
Code (this works):
newboard, createb = Boards.objects.get_or_create(boardnum=boardn)
createb returns True.
Code (this immediately follows the above, and does not work):
try:
hx = newboard.boards_set.create(col=c, row=r)
except Exception, err:
print "error:", err
traceback.print_exc()
Both "err" and "traceback.print_exc()" give: AttributeError: 'Boards' object has no attribute 'boards_set'
I get the same error if I first create the Hexes record with a get_or_create and then try a newboard.boards_set.add() on it.
Any ideas? All suggestions appreciated.

The name that Django uses for a reverse foreign key manager is the name of the model that contains the foreign key, not the name of the model that the manager is on.
In your case, it will be:
newboard.hexes_set.create(col=c,row=r)
I find it useful to use the manage.py shell command to import your models and inspect them (with dir, etc) to check out all the available attributes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Caching in Django's ManyToManyField - python

Related

Factory-boy / Django - Factory instance not reflecting changes to model instance

Django Many to Many Relationship Add Not Working

How to delete a many to many object without deleting all objects within the relationship?

How to soft delete many to many relation with Django

ForeignKey model has no Manager (i.e., 'Foo' object has no attribute 'foo_set')

Categories

Resources