Environment
Python 3.8.10
SQLAlchemy 1.3.22
Problem
I am having a problem related to the stated here, but couldn't find a solution yet: How to prevent related object persistence in sqlalchemy?
I have two models with a one to many relationship:
class A:
b = db.relationship("B", back_populates="a", cascade="all,delete")
class B:
a_id = db.Column('A_ID', db.Integer, db.ForeignKey('A.ID'), nullable=False)
a = db.relationship('A', back_populates="b")
The thing is that at a certain point, I need to modify a certain field of B. To do so, I need to access the object A related to the current B object to do some checks. But, as A is a class that uses translations and we need to have that in mind.
What translate does is overwrite the translatable fields of A in the current instance, without storing them in the database (we have the translations on a diferent table due to backwards compatibility). I need to do that translation to do the checks and get the correct value I should set to B. The problem is as follows:
# some_service.py
def get_by_id(a_id):
a = A.get_by_id(a_id)
if a:
return translate(a)
# file_y.py
def get_value_to_update_b(a_id)
a = some_service.get_by_id(a_id)
# do some stuff without saving anything to A
# file_x.py
b = B.get_by_id(id) # At this point, b.a stores the original A object
value = file_y.get_value_to_update_b(b.a_id) # After this executes, b.a points to the translated one, instead of the original, so when B is saved, the translated A is saved too.
b.value = value
session.add(b)
session.commit()
As you can see, the problem is that when I translate A to do the checks, the reference from B gets updated to A_translated, so when I save B, A is saved too with the translated values, which is incorrect.
I have already tried modifying the cascade attribute of the relationships (to None and merge), making a copy of the object A before translating and some other choices. And changing the whole translation process, although a possibility, it's something I would rather have as the last option as this is something that urges us.
Is there anything else I could do to prevent A being saved when B is saved? If you have any question about the process, as I think it can be a little bit messy, I would gladly answer you. Thank you very much
In the end I ended up with another approach. I couldn't find a solution for the simultaneous storing of A and B, but I just made the checks different so the translation didn't overwrite the original instance.
Related
I'm using SQLAlchemy with Python (linking to an MySQL database) and am having a little design issue, which has presented itself as a problem with a backref.
So the situation is this. I have a SearchGroup which contains TargetObjects and SearchObjects. These are both many to many relationships, and so the SearchGroup table comes with two association tables, one for each. The SearchObject is the same time for any SearchGroup, but the TargetObject varies. So far so good. The whole idea here is that a SearchObject is simply a string with a few other variables, and a SearchGroup compares them all to a given string and then, if there's a match, supplies the target objects.
Now for some code: the declaration of these three classes, although with the parent logic hidden for brevity:
class AssocTable_GroupCMClassesGrades_Grades(AssociationTable_Group_TargetObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_Grades'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_targetObjectsTableName = 'Grades'
class AssocTable_GroupCMClassesGrades_SearchObjects(AssociationTable_Group_SearchObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_SearchObjects'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_searchObjectsTableName = 'SearchObjects'
class SearchGroup_CMClasses_Grades(SearchObjectGroupParent, med.DeclarativeBase):
__tablename__ = 'SearchGroup_CMClasses_Grades'
targetAssociatedTargetObjectTableName = 'AssocTable_GroupCMClassesGrades_Grades'
targetAssociatedSearchObjectTableName = 'AssocTable_GroupCMClassesGrades_SearchObjects'
targetClassName = 'Grade'
myClassName = 'SearchGroup_CMClasses_Grades'
searchObjectClassName = 'SearchObject'
searchObjectChildrenBackRefName = 'Groups'
The top two are the association tables and the bottom is the main class. The strings are used to set up various foreign keys and relationships and such.
Let's look at a specific example, which is crucial to the question:
#declared_attr
def searchObject_childen(cls):
return relationship(f'{cls.searchObjectClassName}', secondary=f'{cls.targetAssociatedSearchObjectTableName}', backref=f'{cls.searchObjectChildrenBackRefName}')
This is inside the SearchObjectGroupParent class and, as you can see, is for the 'children' of the SearchGroup, which are SearchObjects.
So now to the problem.
That all works rather well, except for one thing. If I could direct your attention back to the large bit of code above, and to this line:
searchObjectChildrenBackRefName = 'Groups'
This, as seen in the second posted piece of code (the declared_attr one), sets up a backref; a property in the target - it creates that property and then populates it. I'm not an expert at this by any means so I won't pretend to be. The point is this: if I create another SearchObjectGroupParent derived class, like the one above, with its association tables, I can't put another 'Groups' property into SearchObject - in fact it will throw an error telling me as much:
sqlalchemy.exc.ArgumentError: Error creating backref 'Groups' on relationship 'SearchGroup_CMClasses_Grades.searchObject_childen': property of that name exists on mapper 'mapped class SearchObject->SearchObjects'
There is a rather unsatisfying way to solve this, which is to simple change that name each time, but then the SearchObject won't have a common list of SearchGroups. In fact it will contain the 'Groups' property for every SearchGroup. This will work, but will be messy and I'd rather not do it. What I would like is to say 'okay, if this backref already exists, just use that one'. I don't know if that's possible, but I think such a thing would solve my problem.
Edit: I thought an image might help explain better:
Figure 1: what I have now:
The more of these objects derived from SearchObjectsGroupParent I have, the messier it will be (SearchObject will contain Groups03, Groups04, Groups05, etc.).
Figure 2: what I want:
Assume I have in models.pysomething like:
Class ModelA(models.Model):
# many fields, including
relatives = models.ManyToManyField(Person)
)
# also, A is foreign key to other models:
Class SomeOtherModel(models.Model):
mya = models.ForeignKey(A)
# now we produce two classes with multi-table inheritance
# WITHOUT any additional fileds
Class InhertA1(ModelA):
pass
Class InhertA2(ModelA):
pass
So as as I understand, this will create Tables for ModelA, InheritA1 and InheritA1; each instance of ModelA will get a row in the ModelA-table only, each instance of InheritA1 will have a row both in the ModelA-table (basically containing all the data) and another in the InheritA1-table (only containing a PK and a OneToOne key pointing to the ModelA-table row), etc. Django queries for ModelA-objects will give all objects, queries for InheritA1-objects only the InheritA1 ones etc.
So now I have an InheritA1-object a1, and want to make it into a InheritA2-object, without changing the according ModelA-object. So previously the parent-IbeToOne Key of a1 points to the ModelA-row with key 3, say (and the ForeignKey of some SomeOtherModel-object is set to 3). In the end, I want a InheritA1-object a2 pointing to the same, unchanged ModelA-row (and the object a1removed).
It seems that django doesn't offer any such move-Class-functionality?
Can I safely implement the according SQL operations myself?
Or will things go horribly wrong? I.e., can I just execute the SQL commands that
Create a new row in the InheritA2-table, setting the parent-OneToOne key to the one of a1,
Remove the a1 row in the InheritA2-table?
It seems I cannot do this from non-SQL-django without automatically creating a ModelA-row. Well, for 1., maybe I can create a temporary object x that way, then let p be the parent of x, then change the parent-OneToOne key of x to point to the one of a1, then delete the obect p? But for 2, I do not think that it is possible in non-SQL-django to remove an instance of a child while keeping the parent object?
Alternatively, is there a good django way to copy instances in django and change references to them?
I.e., I could create a new InheritA2 object y, and copy all the properties of a1 into the new object, and then go through the database and find all ManyToMany and ForeignKey entries that point to the parent of a1
and change it to point to the parent of y instead, and then delete a1.
That could be done with non-SQL-django, the downside is that it seems wasteful performance-wise (which would be of no great concern for my project), and that it might also not be so "stable" I.e., if I change ModelA or other models in relation to it, the code that copies everything might break? (Or is there a good way to do something like
Forall models M in my project:
Forall keys k used in M:
If k is a descendant of a ManyToMany or Foreign or ... key:
If k points to ModelA:
Forall instances x of M:
If x.k=a1:
x.k=y
The first four lines seem rather dubious.
Remarks:
Copying without changing the instance can be done in a stable, simple, standard way, see e.g. here, but we are still stuck in the same child class (and still have to modify ForeignKeys etc)?
Changing the class by just declaring it in the standard python way, see here, is not an option for me, as nobody seems to know whether it will horribly break django.
If you plan on changing the children but not the parent, then maybe you could use OneToOneField instead of direct inheritance.
class ModelA(models.Model):
pass
class InhertA1(models.Model):
a = models.OneToOneField(ModelA, primary_key=True)
class InhertA2(models.Model):
a = models.OneToOneField(ModelA, primary_key=True)
It gives you the same 3 tables in the database. (One difference is that the pk fields of InheritA1 and InheritA2 will be the same id from the parent.)
For changing from InheritA1 to InheritA2 you would delete one child instance (this would not affect the parent instance) and then create the other new instance, pointing it to the parent instance.
Well, you can even have a parent instance which has children from both other models, but that would be checked in your view to prevent that.
Let me know if this helps you, even if the answer is a bit late.
I have a datastore entity with several properties. Each property is updated using a separate method. However, every so often I find that either a method overwrites a property it is not modifying with an old value (Null).
For example.
class SomeModel(ndb.Model):
property1 = ndb.StringProperty()
property2 = ndb.StringProperty()
def method1(self, entity_key_urlsafe):
data1 = ndb.Key(urlsafe = entity_key_urlsafe).get()
data1.property1 = "1"
data1.put()
The data 1 entity now has property1 with value of "1"
def method2(self, entity_key_urlsafe):
data1 = ndb.Key(urlsafe = entity_key_urlsafe).get()
data1.property2 = "2"
data1.put()
The data 1 entity now has property2 with value of "2"
However, if these methods are run to closely in succession - method2 seems to overwrite property1 with its initial value (Null).
To get around this issue, I've been using the deferred library, however it's not reliable (deferred entities seem to disappear every now-and-then) or predictable (the _countdown time seems to be for guidance at best) enough.
My question is: Is there a way to only retrieve and modify one property of a datastore entity without overwriting the rest when you call data1.put()? I.e. In the case of method2 - could I only write to property2 without overwriting property1?
The way to prevent such overwrites, is to make sure your updates are done inside transactions. With NDB this is really easy - just attach the #ndb.transactional decorator to your methods:
#ndb.transactional
def method1(self, entity_key_urlsafe):
data1 = ndb.Key(urlsafe = entity_key_urlsafe).get()
data1.property1 = "1"
data1.put()
The documentation on transactions with NDB doesn't give as much background as the (older) DB version, so to familiarise yourself fully with the limitations and options, you should read both.
I say No
I have never seen a reference to that or a trick or a hack.
I also think that it would be quite difficult for such an operation to exist.
When you perform .put() on an entity the entity is serialised and then written.
An entity is an instance of the Class that you can save or retrieve from the Datastore.
Imagine if you had a date property that has auto_now? What would have to happen then? Which of the 2 saves should edit that property?
Though your problem seems to be different. One of your functions commits first and nullifies the other methods value because it retrieves an outdated copy, and not the expected one.
#Greg's Answer talks about transactions. You might want to take a look at them.
Transactions are used for concurrent requests and not that much for succession.
Imagine that 2 users pressing the save button to increase a counter at the same time. There transactions work.
#ndb.transactional
def increase_counter(entity_key_urlsafe):
entity = ndb.Key(urlsafe = entity_key_urlsafe).get()
entity.counter += 1
entity.put()
Transactions will ensure that the counter is correct.
The first that tries to commit the above transaction will succeed and the later will have to retry if retries are on (3 by default).
Though succession is something different. Said that, I and #Greg advise you to change your logic towards using transaction if the problem you want to solve is something like the counter example.
I have the following model:
One
name (Char)
Many
one (ForeignKey,blank=True,null=True)
title (Char)
I want to delete a One instance and all related objects should loose their relation to the One instance. At the moment my code looks like this:
one=One.objects.get(<some criterion>)
more=Many.objects.filter(one=one)
for m in more
m.one=None
m.save()
#and finally:
one.delete()
what does the code do?
It finds the object, that should be deleted then searches for related objects, sets their ForeignKey to None and finally deletes the One instance. But somewhere in that process it also manages to kill all related objects (Many instances) in the process.
My question is: Why are those related objects deleted and how do I prevent this?
The code given is correct. My problem when asking the question was a typo in my implementation.
shame on me
well... there is still a bit that could be improved on:
more=Many.objects.filter(one=one)
for m in more
m.one=None
m.save()
#and finally:
one.delete()
can be written as:
for m in one.many_set.all()
m.one=None
m.save()
one.delete()
which is equivalent to:
one.many_set.clear()
one.delete()
You can use update() in first place:
Many.objects.filter(one=one).update(one=None)
I think that Django deletes related object on program level (without on delete cascade in DBMS). So probably your objects are in some kind of cache and Django still thinks that they are related to one object.
Try to list the related objects before you delete.
print one.many_set
one.delete()
If you still have any objects in this set you probably should fetch one from DB again, and then delete. Or you can use delete():
One.objects.filter(<criteria>).delete()
I have model, Match, with two foreign keys:
class Match(model.Model):
winner = models.ForeignKey(Player)
loser = models.ForeignKey(Player)
When I loop over Match I find that each model instance uses a unique object for the foreign key. This ends up biting me because it introduces inconsistency, here is an example:
>>> def print_elo(match_list):
... for match in match_list:
... print match.winner.id, match.winner.elo
... print match.loser.id, match.loser.elo
...
>>> print_elo(teacher_match_list)
4 1192.0000000000
2 1192.0000000000
5 1208.0000000000
2 1192.0000000000
5 1208.0000000000
4 1192.0000000000
>>> teacher_match_list[0].winner.elo = 3000
>>> print_elo(teacher_match_list)
4 3000 # Object 4
2 1192.0000000000
5 1208.0000000000
2 1192.0000000000
5 1208.0000000000
4 1192.0000000000 # Object 4
>>>
I solved this problem like so:
def unify_refrences(match_list):
"""Makes each unique refrence to a model instance non-unique.
In cases where multiple model instances are being used django creates a new
object for each model instance, even if it that means creating the same
instance twice. If one of these objects has its state changed any other
object refrencing the same model instance will not be updated. This method
ensure that state changes are seen. It makes sure that variables which hold
objects pointing to the same model all hold the same object.
Visually this means that a list of [var1, var2] whose internals look like so:
var1 --> object1 --> model1
var2 --> object2 --> model1
Will result in the internals being changed so that:
var1 --> object1 --> model1
var2 ------^
"""
match_dict = {}
for match in match_list:
try:
match.winner = match_dict[match.winner.id]
except KeyError:
match_dict[match.winner.id] = match.winner
try:
match.loser = match_dict[match.loser.id]
except KeyError:
match_dict[match.loser.id] = match.loser
My question: Is there a way to solve the problem more elegantly through the use of QuerySets without needing to call save at any point? If not, I'd like to make the solution more generic: how can you get a list of the foreign keys on a model instance or do you have a better generic solution to my problem?
Please correct me if you think I don't understand why this is happening.
This is because, as far as I can tell, there's no global cache of model instances, so each query creates new instances, and your lists of related objcts are created lazily using separate queries.
You might find that select_related() is smart enough to solve the problem in this case. Instead of code like:
match = Match.objects.filter(...).get()
use:
match = Match.objects.select_related().filter(...).get()
That creates all the attribute instances at once and may be smart enough to re-use instances. Otherwise, you are going to need some kind of explicit cache (which is what your solution does).
Warning: I am surprised by this kind of behaviour myself and am not an expert on this. I found this post while searching for information on this kind of issue in my own code. I'm just sharing what I think is happening as I try to understand...
You might want to check out django-idmapper It defines a SharedMemoryModel so that there is only one copy of each instance in the interpreter.
Uh, are you using get_or_create() for the Player records? If not, then you are probably creating new instances of identical (or near identical) Player records on every match. This can lead to tears and/or insanity.
I've found a good answer to a similar question, take a look here:
Django Models: preserve object identity over foreign-key following