Appengine ndb: differentiating between creating and editing?

Appengine ndb: differentiating between creating and editing? - python

I have the following code in models.py:
class Order(ndb.Model):
created_at = ndb.DateTimeProperty(auto_now_add=True)
updated_at = ndb.DateTimeProperty(auto_now=True)
name = ndb.StringProperty()
due_dates = ndb.DateProperty(repeated=True)
class Task(ndb.Model):
created_at = ndb.DateTimeProperty(auto_now_add=True)
updated_at = ndb.DateTimeProperty(auto_now=True)
order = ndb.KeyProperty(required=True)
order_updated_at = ndb.DateTimeProperty(required=True)
...
When an order is created, 6 tasks will be created. Currently, I have the following method:
def _post_put_hook(self, future):
# Deleting old tasks
tbd = Task.query(Task.order == self.key).fetch(keys_only=True)
ndb.delete_multi(tbd)
# Generating new tasks
for entry in self.entries:
pt = entry.producetype.get()
# Now create Tasks and store them into the database
Task(order=self.key,
order_updated_at=self.updated_at,
order_entry_serial=entry.serial,
date=dt_sowing,
action=TaskAction.SOWING).put()
Now I am changing the way Order and Task are created.
I want to create Tasks when an Order is created AND I want to delete the tasks of an Order when an order is modified
Unfortunately, ndb's API states:
The Datastore API does not distinguish between creating a new entity
and updating an existing one. If the object's key represents an entity
that already exists, the put() method overwrites the existing entity.
You can use a transaction to test whether an entity with a given key
exists before creating one. See also the Model.get_or_insert() method.
I don't really understand how Model.get_or_insert can be applied in my scenario.
Note that I can't use _pre_put_hooks because my Tasks needs to reference their Order via its key.

Ignore get_or_insert(), it return an entity in any case and don't help you. You need to check if tasks exist in the datastore. I think to wrap a get() or a get_multi() function in a try/except. If the entities exists, delete it else create 6 new Task entities with a put_multi().
edit: you need timestamps for check the preexistance. Look datetime property and auto_now_add/auto_now options.

Related

Run code when "foreign" object is added to set

I have a foreign key relationship in my Django (v3) models:
class Example(models.Model):
title = models.CharField(max_length=200) # this is irrelevant for the question here
not_before = models.DateTimeField(auto_now_add=True)
...
class ExampleItem(models.Model):
myParent = models.ForeignKey(Example, on_delete=models.CASCADE)
execution_date = models.DateTimeField(auto_now_add=True)
....
Can I have code running/triggered whenever an ExampleItem is "added to the list of items in an Example instance"? What I would like to do is run some checks and, depending on the concrete Example instance possibly alter the ExampleItem before saving it.
To illustrate
Let's say the Example's class not_before date dictates that the ExampleItem's execution_date must not be before not_before I would like to check if the "to be saved" ExampleItem's execution_date violates this condition. If so, I would want to either change the execution_date to make it "valid" or throw an exception (whichever is easier). The same is true for a duplicate execution_date (i.e. if the respective Example already has an ExampleItem with the same execution_date).
So, in a view, I have code like the following:
def doit(request, example_id):
# get the relevant `Example` object
example = get_object_or_404(Example, pk=example_id)
# create a new `ExampleItem`
itm = ExampleItem()
# set the item's parent
itm.myParent = example # <- this should trigger my validation code!
itm.save() # <- (or this???)
The thing is, this view is not the only way to create new ExampleItems; I also have an API for example that can do the same (let alone that a user could potentially "add ExampleItems manually via REPL). Preferably the validation code must not be duplicated in all the places where new ExampleItems can be created.
I was looking into Signals (Django docu), specifically pre_save and post_save (of ExampleItem) but I think pre_save is too early while post_save is too late... Also m2m_changed looks interesting, but I do not have a many-to-many relationship.
What would be the best/correct way to handle these requirements? They seem to be rather common, I imagine. Do I have to restructure my model?

The obvious solution here is to put this code in the ExampleItem.save() method - just beware that Model.save() is not invoked by some queryset bulk operations.
Using signals handlers on your own app's models is actually an antipattern - the goal of signal is to allow for your app to hook into other app's lifecycle without having to change those other apps code.
Also (unrelated but), you can populate your newly created models instances directly via their initializers ie:
itm = ExampleItem(myParent=example)
itm.save()
and you can even save them directly:
# creates a new instance, populate it AND save it
itm = ExampleItem.objects.create(myParent=example)
This will still invoke your model's save method so it's safe for your use case.

Can I filter a parent model by occurences in a child model in django?

I have two models:
class User(models.Model):
# stuff
class Task(models.Model):
# stuff
When assigning new tasks to users, I want to avoid assigning tasks the worker has gotten before. The approach I'm trying to take is to add a third model, Assignment that tracks tasks assigned to a given user:
class Assignment(models.Model):
worker = models.ForeignKey(User)
task = models.ForeignKey(Task)
How, though, do I go about forming a query based on this? My thought was to start by filtering Assignments by the user being assigned to, but I'm getting stuck after that point.
def get_tasks(worker):
previous_assignments = Assignment.objects.filter(worker=worker)
# then something like...
assignable_tasks = Task.objects.exclude(pk__in=previous_assignments.task) #clearly wrong
Is there a way to access the task ids inside the previous_assignments queryset? Not sure if this isn't the best approach, or if I am just missing how to move past this step.
Edit: It occurs to me I could populate an empty set with Tasks by looping through those Assignment objects and then use an __in exclude argument like above... Any better approaches than that?

In order to exclude with pk_in, you would need a list or queryset of task ids to exclude. For example you could do:
previous_assignments = Assignment.objects.filter(worker=worker).values_list('task_id')
assignable_tasks = Task.objects.exclude(pk__in=previous_assignments)
However you don't need to do this. You can use the double underscore notation to follow the relationship from assignment to worker:
assignable_tasks = Task.objects.exclude(assignment__worker=worker)
Note that you could use a many-to-many field instead, and Django will take care of creating the joining table for you:
class Task(models.Model):
users = models.ManyToManyField(User)
In this case, your query becomes:
assignable_tasks = Task.objects.exclude(users=worker)

Sort objects by first instance of related object, Django

I have a model which is an instance for the existence of an item (a ticket), and on each creation of a ticket I create a instance of another model, a record. Each record keeps track of who made a change to the ticket, and what they did with it, it basically keeps a record of what has happened with it. I want to tickets creator and creation date to be defined as the creator and creation date of the first activity made which points to it. (The first of the many in a many to one relation.
As is, I have a function which does this very simply:
def created_by(self):
records = Record.objects.filter(ticket=self.id).order_by('created_on')
return records[0].created_by
However I run into an issue with this when trying to sort a collection of tickets (which is logically most often going to be sorted by creation date). I cannot sort by a function using django's filter for queries.
I don't really want to store the redundant data in the database, and I'd rather have the record than not so all date items related to the ticket can be seen in the records. Idea's on how to make it so I can sort and search by this first instance of record? (Also need to search for the creator because some users can only see their own tickets, some can see all, some can see subsets)
Thanks!

Assuming the Record ticket field is a Foreign key to the Ticket model:
class Record (models.Model):
....
create_time = models.DateTimeField()
ticket = models.ForeignKey(Ticket,related_name='records')
You can replace the ModelManager (objects) of the Ticket model and override the get_queryset function:
class TicketManager(models.ModelManager):
def get_queryset():
return super(TicketManager, self).get_queryset().annotate(create_time=Min('records__create_time')).order_by('create_time')
class Ticket(models.Model):
.....
objects = TicketManager
Now every query like Ticket.objects.all() or Ticket.objects.filter(...) will be sorted by the create time

Nested chain vs duplicated information

There is a models.py with 4 model.
Its standard record is:
class Main(models.Model):
stuff = models.IntegerField()
class Second(models.Model):
nested = models.ForeignKey(Main)
stuff = models.IntegerField()
class Third(models.Model):
nested = models.ForeignKey(Second)
stuff = models.IntegerField()
class Last(models.Model):
nested = models.ForeignKey(Third)
stuff = models.IntegerField()
and there is another variant of Last model:
class Last(models.Model):
nested1 = models.ForeignKey(Main)
nested2 = models.ForeignKey(Second)
nested = models.ForeignKey(Third)
stuff = models.IntegerField()
Will that way save some database load?
The information in nested1 and nested2 will duplicate fields in Secod and Third and even it may become outdated ( fortunately not in my case, as the data will not be changed, only new is added ).
But from my thoughts it may save database load, when I'm looking all Last records for a certain Main record. Or when I'm looking only for Main.id for specific Last item.
Am I right?
Will it really save the load or there is a better practice?

It all depends how you access the data. By default Django will make another call to the database when you access a foreign key. So if you want to make less calls to the database, you can use select_related to prefetch the models in foreign keys.

Keeping an Audit Trail of any/all Python Database objects in GAE

I'm new to Python. I'm trying to figure out how to emulate an existing application I've coded using PHP and MS-SQL, and re-create the basic back-end functionality on the Google Apps Engine.
One of the things I'm trying to do is emulate the current activity on certain tables I have in MS-SQL, which is an Insert/Delete/Update trigger which inserts a copy of the current (pre-change) record into an audit table, and stamps it with a date and time. I'm then able to query this audit table at a later date to examine the history of changes that the record went through.
I've found the following code here on stackoverflow:
class HistoryEventFieldLevel(db.Model):
# parent, you don't have to define this
date = db.DateProperty()
model = db.StringProperty()
property = db.StringProperty() # Name of changed property
action = db.StringProperty( choices=(['insert', 'update', 'delete']) )
old = db.StringProperty() # Old value for field, empty on insert
new = db.StringProperty() # New value for field, empty on delete
However, I'm unsure how this code can be applied to all objects in my new database.
Should I create get() and put() functions for each of my objects, and then in the put() function I create a child object of this class, and set its particular properties?

This is certainly possible, albeit somewhat tricky. Here's a few tips to get you started:
Overriding the class's put() method isn't sufficient, since entities can also be stored by calling db.put(), which won't call any methods on the class being written.
You can get around this by monkeypatching the SDK to call pre/post call hooks, as documented in my blog post here.
Alternately, you can do this at a lower level by implementing RPC hooks, documented in another blog post here.
Storing the audit record as a child entity of the modified entity is a good idea, and means you can do it transactionally, though that would require further, more difficult changes.
You don't need a record per field. Entities have a natural serialization format, Protocol Buffers, and you can simply store the entity as an encoded Protocol Buffer in the audit record. If you're operating at the model level, use model_to_protobuf to convert a model into a Protocol Buffer.
All of the above are far more easily applied to storing the record after it's modified, rather than before it was changed. This shouldn't be an issue, though - if you need the record before it was modified, you can just go back one entry in the audit log.

I am bit out of touch of GAE and also no sdk with me to test it out, so here is some guidelines to given you a hint what you may do.
Create a metaclass AuditMeta which you set in any models you want audited
AuditMeta while creating a new model class should copy Class with new name with "_audit" appended and should also copy the attribute too, which becomes a bit tricky on GAE as attributes are itself descriptors
Add a put method to each such class and on put create a audit object for that class and save it, that way for each row in tableA you will have history in tableA_audit
e.g. a plain python example (without GAE)
import new
class AuditedModel(object):
def put(self):
print "saving",self,self.date
audit = self._audit_class()
audit.date = self.date
print "saving audit",audit,audit.date
class AuditMeta(type):
def __new__(self, name, baseclasses, _dict):
# create model class, dervied from AuditedModel
klass = type.__new__(self, name, (AuditedModel,)+baseclasses, _dict)
# create a audit class, copy of klass
# we need to copy attributes properly instead of just passing like this
auditKlass = new.classobj(name+"_audit", baseclasses, _dict)
klass._audit_class = auditKlass
return klass
class MyModel(object):
__metaclass__ = AuditMeta
date = "XXX"
# create object
a = MyModel()
a.put()
output:
saving <__main__.MyModel object at 0x957aaec> XXX
saving audit <__main__.MyModel_audit object at 0x957ab8c> XXX
Read audit trail code , only 200 lines, to see how they do it for django

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Appengine ndb: differentiating between creating and editing? - python

Related

Run code when "foreign" object is added to set

Can I filter a parent model by occurences in a child model in django?

Sort objects by first instance of related object, Django

Nested chain vs duplicated information

Keeping an Audit Trail of any/all Python Database objects in GAE

Categories

Resources