Django: ORM Design Issue - python

Scenario: Vehicle Testing - (Vehicles are booked into a test cell and then tested)
For example purposes I have created two simplified models:
class Vehicle(models.Model):
registration = models.CharField(unique=True, max_length=10)
tyre_pressure = models.IntegerField()
class Booking(models.Model):
STATUS_CHOICES= (('Booked','Booked'),('Complete','Complete'))
booking_datetime = models.DatetimeField(auto_now=True)
vehicle = models.ForeignKey(Vehicle, on_delete=SET_NULL)
status = models.CharField(max_length=20, choices=STATUS_CHOICES)
When a booking is created/booked, it creates a Booking object with the status Booked and the booking has an assigned vehicle.
Once the test has completed, the Booking object changes it's status to Complete. The vehicle is now vacant for the next test.
Issue
The tyre_pressure for the vehicle (for example) may change later on after the Booking completed (Maybe for a different test) and cause the test to have incorrect data for that particular time.
I want the Booking record to reflect the vehicle state at that particular time of the booking rather than the present value.
What I've considered
When a vehicle is edited/updated it creates a new, unique id. This would conflict with the unique=True parameter in the registration field.
Include all vehicle fields inside Booking Model for given test instead of Vehicle object.
Possibly create a new VehicleConfiguration model. Contains all properties except the unique key. New record is created on every change to vehicle and historical changes are stored with date changed. Like so:
class Vehicle(models.Model):
registration = models.CharField(unique=True, max_length=10)
vehicle_config = models.ForeignKey(VehicleConfiguration, on_delete=models.CASCADE)
class VehicleConfig(models.Model):
tyre_pressure = models.IntegerField()
[...]

This is a very common pattern, where you need a frozen state of a changeable model at a certain point in time. There's various ways to deal with it. In e-commerce, you see this pattern when a Product in a catalogue is purchased, it cannot be a foreign key in the Order, but a copy needs to be made with the price of that time, even the name at that time. The Order references PurchasedItem instead of Product and they share a number of fields.
However, in this case you may be able to separate test data from vehicle:
class Booking(models.Model):
vehicle = models.ForeignKey(Vehicle, related_name='bookings')
test_results = models.ForeignKey('Test', null=True) # ManyToMany if one booking can result in multiple tests
... financial data
class Vehicle(models.Model):
... static vehicle data, like make, license plate, owner, etc
class Test(models.Model):
test_time = models.DateTimeField(auto_now_add=True)
tyre_pressure = models.IntegerField()
So in this case, the booking ties the vehicle and test together and test is assumed to be static. The vehicle's tyre pressure, would be a proxy method to the latest test:
class Vehicle(models.Model):
...
#property
def tyre_pressure(self) -> int:
return self.bookings.filter(
status='completed', test__isnull=False
).latest('test__test_time').tyre_pressure

I would create an abstract model for vehicle and booking to inherit shared properties from. Pretty much your second bullet point idea.

Related

Django/Python: Best Practice/Advice on handling external IDs for Multiple Multi-directional External APIs

So this is more of a conceptual question, and I am really looking for someone to just help point me in the right direction. I am building a middleware platform where I will be pull data in from inbound channels, manipulating it, and then pushing it out the other door to outbound channels. I will need to store the external id for each of these records, but the kicker is, records will be pulled from multiple sources, and then pushed to multiple sources. A single record in my system will need to be tied to any number of external ids.
a quick model to work with:
class record(models.Model):
#id
Name = models.CharField(max_length=255, help_text="")
Description = models.CharField(max_length=255, help_text="")
category_id = model.ForeignKey(category)
class category(models.Model):
#id
name = models.CharField(max_length=255, help_text="")
description = models.CharField(max_length=255, help_text="")
class channel(models.Model):
#id
name = models.CharField(max_length=255, help_text="")
inbound = models.BooleanField()
outbound = models.BooleanField()
Obviously, I cannot add a new field to every model every time I add a new integration, that would be soooo 90s. The obvious would be to create another model to simply store the channel and record id with the unique id, and maybe this is the answer.
class external_ref(models.Model):
model_name = models.CharfieldField()
internal_id = models.IntegerField()
external_id = models.IntegerField()
channel_id = models.IntegerField()
class Meta:
unique_together = ('model', 'internal_id',)
While my example holds simply 4 models, I will be integrating records from 10-20 different models, so something I could implement an a global level would be optimal. Other things I have considered:
Overwriting the base model class to create a new "parent" class that also holds an alpha-numberic representation of every record in the db as unique.
Creating an abstract model to do the same.
Possibly storing a json reference with channel : external_id that I could ping on every record to see if it has an external reference.
I'm really an open book on this, and the internet has become increasingly overwhelming to sift through. Any best practices or advice would be much appreciated. Thanks in advance.
I have this exact issue and yes there is not much information on the web in using Django this way. Heres what Im doing - haven't used it long enough to determine if its "the best" way.
I have a class IngestedModel which tracks the source of the incoming objects as well as their external ids. This is also where you would put a unique_together constraint (on external_id and source)
class RawObject(TimeStampedModel):
"""
A Table to keep track of all objects ingested into the database and where they came from
"""
data = models.JSONField()
source = models.ForeignKey(Source,on_delete=models.PROTECT)
class IngestedModel(models.Model):
external_id = models.CharField(max_length=50)
source = models.ForeignKey(Source,on_delete=models.CASCADE)# 1 or 0
raw_objects = models.ManyToManyField(RawObject,blank=True)
class Meta:
abstract = True
then every model that is created from ingested data inherits from this IngestedModel. That way you know its source and you can use each external object for more than 1 internal object and vise versa.
class Customer(IngesteModel):
class Order(IngestedModel):
...
etc.
Now this means there is no "IngestedModel" table but that every model has a field for source, external_id and a reference to a raw object (many to many). This feels more compositional rather than inherited - no child tables which seems better to me. I would also love to hear feedback on the "right" way to do this.

What's the use of Intermediate models in Django?

Why do we use an intermediate model?
Can't we just use Many to many relationship without intermediate model?
M2M relationships require intermediate tables. You can read more about what M2M relationships are and why they require an intermediate table (referred to as a junction table in the article) here:
Django abstracts this away by automagically creating this intermediate table for you, unless you need to add custom fields on it. If you do, then you can define it by overriding the through parameter as shown here
Here's a quick picture of why the table is required
Source: https://www.geeksforgeeks.org/intermediate-fields-in-django-python/
Let's say you have two models which have a Many-to-Many relationship, like Customer and Product. One customer can buy many products and a product can be bought by many customers.
But you can have some data that doesn't belong to neither of them, but are important to the transaction, like: quantity or date.
Quantity and date are the intermediary data which are stored in intermediary models.
from django.db import models
class Item(models.Model):
name = models.CharField(max_length = 128)
price = models.DecimalField(max_digits = 5, decimal_places = 2)
def __str__(self):
return self.name
class Customer(models.Model):
name = models.CharField(max_length = 128)
age = models.IntegerField()
items_purchased = models.ManyToManyField(Item, through = 'Purchase')
def __str__(self):
return self.name
class Purchase(models.Model):
item = models.ForeignKey(Item, on_delete = models.CASCADE)
customer = models.ForeignKey(Customer, on_delete = models.CASCADE)
date_purchased = models.DateField()
quantity_purchased = models.IntegerField()
When you buy a product, you do it through the Purchase model: the client customer buys quantity_purchased quantity of items item in date_purchased.
The Purchase model is the Intermediate model.
Django documentation says:
...if you want to manually specify the intermediary table, you can use
the through option to specify the Django model that represents the
intermediate table that you want to use.
In this case we have this line in the Customer model, which defines the intermediary model in through = 'Purchase'
items_purchased = models.ManyToManyField(Item, through = 'Purchase')
Let's now use the example from the Django Documentation.
You have a database of musicians with a Many-to-Many relationship with the bands the belong to: a musician can belong can be part of many bands, and the bands can have many musicians.
What data do you want to keep?
For musicians (person): name and instrument they play
For the bands: name and style.
from django.db import models
class Person(models.Model):
name = models.CharField(max_length=128)
age = models.IntegerField()
class Group(models.Model):
name = models.CharField(max_length=128)
style = models.CharField(max_length=128)
person = models.ForeignKey(Person, on_delete=models.CASCADE)
But, wouldn't you think that knowing when the person joined the band is important? What model would be the natural place to add a date_joined field? It makes no sense to add it to Person or Group, because it's not an intrinsic field for each of them, but it's related to an action: joining the band.
So you make a small, but important adjustment. You create an intermediate model that will relate the Person, the Group with the Membership status (which includes the date_joined).
The new version is like this:
from django.db import models
class Person(models.Model):
name = models.CharField(max_length=128)
age = models.IntegerField()
class Group(models.Model):
name = models.CharField(max_length=128)
style = models.CharField(max_length=128)
members = models.ManyToManyField(Person, through='Membership')
class Membership(models.Model):
person = models.ForeignKey(Person, on_delete=models.CASCADE)
group = models.ForeignKey(Group, on_delete=models.CASCADE)
date_joined = models.DateField()
The changes are:
You added a new class called Membership which reflects the membership status.
In the Group model you added members = models.ManyToManyField(Person, through='Membership'). With this you relate Person and Group with Membership, thanks to through.
Something important to clarify.
An intermediate model, or in relational database terms, an associative entity, are always needed in a Many-to-Many (M2M) relationship.
A relational database requires the implementation of a base relation
(or base table) to resolve many-to-many relationships. A base relation
representing this kind of entity is called, informally, an associative
table... that can contain references to columns from the same or different database tables within the same database.
An associative (or junction) table maps two or more tables together by
referencing the primary keys of each data table. In effect, it
contains a number of foreign keys, each in a many-to-one relationship
from the junction table to the individual data tables. The PK of the
associative table is typically composed of the FK columns themselves. (source)
Django will create the intermediate model, even when you don't explicitly define it with through.
Behind the scenes, Django creates an intermediary join table to
represent the many-to-many relationship. By default, this table name
is generated using the name of the many-to-many field and the name of
the table for the model that contains it.
Django will automatically generate a table to manage many-to-many
relationships. However, if you want to manually specify the
intermediary table, you can use the through option to specify the
Django model that represents the intermediate table that you want to
use.
The most common use for this option is when you want to associate extra data with a many-to-many relationship.(source)

Correctly defining this data relation in Django models

I'm working on a Django project, where I have amongst others, two models that have a relationship.
The first model describes a dish in general. It has a name and some other basic information, for instance:
dish(models.Model):
name = models.CharField(max_length=100)
short_desc = models.CharField(max_lenght=255)
vegetarian = models.BooleanField(default=False)
vegan = models.BooleanField(default=False)
The second model is related to the dish, I assume in form of a one-to-one relationship. This model contains the preparation and the ingredients. This data may change over time for the dish (e.g. preparation text is adjusted). Old versions of this text are still stored, but not connected to the dish. So the dish gets a new field, which points to the current preparation text.
preparation = models.???(???)
So, whenever the preparation description is changed a new entry is created for the preparation and the dish's reference to the preparation is updated.
The preparation itself looks like this:
preparation(models.Model):
prep_test = models.TextField()
ingredients = models.TextField()
last_update = models.DateTimeField()
As stated before, I believe that a one-to-one relation would be reasonable between the dish and the preparation.
Is my assumption with the one-to-one relation correct and if so, how do I correctly define it?
If you have multiple preparations for the dish, you don't have a one-to-one relationship by definition.
The way to define this is a ForeignKey from Preparation to Dish. (Note, Python style is that classes start with an upper case letter.)
class Preparation(models.Model):
...
dish = models.ForeignKey('Dish')
Now you can do my_dish.preparation_set.latest('last_update') to get the latest preparation for a dish. If you add an inner Meta class to Preparation and define get_latest_by = 'last_update'), you can leave out the parameter to the latest() call.
Make sure, relations are correct otherwise you have repeating tuples in your models which is not very good practice, make your database very heavy. see relation from my perspective.
class dish(models.Model):
name = models.CharField(max_length=100)
short_desc = models.CharField(max_lenght=255)
vegetarian = models.BooleanField(default=False)
vegan = models.BooleanField(default=False)
class Ingredients(models.Model):
name = models.CharField(max_length=100)
dish = models.ForeignKey(dish)
class preparation(models.Model):
prep_test = models.TextField()
last_update = models.DateTimeField()
dish = models.OneToOneField(dish)
why you don't make one2many relation of dish with preparation.
I dish have multiple preparation but have only one active. you can attach latest on base of last_update = models.DateTimeField()
your model will be like:
class preparation(models.Model):
dish = models.ForeignKey(dish)
...

Django: Model with varying fields (Entity-Attribute-Value model)

I have the following Django model to store sparse product data in a relational database. I apologize myself for any wrong relationship in the code below (ForeignKey and/or ManyToMany might be wrongly placed, I am just playing around with Django for now).
class ProdCategory(models.Model):
category = models.CharField(max_length=32, primary_key=True)
class ProdFields(models.Model):
categoryid = models.ForeignKey(ProdCategory)
field = models.CharField(max_length=32)
class Product(models.Model):
name = models.CharField(max_length=20)
stock = models.IntegerField()
price = models.FloatField()
class ProdData(models.Model):
prodid = models.ManyToManyField(Product)
fieldid = models.ManyToManyField(ProdFields)
value = models.CharField(max_length=128)
The idea is to store the name, stock and price for each product in one table and the information for each product in the (id, value) format in another table.
I do know, a priori, the fields that each product category should have. For instance, a product of type Desktop should have, among others, memory size and storage size as fields, whereas another product of category Monitor should have resolution and screen size as fields.
My question is: How do I guarantee, in Django, that each product contains only the fields for its category? More precisely, when specifying a product of category Monitor, how to assure that only resolution and screen size are fields in the ProdData table?
I found a similar question Django: Advice on designing a model with varying fields, but there was no answer on how to assure the above.
Thank you in advance.
Django is an excellent framework, but it is still just an abstraction over a relation database.
What you are asking isn't efficiently possible in a relational database, so it will be tough to do in Django. Primarily, because at some point your code will need to be converted to tables.
There are basically 2 ways you can do this:
A product class with a ManyToMany relation to an attribute table:
class Product(models.Model):
name = models.CharField(max_length=20)
stock = models.IntegerField()
price = models.FloatField()
product_type = models.CharField(max_length=20) eg. Monitor, desktop, etc...
attributes = models.ManyToManyField(ProductAttribute)
class ProductAttribute(models.Model):
property = models.CharField(max_length=20) # eg. "resolution"
value = models.CharField(max_length=20) # eg. "1080p"
But, your logic around certain classes of objects having certain properties will be lost.
Use inheritance. Django is just Python, and inheritance is certainly possible - in fact its encouraged:
class Product(models.Model):
name = models.CharField(max_length=20)
stock = models.IntegerField()
price = models.FloatField()
class Desktop(Product):
memory_size = models.CharField(max_length=20)
storage_size = models.CharField(max_length=20)
class Monitor(Product):
resolution = models.CharField(max_length=20)
screen_size = models.CharField(max_length=20)
Then you can do queries on all products - Products.objects.all() - or just Monitors - Monitor.objects.all()` - and so on. This hard codes the possible products in code, so a new product type requires a database migration, but it also gives you the ability to embed your business logic in the models themselves.
There are trade-offs to both these approaches which you need to decide, so picking is up to you.

Filtering object from one model accordingly to the field of another

I am making a new section on a website where existing customers (Customer model) can choose to appear on.
New users are not required to have an account from the main site (Customer) and can just create an account for the new section (NewSecUser model)
class Customer(models.Model):
name = models.CharField(max_length=50)
#[...]
is_visible_on_new_section = models.BooleanField(default=False)
class NewSecUser(model.Model):
name = models.CharField(max_length=50)
#[...]
customer_id = models.IntegerField(null=True)
# customer_id refers to the id of a Customer model object
# its value is different from null only when a Customer chooses to appear
# on the new section
How would one use exclude() to filter-out NewSecUser objects where Customer objects have an id equal to NewSecUser.customer_id and is_visible_on_new_section set to False?
Basically something similar to an SQL JOIN (with new_sec_user.customer_id=customer.id) I believe.
I know it would have been much easier with customer_id being a foreign key but I did not choose this.
Customer.filter(id__in = [nsu.pk for nsu in NewSecUser.all()]).filter(is_visible_on_new_selection=True).all()`
or something very similar

Categories

Resources