Django model with filterable attributes - python

I've got two models. One represents a piece of equipment, the other represents a possible attribute the equipment has. Semantically, this might look like:
Equipment: tractor, Attributes: wheels, towing
Equipment: lawnmower, Attributes: wheels, blades
Equipment: hedgetrimmer, Attributes: blades
I want to make queries like,
wheels = Attributes.objects.get(name='wheels')
blades = Attributes.objects.get(name='blades')
Equipment.objects.filter(has_attribute=wheels) \
.exclude(has_attribute=blades)
How can I create Django models to do this?
This seems simple, but I'm just too dense to see the right solution.
One solution that popped into my head is to encode the list of Attribute IDs in an integer list like |109|14|3 and test for attributes using Equipment.objects.filter(attributes_contains='|%d|' % id) -- but this seems really wrong.

Your second example is pretty close, but you need to understand how the QuerySet API works across relationships (i.e. joins).
class Attribute(models.Model):
name = models.CharField(max_length=20)
class Equipment(models.Model):
name = models.CharField(max_length=20)
attributes = models.ManyToManyField(Attribute)
equips = Equipment.objects.filter(
attributes__name='wheels').exclude(attributes__name='blades')
You can use Q objects in your QuerySet to do more interesting queries.
And keep in mind you can always dump the SQL from a QuerySet like this:
print equips.query.as_sql()
Sometimes you'll want to see the exact SQL being generated to make sure you're using the API correctly.

Related

Django annotate value based on another model field

I have these two models, Cases and Specialties, just like this:
class Case(models.Model):
...
judge = models.CharField()
....
class Specialty(models.Model):
name = models.CharField()
sys_num = models.IntegerField()
I know this sounds like a really weird structure but try to bare with me:
The field judge in the Case model refer to a Specialty instance sys_num value (judge is a charfield but it will always carries an integer) (each Specialty instance has a unique sys_num). So I can get the Specialty name related to a specific Case instance using something like this:
my_pk = #some number here...
my_case_judge = Case.objects.get(pk=my_pk).judge
my_specialty_name = Specialty.objects.get(sys_num=my_case_judge)
I know this sounds really weird but I can't change the underlying schemma of the tables, just work around it with sql and Django's orm.
My problem is: I want to annotate the Specialty names in a queryset of Cases that have already called values().
I only managed to get it working using Case and When but it's not dynamic. If I add more Specialty instances I'll have to manually alter the code.
cases.annotate(
specialty=Case(
When(judge=0, then=Value('name 0 goes here')),
When(judge=1, then=Value('name 1 goes here')),
When(judge=2, then=Value('name 2 goes here')),
When(judge=3, then=Value('name 3 goes here')),
...
Can this be done dynamically? I look trough django's query reference docs but couldn't produce a working solution with the tools specified there.
You can do this with a subquery expression:
from django.db.models import OuterRef, Subquery
Case.objects.annotate(
specialty=Subquery(
Specialty.objects.filter(sys_num=OuterRef('judge')).values('name')[:1]
)
)
For some databases, casting might even be necessary:
from django.db.models import IntegerField, OuterRef, Subquery
from django.db.models.functions import Cast
Case.objects.annotate(
specialty=Subquery(
Specialty.objects.filter(sys_num=Cast(
OuterRef('judge'),
output_field=IntegerField()
)).values('name')[:1]
)
)
But the modeling is very bad. Usually it is better to work with a ForeignKey, this will guarantee that the judge can only point to a valid case (so referential integrity), will create indexes on the fields, and it will also make the Django ORM more effective since it allows more advanced querying with (relativily) small queries.

Django ORM - Filter Related Objects

I apologize if this is a duplicate, but I was unable to find any other SO posts that address this matter. I have models like so:
class Person(models.Model):
pass
class Interest(models.Model):
person = models.ForeignKey(Person, related_name='interests')
is_cool = models.BooleanField()
I know that I can find all people who have cool interests like so:
Person.objects.filter(interests__is_cool=True)
However, what I really want is to get only their cool interests when I get the Person object. I know that I could always pluck the related queryset out and operate on it, like so:
interests = person.interests.filter(is_cool=True)
but I cannot assign it back to the person instance since the relationship is reversed. To summarize, the goal is to use the ORM directly to filter the Interest objects being returned in the person.interests queryset.
One possibility is to define a method or property on the model:
def cool_interests(self):
return self.interests.filter(is_cool=True)

ProgrammingError: column "product" is of type product[] but expression is of type text[] enum postgres

I would like to save array of enums.
I have the following:
CREATE TABLE public.campaign
(
id integer NOT NULL,
product product[]
)
product is an enum.
In Django I defined it like this:
PRODUCT = (
('car', 'car'),
('truck', 'truck')
)
class Campaign(models.Model):
product = ArrayField(models.CharField(null=True, choices=PRODUCT))
However, when I write the following:
campaign = Campaign(id=5, product=["car", "truck"])
campaign.save()
I get the following error:
ProgrammingError: column "product" is of type product[] but expression is of type text[]
LINE 1: ..."product" = ARRAY['car...
Note
I saw this answer, but I don't use sqlalchemy and would rather not use it if not needed.
EDITED
I tried #Roman Konoval suggestion below like this:
class PRODUCT(Enum):
CAR = 'car'
TRUCK = 'truck'
class Campaign(models.Model):
product = ArrayField(EnumField(PRODUCT, max_length=10))
and with:
campaign = Campaign(id=5, product=[CAR, TRUCK])
campaign.save()
However, I still get the same error,
I see that django is translating it to list of strings.
if I write the following directly the the psql console:
INSERT INTO campaign ("product") VALUES ('{car,truck}'::product[])
it works just fine
There are two fundamental problems here.
Don't use Enums
If you continue to use enum, your next question here on Stackoverflow will be "how do I add a new entry to an enum?". Django does not support enum type out of the box (thank heavens). So you have to use third party libraries for this. Your mileage will vary with how complete the library is.
An enum value occupies four bytes on disk. The length of an enum
value's textual label is limited by the NAMEDATALEN setting compiled
into PostgreSQL; in standard builds this means at most 63 bytes.
If you are thinking that you are saving space on disk by using enum, the above quote from the manual shows that it's an illusion.
See this Q&A for more on advantages and disadvantages of enum. But generally the disadvantages outweigh the advantages.
Don't use Arrays
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Source: https://www.postgresql.org/docs/9.6/static/arrays.html
If you are going to search for a campaign that deals with Cars or Trucks you are going to have to do a lot of hard work. So will the database.
The correct design
The correct design is the one suggested in the postgresql arrays documentation page. Create a related table. This is the standard django way as well.
class Campaign(models.Model):
name = models.CharField(max_length=20)
class Product(Models.model):
name = models.CharField(max_length=20)
campaign = models.ForeignKey(Campaign)
This makes your code simpler. Doesn't require any extra storage. Doesn't require third party libraries. And best of all the vast api of the django related models becomes available to you.
The definition of product field is incorrect as it specifies that it is array of CharFields but it is array of enums in reality. Django does not support enum type now so you can try this extension to define the type correctly:
class Product(Enum):
ProductA = 'a'
...
class Campaign(models.Model):
product = ArrayField(EnumField(Product, max_length=<whatever>))
Try this:
def django2psql(s):
return '{'+','.join(s) + '}'
campaign = Campaign(id=5, product=django2psql(["car", "truck"]))
I think you may have to subclass CharField to get it to report the correct db_type. There may be more problems than this but you can give this a try:
class Product(models.CharField):
def db_type(self, connection):
return 'product'
PRODUCT = (
('car', 'car'),
('truck', 'truck')
)
class Campaign(models.Model):
product = ArrayField(Product(null=True, choices=PRODUCT))

Django creating multiple tables/model classes from same base class with factory function

I have been trying to figure out the best way to automate creating multiple SQL tables based on separate but identical models, all based on the same base class. I'm basically creating pseudo message boards or walls with different Groups, and I wanted each Group to have its own db_table of Posts, each Post containing the user id, timestamp, etc.
My first thought was to have one base class of Posts and just include a field for Group name, but I thought this would be bad practice. My rationale was that one table containing every Post for all Groups would get really big (in theory anyway) and slow down filtering, and also that the extra field for group name would in the long run be a waste of memory when I could have separate tables per group and skip this field.
I've also considered using a ForeignKey with a Many-to-One relationship, but as far as I can tell this has the same drawbacks. Am I wrong to think that? Or are these size concerns not really an issue?
So my next idea was to make Posts an abstract class, and then create subclasses based on each Group. This is ultimately what I did. However, I found myself having to copy and paste the code over and over and change the class name each time. This felt very unPythonic to me. It was something like:
class Posts(models.Model):
timestamp = models.DateTimeField(auto_now_add=True, unique=False)
user_id = ...
#etc.
#
class Meta:
abstract = True
class GroupA(Posts):
class Meta(Posts.Meta):
db_table = 'groupa_board'
class GroupB(Posts):
class Meta(Posts.Meta):
db_table = 'groupb_board'
class GroupC...etc.
What I really was looking for was a factory function to do this for me. I tried this sort of thing:
def makeBoard(group):
class Board(Posts):
class Meta(Posts.Meta):
db_table = group
return board #note I tried with and without this line
And then I ran a simple for loop using a list of groups.
for group in groups:
makeBoard(group)
I found myself hitting a RuntimeError: conflicting models in application, and I probably deserved it. So then I figured what I need is something like:
def makeBoard(group):
class group(Posts): #***group here being a variable, not the class name
class Meta(Posts.Meta):
db_table = '%s' % group #maybe issues here too, but the table
return group #name is not that important if the class
#name works
But I couldn't figure out how to make this work! Is there a way to pass a variable from a list to a class name?
Anyway if you're still with me I appreciate it. I've been on stackoverflow all day and while I've found guides for creating abstract base classes and subclasses to solve similar issues, I didn't see a way to create a function to do this for me. I ultimately punted here and just make a subclass for each group by hand. If there is a way to automate this process, I'd love to hear it.
Also, if I'm being stupid for not just going with one db table containing every post, I'd like to know that too, and why! Or if there's a better way to implement this kind of system altogether. I apologize if this has been answered before, I really couldn't find it.
Thank you!
Using a single table would not be bad practice. The extra memory is minimal, on modern systems that shouldn't be a problem. You shouldn't worry about performance either, premature optimization (not including the actual system design) is considered bad practice, but if you run into performance problems you can always specify an index on the group column:
group = models.CharField(max_length=100, db_index=True)
That's not to say that it is the best option, or that your method isn't good. Also, it is entirely possible to dynamically create models, using the type() built-in function. The only difference with dynamically creating models and creating other classes is that you must specifically pass the __module__ attribute. You can create subclasses for Posts in the following way:
def fabric(names, baseclass=Posts):
for name in names:
class Meta:
db_table = '%s_table' % name.lower()
attrs = {'__module__': baseclass.__module__, 'Meta': Meta}
# specify any other class attributes here. E.g. you can specify extra fields:
attrs.update({'my_field': models.CharField(max_length=100)})
newclass = type(str(name), (baseclass,), attrs)
globals()[name] = newclass
fabric(['GroupA', 'GroupB', 'GroupC', etc...])
Put that code in your models.py after your Posts class, and all classes will be created for you. They can be used in any way normal classes can be used: Django doesn't even know you dynamically created this class. Though your Meta class doesn't inherit from Posts.Meta, your meta settings should still be preserved.
Tested with Django 1.4.
Try smth like this
import app.models as group_models
from django.db.models.base import ModelBase
def fabric(group):
for item in dir(group_models):
c = getattr(group_models, item)
if type(c) is ModelBase:
if c._meta.db_table == '%s_table' % group:
return c
return None

Creating a multi-model in Django

I'd like to create a directed graph in Django, but each node could be a separate model, with separate fields, etc.
Here's what I've got so far:
from bannergraph.apps.banners.models import *
class Node(models.Model):
uuid = UUIDField(db_index=True, auto=True)
class Meta:
abstract = True
class FirstNode(Node):
field_name = models.CharField(max_length=100)
next_node = UUIDField()
class SecondNode(Node):
is_something = models.BooleanField(default=False)
first_choice = UUIDField()
second_choice = UUIDField()
(obviously FirstNode and SecondNode are placeholders for the more domain-specific models, but hopefully you get the point.)
So what I'd like to do is query all the subclasses at once, returning all of the ones that match. I'm not quite sure how to do this efficiently.
Things I've tried:
Iterating over the subclasses with queries - I don't like this, as it could get quite heavy with the number of queries.
Making Node concrete. Apparently I have to still check for each subclass, which goes back to #1.
Things I've considered:
Making Node the class, and sticking a JSON blob in it. I don't like this.
Storing pointers in an external table or system. This would mean 2 queries per UUID, where I'd ideally want to have 1, but it would probably do OK in a pinch.
So, am I approaching this wrong, or forgetting about some neat feature of Django? I'd rather not use a schemaless DB if I don't have to (the Django admin is almost essential for this project). Any ideas?
The InheritanceManager from django-model-utils is what you are looking for.
You can iterate over all your Nodes with:
nodes = Node.objects.filter(foo="bar").select_subclasses()
for node in nodes:
#logic

Categories

Resources