I have a model in SQLAlchemy of which one column is an enum. Wanting to stay as close to vanilla SQLAlchemy and the Python3 stdlib as possible, I've defined the enum against the stdlib's enum.Enum class, and then fed that to SQLAlchemy using its sqlalchemy.Enum class (as recommended somewhere in the SQLAlchemy documentation.)
class TaxonRank(enum.Enum):
domain = "domain"
kingdom = "kingdom"
phylum = "phylum"
class_ = "class"
order = "order"
family = "family"
genus = "genus"
species = "species"
And in the model:
rank = sqlalchemy.Column(sqlalchemy.Enum(TaxonRank), name = "rank", nullable = False)
This works well, except for forcing me to use class_ instead of class for one of the enum values (naturally to avoid conflict with the Python keyword; it's illegal syntax to attempt to access TaxonRank.class.)
I don't really mind using class_, but the issue I'm having is that class_ is the value that ends up getting stored in the database. This, in turn, is causing me issues with my CRUD API, wherein I allow the user to do things like "filter on rank where rank ends with ss." Naturally this doesn't match anything because the value actually ends with ss_!
For record display I've been putting in some hacky case-by-case translation to always show the user class in place of class_. Doing something similar with sorting and filtering, however, is more tricky because I do both of those at the SQL level.
So my question: is there a good way around this mild annoyance? I don't really care about accessing TaxonRank.class_ in my Python, but perhaps there's a way to subclass the stdlib's enum.Enum to force the string representation of the class_ attribute (and thus the value that actually gets stored in the database) to the desired class?
Thanks to Sergey Shubin for pointing out to me an alternative form for defining an enum.Enum.
TaxonRank = enum.Enum("TaxonRank", [
("domain", "domain"),
("kingdom", "kingdom"),
("phylum", "phylum"),
("class", "class"),
("order", "order"),
("family", "family"),
("genus", "genus"),
("species", "species")
])
I have been working on an interface for a Russian and English database. I am using postgresql, but it will probably work for any brand X enumeration. This is the solution solution:
In mymodel.py:
from sqlalchemy.dialects.postgresql import ENUM
from .meta import Base
from enum import Enum
class NounVar(Enum):
abstract = 1
proper = 2
concrete = 3
collective = 4,
compound = 5
class Nouns(Base):
__tablename__ = 'nouns'
id = Column(Integer, primary_key=True)
name = Column(Text)
runame = Column(Text)
variety = Column("variety", ENUM(NounVar, name='variety_enum'))
And then further in default.py:
from .models.mymodel import Nouns
class somecontainer():
def somecallable():
page = Nouns(
name="word",
runame="слово",
variety=NounVar().concrete))
self.request.dbsession.add(page)
I hope it works for you.
Related
Might be a bit of an inelegant question title, but hopefully this skeleton setup explains things a little more clearly:
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
class Number(Base):
__tablename__ = 'number'
id = Column(Integer, primary_key=True)
users_id = Column(Integer, ForeignKey('user.id'))
user = relationship('User', backref=backref('numbers'))
value = Column(String)
joe = User(name='Joe')
joe.numbers = [
# Here we need to know that the class we want is named "Number".
# However, in some contexts (think abstract base classes or mixins) we might
# not necessarily know that, or have a way to import/reference it.
Number(value='212-555-1234'),
Number(value='201-555-1111'),
Number(value='917-555-8989')]
Basically there is a table of Users, and each User can have an arbitrary number of Numbers associated with it.
Is there a clean way, through the attributes of User alone, to find a reference to the Number class (and be able to create instances from it) without importing Number directly? The best I've come up with, with considerable influence from this question, is:
from sqlalchemy.orm import object_mapper
number_class = object_mapper(joe).relationships['numbers'].mapper.class_
joe.numbers = [number_class(value='212-555-1234') ...]
... but this seems rather obtuse, and I'm not fully comfortable relying on it.
The most valid reason I can think to want to be able to do this is in the case of mixins -- if there were some base class that needed the ability to append new numbers to a user without concrete knowledge of what class to use.
There are a few ways to do this, but I'd argue that the easiest (and clean enough) is to store what you need on the User class, because your User class is already implementation bound to the Number class, in that it imports and uses Number when creating the relationship. So you could add a User.add_number() method where you pass args to add number, and just have it create the Numbers and store on self.
I would like to save array of enums.
I have the following:
CREATE TABLE public.campaign
(
id integer NOT NULL,
product product[]
)
product is an enum.
In Django I defined it like this:
PRODUCT = (
('car', 'car'),
('truck', 'truck')
)
class Campaign(models.Model):
product = ArrayField(models.CharField(null=True, choices=PRODUCT))
However, when I write the following:
campaign = Campaign(id=5, product=["car", "truck"])
campaign.save()
I get the following error:
ProgrammingError: column "product" is of type product[] but expression is of type text[]
LINE 1: ..."product" = ARRAY['car...
Note
I saw this answer, but I don't use sqlalchemy and would rather not use it if not needed.
EDITED
I tried #Roman Konoval suggestion below like this:
class PRODUCT(Enum):
CAR = 'car'
TRUCK = 'truck'
class Campaign(models.Model):
product = ArrayField(EnumField(PRODUCT, max_length=10))
and with:
campaign = Campaign(id=5, product=[CAR, TRUCK])
campaign.save()
However, I still get the same error,
I see that django is translating it to list of strings.
if I write the following directly the the psql console:
INSERT INTO campaign ("product") VALUES ('{car,truck}'::product[])
it works just fine
There are two fundamental problems here.
Don't use Enums
If you continue to use enum, your next question here on Stackoverflow will be "how do I add a new entry to an enum?". Django does not support enum type out of the box (thank heavens). So you have to use third party libraries for this. Your mileage will vary with how complete the library is.
An enum value occupies four bytes on disk. The length of an enum
value's textual label is limited by the NAMEDATALEN setting compiled
into PostgreSQL; in standard builds this means at most 63 bytes.
If you are thinking that you are saving space on disk by using enum, the above quote from the manual shows that it's an illusion.
See this Q&A for more on advantages and disadvantages of enum. But generally the disadvantages outweigh the advantages.
Don't use Arrays
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Source: https://www.postgresql.org/docs/9.6/static/arrays.html
If you are going to search for a campaign that deals with Cars or Trucks you are going to have to do a lot of hard work. So will the database.
The correct design
The correct design is the one suggested in the postgresql arrays documentation page. Create a related table. This is the standard django way as well.
class Campaign(models.Model):
name = models.CharField(max_length=20)
class Product(Models.model):
name = models.CharField(max_length=20)
campaign = models.ForeignKey(Campaign)
This makes your code simpler. Doesn't require any extra storage. Doesn't require third party libraries. And best of all the vast api of the django related models becomes available to you.
The definition of product field is incorrect as it specifies that it is array of CharFields but it is array of enums in reality. Django does not support enum type now so you can try this extension to define the type correctly:
class Product(Enum):
ProductA = 'a'
...
class Campaign(models.Model):
product = ArrayField(EnumField(Product, max_length=<whatever>))
Try this:
def django2psql(s):
return '{'+','.join(s) + '}'
campaign = Campaign(id=5, product=django2psql(["car", "truck"]))
I think you may have to subclass CharField to get it to report the correct db_type. There may be more problems than this but you can give this a try:
class Product(models.CharField):
def db_type(self, connection):
return 'product'
PRODUCT = (
('car', 'car'),
('truck', 'truck')
)
class Campaign(models.Model):
product = ArrayField(Product(null=True, choices=PRODUCT))
For example, using Flask-SQLAlchemy and jsontools to serialize to JSON like shown -here-, and given a model like this:
class Engine(db.Model):
__tablename__ = "engines"
id = db.Column(db.Integer, primary_key=True)
this = db.Column(db.String(10))
that = db.Column(db.String(10))
parts = db.relationship("Part")
schema = ["id"
, "this"
, "that"
, "parts"
]
def __json__(self):
return self.schema
class Part(db.Model):
__tablename__ = "parts"
id = db.Column(db.Integer, primary_key=True)
engine_id = db.Column(db.Integer, db.ForeignKey("engines.id"))
code = db.Column(db.String(10))
def __json__(self):
return ["id", "code"]
How do I change the schema attribute before query so that it takes effect on the return data?
enginelist = db.session.query(Engine).all()
return enginelist
So far, I have succeeded with subclassing and single-table inheritance like so:
class Engine_smallschema(Engine):
__mapper_args__ = {'polymorphic_identity': 'smallschema'}
schema = ["id"
, "this"
, "that"
]
and
enginelist = db.session.query(Engine_smallschema).all()
return enginelist
...but it seems there should be a better way without needing to subclass (I'm not sure if this is wise). I've tried various things such as setting an attribute or calling a method to set an internal variable. Problem is, when trying such things, the query doesn't like the instance object given it and I don't know SQLAlchemy well enough yet to know if queries can be executed on pre-made instances of these classes.
I can also loop through the returned objects, setting a new schema, and get the wanted JSON, but this isn't a solution for me because it launches new queries (I usually request the small dataset first).
Any other ideas?
The JSON serialization takes place in flask, not in SQLAlchemy. Thus, the __json__ function is not consulted until after you return from your view function. This has therefore nothing to do with SQLAlchemy, and instead it has to do with the custom encoding function, which presumably you can change.
I would actually suggest not attempting to do it this way if you have different sets of attributes you want to serialize for a model. Setting a magic attribute on an instance that affects how it's serialized violates the principle of least surprise. Instead, you can, for example, make a Serializer class that you can initialize with the list of fields you want to be serialized, then pass your Engine to it to produce a dict that can be readily converted to JSON.
If you insist on doing it your way, you can probably just do this:
for e in enginelist:
e.__json__ = lambda: ["id", "this", "that"]
Of course, you can change __json__ to be a property instead if you want to avoid the lambda.
Say I have a Thing class that is related to some other classes, Foo and Bar.
class Thing(Base):
FooKey = Column('FooKey', Integer,
ForeignKey('FooTable.FooKey'), primary_key=True)
BarKey = Column('BarKey', Integer, ForeignKey('BarTable.BarKey'), primary_key=True)
foo = db.relationship('Foo')
bar = db.relationship('Bar')
I want to get a list of the classes/tables related to Thing created by my relationships() e.g. [Foo, Bar]. Any way to do this?
This is a closely related question:
SQLAlchemy, Flask: get relationships from a db.Model. That identifies the string names of the relationships, but not the target classes.
Context:
I'm building unit tests for my declarative base mapping of a SQL database. A lot of dev work is going into it and I want robust checks in place.
Using the Mapper as described in that other question gets you on the right path. As mentioned on the doc [0], you will get a bunch of sqlalchemy.orm.relationships.RelationshipProperty, and then you can use class_ on the mapper associated with each RelationshipProperty to get to the class:
from sqlalchemy.inspection import inspect
rels = inspect(Thing).relationships
clss = [rel.mapper.class_ for rel in rels]
I have following code:
class ArchaeologicalRecord(Base, ObservableMixin, ConcurrentMixin):
author_id = Column(Integer, ForeignKey('authors.id'))
author = relationship('Author', backref=backref('record'))
horizont_id = Column(Integer, ForeignKey('horizonts.id'))
horizont = relationship('Horizont', backref=backref('record'))
.....
somefield_id = Column(Integer, ForeignKey('somefields.id'))
somefield = relationship('SomeModel', backref=backref('record'))
At the moment I have one of entry (Author or Horizont or any other entry which related to arch.record). And I want to ensure that no one record has reference to this field. But I hate to write a lot of code for each case and want to do it most common way.
So, actually I have:
instance of ArchaeologicalRecord
instance of child entity, for example, Horizont
(from previous) it's class definition.
How to check whether any ArchaeologicalRecord contains (or does not) reference to Horizont (or any other child entity) without writing great chunk of copy-pasted code?
Are you asking how to find orphaned authors, horzonts, somefields etc?
Assuming all your relations are many-to-one (ArchaelogicalRecord-to-Author), you could try something like:
from sqlalchemy.orm.properties import RelationshipProperty
from sqlalchemy.orm import class_mapper
session = ... # However you setup the session
# ArchaelogicalRecord will have various properties defined,
# some of these are RelationshipProperties, which hold the info you want
for rp in class_mapper(ArchaeologicalRecord).iterate_properties:
if not isinstance(rp, RelationshipProperty):
continue
query = session.query(rp.mapper.class_)\
.filter(~getattr(rp.mapper.class_, rp.backref[0]).any())
orphans = query.all()
if orphans:
# Do something...
print rp.mapper.class_
print orphans
This will fail when rp.backref is None (i.e. where you've defined a relationship without a backref) - in this case you'd probably have to construct the query a bit more manually, but the RelationshipProperty, and it's .mapper and .mapper.class_ attributes should get you all the info you need to do this in a generic way.