I have some python classes that relate to one another, they attempt to mimic a graphql schema (The schema itself is not relevant, I post here the base case to reproduce the issue).
The GraphQL schema looks like this:
type User {
name: String
orders: [Order]
}
type Order {
key: String
user: User
}
From a schema-design point of view, there is nothing wrong with this schema, it's a valid one and I already have a database running with this relationships (it just means: An user may have several orders, an order may have only one user that created it).
It's in the python side of things that things get a little messy.
I would expect the following code to work:
file: models/Model.py
import attr
#attr.s
class Model():
pass # Model internal workings not relevant to the example
file: models/User.py
from typing import List
import attr
from . import Model
#attr.s
class User(Model):
name: str = 'Name'
orders: List[Order] = attr.ib(factory=list)
file: models/Order.py
import attr
from . import Model
#attr.s
class Order(Model):
key: str = 'abc'
user: User = attr.ib(factory=User)
then I can do things like this:
file: main.py
import models as m
user = m.User.query(name='John', with='orders')
user.name # "John"
user.orders # [m.Order(key='1'), m.Order(key='2'), m.Order(key='3')...]
order = m.Order.query(key='1', with='user')
order.key # "1"
order.user # m.User(name="John")
This code does not work due to the circular dependency (User needing Order type to be defined earlier, and Order requiring User).
The workaround I found was late-importing the models using the importlib:
# current solution:
# using the importlib to import dynamically
from typing import List
import attr
from helpers import convert_to, list_convert_to,
# Note: "convert_to" receives a class name and returns a function to instantiate it dinamically
#attr.s
class Model():
pass
#attr.s
class User(Model):
name: str = 'Name'
orders: List[Model] = attr.ib(factory=list_convert_to('Order'))
#attr.s
class Order(Model):
key: str = 'abc'
user: Model = attr.ib(factory=list_convert_to('User'))
this solution works, but loses the ability to know beforehand the types of the fields, and I think it is slower when building complex relations (hundreds of items with Objects several levels deep).
This is why i am looking for better ways to solve this problem, any ideas?
Assuming you're using Python 3.7 or later, the following line will make it work:
from __future__ import annotations
It also allows you to refer to a class while defining it. E.g.
class C:
#classmethod
def factory(cls) -> C:
...
works now.
If your classes are defined in multiple files and you get a circular dependency due to that, you can guard the imports using
from typing import TYPE_CHECKING
# ...
if TYPE_CHECKING:
from module import User
Related
While attempting to name a Pydantic field schema, I received the following error:
NameError: Field name "schema" shadows a BaseModel attribute; use a different field name with "alias='schema'".
Following the documentation, I attempted to use an alias to avoid the clash. See code below:
from pydantic import StrictStr, Field
from pydantic.main import BaseModel
class CreateStreamPayload(BaseModel):
name: StrictStr
_schema: dict[str: str] = Field(alias='schema')
Upon trying to instantiate CreateStreamPayload in the following way:
a = CreateStreamPayload(name= "joe",
_schema= {"name": "a name"})
The resulting instance has only a value for name, nothing else.
a.dict()
{'name': 'joe'}
This makes absolutely no sense to me, can someone please explain what is happening?
Many thanks
From the documentation:
Class variables which begin with an underscore and attributes annotated with typing.ClassVar will be automatically excluded from the model.
In general, append an underscore to avoid conflicts as leading underscores are seen as either dunder (magic) members or private members: _schema ➡ schema_
I have a model A and want to make subclasses of it.
class A(models.Model):
type = models.ForeignKey(Type)
data = models.JSONField()
def compute():
pass
class B(A):
def compute():
df = self.go_get_data()
self.data = self.process(df)
class C(A):
def compute():
df = self.go_get_other_data()
self.data = self.process_another_way(df)
# ... other subclasses of A
B and C should not have their own tables, so I decided to use the proxy attirbute of Meta. However, I want there to be a table of all the implemented proxies.
In particular, I want to keep a record of the name and description of each subclass.
For example, for B, the name would be "B" and the description would be the docstring for B.
So I made another model:
class Type(models.Model):
# The name of the class
name = models.String()
# The docstring of the class
desc = models.String()
# A unique identifier, different from the Django ID,
# that allows for smoothly changing the name of the class
identifier = models.Int()
Now, I want it so when I create an A, I can only choose between the different subclasses of A.
Hence the Type table should always be up-to-date.
For example, if I want to unit-test the behavior of B, I'll need to use the corresponding Type instance to create an instance of B, so that Type instance already needs to be in the database.
Looking over on the Django website, I see two ways to achieve this: fixtures and data migrations.
Fixtures aren't dynamic enough for my usecase, since the attributes literally come from the code. That leaves me with data migrations.
I tried writing one, that goes something like this:
def update_results(apps, schema_editor):
A = apps.get_model("app", "A")
Type = apps.get_model("app", "Type")
subclasses = get_all_subclasses(A)
for cls in subclasses:
id = cls.get_identifier()
Type.objects.update_or_create(
identifier=id,
defaults=dict(name=cls.__name__, desc=cls.__desc__)
)
class Migration(migrations.Migration):
operations = [
RunPython(update_results)
]
# ... other stuff
The problem is, I don't see how to store the identifier within the class, so that the Django Model instance can recover it.
So far, here is what I have tried:
I have tried using the fairly new __init_subclass__ construct of Python. So my code now looks like:
class A:
def __init_subclass__(cls, identifier=None, **kwargs):
super().__init_subclass__(**kwargs)
if identifier is None:
raise ValueError()
cls.identifier = identifier
Type.objects.update_or_create(
identifier=identifier,
defaults=dict(name=cls.__name__, desc=cls.__doc__)
)
# ... the rest of A
# The identifier should never change, so that even if the
# name of the class changes, we still know which subclass is referred to
class B(A, identifier=3):
# ... the rest of B
But this update_or_create fails when the database is new (e.g. during unit tests), because the Type table does not exist.
When I have this problem in development (we're still in early stages so deleting the DB is still sensible), I have to go
comment out the update_or_create in __init_subclass__. I can then migrate and put it back in.
Of course, this solution is also not great because __init_subclass__ is run way more than necessary. Ideally this machinery would only happen at migration.
So there you have it! I hope the problem statement makes sense.
Thanks for reading this far and I look forward to hearing from you; even if you have other things to do, I wish you a good rest of your day :)
With a little help from Django-expert friends, I solved this with the post_migrate signal.
I removed the update_or_create in __init_subclass, and in project/app/apps.py I added:
from django.apps import AppConfig
from django.db.models.signals import post_migrate
def get_all_subclasses(cls):
"""Get all subclasses of a class, recursively.
Used to get a list of all the implemented As.
"""
all_subclasses = []
for subclass in cls.__subclasses__():
all_subclasses.append(subclass)
all_subclasses.extend(get_all_subclasses(subclass))
return all_subclasses
def update_As(sender=None, **kwargs):
"""Get a list of all implemented As and write them in the database.
More precisely, each model is used to instantiate a Type, which will be used to identify As.
"""
from app.models import A, Type
subclasses = get_all_subclasses(A)
for cls in subclasses:
id = cls.identifier
Type.objects.update_or_create(identifier=id, defaults=dict(name=cls.__name__, desc=cls.__doc__))
class MyAppConfig(AppConfig):
default_auto_field = "django.db.models.BigAutoField"
name = "app"
def ready(self):
post_migrate.connect(update_As, sender=self)
Hope this is helpful for future Django coders in need!
I would love to use a schema that looks something like the following in FastAPI:
from __future__ import annotations
from typing import List
from pydantic import BaseModel
class Project(BaseModel):
members: List[User]
class User(BaseModel):
projects: List[Project]
Project.update_forward_refs()
but in order to keep my project structure clean, I would ofc. like to define these in separate files. How could I do this without creating a circular reference?
With the code above the schema generation in FastAPI works fine, I just dont know how to separate it out into separate files. In a later step I would then instead of using attributes use #propertys to define the getters for these objects in subclasses of them. But for the OpenAPI doc generation, I need this combined - I think.
There are three cases when circular dependency may work in Python:
Top of module: import package.module
Bottom of module: from package.module import attribute
Top of function: works both
In your situation, the second case "bottom of module" will help.
Because you need to use update_forward_refs function to resolve pydantic postponed annotations like this:
# project.py
from typing import List
from pydantic import BaseModel
class Project(BaseModel):
members: "List[User]"
from user import User
Project.update_forward_refs()
# user.py
from typing import List
from pydantic import BaseModel
class User(BaseModel):
projects: "List[Project]"
from project import Project
User.update_forward_refs()
Nonetheless, I would strongly discourage you from intentionally introducing circular dependencies
Just put all your schema imports at the bottom of the file, after all classes, and call update_forward_refs().
#1/4
from __future__ import annotations # this is important to have at the top
from pydantic import BaseModel
#2/4
class A(BaseModel):
my_x: X # a pydantic schema from another file
class B(BaseModel):
my_y: Y # a pydantic schema from another file
class C(BaseModel):
my_z: int
#3/4
from myapp.schemas.x import X # related schemas we import after all classes
from myapp.schemas.y import Y
#4/4
A.update_forward_refs() # tell the system that A has a related pydantic schema
B.update_forward_refs() # tell the system that B has a related pydantic schema
# for C we don't need it, because C has just an integer field.
NOTE:
Do this in every file that has schema imports.
That will enable you make any combination without circular import problems.
NOTE 2:
People usually put the imports and update_forward_refs() after every class, and then report that it doesn't work. That is usually because if an app is complex, you do not know what import is calling which class and when. Therefore, if you put it at the bottom, you are sure that every class will be 'scanned' and visible for others.
To me, the other answers don't seem to solve this on a satisfactory level due to ignoring the locals in modules. Here is a straightforward way to make that work on separate files:
user.py
from typing import TYPE_CHECKING, List
from pydantic import BaseModel
if TYPE_CHECKING:
from project import Project
class User(BaseModel):
projects: List['Project']
project.py
from typing import TYPE_CHECKING, List
from pydantic import BaseModel
if TYPE_CHECKING:
from user import User
class Project(BaseModel):
members: List['User']
main.py
from project import Project
from user import User
# Update the references that are as strings
Project.update_forward_refs(User=User)
User.update_forward_refs(Project=Project)
# Example: Projects into User and Users into Project
Project(
members=[
User(
projects=[
Project(members=[])
]
)
]
)
This works if you run the main.py. If you are building a package, you may put that content to an __init__.py file that is high enough in the structure to not have circular import problem.
Note how we passed the User=User and Project=Project to update_forward_refs. This is because the module scopes where these classes are don't have references to each other (as if they did, there would be circular import). Therefore we pass them in main.py when updating the references as there we don't have the circular import problem.
NOTE: About type checking
If if TYPE_CHECKING: patterns are unfamiliar, they are basically if blocks that are never True on runtime (running your code) but they are used by code analysis (IDEs) to highlight the types. Those blocks are not needed for the example to work but are highly recommended as otherwise, it's hard to read the code, find out where these classes actually are defined and fully utilize code analysis tools.
If I want to split the models/schema into separate files, I will create extra files for the ProjectBase model and UserBase model so the Project model and User model could inherit from them. I will do like this:
#project_base.py
from pydantic import BaseModel
class ProjectBase(BaseModel):
id: int
title: str
class Config:
orm_mode=True
#user_base.py
from pydantic import BaseModel
class UserBase(BaseModel):
id: int
title: str
class Config:
orm_mode=True
#project.py
from typing import List
from .project_base import ProjectBase
from .user_base import UserBase
class Project(ProjectBase):
members: List[UserBase] = []
#user.py
from typing import List
from .project_base import ProjectBase
from .user_base import UserBase
class User(UserBase):
projects: List[ProjectBase] = []
note: for this method the orm_mode must be put in the ProjectBase and UserBase, so it can read by Project and User even if it is not a dict
I have 3 marshmallow Schemas with Nested fields that form a dependency cycle/triangle. If I use the boilerplate from two-way nesting, I seem to have no problem.
from marshmallow import Schema, fields
class A(Schema):
id = fields.Integer()
b = fields.Nested('B')
class B(Schema):
id = fields.Integer()
c = fields.Nested('C')
class C(Schema):
id = fields.Integer()
a = fields.Nested('A')
However, I have my own, thin subclass of fields.Nested that looks something like the following:
from marshmallow import fields
class NestedRelationship(fields.Nested):
def __init__(self, nested,
include_data=True,
**kwargs):
super(NestedRelationship, self).__init__(nested, **kwargs)
self.schema.is_relationship = True
self.schema.include_relationship_data = include_data
and I change each Schema to use NestedRelationship instead of the native Nested type, I get:
marshmallow.exceptions.RegistryError: Class with name 'B' was not found. You may need to import the class.
NestedRelationship is a relatively thin subclass and I am surprised at the difference in behavior. Am I doing something wrong here? Am I not calling super appropriately?
The problem is with your extra code that accesses self.schema. When you define A.b field, it tries to resolve it, but it wasn't defined yet. On the other hand marshmallow.fields.Nested does not try to resolve schema on construction time and thus does not have this problem.
I have two apps say app1 and app2 and I have models in it.
from app2.models import SecondModel
class FirstModel(models.Model):
first_field = models.ManyToManyField(SecondModel, blank=True)# or Foreign Key
from app1.models import FirstModel
class SecondModel(models.Model):
second_field = models.ForeignKey(FirstModel)
When I do this I get import error.
Could not import name 'FirstModel'
Why is this happening ?
The error is because you have a circular import. It's not possible to for both modules to import from each other.
In this case, you don't need to import the models into each app. Remove the imports, and use a string app_label.ModelName instead.
# app1.models.py
class FirstModel(models.Model):
first_field = models.ManyToManyField('app2.SecondModel')
# app2.models.py
class SecondModel(models.Model):
second_field = models.ForeignKey('app1.FirstModel')
there is a name conflict here .. you defined the FirstModel in your models.py and then defined FirstModel, from the code above, this could be the possible problem. Also, the import error generally mean, there is no FirstModel defined from where you are importing it.
However, a more generic way of doing FKs without import is generally
class FkModel(models.Model):
relationship = models.ManyToManyField('appName.modelName')
where appName is the app from where you are trying to import the model from, and modelName is the model to which you are trying to create the relationship. This helps where you are trying to do something like this.
Lets say your app name is 'app' and you are trying to create a many to many relationship from 1st model to a 2nd model for which the class is declared after the 1st model e.g.
class Model1(models.Model):
first_field = models.ManyToManyField('app.Model1')
class Model2(models.Model):
name = models.CharField(maxlength=256)
that is just put your appname.modelName inside strings :)
also, you have a flaw in your ManyToManyField() declaration i.e. you don't need to define blank in Many to Many. The way db's work under the hood is, they create a 3rd database table just to store many to many relationships.
hope it helps
//mouse.