How to import a Pydantic model into SQLModel? - python

I generated a Pydantic model and would like to import it into SQLModel. Since said model does not inherit from the SQLModel class, it is not registered in the metadata which is why
SQLModel.metadata.create_all(engine)
just ignores it.
In this discussion I found a way to manually add models:
SQLModel.metadata.tables["hero"].create(engine)
But doing so throws a KeyError for me.
SQLModel.metadata.tables["sopro"].create(engine)
KeyError: 'sopro'
My motivation for tackling the problem this way is that I want to generate an SQLModel from a simple dictionary like this:
model_dict = {"feature_a": int, "feature_b": str}
And in this SO answer, I found a working approach. Thank you very much in advance for your help!

As far as I know, it is not possible to simply convert an existing Pydantic model to an SQLModel at runtime. (At least as of now.)
There are a lot of things that happen during model definition. There is a custom meta class involved, so there is no way that you can simply substitute a regular Pydantic model class for a real SQLModel class, short of manually monkeypatching all the missing pieces.
That being said, you clarified that your actual motivation was to be able to dynamically create an SQLModel class at runtime from a dictionary of field definitions. Luckily, this is in fact possible. All you need to do is utilize the Pydantic create_model function and pass the correct __base__ and __cls_kwargs__ arguments:
from pydantic import create_model
from sqlmodel import SQLModel
field_definitions = {
# your field definitions here
}
Hero = create_model(
"Hero",
__base__=SQLModel,
__cls_kwargs__={"table": True},
**field_definitions,
)
With that, SQLModel.metadata.create_all(engine) should create the corresponding database table according to your field definitions.
See this question for more details.
Be sure to use correct form for the field definitions, as the example you gave would not be valid. As the documentation says, you need to define fields in the form of 2-tuples (or just a default value):
model_dict = {
"feature_a": (int, ...),
"feature_b": (str, ...),
"feature_c": 3.14,
}
Hope this helps.

Related

Pydantic does not validate the key/values of dict fields

I have the following simple data model:
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
I want to add multiple tables like this:
tables = TableModel(table={'T1': 'Tea'})
print(tables) # table={'T1': 'Tea'}
tables.table['T2'] = 'coffee'
tables.table.update({'T3': 'Milk'})
print(tables) # table={'T1': 'Tea', 'T2': 'coffee', 'T3': 'Milk'}
So far everything is working as expected. However the next piece of code does not raise any error:
tables.table[1] = 2
print(tables) # table={'T1': 'Tea', 'T2': 'coffee', 'T3': 'Milk', 1: 2}
I changed tables field name to __root__. With this change as well I see the same behavior.
I also add the validate_assignment = True in the Model Config that also does not help.
How can I get the model to validate the dict fields? Am I missing something basic here?
There are actually two distinct issues here that I'll address separately.
Mutating a dict on a Pydantic model
Observed behavior
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
class Config:
validate_assignment = True
instance = TableModel(table={"a": "b"})
instance.table[1] = object()
print(instance)
Output: table={'a': 'b', 1: <object object at 0x7f7c427d65a0>}
Both key and value type clearly don't match our annotation of table. So, why does the assignment instance.table[1] = object() not cause a validation error?
Explanation
The reason is rather simple: There is no mechanism to enforce validation here. You need to understand what happens here from the point of view of the model.
A model can validate attribute assignment (if you configure validate_assignment = True). It does so by hooking into the __setattr__ method and running the value through the appropriate field validator(s).
But in that example above, we never called BaseModel.__setattr__. Instead, we called the __getattribute__ method that BaseModel inherits from object to access the value of instance.table. That returned the dictionary object ({"a": "b"}). And then we called the dict.__setitem__ method on that dictionary and added a key-value-pair of 1: object() to it.
The dictionary is just a regular old dictionary without any validation logic. And the mutation of that dictionary is completely obscure to the Pydantic model. It has no way of knowing that after accessing the object currently assigned to the table field, we changed something inside that object.
Validation would only be triggered, if we actually assigned a new object to the table field of the model. But that is not what happens here.
If we instead tried to do instance.table = {1: object()}, we would get a validation error because now we are actually setting the table attribute and trying to assign a value to it.
Possible workaround
Depending on how you intend to use the model, you could ensure that changes in the table dictionary will always happen "outside" of the model and are followed by a re-assignment in the form instance.table = .... I would say that is probably the most practical option. In general, re-parsing (subsets of) data should ensure consistency, if you mutated values. Something like this should work (i.e. cause an error):
tables.table[1] = 2
tables = TableModel.parse_obj(tables.dict())
Another option might be to play around and define your own subtype of Dict and add validation logic there, but I am not sure how much "reinventing the wheel" that might entail.
The most sophisticated option could maybe be a descriptor-based approach, where instead of just calling __getattribute__, a custom descriptor intercepts the attribute access and triggers the assignment validation. But that is just an idea. I have not tried this and don't know if that might break other Pydantic magic.
Implicit type coercion
Observed behavior
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
instance = TableModel(table={1: 2})
print(instance)
Output: table={'1': '2'}
Explanation
This is very easily explained. This is expected behavior and was put in place by choice. The idea is that if we can "simply" coerce a value to the specified type, we want to do that. Although you defined both the key and value type as str, passing an int for each is no big deal because the default string validator can just do str(1) and str(2) respectively.
Thus, instead of raising a validation error, the tables value ends up with {"1": "2"} instead.
Possible workaround
If you do not want this implicit coercion to happen, there are strict types that you can use to annotate with. In this case you could to table: Dict[StrictStr, StrictStr]. Then the previous example would indeed raise a validation error.

Pydantic field does not take value

While attempting to name a Pydantic field schema, I received the following error:
NameError: Field name "schema" shadows a BaseModel attribute; use a different field name with "alias='schema'".
Following the documentation, I attempted to use an alias to avoid the clash. See code below:
from pydantic import StrictStr, Field
from pydantic.main import BaseModel
class CreateStreamPayload(BaseModel):
name: StrictStr
_schema: dict[str: str] = Field(alias='schema')
Upon trying to instantiate CreateStreamPayload in the following way:
a = CreateStreamPayload(name= "joe",
_schema= {"name": "a name"})
The resulting instance has only a value for name, nothing else.
a.dict()
{'name': 'joe'}
This makes absolutely no sense to me, can someone please explain what is happening?
Many thanks
From the documentation:
Class variables which begin with an underscore and attributes annotated with typing.ClassVar will be automatically excluded from the model.
In general, append an underscore to avoid conflicts as leading underscores are seen as either dunder (magic) members or private members: _schema ➡ schema_

Generate a pydantic model from pydantic object

is it possible to create a pydantic model form an instance of a pydantic model, so that the values are maintained ? Something like this:
from pydantic import create_model,BaseModel,Field
from typing import Optional
class ExampleModel(BaseModel):
some_text: str
optional_number: Optional[float]
instance=ExampleModel(some_text="foo")
dynamic_Model=create_model("Parameters",__config__=instance.Config)
dyn_instance=dynamic_Model()
print(instance)
print(dyn_instance) #this has no attributes so it's an empty line
print("Is it equal ? "+ str(dyn_instance == instance)) #can this be true?
If you wonder about the use case. I want to build an web-app with Streamlit and Streamlit-pydantic. The later reders an UI-inputmask from a pydantic model like this:
instance_of_pydantic_model=sp.pydantic_form(model=pydanticModel, key='some key')
See it in action
This leads in a multi-page application to the problem, that the Input_mask will not display any of the user input after switch to another page an back.
If you use the create_model function properly it works:
dynamic_Model=create_model("Parameters",**vars(instance))
With Streamlit pydantic the input mask stays consistent, even with optional Fields, which are now populated with a value.

How to limit choices for pydantic using Enum

I got next Enum options:
class ModeEnum(str, Enum):
""" mode """
map = "map"
cluster = "cluster"
region = "region"
This enum used in two Pydantic data structures.
In one data structure I need all Enum options.
In other data structure I need to exclude region.
If I use custom validation for this and try to enter some other value, standard Validation error message informs, that allowed values are all three.
So what is best decision in this situation?
P.S.
I use map variable in ModeEnum. Is it bad? I can't imagine situation when it can override built-in map object, but still, is it ok?
It's a little bit of a hack, but if you mark your validator with pre=True, you should be able to force it to run first, and then you can throw a custom error with the allowed values.

How to define a django model field with the same name as a Python keyword

I need to define a Django model field with the name in, which is a Python language keyword. This is a syntax error:
class MyModel(models.Model):
in = jsonfield.JSONField()
How can I make this work?
The reason I need this name is when I use django-rest-framework's ModelSerializer class, field name is used as the key for serialization output, and I thought it might be easier to manipulate django's Model class instead of ModelSerializer class to get the output I want.
Generally speaking, you don't. Avoid the use of keywords in your identifiers. The general Python convention is to add an underscore to such names; here that'd be in_:
class MyModel(models.Model):
in_ = jsonfield.JSONField()
However, Django prohibits names ending in an underscore because the underscore clashes with their filter naming conventions, so you have to come up with a different name still; pick one that still describes your case; I picked contained in rather than in, as a guess to what you want to do here:
class MyModel(models.Model):
contained_in = jsonfield.JSONField()
If you are trying to match an existing database schema, use the db_column attribute:
class MyModel(models.Model):
contained_in = jsonfield.JSONField(db_column='in')
If you want to be stubborn, in normal classes you could use setattr() after creating the class to use a string instead of an identifier:
class Foo:
pass
setattr(Foo, 'in', 'some value')
but you'll have to use setattr(), getattr(), delattr() and/or vars() everywhere in your code to be able to access this.
In Django you'll have the added complication that a models.Model subclass uses a metaclass to parse out your class members into others structures, and adding an extra field with setattr() doesn't work without (a lot of) extra work to re-do what the metaclass does. You could instead use the field.contribute_to() method, calling it after the class has been prepared by Django (technique taken from this blog post):
from django.db.models.signals import class_prepared
def add_field(sender, **kwargs):
if sender.__name__ == "MyModel":
field = jsonfield.JSONField('in')
field.contribute_to_class(sender, 'in')
class_prepared.connect(add_field)
but you have to make sure this hook is registered before you create your model class.
There is no way to make it work, and it's a bad idea anyway. Choose a different name.
If, for some reason, you want to have column name that matches some reserved keyword, use db_column argument for that field.
in_something = models.CharField(db_column='in', max_length=100)
You mentioned the use of django rest framework. Here's how to make it work on the serializer layer. The keyword used is from. to is just an example of a non-keyword if you want it mapped to a different name.
from django.db import models
from rest_framework import serializers
SP_FIELD_MAP = {
'from': 'sender'
}
# would be in models.py
class Transaction(models.Model):
recipient = models.CharField(max_length=16)
sender = models.CharField(max_length=64)
# would be in serializers.py
class TransactionSerializer(serializers.ModelSerializer):
to = serializers.CharField(source='recipient')
class Meta:
model = Transaction
fields = ('id', 'to', 'from')
# `from` is a python keyword hence this
extra_kwargs = {'from': {'source': 'sender'}}
def build_field(self, field_name, info, model_class, nested_depth):
# Catches python keywords like `from` and maps to its proper field
field_name = SP_FIELD_MAP.get(field_name, field_name)
return super(TransactionSerializer, self).build_field(
field_name, info, model_class, nested_depth)
Tested on CharField using POST and GET methods only but I don't see how it won't work on other methods. You might need special stuff for other field types though. I suggest going into the source. There's tons of fun stuff going on in DRF's source.
You should be giving all your variables descriptive names that clearly state what they are to be used for, and where possible it should be easy to assertain what type of variable it is.
in, to me, would appear at first glance to be a boolean so in order to use this variable in my own extension to the code I'd need to find other usages of it before I knew how I could use it.
Therefore, simply don't try to hack something together just so you can get this terrible variable name into your model, it offers no value to you to do so, its not really any quicker to type since intellisense is available in most places. Figure out what "in" relates to and then formulate a proper name that is descriptive.

Categories

Resources