I am using Datastax Cassandra python driver's Object Mapper for defining cassandra table columns at run time (requirements are like those).
Table and column name and column types are resolved at run time.
I am trying to define a cassandra cqlengine model at runtime using 'type' to define a class.
Looks like Model class defined in python driver has added a metaclass to Model
#six.add_metaclass(ModelMetaClass)
class Model(BaseModel):
...
Is there even a way to define Models using type?
I am seeing following error while defining a Model class
from cassandra.cqlengine.models import Model
from cassandra.cqlengine import columns as Columns
attributes_dict = {
'test_id': Columns.Text(primary_key=True)
'test_col1': Columns.Text()
}
RunTimeModel = type ('NewModelName', tuple(Model), attributes_dict)
Error:
RunTimeModel = type ('NewModelName', tuple(Model), attributes_dict)
TypeError: 'ModelMetaClass' object is not iterable
I'll stay away from the rest, but to answer the question about the error, I think you have a simple syntax error trying to construct a tuple from a non-sequence argument. Instead, you might use the tuple literal notation:
RunTimeModel = type ('NewModelName', (Model,), attributes_dict)
Related
I have the following simple data model:
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
I want to add multiple tables like this:
tables = TableModel(table={'T1': 'Tea'})
print(tables) # table={'T1': 'Tea'}
tables.table['T2'] = 'coffee'
tables.table.update({'T3': 'Milk'})
print(tables) # table={'T1': 'Tea', 'T2': 'coffee', 'T3': 'Milk'}
So far everything is working as expected. However the next piece of code does not raise any error:
tables.table[1] = 2
print(tables) # table={'T1': 'Tea', 'T2': 'coffee', 'T3': 'Milk', 1: 2}
I changed tables field name to __root__. With this change as well I see the same behavior.
I also add the validate_assignment = True in the Model Config that also does not help.
How can I get the model to validate the dict fields? Am I missing something basic here?
There are actually two distinct issues here that I'll address separately.
Mutating a dict on a Pydantic model
Observed behavior
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
class Config:
validate_assignment = True
instance = TableModel(table={"a": "b"})
instance.table[1] = object()
print(instance)
Output: table={'a': 'b', 1: <object object at 0x7f7c427d65a0>}
Both key and value type clearly don't match our annotation of table. So, why does the assignment instance.table[1] = object() not cause a validation error?
Explanation
The reason is rather simple: There is no mechanism to enforce validation here. You need to understand what happens here from the point of view of the model.
A model can validate attribute assignment (if you configure validate_assignment = True). It does so by hooking into the __setattr__ method and running the value through the appropriate field validator(s).
But in that example above, we never called BaseModel.__setattr__. Instead, we called the __getattribute__ method that BaseModel inherits from object to access the value of instance.table. That returned the dictionary object ({"a": "b"}). And then we called the dict.__setitem__ method on that dictionary and added a key-value-pair of 1: object() to it.
The dictionary is just a regular old dictionary without any validation logic. And the mutation of that dictionary is completely obscure to the Pydantic model. It has no way of knowing that after accessing the object currently assigned to the table field, we changed something inside that object.
Validation would only be triggered, if we actually assigned a new object to the table field of the model. But that is not what happens here.
If we instead tried to do instance.table = {1: object()}, we would get a validation error because now we are actually setting the table attribute and trying to assign a value to it.
Possible workaround
Depending on how you intend to use the model, you could ensure that changes in the table dictionary will always happen "outside" of the model and are followed by a re-assignment in the form instance.table = .... I would say that is probably the most practical option. In general, re-parsing (subsets of) data should ensure consistency, if you mutated values. Something like this should work (i.e. cause an error):
tables.table[1] = 2
tables = TableModel.parse_obj(tables.dict())
Another option might be to play around and define your own subtype of Dict and add validation logic there, but I am not sure how much "reinventing the wheel" that might entail.
The most sophisticated option could maybe be a descriptor-based approach, where instead of just calling __getattribute__, a custom descriptor intercepts the attribute access and triggers the assignment validation. But that is just an idea. I have not tried this and don't know if that might break other Pydantic magic.
Implicit type coercion
Observed behavior
from typing import Dict
from pydantic import BaseModel
class TableModel(BaseModel):
table: Dict[str, str]
instance = TableModel(table={1: 2})
print(instance)
Output: table={'1': '2'}
Explanation
This is very easily explained. This is expected behavior and was put in place by choice. The idea is that if we can "simply" coerce a value to the specified type, we want to do that. Although you defined both the key and value type as str, passing an int for each is no big deal because the default string validator can just do str(1) and str(2) respectively.
Thus, instead of raising a validation error, the tables value ends up with {"1": "2"} instead.
Possible workaround
If you do not want this implicit coercion to happen, there are strict types that you can use to annotate with. In this case you could to table: Dict[StrictStr, StrictStr]. Then the previous example would indeed raise a validation error.
Cannot insert a pydantic model into a pandas DataFrame.
I cannot figure out what about classes that inherit from pydantic's BaseModel means that they cannot be inserted into a DataFrame whereas other classes can.
For example:
#dataclasses.dataclass
class Test:
name: str
df = pd.DataFrame()
inst = Test(name='Brian')
df.at[0, 'test'] = inst
print(df)
will output
test
0 Test(name='Brian')
(the above also works for non dataclasses as well)
whereas for a pydantic model
class Test(BaseModel):
name: str
pandas seems to interpret the model as list_like
and fails with error:
TypeError: object of type 'Test' has no len()
This goes for pandas insert, loc, iloc, at, iat
Whilst its not the most obvious use case for a DataFrame I am working with a codebase that expects to be able to insert and retrieve classes as objects from dataframe fields and I am currently attempting to change some classes to use Pydantic.
Why does pandas assume pydantic models are list_like and is there a way around this?
UPDATE:
I found a way to make this work although it is not ideal...
Pandas uses a method is_list_like - one can find it in pandas._libs.lib.pyx but as it is written Pyrex it is hard to tell why a dataclass is not list_like and a pydantic class is, someone may be able to enlighten me on that one.
Anyway to get around this I noticed pandas checks the value's 'ndim' attribute > 0 with a default of 1. If this condition fails then the value is set using the _setitem_single_column method in the pandas _iLocIndexer class.
Therefore setting ndim=0 on the pydantic class allows for it to be set as a field value in a pandas df.
class Test(BaseModel):
name: str
ndim = 0
I am not able to figure out how I can access a dictionary keys using pydantic model properties instead of using get directly on the dictionary.
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
def some_function(data: Person):
abc=data.name
print(abc)
person={'name':'tom','age':12}
some_function(person)
I get : AttributeError: 'dict' object has no attribute 'name'
Basically, I need to pass a dict to a function which will be received as a Pydantic Model type and then I want to be able to use . (dot) operator to access the contents of the passed dict. I have seen something similar implemented but I have no idea why it is not working for me.
Is this possible ?
You need to convert your dict to the pydantic class first,
some_function(Person(**person))
or
some_function(Person.parse_obj(person))
so I've got this model:
class Action(models.Model):
d_changes = ArrayField(models.FloatField(), default=list(), verbose_name='D Changes')
w_changes = ArrayField(models.FloatField(), default=list(), verbose_name='A Changes')
And when I want to create a migration or a fixture I always receive the following warning for both fields:
backend.Action.d_changes: (postgres.E003) ArrayField default should be a callable instead of an instance so that it's not shared between all field instances.
HINT: Use a callable instead, e.g., use `list` instead of `[]`.
For my migrations its not such a big deal, since everything still works fine. But when I try to create a fixture of my db, my .json File always ends up with this bit in the very top of my .json file:
System check identified some issues:
WARNINGS:
[33;1mbackend.Action.d_changes: (postgres.E003) ArrayField default should be a callable instead of an instance so that it's not shared between all field instances.
HINT: Use a callable instead, e.g., use `list` instead of `[]`.[0m
[33;1mbackend.Action.w_changes: (postgres.E003) ArrayField default should be a callable instead of an instance so that it's not shared between all field instances.
HINT: Use a callable instead, e.g., use `list` instead of `[]`.[0m
Which breaks my .json file and thus I cannot use loaddata, as I always receive a DeserializationError(), so I have to manually remove that part.
So what exactly is wrong with the model fields? I mean I'm literally using default=list() which is a callable?
Thanks for the help :)
You have to do this:
class Action(models.Model):
d_changes = ArrayField(models.FloatField(), default=list, verbose_name='D Changes')
w_changes = ArrayField(models.FloatField(), default=list, verbose_name='A Changes')
list() is not callable list is callable. because list() has been already called.
How do I get all column names for an existing model in SQLAlchemy that is defined using a Base class with some columns defined there already? I am looking for a #classmethod, not for an instance method.
The model is not yet bound to a table, I have no __table__ attribute of the model definition yet.
I can loop through the dict and look at the types but surely there is a better way? I cannot use inspect, I get the error: 'No inspection system is available for object of type <class 'sqlalchemy.orm.decl_api.DeclarativeMeta'>'