Traitlets: Best way for a "Dict of Instance"? - python

In my code, I need a Dict of Instance (for example a list of Parameter keyed by name). Currently I've solved this by using a regular Dict-traitlet as the incoming property (parameters) and then having a function that "translates" these into instances of the Parameter class.
Is there a better way to do this than:
import traitlets as t
import traitlets.config as tc
class Parameter(tc.Configurable):
name = t.Unicode().tag(config=True)
description = t.Unicode(allow_none=True).tag(config=True)
value = t.Any(default_value=None).tag(config=True)
class Job(tc.Configurable):
parameters = t.Dict(allow_none=True).tag(config=True)
_parameter_map = t.Dict()
def init_parameters(self):
self._parameter_map.clear()
for name, configuration in self.parameters.items():
configuration['name'] = name
parameter = Parameter(**configuration, parent=self)
self._parameter_map[name] = parameter
And then this:
c.Job.parameters = {
"parameter1": {
"description": "The first parameter",
"value": True
}
}
It works and logic dictates that - since you configure by "class names" with the traitlets - it is the only way, but I just wanted to be sure

Related

Pydantic Model Structure for Largely Similar Objects?

I wonder if anyone might have a suggestion for a better way to build up a Pydantic model for this case?
The data set I am working with (JSON) is mostly the same structure throughout, but with some differences only down at the lowest levels of the tree. ie:
// data.json
{
"FirstItem": {
"Name": "first item",
"Data": {
"attr_1": "a",
"attr_2": "b"
}
},
"SecondItem": {
"Name": "second item",
"Data": {
"attr_3": "d",
"attr_4": "e"
}
},
...
}
So I am wondering, is there a suggested method for building a Pydantic model that uses a standard 'Item' (in this case, it would have 'Name' and 'Data'), but then change the 'Data' on a case-by-case basis?
I have a working example, but it feels quite verbose?
working example:
from pydantic import BaseModel
class FirstItemData(BaseModel):
attr_1: str
attr_2: str
class FirstItem(BaseModel):
Name: str
Data: FirstItemData # <--- The unique part
class SecondItemData(BaseModel):
attr_3: str
attr_4: str
class SecondItem(BaseModel):
Name: str
Data: SecondItemData
class Example(BaseModel):
FirstItem: FirstItem
SecondItem: SecondItem
o = Example.parse_file("data.json")
The above does work, but it feels like building the Item 'holder' each time (the part with 'Name' and 'Data') is redundant? Is there way to specify a generic 'container' structure, and then swap out the 'Data'"? Something like:
class GenericContainer(BaseModel):
Name: str
Data: ????
class Example(BaseModel):
FirstItem: GenericContainer(Data = FirstItemData)
SecondItem: GenericContainer(Data = SecondItemData)
or something of that sort? In this case I have several dozen of these unique 'Items' (only unique in their 'Data' part) and it doesn't seem correct to create 2 classes for each one? Does it?
I do realize that using the type Dict in place of the detailed 'Data' does work to load in the data, but it comes in as a dict instead of an object, which is not ideal in this case.
any thoughts or suggestions are much appreciated. Thanks!
Based on the comment from Hernán Alarcón, i wanted to try and
i believe this should work. Perhaps it will usefull to someone.
from pydantic.generics import BaseModel, GenericModel
from typing import Generic, TypeVar, Optional
class FirstItemData(BaseModel):
attr_1: str
attr_2: str
class SecondItemData(BaseModel):
attr_3: str
attr_4: str
TypeX = TypeVar('TypeX')
class GenericContainer(GenericModel, Generic[TypeX]):
Name: str
Data: TypeX
class ItemBag(BaseModel):
FirstItem: Optional[GenericContainer[FirstItemData]]
SecondItem: Optional[GenericContainer[SecondItemData]]
# some tests
one_bag = ItemBag(FirstItem = {"Name":"My first item", "Data":{"attr_1":"test1", "attr_2":"test2"}})
another_bag = ItemBag(FirstItem = {"Name":"My first item", "Data":{"attr_1":"test1", "attr_2":"test2"}}, SecondItem = {"Name":"My first item", "Data":{"attr_3":"test3", "attr_4":"test4"}})
# failing tests to slightly check validation
one_failing_bag = ItemBag(FirstItem = {"Name":"My first item", "Data":{"attr_3":"test1", "attr_42":"test2"}})
another_failing_bag = ItemBag(SecondItem = {"Name":"My second item", "Data":{"attr_3":"test3", "attr_42":"test2"}})
# the parsing way
parsed_bag = ItemBag.parse_obj({"FirstItem":{"Name":"My first item", "Data":{"attr_1":"test1", "attr_2":"test2"}}, "SecondItem": {"Name":"My first item", "Data":{"attr_3":"test3", "attr_4":"test4"}}})
So it works,
but i am not sure i'd choose genericity versus readability.

How to create a dataclass with optional fields that outputs field in json only if the field is not None

I am unclear about how to use a #dataclass to convert a mongo doc into a python dataclass. With my NSQL documents they may or may not contain some of the fields. I only want to output a field (using asdict) from the dataclass if that field was present in the mongo document.
Is there a way to create a field that will be output with dataclasses.asdict only if it exists in the mongo doc?
I have tried using post_init but have not figured out a solution.
# in this example I want to output the 'author' field ONLY if it is present in the mongo document
#dataclass
class StoryTitle:
_id: str
title: str
author: InitVar[str] = None
dateOfPub: int = None
def __post_init__(self, author):
print(f'__post_init__ got called....with {author}')
if author is not None:
self.newauthor = author
print(f'self.author is now {self.newauthor}')
# foo and bar approximate documents in mongodb
foo = dict(_id='b23435xx3e4qq', title = 'goldielocks and the big bears', author='mary', dateOfPub = 220415)
newFoo = StoryTitle(**foo)
json_foo = json.dumps(asdict(newFoo))
print(json_foo)
bar = dict(_id='b23435xx3e4qq', title = 'War and Peace', dateOfPub = 220415)
newBar = StoryTitle(**bar)
json_bar = json.dumps(asdict(newBar))
print(json_bar)
My output json does not (of course) have the 'author' field. Anyone know how to accomplish this? I suppose I could just create my own asdict method ...
The dataclasses.asdict helper function doesn't offer a way to exclude fields with default or un-initialized values unfortunately -- however, the dataclass-wizard library does.
The dataclass-wizard is a (de)serialization library I've created, which is built on top of dataclasses module. It adds no extra dependencies outside of stdlib, only the typing-extensions module for compatibility reasons with earlier Python versions.
To skip dataclass fields with default or un-initialized values in serialization for ex. with asdict, the dataclass-wizard provides the skip_defaults option. However, there is also a minor issue I noted with your code above. If we set a default for the author field as None, that means that we won't be able to distinguish between null values and also the case when author field is not present when de-serializing the json data.
So in below example, I've created a CustomNull object similar to the None singleton in python. The name and implementation doesn't matter overmuch, however in our case we use it as a sentinel object to determine if a value for author is passed in or not. If it is not present in the input data when from_dict is called, then we simply exclude it when serializing data with to_dict or asdict, as shown below.
from __future__ import annotations # can be removed in Python 3.10+
from dataclasses import dataclass
from dataclass_wizard import JSONWizard
# create our own custom `NoneType` class
class CustomNullType:
# these methods are not really needed, but useful to have.
def __repr__(self):
return '<null>'
def __bool__(self):
return False
# this is analogous to the builtin `None = NoneType()`
CustomNull = CustomNullType()
# in this example I want to output the 'author' field ONLY if it is present in the mongo document
#dataclass
class StoryTitle(JSONWizard):
class _(JSONWizard.Meta):
# skip default values for dataclass fields when `to_dict` is called
skip_defaults = True
_id: str
title: str
# note: we could also define it like
# author: str | None = None
# however, using that approach we won't know if the value is
# populated as a `null` when de-serializing the json data.
author: str | None = CustomNull
# by default, the `dataclass-wizard` library uses regex to case transform
# json fields to snake case, and caches the field name for next time.
# dateOfPub: int = None
date_of_pub: int = None
# foo and bar approximate documents in mongodb
foo = dict(_id='b23435xx3e4qq', title='goldielocks and the big bears', author='mary', dateOfPub=220415)
new_foo = StoryTitle.from_dict(foo)
json_foo = new_foo.to_json()
print(json_foo)
bar = dict(_id='b23435xx3e4qq', title='War and Peace', dateOfPub=220415)
new_bar = StoryTitle.from_dict(bar)
json_bar = new_bar.to_json()
print(json_bar)
# lastly, we try de-serializing with `author=null`. the `author` field should still
# be populated when serializing the instance, as it was present in input data.
bar = dict(_id='b23435xx3e4qq', title='War and Peace', dateOfPub=220415, author=None)
new_bar = StoryTitle.from_dict(bar)
json_bar = new_bar.to_json()
print(json_bar)
Output:
{"_id": "b23435xx3e4qq", "title": "goldielocks and the big bears", "author": "mary", "dateOfPub": 220415}
{"_id": "b23435xx3e4qq", "title": "War and Peace", "dateOfPub": 220415}
{"_id": "b23435xx3e4qq", "title": "War and Peace", "author": null, "dateOfPub": 220415}
Note: the dataclass-wizard can be installed with pip:
$ pip install dataclass-wizard

How to init a value in a class with a simple setting and getting attributes?

The class I have been using looks simple, like this:
class Transaction(dict):
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
and then sending in:
transaction = Transaction({"to": "0x000", "from": "0x001": "timestamp": 1234})
and of course can be used like this transaction.to, however it looks like transaction.from does not work because from is a python reserved keyword
So I am curious using that simple class, is there a way to reassign from in the class to be something like
self.sender = dict.from
I have been trying with __init__ but with no luck
I also have written the class just with an __init__ and then assigning all values using self but with out a getter the class is not iterable
What I have been doing looks like this
# given data - {"to": "0x000", "from": "0x001": "timestamp": 1234}
item["sender"] = item["from"]
transaction = Transaction(item)
and then I have reference to it like transaction.sender.
If I understand correctly, your end goal is to have a class that can be instantiated from a dict and expose the keys as attributes. I'm inferring that the dict can only contain certain keys since you're talking about mapping "from" to sender. In that case, I would do this completely differently: don't subclass dict, instead have an alternate constructor that can handle the dict. I'd keep a "normal" constructor mostly for the sake of the repr.
For example:
class Transaction:
def __init__(self, to, from_, timestamp):
self.to = to
self.from_ = from_
self.timestamp = timestamp
#classmethod
def from_dict(cls, d):
return cls(d['to'], d['from'], d['timestamp'])
def __repr__(self):
"""Show construction."""
r = '{}({!r}, {!r}, {!r})'.format(
type(self).__name__,
self.to,
self.from_,
self.timestamp)
return r
transaction = Transaction.from_dict({"to": "0x000", "from": "0x001", "timestamp": 1234})
print(transaction) # -> Transaction('0x000', '0x001', 1234)
print(transaction.from_) # -> 0x001
Here I'm using the trailing underscore convention covered in PEP 8:
single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g.
tkinter.Toplevel(master, class_='ClassName')
By the way, if it's useful, the keyword module contains the names of all Python keywords.

Python: mapping between class and json

I am getting Data via a REST-Interface and I want to store those data in a class-object.
my class could looks like this:
class Foo:
firstname = ''
lastname = ''
street = ''
number = ''
and the json may look like this:
[
{
"fname": "Carl",
"lname": "any name",
"address": ['carls street', 12]
}
]
What's the easiest way to map between the json and my class?
My problem is: I want to have a class with a different structure than the json.
I want the names of the attributes to be more meaningful.
Of course I know that I could simply write a to_json method and a from_json method which does what I want.
The thing is: I have a lot of those classes and I am looking for more declarative way to write the code.
e.g. in Java I probably would use mapstruct.
Thanks for your help!
Use a dict for the json input. Use **kwargs in an __init__ method in your class and map the variables accordingly.
I had a similar problem, and I solved it by using #classmethod
import json
class Robot():
def __init__(self, x, y):
self.type = "new-robot"
self.x = x
self.y = y
#classmethod
def create_robot(cls, sdict):
if sdict["type"] == "new-robot":
position = sdict["position"]
return cls(position['x'], position['y'])
else:
raise Exception ("Unable to create a new robot!!!")
if __name__=='__main__':
input_string = '{"type": "new-robot", "position": {"x": 3, "y": 3}}'
cmd = json.loads(input_string)
bot = Robot.create_robot(cmd)
print(bot.type)
Perhaps you could you two classes, one directly aligned with the Json (your source class) and the other having the actual structure you need. Then you could map them using the ObjectMapper class[https://pypi.org/project/object-mapper/]. This is very close to the MapStruct Library for Java.
ObjectMapper is a class for automatic object mapping. It helps you to create objects between project layers (data layer, service layer, view) in a simple, transparent way.

XML to Python Class to C Struct

I need some advice. Two questions, does something already exist for this, what modules should I use to develop this.
I have some structures that come from an XML file. I want to represent them in Python Classes (maybe using a factory to create a class per structure). But I want these classes to have a function that will emit the structure as a C Struct.
From my research ctypes seems like the recommended thing to use to represent the structures in Python classes, but I don't see any methods for anything that will emit C Stucts for the creation of a header file.
From OP's comment I think the minimal solution a set of helper functions instead of classes. the xmltodict library makes it easy to turn the XML data into nested dictionaries, more or less like JSON. A set of helpers that parse the contents and generate appropriate C-struct strings is all that's really needed. If you can work with dictionaries :
{
"name": "my_struct",
"members": {
[
"name": "intmember",
"ctype": "int"
},
{
"name": "floatmember",
"ctype": "float"
}
]
}
You can do something like:
from string import Template
struct_template_string = '''
typedef $structname struct {
$defs
} $structname;
'''
struct_template = Template(struct_template_string)
member_template = Template(" $ctype $name;")
def spec_to_struct(spec_dict):
structname = spec_dict['name']
member_data = spec_dict['members']
members = [member_template.substitute(d) for d in member_data]
return struct_template.substitute(structname = structname, defs = "\n".join(members))
Which will produce something like:
typedef my_struct struct {
int intmember;
float floatmember;
} my_struct;
I'd try to get it working with basic functions first before trying to build up a class scaffold. It would be pretty easy to hide the details in a class using property descriptors:
class data_property(object):
def __init__(self, path, wrapper = None):
self.path = path
self.wrapper = wrapper
def __get__(self, instance, owner):
result = instance[self.path]
if self.wrapper:
if hasattr(result, '__iter__'):
return [self.wrapper(**i) for i in result]
return self.wrapper(**result)
return result
class MemberWrapper(dict):
name = data_property('name')
type = data_property('ctype')
class StructWrapper(dict):
name = data_property('name')
members = data_property('members', MemberWrapper )
test = StructWrapper(**example)
print test.name
print test.members
for member in test.members:
print member.type, member.name
# my_struct
# [{'name': 'intmember', 'ctype': 'int'}, {'name': 'floatmember', 'ctype': 'float'}]
# int intmember
# float floatmember

Categories

Resources