Python: How to json serialize class reference, not the class instance - python

I want to serialize the class itself rather than a object instance.
For example if I do this
class Foo:
pass
json.dumps(Foo)
It would throw an error saying Foo is not JSON serializable.
Is it even possible to do this with python?

A bit late, but you can't actually directly serialize a class. Serializing is the act of turning an object into a string for storage and transmission. A class declaration, in your case Foo, is already a text file; Foo.py (or wherever you put it).
Serialization librairies usually keep the object's class in reference, so that the same program quand save and load an object and still know that it's a Foo instance. But, if I program my own little app that does not contain a Foo class, then unserializing your data will give me something that is pretty much useless.
Depending on what you want to do, there are alternatives.
If your goal is to save or transmit methods instead of data (as a class declaration is usually one of methods, unlike a class instance, which is composed of information, contained in properties and variables), then you could serialize functions or dynamic objects containing methods. Those would be understood by any program unserializing the data and those functions could be correctly run. It is actually a security risk to unserialize without fact-checking the data, as you could then transmit damaging methods to an unsuspecting program.
If your goal is to transmit data types, then you might have to consider transmitting your entire program instead. As stated, the goal of serialization is to save or transmit data. If you want to save or transmit functionality, then consider that your code, saved on your hard drive, already is a string representation of functionality; you could transmit your entire program. Also, there are librairies out there to convert your code into executables, in case the target computer does not support python.
A third alternative, and half the reason I ended up finding this question on Stack Overflow, is to structure your data as database entries. This only applies if your goal is to save or transmit an object as a "shape", or a structure, and not as a bunch of fuctions and actions. In other words, this assumes your goal is to transmit the fact that a Contact class is composed of a name, a phone number and an address.
For the last option, you, in a way, want to build a class of classes:
Class model:
def __init__(self, class_name, fields):
self.class_name = class_name
self.fields = fields
Class model_instance:
def __init__(self, model, values):
self.fields = {}
# add values to the dict with fields as keys, based on the model
...
When you instantiate a "model", you are actually creating the model of an object. There would then be multiple tricks to correctly implement an alternative instantiation system with this, at this point, some research will be required.
I understand this post is a bit old but I thought answering what I know while searching for what I don't is the entire point of Stack Overflow!

Related

How to use a list of arguments with flask_smorest/marshmallow

I am trying to insert a collection of objects in a flask api. We use marshmallow for deserializing. My endpoint looks like this:
#blp.arguments(SomeSchemas, location='json', as_kwargs=True)
#blp.response(200, SomeSchemas)
def post(self, some_schemas: SomeSchemas) -> dict:
The schema is a simple schema like this:
class SomeSchemas(ma.Schema):
schemas = ma.fields.List(ma.fields.Nested(SomeSchema))
class SomeSchema(ma.Schema):
a = ma.fields.String()
b = ma.fields.Integer()
When i post to the endpoint, I do get a list of the correct data, but it comes in the form of dicts, instead of being correctly translated into the object.
I have also tried explicitly using a list of objects (List[SomeSchema], SomeSchema(many=True), etc.) but I can not seem to figure it out.
I assume this is a very common use case (providing a list of arguments) and that I am missing an obvious solution, but I can't seem to find any reference as to how to do this correctly. To be clear, I am looking for the correct way to call the endpoint with a list (or some other collection type, it does not matter) and have said list be correctly deserialized and with the correct object type.
Disclaimer: flask-smorest maintainer speaking.
I don't think the issue is related to the fact that the input is a list.
IIUC, your problem is that you're gettings dicts, rather than objects, injected in the view function. This is the default marshmallow behaviour. It can be overridden in marshmallow by using a post_load hook to actually instantiate the object.
I generally don't do that. In practice I find it better to instantiate objects in the view function. For instance, in a PUT resource, I prefer to instantiate the existing item from DB then update it with new data. In this case it is better to have the new data as dict than object.
There may not be a single truth, here, it could be a matter of opinion, but while the idea of having the object instantiated in the decorator and passed to the view is appealing, it might be a bit of an abusive shortcut.
I realize this answer is of the "no idea but you shouldn't do it anyway" type. Just saying maybe you shouldn't struggle to achieve this.
This said, I'd be surprised if it worked with non-list / non-nested inputs and I'd wonder why it doesn't work specifically in this case.

dictionary | class | namedtuple from YAML

I have a large-ish YAML file (~40 lines) that I'm loading using PyYAML. This is of course parsed into a large-ish dictionary plus a couple of arrays.
My question is: how to manage the data. I can of course leave it in the output dictionary and work through the data. But I was wondering if it's better instead to mangle the data in a class or use a nametuple to hold the data.
Any first-hand experience about that?
Whether you post-process the data structure into a class or not primarily has to do with how you are using that data. The same applies to the decision whether to use a tag or not and load (some off) the data from the YAML file into a specific instance of a class that way.
The primary advantage of using a class in both cases (post-processing, tagging) is that you can do additional tests during initialisation for consistency, that are not done on the key-value pairs of a dict or on the items of list.
A class also allows you to provide methods to check values before they are set, e.g. to make sure they are of the right type.
Whether that overhead is necessary depends on the project, who is using and/or updating the data etc and how long this project and its data is going to live (i.e. are you still going to understand the data and its implicit structure a year from now). These are all issues for which a well designed (and documented) class can help, at the cost of some extra work up-front.

How can i implement structure array like matlab in python?

How can i implement structure array like matlab in python ?
matlab code :
cluster.c=[]
cluster.indiv=[]
Although you can do this in Python (as I explain below), it might not be the best or most pythonic approach. For other users that have to look at your code (including yourself in 3 months) this syntax is extremely confusing. Think for example about how this deals with name conflicts, undefined values and iterating over properties.
Instead consider storing the data in a data structure that is better suited for this such as a dictionary. Then you can just store everything in
cluster = {'c':[],'indiv':[]}
Imitating Matlab in a bad way:
You can assign properties to any mutable objects in python.
If you need an object just for data storage, then you can define a custom class without any functionality in the following way:
class CustomStruct():
pass
Then you can have
struct=CustomStruct()
struct.c=[]
and change or request properties of the class in this way.
Better approach:
If you really want to store these things as properties of an object, then it might be best to define the variables in the init of that class.
class BetterStruct():
def __init__(self):
self.c=[]
self.indiv=[]
In this way, users looking at your code can immediately understand the expected values, and you can guarantee that they are initalised in a proper fashion.
Allowing data control
If you want to verify the data when it is stored, or if it has to be calculated once the user requests it (instead of storing it constantly), then consider using Python property decorators

Store reference to non-NDB object in an NDB model

As a caveat: I am an utter novice here. I wouldn't be surprised to learn a) this is already answered, but I can't find it because I lack the vocabulary to describe my problem or b) my question is basically silly to begin with, because what I want to do is silly.
Is there some way to store a reference to a class instance that defined and stored in active memory and not stored in NDB? I'm trying to write an app that would help manage a number of characters/guilds in an MMO. I have a class, CharacterClass, that includes properties such as armor, name, etc. that I define in main.py as a base python object, and then define the properties for each of the classes in the game. Each Character, which would be stored in Datastore, would have a property charClass, which would be a reference to one of those instances of CharacterClass. In theory I would be able to do things like
if character.charClass.armor == "Cloth":
while storing the potentially hundreds of unique characters and their specifc data in Datastore, but without creating a copy of "Cloth" for every cloth-armor character, or querying Datastore for what kind of armor a mage wears thousands of times a day.
I don't know what kind of NDB property to use in Character to store the reference to the applicable CharacterClass. Or if that's the right way to do it, even. Thanks for taking the time to puzzle through my confused question.
A string is all you need. You just need to fetch the class based on the string value. You could create a custom property that automatically instantiates the class on reference.
However I have a feeling that hard coding the values in code might be a bit unwieldy. May be you character class instances should be datastore entities as well. It means you can adjust these parameters without deploying new code.
If you want these objects in memory then you can pre-cache them on warmup.

Does a JsonProperty deserialize only upon access?

In Google App Engine NDB, there is a property type JsonProperty which takes a Python list or dictionary and serializes it automatically.
The structure of my model depends on the answer to this question, so I want to know when exactly an object is deserialized? For example:
# a User model has a property "dictionary" which is of type JsonProperty
# will the following deserialize the dictionary?
object = User.get_by_id(someid)
# or will it not get deserialized until I actually access the dictionary?
val = object.dictionary['value']
ndb.JsonProperty follows the docs and does things the same way you would when defining a custom property: it defines make_value_from_datastore and get_value_for_datastore methods.
The documentation doesn't tell you when these methods get called, because it's up to the db implementation within the app engine to decide when to call these methods.
However, it's pretty likely they're going to get called whenever the model has to access the database. For example, from the documentation for get_value_for_datastore:
A property class can override this to use a different data type for the datastore than for the model instance, or to perform other data conversion just prior to storing the model instance.
If you really need to verify what's going on, you can provide your own subclass of JsonProperty like this:
class LoggingJsonProperty(ndb.JsonProperty):
def make_value_from_datastore(self, value):
with open('~/test.log', 'a') as logfile:
logfile.write('make_value_from_datastore called\n')
return super(LoggingJson, self).make_value_from_datastore(value)
You can log the JSON string, the backtrace, etc. if you want. And obviously you can use a standard logging function instead of sticking things in a separate log. But this should be enough to see what's happening.
Another option, of course, is to read the code, which I believe is in appengine/ext/db/__init__.py.
Since it's not documented, the details could change from one version to the next, so you'll have to either re-run your tests or re-read the code each time you upgrade, if you need to be 100% sure.
The correct answer is that it does indeed load the item lazily, upon access:
https://groups.google.com/forum/?fromgroups=#!topic/appengine-ndb-discuss/GaUSM7y4XhQ

Categories

Resources