Does a JsonProperty deserialize only upon access? - python

In Google App Engine NDB, there is a property type JsonProperty which takes a Python list or dictionary and serializes it automatically.
The structure of my model depends on the answer to this question, so I want to know when exactly an object is deserialized? For example:
# a User model has a property "dictionary" which is of type JsonProperty
# will the following deserialize the dictionary?
object = User.get_by_id(someid)
# or will it not get deserialized until I actually access the dictionary?
val = object.dictionary['value']

ndb.JsonProperty follows the docs and does things the same way you would when defining a custom property: it defines make_value_from_datastore and get_value_for_datastore methods.
The documentation doesn't tell you when these methods get called, because it's up to the db implementation within the app engine to decide when to call these methods.
However, it's pretty likely they're going to get called whenever the model has to access the database. For example, from the documentation for get_value_for_datastore:
A property class can override this to use a different data type for the datastore than for the model instance, or to perform other data conversion just prior to storing the model instance.
If you really need to verify what's going on, you can provide your own subclass of JsonProperty like this:
class LoggingJsonProperty(ndb.JsonProperty):
def make_value_from_datastore(self, value):
with open('~/test.log', 'a') as logfile:
logfile.write('make_value_from_datastore called\n')
return super(LoggingJson, self).make_value_from_datastore(value)
You can log the JSON string, the backtrace, etc. if you want. And obviously you can use a standard logging function instead of sticking things in a separate log. But this should be enough to see what's happening.
Another option, of course, is to read the code, which I believe is in appengine/ext/db/__init__.py.
Since it's not documented, the details could change from one version to the next, so you'll have to either re-run your tests or re-read the code each time you upgrade, if you need to be 100% sure.

The correct answer is that it does indeed load the item lazily, upon access:
https://groups.google.com/forum/?fromgroups=#!topic/appengine-ndb-discuss/GaUSM7y4XhQ

Related

How to use a list of arguments with flask_smorest/marshmallow

I am trying to insert a collection of objects in a flask api. We use marshmallow for deserializing. My endpoint looks like this:
#blp.arguments(SomeSchemas, location='json', as_kwargs=True)
#blp.response(200, SomeSchemas)
def post(self, some_schemas: SomeSchemas) -> dict:
The schema is a simple schema like this:
class SomeSchemas(ma.Schema):
schemas = ma.fields.List(ma.fields.Nested(SomeSchema))
class SomeSchema(ma.Schema):
a = ma.fields.String()
b = ma.fields.Integer()
When i post to the endpoint, I do get a list of the correct data, but it comes in the form of dicts, instead of being correctly translated into the object.
I have also tried explicitly using a list of objects (List[SomeSchema], SomeSchema(many=True), etc.) but I can not seem to figure it out.
I assume this is a very common use case (providing a list of arguments) and that I am missing an obvious solution, but I can't seem to find any reference as to how to do this correctly. To be clear, I am looking for the correct way to call the endpoint with a list (or some other collection type, it does not matter) and have said list be correctly deserialized and with the correct object type.
Disclaimer: flask-smorest maintainer speaking.
I don't think the issue is related to the fact that the input is a list.
IIUC, your problem is that you're gettings dicts, rather than objects, injected in the view function. This is the default marshmallow behaviour. It can be overridden in marshmallow by using a post_load hook to actually instantiate the object.
I generally don't do that. In practice I find it better to instantiate objects in the view function. For instance, in a PUT resource, I prefer to instantiate the existing item from DB then update it with new data. In this case it is better to have the new data as dict than object.
There may not be a single truth, here, it could be a matter of opinion, but while the idea of having the object instantiated in the decorator and passed to the view is appealing, it might be a bit of an abusive shortcut.
I realize this answer is of the "no idea but you shouldn't do it anyway" type. Just saying maybe you shouldn't struggle to achieve this.
This said, I'd be surprised if it worked with non-list / non-nested inputs and I'd wonder why it doesn't work specifically in this case.

Python: How to json serialize class reference, not the class instance

I want to serialize the class itself rather than a object instance.
For example if I do this
class Foo:
pass
json.dumps(Foo)
It would throw an error saying Foo is not JSON serializable.
Is it even possible to do this with python?
A bit late, but you can't actually directly serialize a class. Serializing is the act of turning an object into a string for storage and transmission. A class declaration, in your case Foo, is already a text file; Foo.py (or wherever you put it).
Serialization librairies usually keep the object's class in reference, so that the same program quand save and load an object and still know that it's a Foo instance. But, if I program my own little app that does not contain a Foo class, then unserializing your data will give me something that is pretty much useless.
Depending on what you want to do, there are alternatives.
If your goal is to save or transmit methods instead of data (as a class declaration is usually one of methods, unlike a class instance, which is composed of information, contained in properties and variables), then you could serialize functions or dynamic objects containing methods. Those would be understood by any program unserializing the data and those functions could be correctly run. It is actually a security risk to unserialize without fact-checking the data, as you could then transmit damaging methods to an unsuspecting program.
If your goal is to transmit data types, then you might have to consider transmitting your entire program instead. As stated, the goal of serialization is to save or transmit data. If you want to save or transmit functionality, then consider that your code, saved on your hard drive, already is a string representation of functionality; you could transmit your entire program. Also, there are librairies out there to convert your code into executables, in case the target computer does not support python.
A third alternative, and half the reason I ended up finding this question on Stack Overflow, is to structure your data as database entries. This only applies if your goal is to save or transmit an object as a "shape", or a structure, and not as a bunch of fuctions and actions. In other words, this assumes your goal is to transmit the fact that a Contact class is composed of a name, a phone number and an address.
For the last option, you, in a way, want to build a class of classes:
Class model:
def __init__(self, class_name, fields):
self.class_name = class_name
self.fields = fields
Class model_instance:
def __init__(self, model, values):
self.fields = {}
# add values to the dict with fields as keys, based on the model
...
When you instantiate a "model", you are actually creating the model of an object. There would then be multiple tricks to correctly implement an alternative instantiation system with this, at this point, some research will be required.
I understand this post is a bit old but I thought answering what I know while searching for what I don't is the entire point of Stack Overflow!

What would be the equivalent of Pythons "pickle" in nodejs

One of Python's features is the pickle function, that allows you to store any arbitrary anything, and restore it exactly to its original form. One common usage is to take a fully instantiated object and pickle it for later use. In my case I have an AMQP Message object that is not serializable and I want to be able to store it in a session store and retrieve it which I can do with pickle. The primary difference is that I need to call a method on the object, I am not just looking for the data.
But this project is in nodejs and it seems like with all of node's low-level libraries there must be some way to save this object, so that it could persist between web calls.
The use case is that a web page picks up a RabbitMQ message and displays the info derived from it. I don't want to acknowledge the message until the message has been acted on. I would just normally just save the data in session state, but that's not an option unless I can somehow save it in its original form.
See the pickle-js project: https://code.google.com/p/pickle-js/
Also, from findbestopensource.com:
pickle.js is a JavaScript implementation of the Python pickle format. It supports pickles containing a cross-language subset of the primitive types. Key differences between pickle.js and pickle.py:text pickles only some types are lossily converted (e.g. int) some types are not supported (e.g. class)
More information available here: http://www.findbestopensource.com/product/pickle-js
As far as I am aware, there isn't an equivalent to pickle in JavaScript (or in the standard node libraries).
Check out https://github.com/carlos8f/hydration to see if it fits your needs. I'm not sure it's as complete as pickle but it's pretty terrific.
Disclaimer: The module author and I are coworkers.

Django - Correct way to load a large default value in a model

I have a field called schema in a django model that usually contains a rather large json string. There is a default value (around 2000 characters) that I would like added when any new instance is created.
I find it rather unclean to dump the whole thing in the models.py in a variable. What is the best way to load this default value in my schema (which is a TextField)?
Example:
class LevelSchema(models.Model):
user = models.ForeignKey(to=User)
active = models.BooleanField(default=False)
schema = models.TextField() # Need a default value for this
I thought about this a bit. If I am using a json file to store the default value somewhere, what is the best place to put it? Obviously it is preferable if it is in the same app in a folder.
The text is rather massive. Would span half the file as it is
formatted json which I would like editable in future. I prefer loading
it from a file (like fixtures), just that I want to know if there is a
method already present in Django.
In django, you have two options:
Listen on post_save, and when created is true, set the default value of the object by reading the file.
Set the default to a callable (a function), and in that method read the file (make sure you close it after), and return its contents.
You can also stick the data in some k/v store (like redis or memcache) for faster access. It would also be better since you won't be constantly opening and closing files.
Finally, the most restrictive option would be to set up a trigger on the database that does the populating for you. You would have to store the json in the database somewhere. Added benefit to this approach is that you can write a django front end to update the json. Downside is it will restrict your application to those database that you decide to support with your trigger.
I find the use of a variable not particularly unclean. But you could "abuse" the
fact that the default argument, that all fields support, can be a callable.
So you could this "crazy" thing:
def get_default_json():
json_text = open('mylargevalue.json').read()
return json_text
and then on your field:
schema = models.TextField(default=get_default_json)
I haven't tried anything like it, but I suppose it could work.

SQLAlchemy Custom Properties

I've got a table called "Projects" which has a mapped column "project". What I'm wanting to be able to do is to define my own property on my mapped class called "project" that performs some manipulation of the project value before returning it. This will of course create an infinite loop when I try to reference the row value. So my question is whether there's a way of setting up my table mapper to use an alias for the project column, perhaps _project. Is there any easy way of doing this?
I worked it out myself in the end. You can specify an alternative name when calling orm.mapper:
orm.mapper(MappedClass, table, properties={'_project': table.c.project})
Have you check the synonyms feature of Sqlalchemy
http://www.sqlalchemy.org/docs/05/reference/ext/declarative.html#defining-synonyms
http://www.sqlalchemy.org/docs/05/mappers.html#synonyms
?
I use this pretty often to provide a proper setter/getter public API for properties
having a pretty complicated underlaying data structure or in case where additional functionality/validation or whatever is needed.

Categories

Resources