Serializing JSON object effectively - python

This is a bit of a puzzle, I have these pseudo models:
class Country(models.Model):
name = models.CharField(unique=True)
class Region(models.Model):
name = models.CharField(unique=True)
country = models.ForeignKey(Country)
class SubRegion(models.Model):
name = models.CharField(unique=True)
region = models.ForeignKey(Region)
class Estate(models.Model):
name = models.CharField(unique=True)
sub_region = models.ForeignKey(SubRegion)
I am trying to JSON serialize their data as below. However i'm not sure how to do this effectively (avoiding too much database queries), suggestions are appreciated
{
CountryX: {
RegionX {
SubRegionX = [
"EstateX"
"EstateY",
"EstateZ"
],
SubRegionY = [ etc... ]
},
RegionY { etc... }
},
CountryY: { etc... }
}

I haven't tested this, but it should give you the idea. Start with the innermost object, use select_related to traverse the heirarchy, then loop over the innermost objects, adding the keys for the heirarchy as needed.
Just a note of warning, if there are Countries/Regions/Subregions without any estates, they won't be included in the JSON. If that's not OK, you'll need to query each of the models separately.
data = {}
for e in Estate.objects.select_related("sub_region__region__country"):
sub, region, country = e.sub_region, e.sub_region.region, e.sub_region.region.country
if country.name not in data:
data[country.name] = {}
if region.name not in data[country.name]:
data[country.name][region.name] = {}
if sub.name not in data[country.name][region.name]:
data[country.name][region.name][sub.name] = []
data[country.name][region.name][sub.name].append(e.name)
json_data = json.dumps(data)

This suggestion might not be exactly what you were looking for, but I've used it in a couple of situations where I needed quick-and-dirty JSON of an app's data.
Check out./manage.py dumpdata app_name (or app_name.model_name). This gives you JSON for all of the data in all of the tables of that app (or that model). The format may be a little different than you were thinking about, but it does include all of the PK and class info necessary to maintain ForeignKey relationships, and it spits them out in the order necessary to create the referenced object before you create the referencing object. Very handy.
If you want to invoke it from inside a script, look at django/core/management/commands/dumpdata.py to see how they do it.

Related

Python flask Graphene: Mapping fields with API response

I'm building a graphql api using python flask and python graphene.
basically my json file data looks like following.
{
"address":{
"streetAddress":"301",
"#city":"Los Angeles",
"state":"CA"
}
}
And my graphene schema looks like follow.
class Address(ObjectType):
streetAddress = String()
city = String()
state = String()
class Meta:
exclude_fields = ('#city',)
class Common(ObjectType):
data = Field(Address)
def resolve_data(self, info):
data = open("address.json", "r")
data_mod = json.loads(data.read())["address"]
return data_mod
So I am trying to map this #city json key value to my schema field called city.
I saw one of the articles and in that, it mentioned that using the meta class we can exclude original field name like this.
class Meta:
exclude_fields = ('#city',)
Still it didn't work. And I am using a common schema to fetch the json data to Address schema fields by using one resolver. Can someone tell me a solution to map these kind of fields to graphene schema fields.

Store Subtitles in a Database

I'm working on a project that uses AI to recognise the speech of an audio file. The output of this AI is a huge JSON object with tons of values. I'll remove some keys, and the final structure will look as follows.
{
text: "<recognised text>",
language: "<detected language>"
segments: [
{startTimestamp: "00:00:00", endTimestamp: "00:00:10", text: "<some text>"},
{startTimestamp: "00:00:10", endTimestamp: "00:00:17", text: "<some text>"},
{startTimestamp: "00:00:17", endTimestamp: "00:00:26", text: "<some text>"},
{ ... },
{ ... }
]
}
Now, I wish to store this new trimmed object in a SQL database because I wish to be able to edit it manually. I'll create a React application to edit segments, delete segments, etc. Additionally, I want to add this feature to the React application, where the information will be saved every 5 seconds using an AJAX call.
Now, I don't understand how I should store this object in the SQL database. Initially, I thought I would store the whole object as a string in a database. Whenever some change is made to the object, I'll send a JSON object from the React application, the backend will sanitize it and then replace the old stringified object in the database with the new sanitised string object. This way updating and deletion will happen with ease but there can be issues in case of searching. But I'm wondering if there are any better approaches to do this.
Could someone guide me on this?
Tech Stack
Frontend - React
Backend - Django 3.2.15
Database - PostgreSQL
Thank you
Now, I don't understand how I should store this object in the SQL database. Initially, I thought I would store the whole object as a string in a database.
If the data has a clear structure, you should not store it as a JSON blob in a relational database. While relational databases have some support for JSON nowadays, it is still not very effective, and normally it means you can not effectively filter, aggregate, and manipulate data, nor can you check referential integrity.
You can work with two models that look like:
from django.db import models
from django.db.models import F, Q
class Subtitle(models.Model):
text = models.CharField(max_length=128)
language = models.CharField(max_length=128)
class Segment(models.Model):
startTimestamp = models.DurationField()
endTimestamp = models.DurationField()
subtitle = models.ForeignKey(
Subtitle, on_delete=models.CASCADE, related_name='segments'
)
text = models.CharField(max_length=512)
class Meta:
ordering = ('subtitle', 'startTimestamp', 'endTimestamp')
constraints = [
models.CheckConstraint(
check=Q(startTimestamp__gt=F('endTimestamp')),
name='start_before_end',
)
]
This will also guarantee that the startTimestamp is before the endTimestamp for example, that these fields store durations (and not "foo" for example).
You can convert from and to JSON with serializers [drf-doc]:
from rest_framework import serializers
class SegmentSerializer(serializers.ModelSerializer):
class Meta:
model = Segment
fields = ['startTimestamp', 'endTimestamp', 'text']
class SubtitleSerializer(serializers.ModelSerializer):
segments = SegmentSerializer(many=True)
class Meta:
model = Subtitle
fields = ['text', 'language', 'segments']

Django Model load choices from a json file

I am writing a Django User model which contains a mobile_country_code field. This field needs to be populated from a list of ISD codes, pre-populated in a json file. What is the best pythonic way to do the same?
My current implementation, which is working:
json_data/countries.json
[
...
{
"name": "Malaysia",
"dial_code": "+60",
"code": "MY"
},
...
]
project/app/models.py
import json, os
class User(models.Model):
with open(os.path.dirname(__file__)+'/json_data/countries.json') as f:
countries_json = json.load(f)
COUNTRIES_ISD_CODES = [(str(country["dial_code"]), str(country["name"])) for country in countries_json]
mobile_country_code = models.CharField(choices=COUNTRIES_ISD_CODES, help_text="Country ISD code loaded from JSON file")
Other Possible options listed below. Which one is better to use?
Using a model's __init__ method to create COUNTRIES_ISD_CODES
Importing a library method, like:
from library import import_countries_isd_codes
class User(models.Model):
mobile_country_code = models.CharField(choices=import_countries_isd_codes())
Try this,
Since Django only takes tuples
def jsonDjangoTupple(jsonData):
"""
Django only takes tuples (actual value, human readable name) so we need to repack the json in a dictionay of tuples
"""
dicOfTupple = dict()
for key, valueList in jsonData.items():
dicOfTupple[str(key)]=[(value,value) for value in valueList]
return dicOfTupple
json_data = jsonDjangoTupple(jsonData)
in the model pass something like
class User(models.Model):
code = models.CharField(max_length=120,choices=json_data['code'])
Works for when there's more than one value per key.

Google App Engine return object with reference set

Hi I am kind of trying to get the concept behind DataStore as a No-SQL database, what I am trying to fetch is a list of object wich have been "reference" by another. As this
class Person(db.Model):
name = db.StringProperty(required=True)
class Contact(db.Model):
name = db.StringProperty(required=True)
email = db.StringProperty()
trader = db.ReferenceProperty(Person)
This works fine and they get to be saved when I use person.put() without any problem. But when I try to retrieve it and encoded as json it nevers shows me the contact as a list in fact it totally ignores it.
persons_query = Person.all()
persons = persons_query.fetch(50)
data = json.encode(persons)
I would expect person to have a collection of Contact but it doesn't any ideas on how to solve this problem?
To make it clearer currently i am getting something like this:
[
{
name: "John Doe"
}
]
I would like to be
[
{
name: "John Doe"
contacts: [{name:"Alex", email:'alex#gmail.com'}]
}
]
Edit
Thanks all you were right I needed to fetch the collection of contacts there was only one issue for this is that when Contact was being encoded it recursively tried to encode the Trader object and this it's contact and so on.
So I got an obvious error recursive error, the solution to this was clearly to remove the trader object from the Contact when it's being encoded.
Make a custom toJson function in your class
class Person(db.Model):
name = db.StringProperty(required=True)
def toJson(self):
contact = self.contact_set #this is the default collection name for your class
d = {"name":self.name,"contact":contact}
return json.dumps(d)
class Contact(db.Model):
name = db.StringProperty(required=True)
email = db.StringProperty()
trader = db.ReferenceProperty(Person)
then you may do the ff:
persons_query = Person.all()
persons = persons_query.fetch(50)
data = person.toJson()
To fetch all the contacts you will need to write a custom json encoder, which fetches all of the reverse of the reference property.
ReferenceProperties automatically get a reverse query. From the docs "collection_name is the name of the property to give to the referenced model class. The value of the property is a Query for all entities that reference the entity. If no collection_name is set, then modelname_set (with the name of the referenced model in lowercase letters and _set added) is used."
So you would add a method to resolve the reverse reference set query.
class Person(db.Model):
name = db.StringProperty(required=True)
def contacts(self):
return self.contact_set.fetch(50) # should be smarter than that
Then use it in your custom json encoder.
If you want to find all the contacts that include a person you will need to issue a query for it.
contacts = Contact.all().filter("trader =", person)

Python Serialization (to JSON) issue

I'm a bit of a newbie in Python, so go easy on me. I'm writing an AJAX handler in Django. Everything's been pretty straight-forward on this until now. I've been banging my head against this bit for the better part of a day. I'd like to return a JSON string that contains a dict that contains a queryset:
#
# models.py
#
class Project(models.Model):
unique_name = models.CharField(max_length=32, unique=True)
title = models.CharField(max_length=255, blank=True)
description = models.TextField('project description', blank=True)
project_date = models.DateField('Project completion date')
published = models.BooleanField()
class ProjectImage(models.Model):
project = models.ForeignKey('Project', related_name='images')
image = models.ImageField(upload_to=get_image_path)
title = models.CharField(max_length=255)
sort_metric = models.IntegerField()
#
# views.py
#
...
projects = Project.Project.objects.filter(published=True)
...
response_dict({
'success' : True,
'maxGroups' : 5, # the result of some analysis on the projects queryset
'projects' : projects
});
# This one works if I remove 'projects' from the dict
# response = json.dumps( response_dict )
# This one works only on projects
# response = serializers.serialize( 'json', response_dict, relations=('images') )
return HttpResponse( response, mimetype='application/javascript' )
I've commented out the two serialization lines, because:
The first one seems to only work with 'simple' dicts and since projects is included in my dict, it fails with [<Project: Project object>] is not JSON serializable
The second one seems to only work with querysets/models and since the 'outer' part of my dict is non-model, it complains that 'str' object has no attribute '_meta'. Note that I am using the wadofstuff serializer with the understanding that it would resolve the OneToMany relationship as defined in my model. But even when I get this working by only serializing projects, I do not have any of my ProjectImages in the output.
QUESTION 1: What is the best way to serialize the whole response_dict? Surely, I'm not the first person to want to do this, right?
QUESTION 2: Why am I unable to get the ManyToOne relationship to work?
Many thanks for your help.
UPDATE: Just found this one: Django JSON Serialization with Mixed Django models and a Dictionary and it looked promising, but I get 'QuerySet' object has no attribute '_meta' =(
You can't serialize a python object like that. There is a section in the django documentation on what to do.
https://docs.djangoproject.com/en/dev/topics/serialization/#id2
Here is the key part to look at:
json_serializer.serialize(queryset, ensure_ascii=False, stream=response)

Categories

Resources