Importing data from JSON to DB questions - python

I want to import data that is in regular JSON format into a (SQLite) db within Django. It looks like fixtures is the recommended way of doing that. I'm not very clear on the steps I need take. Specifically, do I need to create a mapping of which field should go into which model (assuming my model is already defined in models.py) etc? The Django example looks like this:
[
{
"model": "myapp.person",
"pk": 1,
"fields": {
"first_name": "John",
"last_name": "Lennon"
} }, {
"model": "myapp.person",
"pk": 2,
"fields": {
"first_name": "Paul",
"last_name": "McCartney"
}
} ]
However my data isn't in this format. Do I need to edit the JSON file to look like this? Should I just abort and import json instead?

json.loads is all that would be needed. Something like
as_object_list = json.loads(your_json)
This converts the json to a dict, from there you can can create the model by doing
YourModel.objects.create(**as_object_list)

Related

How to parse a JSON schema file and create new Python classes dynamically with many column constraints?

I am using the SQLAlchemy 1.4 ORM with postgres13 and Python 3.7.
EDITED FOR CLARITY AND REFINEMENT:
To stand up the project and test it out, these 3 files are working well:
base.py --> setting up SQLAlchemy Engine and database session
models.py --> a User class is defined with a number of fields
inserts.py --> Creating instances of the User class, adding and committing them to database
This all works well provided models.py has a hardcoded Class already defined (such as the User Class).
I have a schema.json file that defines database schema. The file is very large with many nested dictionaries.
The goal is to parse the json file and use the given schema to create Python Classes in models.py (or whichever file) automatically.
An example of schema.json:
"groupings": {
"imaging": {
"owner": { "type": "uuid", "required": true, "index": true },
"tags": { "type": "text", "index": true }
"filename": { "type": "text" },
},
"user": {
"email": { "type": "text", "required": true, "unique": true },
"name": { "type": "text" },
"role": {
"type": "text",
"required": true,
"values": [
"admin",
"customer",
],
"index": true
},
"date_last_logged": { "type": "timestamptz" }
}
},
"auths": {
"boilerplate": {
"owner": ["read", "update", "delete"],
"org_account": [],
"customer": ["create", "read", "update", "delete"]
},
"loggers": {
"owner": [],
"customer": []
}
}
}
The models' Classes need to be created on the fly by parsing the json because the schema might change in the future and manually hardcoding 100+ classes each time doesn't scale.
I have spent time researching this but have not been able to find a completely successful solution. Currently this is how I am handling the parsing and dynamic table creation.
I have this function create_class(table_data) which gets passed an already-parsed dictionary containing all the table names, column names, column constraints. The trouble is, I cannot figure out how to create the table with its constraints. Currently, running this code will commit the table to the database but in terms of columns, it only takes what it inherited from Base (automatically generated PK ID).
All of the column names and constraints written into the constraint_dict are ignored.
The line #db_session.add(MyTableClass) is commented out because it errors with "sqlalchemy.orm.exc.UnmappedInstanceError: Class 'sqlalchemy.orm.decl_api.DeclarativeMeta' is not mapped; was a class (main.MyTableClass) supplied where an instance was required?"
I think this must have something to do with the order of operations - I am creating an instance of a class before the class itself has been committed. I realise this further confuses things, as I'm not actually calling MyTableClass.
def create_class(table_data):
constraint_dict = {'__tablename__': 'myTableName'}
for k, v in table_data.items():
if 'table' in k:
constraint_dict['__tablename__'] = v
else:
constraint_dict[k] = f'Column({v})'
MyTableClass = type('MyTableClass', (Base,), constraint_dict)
Base.metadata.create_all(bind=engine)
#db_session.add(MyTableClass)
db_session.commit()
db_session.close()
return
I'm not quite sure what to take a look at to complete this last step of getting all columns and their constraints committed to the database. Any help is greatly appreciated!
This does not answer your question directly, but rather poses a different strategy. If you expect the json data to change frequently, you could just consider creating a simple model with an id and data column, essentially using postgres as a json document store.
from sqlalchemy.dialects.postgresql import JSONB
class Schema(db.Model):
id = db.Column(db.Integer(), nullable=False, primary_key=True, )
data= db.Column(JSONB)
sqlalchemy: posgres dialect JSONB type
The JSONB data type is preferable to the JSON data type in posgres because the binary representation is more efficient to search through, though JSONB does take slightly longer to insert than JSON. You can read more about the distinction between the JSON and JSONB data types in the posgres docs
This SO post explains how you can use SQLAlchemy to perform json operations in posgres: sqlalchemy filter by json field

Django getting values from postgres JSON field

I have a simple model like:
class MyModel(models.Model):
data = JSONField()
The JSONField data is in the following structure:
{
"name": "Brian",
"skills": [
{"id": 4, "name": "First aid"},
{"id": 5, "name": "Second aid"}
]
}
I'd like to create a query that gets a list of MyModels filtered by the id of the skill inside the data.
I've tried a few different avenues here, and can do the work in Python but I'm pretty sure there's a way to do this in Django; I think my SQL isn't good enough to figure it out.
Cheers in advance.
try this
>>> MyModel.objects.filter(data__skills__contains=[{'id':4}, {'id':5}])
more about JSON filter https://docs.djangoproject.com/en/3.1/topics/db/queries/#querying-jsonfield

Query results of django rest framework

It seems to me that there should be an automatic way to query the results of a Django Rest Framework call and operate it like a dictionary (or something similar). Am I missing something, or is that not possible?
i.e.,
if the call to http://localhost:8000/api/1/roles/
yields
{"count": 2, "next": null, "previous": null, "results": [{"user": {"username": "smithb", "first_name": "Bob", "last_name": "Smith"}, "role_type": 2, "item": 1}, {"user": {"username": "jjones", "first_name": "Jane", "last_name": "Jones"}, "role_type": 2, "item": 1}]}
I would think something akin to http://localhost:8000/api/1/roles/0/user/username should return smithb.
Does this functionality exist or do I need to build it myself?
It appears to be something you will have to build yourself. That said Django makes this kind of thing very easy. In URLS you can specify parts of the url path to pass to the view. You can catch the values using regex and then pass them into your views function.
Urls:
url(regex=r'^user/api/1/roles/(?P<usernumber>\w{1,50})/(?P<username>\w{1,50})/$', view='views.profile_page')
a request for http://domain/user/api/1/roles/0/username/
View:
def someApiFunction(request, usernumber=None ,username=None):
return HttpResponse(username)
Some additional Resources:
https://docs.djangoproject.com/en/1.7/intro/tutorial03/#writing-more-views
Capturing url parameters in request.GET

DJango: formatting json serialization

I have the following DJango view
def company(request):
company_list = Company.objects.all()
output = serializers.serialize('json', company_list, fields=('name','phonenumber','email','companylogo'))
return HttpResponse(output, content_type="application/json")
it result as follows:
[{"pk": 1, "model": "test.company", "fields": {"companylogo": null, "phonenumber": "741.999.5554", "name": "Remax", "email": "home#remax.co.il"}}, {"pk": 4, "model": "test.company", "fields": {"companylogo": null, "phonenumber": "641-7778889", "name": "remixa", "email": "a#aa.com"}}, {"pk": 2, "model": "test.company", "fields": {"companylogo": null, "phonenumber": "658-2233444", "name": "remix", "email": "b#bbb.com"}}, {"pk": 7, "model": "test.company", "fields": {"companylogo": null, "phonenumber": "996-7778880", "name": "remix", "email": "a#aba.com"}}]
my questions:
1. can i control the order of the fields
2. can i change the name of the fields
3. I was expecting to see the result with indentation in the browser i.e. instead of one long line to see something like this:
[
{
"pk": 1,
"model": "test.company",
"fields":
{
"companylogo": null,
"phonenumber": "741.999.5554",
"name": "Remax",
"email": "home#remax.co.il"
}
},
{
"pk": 4,
"model": "test.company",
"fields":
{
"companylogo": null,
"phonenumber": "641-7778889",
"name": "remixa",
"email": "a#aa.com"
}
},
....
}
]
you can get pretty format in this way:
return JsonResponse(your_json_dict, json_dumps_params={'indent': 2})
django doc as the first comment say
Python (unrelated to Django and starting with 2.6) has a built in json library that can accomplish the indentation you require. If you are looking for something quick and dirty for debug purposes you could do something like this:
from django.http import HttpResponse
from django.core import serializers
import json
def company(request, pretty=False):
company_list = Company.objects.all()
output = serializers.serialize('json', company_list, fields=('name','phonenumber','email','companylogo'))
if pretty:
output = json.dumps(json.loads(output), indent=4))
return HttpResponse(output, content_type="application/json")
But this is a performance issue if the Company model is large. I recommend taking Dan R's advice and use a browser plugin to parse and render the json or come up with some other client side solution. I have a script that takes in a json file and does exactly the same thing as the code above, reads in the json and prints it out with indent=4.
As for sorting your output, you can just use the order_by method on your query set:
def company(request):
company_list = Company.objects.order_by("sorting_field")
...
And if you always want that model sorted that way, you can use the ordering meta-class option:
class Company(models.Model):
class Meta:
ordering = ["sorting_field"]
...
As a final note, If your intent is to expose your models with a web service, I highly recommend taking a look at tastypie. It may help you in the long run as it provides many other convenient features that help towards that end.
With Django 1.7 I can get nicely indented JSON by using the indent parameter of the serializer. For instance, in a command that dumps data from my database:
self.stdout.write(serializers.serialize("json",
records,
indent=2))
The indent parameter has been in Django since version 1.5. The output I get looks like this:
[
{
"fields": {
"type": "something",
"word": "something else",
},
"model": "whatever",
"pk": 887060
},
{
"fields": {
"type": "something more",
"word": "and another thing",
},
"model": "whatever",
"pk": 887061
},
...
To order your records then you'd have to do what Kevin suggested and use order_by, or whatever method you want to order the records you pass to the serializer. For instance, I use itertools.chain to order different querysets that return instances of different models.
The serializer does not support ordering fields according to your preferences, or renaming them. You have to write your own code to do this or use an external tool.
JSON doesn't have indentation, it's simply structured data. Browsers or other tools may format the JSON so that it looks nice but by default it's not there. It's also not part of the JSON as the formatting is just how it looks on the screen. JSON is often processed by other code or services so they don't care about indentation, as long as the data is structured correctly.

tastypie PUT error unauthorized 401 when edit related fields

so I just try to use tastypie put method to edit objects.
let's say my object have this structure:
{
"id": 38,
"media": [],
"name": "tesdr",
"resource_uri": "/api/v2/group/38/",
"status": 7,
"user_name": null,
"users": []
}
witch media and users are related many to many field. when I edit group and use put without any change in m2m fields every thing works fine.
but when I try to put something like this:
{
"id": 38,
"media": [
"/api/v2/media/70/"
],
"name": "testgpat",
"resource_uri": "/api/v2/group/40/",
"status": 6,
"user_name": null,
"users": []
}
tastypie return an 401 http error. so what is the solution? any idea?
ok! I just solved the problem. have to define a many to many field in both resources which wants to set relation.
thanks all! :D

Categories

Resources