How to load json data with get_or_create? - python

I am a programming self-learner and I am new to python and django and would like to optimize my code.
My problem is that I want to do a get_or_create with some loaded json data. Each dictionary entry is directly linked to my model. example:
data=json.load(file)
Person.objects.get_or_create(
firstname=data['firstname'],
surname=data['surname'],
gender=data['gender'],
datebirth=data['datebirth'],
)
Are there any ways to automatically link the json properties to my model fields instead of typing all my properties one by one?

What you might want to do is to unpack your list of arguments. Link to Python docs.
Say your model is Person:
p = Person(**data_dict)
p.save()
Reference

You need to write following code in python shell:
import json
data = json.loads(source)
print(json.dumps(data, indent=2))

Related

Understanding jsonify() function. Maybe it's used for displaying the contacts?

I have to build, in python, an address book of contacts using a json file. So, I have to define a class Contact which has as attributes hust the name, surname and mail. In the solution there is a function that I never would have thought of: jsonify(self). I don't understand what it does and why I need it. Someone can help me figure it out?
def jsonify(self):
contact = {'name':self.name,'surname':self.surname,'mail':self.mail}
return contact
Using the json library, you can convert objects into JSON really easily, but it needs to know how to go about converting them. The library doesn't know how to interpret your custom class:
>>> import json
>>> contact = Contact("Hello", "World", "hello#world.com")
>>> contact_json = json.dumps(contact)
TypeError: Object of type 'Contact' is not JSON serializable
(json.dumps(obj) converts its input to JSON and returns it as a string. You can also do json.dump(obj, file_handle) to save it to a file).
A dictionary is a known type in Python, so the json library knows how to convert it into json format:
>>> import json
>>> contact = Contact("Hello", "World", "hello#world.com")
>>> contact_json = json.dumps(contact.jsonify())
{
"name": "Hello",
"surname": "World",
"mail": "hello#world.com"
}
Using that jsonify method, you're converting the fields from your class into something that the json library can understand and knows how to translate.
This is a quick and easy way to serialise your object into JSON, but isn't necessarily the best way to do it - ideally you'd tell the JSON library how to interpret your class (see this related question: How to make a class JSON serializable).
Edit: Seeing the comment discussion - I'm assuming here you have an understanding of Python data structures and classes. If this is not the case it's worth reading up on those first.

How to read and assign variables from an API return that's formatted as Dictionary-List-Dictionary?

So I'm trying to learn Python here, and would appreciate any help you guys could give me. I've written a bit of code that asks one of my favorite websites for some information, and the api call returns an answer in a dictionary. In this dictionary is a list. In that list is a dictionary. This seems crazy to me, but hell, I'm a newbie.
I'm trying to assign the answers to variables, but always get various error messages depending on how I write my {},[], or (). Regardless, I can't get it to work. How do I read this return? Thanks in advance.
{
"answer":
[{"widgets":16,
"widgets_available":16,
"widgets_missing":7,
"widget_flatprice":"156",
"widget_averages":15,
"widget_cost":125,
"widget_profit":"31",
"widget":"90.59"}],
"result":true
}
Edited because I put in the wrong sample code.
You need to show your code, but the de-facto way of doing this is by using the requests module, like this:
import requests
url = 'http://www.example.com/api/v1/something'
r = requests.get(url)
data = r.json() # converts the returned json into a Python dictionary
for item in data['answer']:
print(item['widgets'])
Assuming that you are not using the requests library (see Burhan's answer), you would use the json module like so:
data = '{"answer":
[{"widgets":16,
"widgets_available":16,
"widgets_missing":7,
"widget_flatprice":"156",
"widget_averages":15,
"widget_cost":125,
"widget_profit":"31",
"widget":"90.59"}],
"result":true}'
import json
data = json.loads(data)
# Now you can use it as you wish
data['answer'] # and so on...
First I will mention that to access a dictionary value you need to use ["key"] and not {}. see here an Python dictionary syntax.
Here is a step by step walkthrough on how to build and access a similar data structure:
First create the main dictionary:
t1 = {"a":0, "b":1}
you can access each element by:
t1["a"] # it'll return a 0
Now lets add the internal list:
t1["a"] = ["x",7,3.14]
and access it using:
t1["a"][2] # it'll return 3.14
Now creating the internal dictionary:
t1["a"][2] = {'w1':7,'w2':8,'w3':9}
And access:
t1["a"][2]['w3'] # it'll return 9
Hope it helped you.

Is parsing a json naively into a Python class or struct secure?

Some background first: I have a few rather simple data structures which are persisted as json files on disk. These json files are shared between applications of different languages and different environments (like web frontend and data manipulation tools).
For each of the files I want to create a Python "POPO" (Plain Old Python Object), and a corresponding data mapper class for each item should implement some simple CRUD like behavior (e.g. save will serialize the class and store as json file on disk).
I think a simple mapper (which only knows about basic types) will work. However, I'm concerned about security. Some of the json files will be generated by a web frontend, so a possible security risk if a user feeds me some bad json.
Finally, here is the simple mapping code (found at How to convert JSON data into a Python object):
class User(object):
def __init__(self, name, username):
self.name = name
self.username = username
import json
j = json.loads(your_json)
u = User(**j)
What possible security issues do you see?
NB: I'm new to Python.
Edit: Thanks all for your comments. I've found out that I have one json where I have 2 arrays, each having a map. Unfortunately this starts to look like it gets cumbersome when I get more of these.
I'm extending the question to mapping a json input to a recordtype. The original code is from here: https://stackoverflow.com/a/15882054/1708349.
Since I need mutable objects, I'd change it to use a namedlist instead of a namedtuple:
import json
from namedlist import namedlist
data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'
# Parse JSON into an object with attributes corresponding to dict keys.
x = json.loads(data, object_hook=lambda d: namedlist('X', d.keys())(*d.values()))
print x.name, x.hometown.name, x.hometown.id
Is it still safe?
There's not much wrong that can happen in the first case. You're limiting what arguments can be provided and it's easy to add validation/conversion right after loading from JSON.
The second example is a bit worse. Packing things into records like this will not help you in any way. You don't inherit any methods, because each type you define is new. You can't compare values easily, because dicts are not ordered. You don't know if you have all arguments handled, or if there is some extra data, which can lead to hidden problems later.
So in summary: with User(**data), you're pretty safe. With namedlist there's space for ambiguity and you don't really gain anything. (compared to bare, parsed json)
If you blindly accept users json input without sanity check, you are at risk of become json injection victim.
See detail explanation of json injection attack here: https://www.acunetix.com/blog/web-security-zone/what-are-json-injections/
Besides security vulnerability, parse JSON to Python object this way is not type safe.
With your example of User class, I would assume you expect both fields name and username to be string type. What if the json input is like this:
{
"name": "my name",
"username": 1
}
j = json.loads(your_json)
u = User(**j)
type(u.username) # int
You have gotten an object with unexpected type.
One solution to make sure type safe is to use json schema to validate input json. more about json schema: https://json-schema.org/

How to use loaded data to add a new value in an ItemLoader?

I have started a scraping project, and I have a small problem with ItemLoader.
Suppose I have some ItemLoader in a scraper:
l = ScraperProductLoader(item=ScraperProduct(), selector=node)
l.add_xpath('sku', 'id/text()')
I would like to add a URL to the item loader based on the sku I have provided:
l.add_value('url', '?????')
...However, based on the documentation, I don't see a clear way to do this.
Options I have considered:
Input processor: Add a string, and pass the sku as the context somehow
Handle separately: Create the URL without using the item loader
How can I use loaded data to add a new value in an ItemLoader?
You can use get_output_value() method:
get_output_value(field_name)
Return the collected values parsed using
the output processor, for the given field. This method doesn’t
populate or modify the item at all.
l.add_value('url', 'http://domain.com/' + l.get_output_value('scu'))

Django/ python validate JSON

what is the best way to validate JSON data in Django/python.
Is it best to create a bunch of classes like the Django FormMixin classes that can validate the data/ parameters being passed in?
What's the best DRY way of doing this? Are there existing apps that I can leverage?
I'd like to take in JSON data and perform some actions/ updates to my model instances as a result. The data I'm taking in is not user generated - that is they are id's and flags (no text) so I don't want to use Forms.
I just instantiate a model object from the json data and call full_clean() on the model to validate: https://docs.djangoproject.com/en/dev/ref/models/instances/#django.db.models.Model.full_clean
m = myModel(**jsondata)
m.full_clean()
validictory validates json to a json-schema. It works. Of course, now you need to define your schema in json which may be a little much for what you want to do, but it does have it's place.
I would recommend a python library named DictShield for this https://github.com/j2labs/dictshield
DictShield is a database-agnostic modeling system. It provides a way to model, validate and reshape data easily.
There is even a sample for doing JSON validation:
Validating User Input
Let's say we get this JSON string from a user.
{"bio": "Python, Erlang and guitars!", "secret": "e8b5d682452313a6142c10b045a9a135", "name": "J2D2"}
We might write some server code that looks like this:
json_string = request.get_arg('data')
user_input = json.loads(json_string)
user.validate(**user_input)

Categories

Resources