I have to build, in python, an address book of contacts using a json file. So, I have to define a class Contact which has as attributes hust the name, surname and mail. In the solution there is a function that I never would have thought of: jsonify(self). I don't understand what it does and why I need it. Someone can help me figure it out?
def jsonify(self):
contact = {'name':self.name,'surname':self.surname,'mail':self.mail}
return contact
Using the json library, you can convert objects into JSON really easily, but it needs to know how to go about converting them. The library doesn't know how to interpret your custom class:
>>> import json
>>> contact = Contact("Hello", "World", "hello#world.com")
>>> contact_json = json.dumps(contact)
TypeError: Object of type 'Contact' is not JSON serializable
(json.dumps(obj) converts its input to JSON and returns it as a string. You can also do json.dump(obj, file_handle) to save it to a file).
A dictionary is a known type in Python, so the json library knows how to convert it into json format:
>>> import json
>>> contact = Contact("Hello", "World", "hello#world.com")
>>> contact_json = json.dumps(contact.jsonify())
{
"name": "Hello",
"surname": "World",
"mail": "hello#world.com"
}
Using that jsonify method, you're converting the fields from your class into something that the json library can understand and knows how to translate.
This is a quick and easy way to serialise your object into JSON, but isn't necessarily the best way to do it - ideally you'd tell the JSON library how to interpret your class (see this related question: How to make a class JSON serializable).
Edit: Seeing the comment discussion - I'm assuming here you have an understanding of Python data structures and classes. If this is not the case it's worth reading up on those first.
Related
Studying Python, I am following an excellent Corey Schafer tutorial on Flask, he does this (I have extracted and summarized it for obvious reasons):
from folder_app import app # I did it to follow the structure and that the code is equal to the original
s = Serializer(app.config['SECRET_KEY'], 1800) # key, seconds
token = s.dumps({'user_id': 1}).decode('utf-8')
s = Serializer(app.config['SECRET_KEY'])
user_id = s.loads(token)['user_id'] # This is where I have the doubt
print(user_id)
print(type(s.loads(token)))
The code works, the problem I have is that although as you can see (s.loads (token)) is a dict, I expected to see something like this s.loads ({token ['user_id']}), or s.loads (token ['user_id']) or something like that. That is, it is a dict but it does not seem so. And my doubt goes in the sense if this comes from a greater concept of those they call "pythonic" (which I have not seen so far), or is something that only happens particularly as in this case. Incidentally, https://itsdangerous.palletsprojects.com/en/1.1.x/jws/ this appears: loads (self, s, salt = None, return_header = False) the arguments are in parentheses. I hope it is clear what my doubt is :)
I know this is not answer per say but just to add to my comment. This is an example of how the loads function works on dictionaries with the json module. https://docs.python.org/3/library/json.html#json.loads. What it does is take a json string and return the dictionary type object in Python. Your Serializer is doing something similar. It takes the token string and represents it as an object like dict
The s.dumps I am assuming is similar to json.dumps which gives you the json string representation of python dictionary.
import json
my_dict = json.loads('{"user_id": "Mane", "name": "Joe"}')
my_dict['user_id']
So you could just do json.loads('{"user_id": "Mane", "name": "Joe"}')['user_id'] which is just chaining the operations.
I want to filter the json that operatinSystem are linux ,and I have some problem with it,the part of json in
'' : {
that I don't know how dictionary represent it and
"DQ578CGN99KG6ECF" : {
how can I represent it with wildcard, anyone could help my please.
import json
import urllib2
response=urllib2.urlopen('https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json')
url=response.read()
urlj=json.loads(url)
filterx=[x for x in urlj if x['??']['??']["attributes"]["operatingSystem"] == 'linux']
I'm not sure about the wildcard representation. I'll look into it and get back to you. Meanwhile, I have already worked with this json before so I can tell you how to access the information you need.
The information you need can be obtained as follows:
for each_product in urlx['products']:
if urlx['products'][each_product]['attributes']['operatingSystem']=="linux":
#your code here
If you need pricing information from the json you need to take the product code string and look into the priceDimensions field for it. Look at the sample json and code accordingly.
https://aws.amazon.com/blogs/aws/new-aws-price-list-api/
Some background first: I have a few rather simple data structures which are persisted as json files on disk. These json files are shared between applications of different languages and different environments (like web frontend and data manipulation tools).
For each of the files I want to create a Python "POPO" (Plain Old Python Object), and a corresponding data mapper class for each item should implement some simple CRUD like behavior (e.g. save will serialize the class and store as json file on disk).
I think a simple mapper (which only knows about basic types) will work. However, I'm concerned about security. Some of the json files will be generated by a web frontend, so a possible security risk if a user feeds me some bad json.
Finally, here is the simple mapping code (found at How to convert JSON data into a Python object):
class User(object):
def __init__(self, name, username):
self.name = name
self.username = username
import json
j = json.loads(your_json)
u = User(**j)
What possible security issues do you see?
NB: I'm new to Python.
Edit: Thanks all for your comments. I've found out that I have one json where I have 2 arrays, each having a map. Unfortunately this starts to look like it gets cumbersome when I get more of these.
I'm extending the question to mapping a json input to a recordtype. The original code is from here: https://stackoverflow.com/a/15882054/1708349.
Since I need mutable objects, I'd change it to use a namedlist instead of a namedtuple:
import json
from namedlist import namedlist
data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'
# Parse JSON into an object with attributes corresponding to dict keys.
x = json.loads(data, object_hook=lambda d: namedlist('X', d.keys())(*d.values()))
print x.name, x.hometown.name, x.hometown.id
Is it still safe?
There's not much wrong that can happen in the first case. You're limiting what arguments can be provided and it's easy to add validation/conversion right after loading from JSON.
The second example is a bit worse. Packing things into records like this will not help you in any way. You don't inherit any methods, because each type you define is new. You can't compare values easily, because dicts are not ordered. You don't know if you have all arguments handled, or if there is some extra data, which can lead to hidden problems later.
So in summary: with User(**data), you're pretty safe. With namedlist there's space for ambiguity and you don't really gain anything. (compared to bare, parsed json)
If you blindly accept users json input without sanity check, you are at risk of become json injection victim.
See detail explanation of json injection attack here: https://www.acunetix.com/blog/web-security-zone/what-are-json-injections/
Besides security vulnerability, parse JSON to Python object this way is not type safe.
With your example of User class, I would assume you expect both fields name and username to be string type. What if the json input is like this:
{
"name": "my name",
"username": 1
}
j = json.loads(your_json)
u = User(**j)
type(u.username) # int
You have gotten an object with unexpected type.
One solution to make sure type safe is to use json schema to validate input json. more about json schema: https://json-schema.org/
I've written some code that converts a JSON object to an iCalendar (.ics) object and now I am trying to test it. The problem is that I can't figure out how to create a generic JSON object to use as the parameter. Some of my attempts are as follows:
# 1
obj_json = u'sample json data in string form'
obj = json.loads(obj_json)
# 2
# I'm not sure about this very first line. My supervisor told me to put it in but he
# has a very heavy accent so I definitely could have heard him incorrectly.
input.json
with open('input.json') as f:
obj = json.loads(f.read())
Try,
import json
some_dict = {'id': 0123, 'text': 'A dummy text'}
dummy_json = json.dumps(some_dict)
Now, feed your dummy json to your function. i.e.
'{"text": "A dummy text", "id": 83}'
You can do dumps with a string object too.
See pnv's answer, but you probably don't need to dump it. Just use a dictionary, as pnv did, and pass that into whatever you need to. Unless you are about to pass your json object over the wire to something, I don't know why you'd want to dump it.
I would've added this as a comment, but no rep, yet. :)
I am caching some JSON data, and in storage it is represented as a JSON-encode string. No work is performed on the JSON by the server before sending it to the client, other than collation of multiple cached objects, like this:
def get_cached_items():
item1 = cache.get(1)
item2 = cache.get(2)
return json.dumps(item1=item1, item2=item2, msg="123")
There may be other items included with the return value, in this case represented by msg="123".
The issue is that the cached items are double-escaped. It would behoove the library to allow a pass-through of the string without escaping it.
I have looked at the documentation for json.dumps default argument, as it seems to be the place where one would address this, and searched on google/SO but found no useful results.
It would be unfortunate, from a performance perspective, if I had to decode the JSON of each cached items to send it to the browser. It would be unfortunate from a complexity perspective to not be able to use json.dumps.
My inclination is to write a class that stores the cached string and when the default handler encounters an instance of this class it uses the string without perform escaping. I have yet to figure out how to achieve this though, and I would be grateful for thoughts and assistance.
EDIT For clarity, here is an example of the proposed default technique:
class RawJSON(object):
def __init__(self, str):
self.str = str
class JSONEncoderWithRaw(json.JSONEncoder):
def default(self, o):
if isinstance(o, RawJSON):
return o.str # but avoid call to `encode_basestring` (or ASCII equiv.)
return super(JSONEncoderWithRaw, self).default(o)
Here is a degenerate example of the above:
>>> class M():
str = ''
>>> m = M()
>>> m.str = json.dumps(dict(x=123))
>>> json.dumps(dict(a=m), default=lambda (o): o.str)
'{"a": "{\\"x\\": 123}"}'
The desired output would include the unescaped string m.str, being:
'{"a": {"x": 123}}'
It would be good if the json module did not encode/escape the return of the default parameter, or if same could be avoided. In the absence of a method via the default parameter, one may have to achieve the objective here by overloading the encode and iterencode method of JSONEncoder, which brings challenges in terms of complexity, interoperability, and performance.
A quick-n-dirty way is to patch json.encoder.encode_basestring*() functions:
import json
class RawJson(unicode):
pass
# patch json.encoder module
for name in ['encode_basestring', 'encode_basestring_ascii']:
def encode(o, _encode=getattr(json.encoder, name)):
return o if isinstance(o, RawJson) else _encode(o)
setattr(json.encoder, name, encode)
print(json.dumps([1, RawJson(u'["abc", 2]'), u'["def", 3]']))
# -> [1, ["abc", 2], "[\"def\", 3]"]
If you are caching JSON strings, you need to first decode them to python structures; there is no way for json.dumps() to distinguish between normal strings and strings that are really JSON-encoded structures:
return json.dumps({'item1': json.loads(item1), 'item2': json.loads(item2), 'msg': "123"})
Unfortunately, there is no option to include already-converted JSON data in this; the default function is expected to return Python values. You extract data from whatever object that is passed in and return a value that can be converted to JSON, not a value that is already JSON itself.
The only other approach I can see is to insert "template" values, then use string replacement techniques to manipulate the JSON output to replace the templates with your actual cached data:
json_data = json.dumps({'item1': '==item1==', 'item2': '==item2==', 'msg': "123"})
return json_data.replace('"==item1=="', item1).replace('"==item2=="', item2)
A third option is to cache item1 and item2 in non-serialized form, as a Python structure instead of a JSON string.
You can use the better maintained simplejson instead of json which provides this functionality.
import simplejson as json
from simplejson.encoder import RawJSON
print(json.dumps([1, RawJSON(u'["abc", 2]'), u'["def", 3]']))
# -> [1, ["abc", 2], "[\"def\", 3]"]
You get simplicity of code, plus all the C optimisations of simplejson.