Decoding json data to Python dictionary - python

I am currently trying to create a dictionary from a json formatted server response:
{"id": null,{"version": "1.1","result": "9QtirjtH9b","error": null}}
Therefore I am using json.loads(). But I always get the following error:
ValueError: Expecting property name: line 1 column 12 (char 12)
I know that this means that there is an error in the json syntax and I found some threads (like this one) here at stackoverflow, but they did not include an answer that solved my problem.
However, I was not sure if the null value within the json response causes the error, so I had a closer look at the json.org Reference Manual and it seems to be a valid syntax. Any ideas?

It's not valid. The outer object needs a property name for the second element; raw values are not valid in an object.
{"id": null, "somename":{"version": "1.1","result": "9QtirjtH9b","error": null}}

The problem here is the lack of a key for the nested object, not the null. You'd need to find a way to fix that syntax or parse it yourself.
If we make a few assumptions about the syntax, you should be able to use a regular expression to fix the JSON data before decoding:
import re
from itertools import count
def _gen_id(match, count=count()):
return '{1}"generated_id_{0}":{2}'.format(next(count), *match.groups())
_no_key = re.compile(r'(,)({)')
def fix_json(json_data):
return _no_key.sub(_gen_id, json_data)
This assumes that any ,{ combo indicates the location of a missing key, and generates one to insert there. That is a reasonable assumption to make, but may break things if you have string data with exactly that sequence.
Demo:
>>> json_data = '{"id": null,{"version": "1.1","result": "9QtirjtH9b","error": null}}'
>>> fix_json(json_data)
'{"id": null,"generated_id_0":{"version": "1.1","result": "9QtirjtH9b","error": null}}'
>>> json.loads(fix_json(json_data))
{u'id': None, u'generated_id_1': {u'version': u'1.1', u'result': u'9QtirjtH9b', u'error': None}}

Related

JSON Parsing with python from Rethink database [Python]

Im trying to retrieve data from a database named RethinkDB, they output JSON when called with r.db("Databasename").table("tablename").insert([{ "id or primary key": line}]).run(), when doing so it outputs [{'id': 'ValueInRowOfid\n'}] and I want to parse that to just the value eg. "ValueInRowOfid". Ive tried with JSON in Python, but I always end up with the typeerror: list indices must be integers or slices, not str, and Ive been told that it is because the Database outputs invalid JSON format. My question is how can a JSON format be invalid (I cant see what is invalid with the output) and also what would be the best way to parse it so that the value "ValueInRowOfid" is left in a Operator eg. Value = ("ValueInRowOfid").
This part imports the modules used and connects to RethinkDB:
import json
from rethinkdb import RethinkDB
r = RethinkDB()
r.connect( "localhost", 28015).repl()
This part is getting the output/value and my trial at parsing it:
getvalue = r.db("Databasename").table("tablename").sample(1).run() # gets a single row/value from the table
print(getvalue) # If I print that, it will show as [{'id': 'ValueInRowOfid\n'}]
dumper = json.dumps(getvalue) # I cant use `json.loads(dumper)` as JSON object must be str. Which the output of the database isnt (The output is a list)
parsevalue = json.loads(dumper) # After `json.dumps(getvalue)` I can now load it, but I cant use the loaded JSON.
print(parsevalue["id"]) # When doing this it now says that the list is a str and it needs to be an integers or slices. Quite frustrating for me as it is opposing it self eg. It first wants str and now it cant use str
print(parsevalue{'id'}) # I also tried to shuffle it around as seen here, but still the same result
I know this is janky and is very hard to comprehend this level of stupidity that I might be on. As I dont know if it is the most simple problem or something that just isnt possible (Which it should or else I cant use my data in the database.)
Thank you for reading this through and not jumping straight into the comments and say that I have to read the JSON documentation, because I have and I havent found a single piece that could help me.
I tried reading the documentation and watching tutorials about JSON and JSON parsing. I also looked for others whom have had the same problems as me and couldnt find.
It looks like it's returning a dictionary ({}) inside a list ([]) of one element.
Try:
getvalue = r.db("Databasename").table("tablename").sample(1).run()
print(getvalue[0]['id'])

Parsing unquoted JavaScript object literal as JSON using YAML & JSON modules

I have scraped a JavaScript object which I can't parse as JSON because it has unquoted keys.
I found a solution here which says to load the object as a Python data structure using the PyYaml library, and then write it back out as valid JSON:
https://stackoverflow.com/a/31030022/10601287
This would be a great solution for me, however yaml.load(js_obj) causes the keys & values to merge together as a key, and causes the value to default to 'None'. This is my code snippet:
import yaml
yaml_obj = yaml.safe_load(js_obj)
print(yaml_obj)
Example of the JavaScript Object before loaded as YAML (in reality it is much bigger than this):
{
path:"1/83656/83659/83669/83670",
is_active:!0,
level:4,
children_count:0,
product_count:59,
parent_id:83669,
name:"Red Wine",
position:1,
id:83670,
include_in_menu:1,
url_key:"red-wine-83670",
url_path:"liquor/wine/red-wine.html",
_score:null,
slug:"red-wine-83670"
}
After yaml.load(js_obj):
{
'path:"1/83656/83659/83669/83670"': None,
'is_active:!0': None,
'level:4': None,
'children_count:0': None,
'product_count:59': None,
'parent_id:83669': None,
'name:"Red Wine"': None,
'position:1': None,
'id:83670': None,
'include_in_menu:1': None,
'url_key:"red-wine-83670"': None,
'url_path:"liquor/wine/red-wine.html"': None,
'_score:null': None,
'slug:"red-wine-83670"': None
}
Any advice would be greatly appreciated.
YAML requires the colon in a mapping to be followed by at least one space character, so your input isn't valid YAML either. If the format is as simple as your example indicates, you could preprocess it into YAML by searching for a word at the beginning of a line followed by a colon and inserting a space after the colon. (Or you could insert quotes around the word to make it JSON, but you're going have a problem with is_active:!0,, because !0 isn't a JSON value.)
So you could try something like:
import re
first_word = re.compile(r"^\s*[_a-zA-Z]\w*:")
# ...
yaml_obj = yaml.load(first_word.replace(r"\g<0> ", js_obj))
Of course, if the input is less regular, that could fail horribly.

What kind of data structure is this? Python

Studying Python, I am following an excellent Corey Schafer tutorial on Flask, he does this (I have extracted and summarized it for obvious reasons):
from folder_app import app # I did it to follow the structure and that the code is equal to the original
s = Serializer(app.config['SECRET_KEY'], 1800) # key, seconds
token = s.dumps({'user_id': 1}).decode('utf-8')
s = Serializer(app.config['SECRET_KEY'])
user_id = s.loads(token)['user_id'] # This is where I have the doubt
print(user_id)
print(type(s.loads(token)))
The code works, the problem I have is that although as you can see (s.loads (token)) is a dict, I expected to see something like this s.loads ({token ['user_id']}), or s.loads (token ['user_id']) or something like that. That is, it is a dict but it does not seem so. And my doubt goes in the sense if this comes from a greater concept of those they call "pythonic" (which I have not seen so far), or is something that only happens particularly as in this case. Incidentally, https://itsdangerous.palletsprojects.com/en/1.1.x/jws/ this appears: loads (self, s, salt = None, return_header = False) the arguments are in parentheses. I hope it is clear what my doubt is :)
I know this is not answer per say but just to add to my comment. This is an example of how the loads function works on dictionaries with the json module. https://docs.python.org/3/library/json.html#json.loads. What it does is take a json string and return the dictionary type object in Python. Your Serializer is doing something similar. It takes the token string and represents it as an object like dict
The s.dumps I am assuming is similar to json.dumps which gives you the json string representation of python dictionary.
import json
my_dict = json.loads('{"user_id": "Mane", "name": "Joe"}')
my_dict['user_id']
So you could just do json.loads('{"user_id": "Mane", "name": "Joe"}')['user_id'] which is just chaining the operations.

Expecting value: line 1 column 1 (char 0) python

I'm newbie in python and trying to parse data in my application using these lines of codes
json_str = request.body.decode('utf-8')
py_str = json.loads(json_str)
But I'm getting this error on json.loads
Expecting value: line 1 column 1 (char 0)
this is json formatted data that I send from angular app (Updated)
Object { ClientTypeId: 6, ClientName: "asdasd", ClientId: 0, PhoneNo: "123", FaxNo: "123", NTN: "1238", GSTNumber: "1982", OfficialAddress: "sads", MailingAddress: "asdasd", RegStartDate: "17-Aug-2016", 15 moreā€¦ }
these are the values that I get in json_str
ClientTypeId=5&ClientName=asdasd&ClientId=0&PhoneNo=123&FaxNo=123&NTN=123&GSTNumber=12&OfficialAddress=adkjh&MailingAddress=adjh&RegStartDate=09-Aug-2016&RegEndDate=16-Aug-2016&Status=1&CanCreateUser=true&UserQuotaFor=11&UserQuotaType=9&MaxUsers=132123&ApplyUserCharges=true&ApplyReportCharges=true&EmailInvoice=true&BillingType=1&UserCharges=132&ReportCharges=123&MonthlyCharges=123&BillingDate=16-Aug-2016&UserSessionId=324
I don't know what's wrong in it.. can anyone mention what's the mistake is??
Your data is not JSON-formatted, not even the one you included in your updated answer. Your data is a JavaScript-object, not an encoded string. Please note the "N" in JSON: Notation -- it is a format inspired from how data is written in JavaScript code, but runtime JavaScript data is not represented in JSON. The "JSON" you pasted is how your browser represents the object to you, it is not proper JSON (that would be {"ClientTypeId": 6, ...} -- note the quotes around the property name).
When sending this data to the server, you have to encode it. You think you are sending it JSON-encoded, but you aren't. You are sending it "web form encoded" (data of type application/x-www-form-urlencoded).
Now either you have to learn how to send the data in JSON format from Angular, or use the correct parsing routine in Python: urllib.parse.parse_qs. Depending on the library you are using, there might be a convenience method to access the data as well, as this is a common use case.

urllib's urlencode returning weird encoded results

I'm trying to use Facebook's REST api, and am encoding a JSON string/dictionary using urllib.urlencode. The result I get however, is different from the correct encoded result (as displayed by pasting the dictionary in the attachment field here http://developers.facebook.com/docs/reference/rest/stream.publish/). I was wondering if anyone could offer any help.
Thanks.
EDIT:
I'm trying to encode the following dictionary:
{"media": [{"type":"flash", "swfsrc":"http://shopperspoll.webfactional.com/media/flashFile.swf", "height": '100', "width": '100', "expanded_width":"160", "expanded_height":"120", "imgsrc":"http://shopperspoll.webfactional.com/media/laptop1.jpg"}]}
This is the encoded string using urllib.urlencode:
"media=%5B%7B%27swfsrc%27%3A+%27http%3A%2F%2Fshopperspoll.webfactional.com%2Fmedia%2FflashFile.swf%27%2C+%27height%27%3A+%27100%27%2C+%27width%27%3A+%27100%27%2C+%27expanded_width%27%3A+%27160%27%2C+%27imgsrc%27%3A+%27http%3A%2F%2Fshopperspoll.webfactional.com%2Fmedia%2Flaptop1.jpg%27%2C+%27expanded_height%27%3A+%27120%27%2C+%27type%27%3A+%27flash%27%7D%5D"
It's not letting me copy the result being thrown out from the facebook rest documentation link, but on copying the above dictionary in the attachment field, the result is different.
urllib.encode isn't meant for urlencoding a single value (as functions of the same name are in many languages), but for encoding a dict of separate values. For example, if I had the dict {"a": 1, "b": 2} it would produce the string "a=1&b=2".
First, you want to encode your dict as JSON.
data = {"media": [{"type":"flash", "swfsrc":"http://shopperspoll.webfactional.com/media/flashFile.swf", "height": '100', "width": '100', "expanded_width":"160", "expanded_height":"120", "imgsrc":"http://shopperspoll.webfactional.com/media/laptop1.jpg"}]}
import json
json_encoded = json.dumps(data)
You can then either use urllib.encode to create a complete query string
import urllib
urllib.encode({"access_token": example, "attachment": json_encoded})
# produces a long string in the form "access_token=...&attachment=..."
or use urllib.quote to just encode your attachment parameter
urllib.quote(json_encoded)
# produces just the part following "&attachment="

Categories

Resources