How to load unicode json in Python? - python

I've got some simple json formatted in unicode which I want to load using the usual python json.loads():
>>> er.rates
u"{u'sell': u'1.3477', u'buy': u'1.3588'}"
>>> json.loads(er.rates)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
So I tried using ensure_ascii=False:
>>> json.loads(er.rates, ensure_ascii=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 351, in loads
return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'ensure_ascii'
Does anybody know how I can load this unicode json?

That is not json. It is a string representation of a python dict, which is something quite different.
You can use ast.literal_eval to load it.

Many API data providers return unicode string which is easily rendered in a browser. Unicode string (even when it looks like json) and json are not the same thing from a 'computer' perspective.
If you have a unicode (json-like) string you should be able to use json.loads(<your unicode json like string>)

Related

Python: ValueError: Expecting , delimiter: line 1 when loading json

sample_json_obj = '{"foo":{"IdentityDocuments":[{"bar":"{\"baz\":\"qux\"}"}]}}'
This is syntactically accurate for a JSON string.
However when I try to load into a variable for further processing,
json.loads(sample_json_obj)
I get the following stacktrace
Traceback (most recent call last):
File "test.py", line 5, in <module>
json.loads(sample_json_obj)
File "/usr/local/Cellar/python#2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python#2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python#2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 40 (char 39)
I read through a few StackOverflow posts trying to fix this and one of the popular ones suggest mutating the string to the following format
sample_json_obj = r'{"foo":{"IdentityDocuments":[{"bar":"{\"baz\":\"qux\"}"}]}}'
which works as expected but my only problem is that since I am getting this JSON object directly from a query and I am not able to add the r in front of it i.e convert it to a raw string although I have tried several other stack posts including trying formatting, none worked.
Can someone please help me?
Would be great if someone could help me without modifying the first line i.e
sample_json_obj = '{"foo":{"IdentityDocuments":[{"bar":"{\"baz\":\"qux\"}"}]}}'

What's the most Pythonic way to parse out this value from a JSON-like blob?

See below. Given a well-known Google URL, I'm trying to retrieve data from that URL. That data will provide me another Google URL from which I can retrieve a list of JWKs.
>>> import requests, json
>>> open_id_config_url = 'https://ggp.sandbox.google.com/.well-known/openid-configuration'
>>> response = requests.get(open_id_config_url)
>>> r.status_code
200
>>> response.text
u'{\n "issuer": "https://www.stadia.com",\n "jwks_uri": "https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt#system.gserviceaccount.com",\n "claims_supported": [\n "iss",\n "aud",\n "sub",\n "iat",\n "exp",\n "s_env",\n "s_app_id",\n "s_gamer_tag",\n "s_purchase_country",\n "s_current_country",\n "s_session_id",\n "s_instance_ip",\n "s_restrict_text_chat",\n "s_restrict_voice_chat",\n "s_restrict_multiplayer",\n "s_restrict_stream_connect",\n ],\n "id_token_signing_alg_values_supported": [\n "RS256"\n ],\n}'
Above I have successfully retrieved the data from the first URL. I can see the entry jwks_uri contains the second URL I need. But when I try to convert that blob of text to a python dictionary, it fails.
>>> response.json()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/saqib.ali/saqib-env-99/lib/python2.7/site-packages/requests/models.py", line 889, in json
self.content.decode(encoding), **kwargs
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
>>> json.loads(response.text)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python#2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
The only way I can get out the JWKs URL is by doing this ugly regular expression parsing:
>>> re.compile('(?<="jwks_uri": ")[^"]+').findall(response.text)[0]
u'https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt#system.gserviceaccount.com'
Is there a cleaner, more Pythonic way to extract this string?
I really wish Google would send back a string that could be cleanly JSON-ified.
The returned json string is incorrect because last item of the dictionary ends with ,, which json cannot parse.
": [\n "RS256"\n ],\n}'
^^^
But ast.literal_eval can do that (as python parsing accepts lists/dicts that end with a comma). As long as you don't have booleans or null values, it is possible and pythonic
>>> ast.literal_eval(response.text)["jwks_uri"]
'https://www.googleapis.com/service_accounts/v1/jwk/stadia-jwt#system.gserviceaccount.com'
Your JSON is invalid because it has an extra comma after the last value in the claims_supported array.
I wouldn't necessarily recommend it, but you could use the similarity of JSON and Python syntax to parse this directly, since Python is much less picky:
ast.literal_eval(response.tezt)
As suggested in this answer use yaml to parse json. It will tolerate the trailing comma as well as other deviations from the json standard.
import yaml
d = yaml.load(response.text)

Unable to parse json string using python load or loads

I am a beginner with python. I am reading from an aerospike db, like this -
(key, metadata, record) = client.get(key)
print('aero ')
aerojson = json.load(record)
print(record)
Output is -
{'expiresIn': 1535873246092}
I am trying to parse the result set (so as to read expiresIn attribute), but am getting following error -
Traceback (most recent call last):
File "Sandeepan-oauth_token_cache_random_sanity.py", line 29, in <module>
aerojson = json.load(record)
File "/root/miniconda2/lib/python2.7/json/__init__.py", line 287, in load
return loads(fp.read(),
AttributeError: 'dict' object has no attribute 'read'
If I change to json.loads(), I get -
Traceback (most recent call last):
File "Sandeepan-oauth_token_cache_random_sanity.py", line 29, in <module>
aerojson = json.loads(record)
File "/root/miniconda2/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/root/miniconda2/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
Please suggest proper documentation with examples.

Python - Having trouble creating a dictionary from a JSON

I'm trying to take the JSON from a twitter get_user query and turn it into a Python object that I can extract data from (twitter handle, location, screen name, etc.)
Here is what I created. I am not sure why it doesn't work.
api = tweepy.API(auth,parser=tweepy.parsers.JSONParser())
user = api.search_users('google.com')
t_dict = json.loads(user)
pprint(t_dict)
Error:
Traceback (most recent call last):
File "Get_User_By_URL.py", line 23, in <module>
t_dict = json.loads(user)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
api.search_users is already returning a python object. It isn't a json string that needs to be parsed. According to tweetpy documentation search_users actually returns a list of users. So the following is possible:
for user in api.search_users('google.com'):
print user.screen_name

Decoding JSON with Python

Why do I get
ValueError: No JSON object could be decoded
from this code:
import urllib.request,json
n = urllib.request.urlopen("http://graph.facebook.com/55")
d = json.loads(str(n.readall()))
The full error:
Traceback (most recent call last):
File "<pyshell#41>", line 1, in <module>
d= json.loads(str(n.readall()))
File "C:\Python33\lib\json\__init__.py", line 309, in loads
return _default_decoder.decode(s)
File "C:\Python33\lib\json\decoder.py", line 352, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python33\lib\json\decoder.py", line 370, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
The output of str(n.readall()):
'b\'{"id":"55","name":"Jillian Copeland","first_name":"Jillian","last_name":"Copeland","username":"JCoMD","gender":"female","locale":"en_US"}\''
Maybe the b is throwing it off?
If that is the issue, how do I convert the binary stream from the readall to a string and not have that b?
I am trying to learn a little python so please keep that in mind.
I am using Python 3.3 in Windows.
I believe that this is an exact duplicate of this question, but sadly there's no accepted answer.
On my end, this works:
import urllib.request,json
n = urllib.request.urlopen("http://graph.facebook.com/55")
d= json.loads(n.readall().decode('utf-8'))

Categories

Resources