Loading JSON with json.loads in Python gives JSONDecodeError because of " - python

I am trying to turn a Javascript from a website into a JSON structure with Python's json.loads() but it gives a JSONDecodeError. It's because there are objects in the Javascript which are quoted, and when json.loads() runs, it turns " into a " (double quote), which produces bad JSON.
This is a very small example of the Javascript:
{"key":{"hascookie":"yes"}, "cookiestatus":234, "widget":null, "player":"{"source":true,"country"}"}
There is a lot of JSON, and it's minified.
I am loading it like this:
j = 'JSON text'
result = json.loads(j)
Is the solution prevent the loads() function to unquote the JSON and leave the " as is?

Here is a possible solution (if I understand the problem correctly...):
import json
s = '{"source":true,"country":"GB"}'
class myEncoder(json.JSONDecoder):
def decode(self, s):
s = s.replace('"', '\"')
return json.JSONDecoder.decode(self, s)
decoded = json.loads(s, cls=myEncoder)
print(decoded)
Output is correct JSON: {'source': True, 'country': 'GB'}

Related

Passing a list of identifiers into an API parameter in Python [duplicate]

I have been trying to figure out how to use python-requests to send a request that the url looks like:
http://example.com/api/add.json?name='hello'&data[]='hello'&data[]='world'
Normally I can build a dictionary and do:
data = {'name': 'hello', 'data': 'world'}
response = requests.get('http://example.com/api/add.json', params=data)
That works fine for most everything that I do. However, I have hit the url structure from above, and I am not sure how to do that in python without manually building strings. I can do that, but would rather not.
Is there something in the requests library I am missing or some python feature I am unaware of?
Also what do you even call that type of parameter so I can better google it?
All you need to do is putting it on a list and making the key as list like string:
data = {'name': 'hello', 'data[]': ['hello', 'world']}
response = requests.get('http://example.com/api/add.json', params=data)
What u are doing is correct only. The resultant url is same what u are expecting.
>>> payload = {'name': 'hello', 'data': 'hello'}
>>> r = requests.get("http://example.com/api/params", params=payload)
u can see the resultant url:
>>> print(r.url)
http://example.com/api/params?name=hello&data=hello
According to url format:
In particular, encoding the query string uses the following rules:
Letters (A–Z and a–z), numbers (0–9) and the characters .,-,~ and _ are left as-is
SPACE is encoded as + or %20
All other characters are encoded as %HH hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
So array[] will not be as expected and will be automatically replaced according to the rules:
If you build a url like :
`Build URL: http://example.com/api/add.json?name='hello'&data[]='hello'&data[]='world'`
OutPut will be:
>>> payload = {'name': 'hello', "data[]": 'hello','data[]':'world'}
>>> r = requests.get("http://example.com/api/params", params=payload)
>>> r.url
u'http://example.com/api/params?data%5B%5D=world&name=hello'
This is because Duplication will be replaced by the last value of the key in url and data[] will be replaced by data%5B%5D.
If data%5B%5D is not the problem(If server is able to parse it correctly),then u can go ahead with it.
Source Link
One solution if using the requests module is not compulsory, is using the urllib/urllib2 combination:
payload = [('name', 'hello'), ('data[]', ('hello', 'world'))]
params = urllib.urlencode(payload, doseq=True)
sampleRequest = urllib2.Request('http://example.com/api/add.json?' + params)
response = urllib2.urlopen(sampleRequest)
Its a little more verbose and uses the doseq(uence) trick to encode the url parameters but I had used it when I did not know about the requests module.
For the requests module the answer provided by #Tomer should work.
Some api-servers expect json-array as value in the url query string. The requests params doesn't create json array as value for parameters.
The way I fixed this on a similar problem was to use urllib.parse.urlencode to encode the query string, add it to the url and pass it to requests
e.g.
from urllib.parse import urlencode
query_str = urlencode(params)
url = "?" + query_str
response = requests.get(url, params={}, headers=headers)
The solution is simply using the famous function: urlencode
>>> import urllib.parse
>>> params = {'q': 'Python URL encoding', 'as_sitesearch': 'www.urlencoder.io'}
>>> urllib.parse.urlencode(params)
'q=Python+URL+encoding&as_sitesearch=www.urlencoder.io'

Convert a string from json to Python: Problem

I am trying to convert a Json string, that i got from a Website, to a python string, but it doesent work.
import json
y = json.loads({"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":,"originalURL":"http://instagramm.com","DomainId":,"archived":false,"path":"","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"","idString":"lnk_X1P_RGeQY","shortURL":"","duplicate":false})
print(y["originalURL"])
As far as I can tell, I am missing some ' in front and back of the JSON string, but I have no idea, how to add them.
Sorry for my bad English and python Skills
Edit: ive tried adding ' with comma, + and .join ('"""')
Edit2:
import requests
url = "https://api.short.io/links/xxxxxx"
wurl = "youtube.com"
response = requests.request("POST", url, json={"allowDuplicates": False, "domain": "1cr0.short.gy", "originalURL": wurl}, headers={"Accept": "application/json", "Content-Type": "application/json", "Authorization": "xxxxxxxxxxxxxxxxxxxxxx"})
print(response.text)
You need to make the json string an actual sting. So try putting single quotes around it. Like so:
import json
y = json.loads('{"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":793212684,"originalURL":"http://instagramm.com","DomainId":226909,"archived":false,"path":"5n316HkujOzP","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":221852,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"https://1cr0.short.gy/5n316HkujOzP","idString":"lnk_X1P_RGeQY","shortURL":"https://1cr0.short.gy/5n316HkujOzP","duplicate":false}')
print(y["originalURL"])
You forgot the quotes around your y variable to make it a string, I recommend an ide so it can point out stuff like this.
y = json.loads('{"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":793212684,"originalURL":"http://instagramm.com","DomainId":226909,"archived":false,"path":"5n316HkujOzP","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":221852,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"https://1cr0.short.gy/5n316HkujOzP","idString":"lnk_X1P_RGeQY","shortURL":"https://1cr0.short.gy/5n316HkujOzP","duplicate":false}')
Please pass your json data in a string.
Try:
import json
y = json.loads('{"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":793212684,"originalURL":"http://instagramm.com","DomainId":226909,"archived":false,"path":"5n316HkujOzP","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":221852,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"https://1cr0.short.gy/5n316HkujOzP","idString":"lnk_X1P_RGeQY","shortURL":"https://1cr0.short.gy/5n316HkujOzP","duplicate":false}')
print(y["originalURL"])
Output:
http://instagramm.com
You have to pass a JSON String to json.loads(). You are just passing the JSON as is (Not a string).
From the Docs:
json.loads(s) - Deserialize s (a str, bytes or bytearray instance containing a JSON document)
import json
y = json.loads('{"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":793212684,"originalURL":"http://instagramm.com","DomainId":226909,"archived":false,"path":"5n316HkujOzP","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":221852,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"https://1cr0.short.gy/5n316HkujOzP","idString":"lnk_X1P_RGeQY","shortURL":"https://1cr0.short.gy/5n316HkujOzP","duplicate":false}')
print(y["originalURL"])
http://instagramm.com
Try This:
import json
y = '''{"title":null,"icon":null,"iphoneURL":null,"splitURL":null,"splitPercent":null,"expiresAt":null,"expiredURL":null,"clicksLimit":null,"source":"api","integrationGA":null,"integrationFB":null,"integrationAdroll":null,"integrationGTM":null,"id":793212684,"originalURL":"http://instagramm.com","DomainId":226909,"archived":false,"path":"5n316HkujOzP","cloaking":null,"redirectType":null,"createdAt":"2021-08-03T17:02:20.935Z","OwnerId":221852,"updatedAt":"2021-08-03T17:02:20.935Z","secureShortURL":"https://1cr0.short.gy/5n316HkujOzP","idString":"lnk_X1P_RGeQY","shortURL":"https://1cr0.short.gy/5n316HkujOzP","duplicate":false}'''
y = json.loads(y)
print(y["originalURL"])
I just converted the json to string before loading
print(response.text)
You can save the return of this code into a variable.
var = str(response.text)
y = json.loads(var)
print(y["originalURL"])
and type cast it into a string before passing it into the variable.

python unable to parse JSON Data

I am unable to parse the JSON data using python.
A webpage url is returning JSON Data
import requests
import json
BASE_URL = "https://www.codechef.com/api/ratings/all"
data = {'page': page, 'sortBy':'global_rank', 'order':'asc', 'itemsPerPage':'40' }
r = requests.get(BASE_URL, data = data)
receivedData = (r.text)
print ((receivedData))
when I printed this, I got large text and when I validated using https://jsonlint.com/ it showed VALID JSON
Later I used
import requests
import json
BASE_URL = "https://www.codechef.com/api/ratings/all"
data = {'page': page, 'sortBy':'global_rank', 'order':'asc', 'itemsPerPage':'40' }
r = requests.get(BASE_URL, data = data)
receivedData = (r.text)
print (json.loads(receivedData))
When I validated the large printed text using https://jsonlint.com/ it showed INVALID JSON
Even if I don't print and directly use the data. It is working properly. So I am sure even internally it is not loading correctly.
is python unable to parse the text to JSON properly?
in short, json.loads converts from a Json (thing, objcet, array, whatever) into a Python object - in this case, a Json Dictionary. When you print that, it will print as a itterative and therefore print with single quotes..
Effectively your code can be expanded:
some_dictionary = json.loads(a_string_which_is_a_json_object)
print(some_dictionary)
to make sure that you're printing json-safe, you would need to re-encode with json.dumps
When you use python's json.loads(text) it returns a python dictionary. When you print that dictionary out it is not in json format.
If you want a json output you should use json.dumps(json_object).

Python: How to prevent python dictionary from putting quotes around my json?

I am using requests to create a post request on a contractor's API. I have a JSON variable inputJSON that undergoes formatting like so:
def dolayoutCalc(inputJSON):
inputJSON = ast.literal_eval(inputJSON)
inputJSON = json.dumps(inputJSON)
url='http://xxyy.com/API'
payload = {'Project': inputJSON, 'x':y, 'z':f}
headers = {'content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(payload), headers=headers)
My issue arises when I define payload={'Project':inputJSON, 'x':y, 'z':f}
What ends up happening is Python places a pair of quotes around the inputJSON structure. The API I am hitting is not able to handle this. It needs Project value to be the exact same inputJSON value just without the quotes.
What can I do to prevent python from placing quotes around my inputJSON object? Or is there a way to use requests library to handle such POST request situation?
inputJSON gets quotes around it because it's a string. When you call json.dumps() on something a string will come out, and then when it's converted to JSON it will get quotes around it. e.g.:
>>> import json
>>> json.dumps('this is a string')
>>> '"this is a string"'
I'm with AKS in that should be able to remove this line:
inputJSON = json.dumps(inputJSON)
From your description inputJSON sounds like a Python literal (e.g. {'blah': True} instead of {"blah": true}. So you've used the ast module to convert it into a Python value, and then in the final json.dumps() it should be converted to JSON along with everything else.
Example:
>>> import ast
>>> import json
>>> input = "{'a_var': True}" # A string that looks like a Python literal
>>> input = ast.literal_eval(input) # Convert to a Python dict
>>> print input
>>> {'a_var': True}
>>> payload = {'Project': input} # Add to payload as a dict
>>> print json.dumps(payload)
>>> {"Project": {"a_var": true}} # In the payload as JSON without quotes

JSON read from sparkcore to python

I have searched the web but couldn't find a suitable answer so I will try and ask here.
I am experimenting with a spark core and parsing data through JSON. I have already managed to read the data and print it with the following code:
import urllib, json
from pprint import pprint
url = "https://api.spark.io/v1/devices/mycore/result?access_token=accesstoken"
response = urllib.urlopen(url);
data = json.loads(response.read())
pprint(data)
And now I am trying to print the value I am sending with this code:
data["result"]["data1"]
I found the above in another topic but I am probably to unexperienced to properly apply it to my own code.
This is what python prints:
{u'cmd': u'VarReturn',
u'coreInfo': {u'connected': True,
u'deviceID': u'1111111111111111111',
u'last_app': u'',
u'last_handshake_at': u'2015-03-09T12:28:20.271Z',
u'last_heard': u'2015-03-09T12:56:42.780Z'},
u'name': u'result',
u'result': u'{"data1":2869}'}
the error I get says the following: TypeError: string indices must be integers
I used the example code from this topic:
https://community.spark.io/t/example-logging-and-graphing-data-from-your-spark-core-using-google/2929
I hope I am clear, can anyone enlighten me?
Try to print out data["result"]. From python print you have provided, the output should be '{"data1":2869}', which is another json object.
Try something like this:
import urllib, json
from pprint import pprint
url = "https://api.spark.io/v1/devices/mycore/result?access_token=accesstoken"
response = urllib.urlopen(url);
data = json.loads(response.read())
pprint(data)
new_data = json.loads(data["result"])
print new_data["data1"]
The contents of data["result"] is a unicode string. The string contains something that looks like a JSON doc / Python dictionary (see the single quotes around the whole construction):
>>> data["result"]
u'{"data1":2869}'

Categories

Resources