Python API gets invalid json - python

I have just started using python a few days ago and get work out the JSON format.
I use requests to get JSON data by API. However, I get the wrong decoded JSON format (JSON validator finds errors).
webpage = 'https://parser-api.com/parser/arbitr_api/run.php'
API = 'cant post it' #
output_results = []
cases = ['А65-22925/2017']
for i in cases:
params = {'key':API, 'CaseNumber':i}
results = requests.get(webpage, params = params)
output_results.append(results.text)
print (output_results)
with open ('file_name_case.json', 'w', encoding='utf8') as wr:
wr.write (str(output_results))
that is the snippet of the response that I get, which is wrong:
['{"Cases":[{"CaseId":"998ecaef-3da8-45ab-9f56-90bfc3375e11","CaseNumber":"\\u041065-22925\\/2017","CaseType":"\\u0410","Thirds":[],"Plaintiffs":[{"Name":"\\u041e\\u041e\\u041e \\"\\u0412\\u0420-\\u041f\\u043b\\u0430\\u0441\\u0442\\", \\u0433.\\u041a\\u0430\\u0437\\u0430\\u043d\\u044c","Address":"421001, \\u0420\\u043e\\u0441\\u0441\\u0438\\u044f, \\u0433.\\u041a\\u0430\\u0437\\u0430\\u043d\\u044c, \\u0420\\u0422, \\u0443\\u043b.\\u0421.\\u0425\\u0430\\u043a\\u0438\\u043c\\u0430, \\u0434.60, \\u043e\\u0444\\u0438\\u0441 164","Id":"dc26df83-6361-4de0-bc93-8c20ae0a4417"}],"Respondents":[{"Name":"\\u0424\\u0435\\u0434\\u0435\\u0440\\u0430\\u043b\\u044c\\u043d\\u0430\\u044f \\u0422\\u0430\\u043c\\u043e\\u0436\\u0435\\u043d\\u043d\\u0430\\u044f \\u0441\\u043b\\u0443\\u0436\\u0431\\u0430 \\u041f\\u0440\\u0438\\u0432\\u043e\\u043b\\u0436\\u0441\\u043a\\u043e\\u0435 \\u0442\\u0430\\u043c\\u043e\\u0436\\u0435\\u043d\\u043d\\u043e\\u0435 \\u0443\\u043f\\u0440\\u0430\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435 \\u0422\\u0430\\u0442\\u0430\\u0440\\u0441\\u0442\\u0430\\u043d\\u0441\\u043a\\u0430\\u044f \\u0442\\u0430\\u043c\\u043e\\u0436\\u043d\\u044f, \\u0433.\\u041a\\u0430\\u0437\\u0430\\u043d\\u044c","Address":"420094, \\u0420\\u043e\\u0441\\u0441\\u0438\\u044f, \\u0433.\\u041a\\u0430\\u0437\\u0430\\u043d\\u044c, \\u0420\\u0422, \\u0443\\u043b.\\u041a\\u043e\\u0440\\u043e\\u043b\\u0435\\u043d\\u043a\\u043e, \\u0434.56","Id":"4b21e3e9-9d0c-42ce-bbec-4e1615e34698"}]...
the right format suppose to be like this:
{"Cases":[{"CaseId":"998ecaef-3da8-45ab-9f56-90bfc3375e11","CaseNumber":"\u041065-22925\/2017","CaseType":"\u0410","Thirds":[],"Plaintiffs":[{"Name":"\u041e\u041e\u041e \"\u0412\u0420-\u041f\u043b\u0430\u0441\u0442\", \u0433.\u041a\u0430\u0437\u0430\u043d\u044c","Address":"421001, \u0420\u043e\u0441\u0441\u0438\u044f, \u0433.\u041a\u0430\u0437\u0430\u043d\u044c, \u0420\u0422, \u0443\u043b.\u0421.\u0425\u0430\u043a\u0438\u043c\u0430, \u0434.60, \u043e\u0444\u0438\u0441 164","Id":"dc26df83-6361-4de0-bc93-8c20ae0a4417"}],"Respondents":[{"Name":"\u0424\u0435\u0434\u0435\u0440\u0430\u043b\u044c\u043d\u0430\u044f \u0422\u0430\u043c\u043e\u0436\u0435\u043d\u043d\u0430\u044f \u0441\u043b\u0443\u0436\u0431\u0430 \u041f\u0440\u0438\u0432\u043e\u043b\u0436\u0441\u043a\u043e\u0435 \u0442\u0430\u043c\u043e\u0436\u0435\u043d\u043d\u043e\u0435 \u0443\u043f\u0440\u0430\u0432\u043b\u0435\u043d\u0438\u0435 \u0422\u0430\u0442\u0430\u0440\u0441\u0442\u0430\u043d\u0441\u043a\u0430\u044f \u0442\u0430\u043c\u043e\u0436\u043d\u044f, \u0433.\u041a\u0430\u0437\u0430\u043d\u044c","Address":"420094, \u0420\u043e\u0441\u0441\u0438\u044f, \u0433.\u041a\u0430\u0437\u0430\u043d\u044c, \u0420\u0422, \u0443\u043b.\u041a\u043e\u0440\u043e\u043b\u0435\u043d\u043a\u043e, \u0434.56","Id":"4b21e3e9-9d0c-42ce-bbec-4e1615e34698"}]
Please help

You can do something like this:
import json
with open('file_name_case.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)

Although Python syntax is very similar in many ways to JSON syntax it is not valid JSON to just str(<some python object>). You need to use the json module to write JSON.
Instead of taking results.text directly, use results.json() to decode the JSON response from the server.
Now you have a Python list of dicts as opposed to a Python list of strings containing JSON.
Then, as in kup's answer you can convert back to JSON:
output_results = []
for idx in cases:
params = {'key': API, 'CaseNumber': idx}
results = requests.get(webpage, params=params)
output_results.append(results.json())
with open('file_name_case.json', 'w') as fobj:
json.dump(output_results, fobj)

Related

JSON output includes literal \n rather than line breaks

How can the JSON output be formatting in a way that doesn't include the \n text, and instead shows these as new lines as intended? This is what the saved output file looks like:
But, this is how it looks when I use print, which is what it should look like:
import requests
import json
def get_all_time_entries():
url_address = "***"
headers = {
"Authorization": "***",
"api-version": "2020-01-31"
}
# find out total number of pages
r = requests.get(url=url_address, headers=headers).json()
total_pages = 605
# results will be appended to this list
all_time_entries = []
# loop through all pages and return JSON object
for page in range(1, total_pages):
url = "***"+str(page)
response = requests.get(url=url, headers=headers).json()
all_time_entries.append(response)
page += 1
# prettify JSON
data = json.dumps(all_time_entries, sort_keys=True, indent=4)
return data
#print(get_all_time_entries())
with open('appointmentsHistory.json', 'w', encoding='utf-8') as f:
# note that I use dump method, not dumps
json.dump(get_all_time_entries(), f, sort_keys=True, indent=4)
json.dumps() transforms the data dictionary into a string, and then json.dump() writes the JSON representation of that string to the file.
To resolve, remove json.dumps() from the get_all_time_entries() method. json.dump() will take the dictionary in directly and transform it into a JSON string for you.
import requests
import json
def get_all_time_entries():
url_address = "***"
headers = {
"Authorization": "***",
"api-version": "2020-01-31"
}
# find out total number of pages
r = requests.get(url=url_address, headers=headers).json()
total_pages = 605
# results will be appended to this list
all_time_entries = []
# loop through all pages and return JSON object
for page in range(1, total_pages):
url = "***"+str(page)
response = requests.get(url=url, headers=headers).json()
all_time_entries.append(response)
page += 1
return data
with open('appointmentsHistory.json', 'w', encoding='utf-8') as f:
# note that I use dump method, not dumps
json.dump(get_all_time_entries(), f, sort_keys=True, indent=4)
json.dump() takes an object, you seem to be passing it a JSON-like string.

JSONDecodeError: Extra data: Python

I am loading json from files using the code:
file = 'file_name'
obj_list = []
with open(file) as f:
for json_obj in f:
obj_list.append(loads(json_obj))
I get error:
JSONDecodeError: Extra data: line 1 column 21 (char 20)
All my files look like this but much larger.
{"some":"property2"}{"some":"property"}{"some":"property3"}
Is there a way to parse this in python for a large number of files?
Your json is not valid . It should be something like this
[{'some': 'property2'}, {'some': 'property'}, {'some': 'property3'}]
import json
with open(file, 'r') as f:
json_str = f'[{f.read()}]'
obj_list = json.loads(json_str)
Reading the content, adding [] to make it valid json, and then loading it with the json package.

Python JSONDecodeError: Expecting value: line 1 column 1

I got an error : JSONDecodeError: Expecting value: line 1 column 1 (char 0). But don't understand why.
Here is my code :
import json
import urllib.request
url = "apiurl"
data = json.loads(url)
# Open the URL as Browser, not as python urllib
page = urllib.request.Request(url,headers={'User-Agent': 'Mozilla/5.0'})
infile = urllib.request.urlopen(page).read()
data = infile.decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1
command_obj = {x['command']: x for x in data}
with open('new_command.json', 'w') as f:
json.dump(command_obj, f, indent=2)
With this fonction, i'm just trying to fetch data from an api and modify its format. Thanks for your help
You're trying to read the URL itself (and not its content) as JSON:
data = json.loads(url)
... instead you want to read the content returned from the API as JSON:
# Open the URL as Browser, not as python urllib
page = urllib.request.Request(url,headers={'User-Agent': 'Mozilla/5.0'})
infile = urllib.request.urlopen(page).read()
data = infile.decode('ISO-8859-1')
# avoid re-using `data` variable name
json_data = json.loads(data)
However, be aware that JSON should always be returned as UTF-8, never as ISO-8859-1 / latin-1.

Python duplicates data from JSON response

When I open python shell and run my code below it gives exactly 250 entries. But when I run it in the shell it gives me 500.
import requests
import json
url = 'https://www.cefconnect.com/api/v3/pricinghistory/DPG/1Y'
json_data = requests.get(url).json()
price_data = json_data['Data']
for i in price_data['PriceHistory']:
print (i['Data'])
This is a sample from the json that I'm traing to manipulate:
{"Data":
{"Period":"1Y",
"PriceHistory":
[{"NAVData":19.31000,
"DiscountData":-14.19,
"Data":16.57000,
"DataDate":"2017-02-14T00:00:00",
"DataDateJs":"2017/02/14",
"DataDateDisplay":"2/14/2017"},
{"NAVData":19.33000,
"DiscountData":-14.49,
"Data":16.53000,
"DataDate":"2017-02-15T00:00:00",
"DataDateJs":"2017/02/15",
"DataDateDisplay":"2/15/2017"},
{"NAVData":19.26000,
"DiscountData":-14.38,
"Data":16.49000,
"DataDate":"2017-02-16T00:00:00",
"DataDateJs":"2017/02/16",
"DataDateDisplay":"2/16/2017"},
{"NAVData":19.18000,
"DiscountData":-14.18,
"Data":16.46000,
"DataDate":"2017-02-17T00:00:00",
"DataDateJs":"2017/02/17",
"DataDateDisplay":"2/17/2017"},
{"NAVData":19.31000,"DiscountData":-
Somehow it duplicates the loop entries.

Trying to Parse SOAP Response in Python

I'm struggling to find a way to parse the data that I'm getting back from a SOAP response. I'm only familiar with Python (v3.4), but relatively new to it. I'm using suds-jurko to pull the data from a 3rd party SOAP server. The response comes back in the form of "ArrayOfXmlNode". I've tried using ElementTree in different ways to parse the data, but I either get no information or I get "TypeError: invalid file: (ArrayOfXmlNode)" errors. Googling how to handle the ArrayOfXMLNode type response has gotten me nowhere.
The first part of the SOAP response is:
(ArrayOfXmlNode){
XmlNode[] =
(XmlNode){
Hl =
(Hl){
ID = "22437790"
Name = "Cameron"
SpeciesID = "1"
Sex = "Male"
PrimaryBreed = "German Shepherd"
SecondaryBreed = "Mix"
SN = ""
Age = "35"
OnHold = "No"
Location = "Foster Home"
BehaviorResult = ""
Photo = "http://sms.petpoint.com/sms/photos/615/123.jpg"
}
},
I've tried iterating through the data with code similar to:
from suds.client import Client
url = 'http://qag.petpoint.com/webservices/AdoptableSearch.asmx?WSDL'
client = Client(url)
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
tree = result[0]
for node in tree:
pet_info = []
pet_info.extend(node)
print(pet_info)
The code above gives me the entire response in "result[0]". Below that I try to create a list from the data, but only get very last node (node being 1 set of information from ID to Photo). Attempts to modify this approach gives me either everything, nothing, or only the last node.
So then I tried to make use of ElementTree with simple code to test it out, but only get the "invalid file" errors.
import xml.etree.ElementTree as ET
from suds.client import Client
url = 'http://qag.petpoint.com/webservices/AdoptableSearch.asmx?WSDL'
client = Client(url)
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
pet_info = ET.parse(result)
print(pet_info)
The result:
Traceback (most recent call last):
File "D:\Python\Eclipse Workspace\KivyTest\src\root\nested\Parse.py", line 11, in <module>
pet_info = ET.parse(result)
File "D:\Programs\Python34\lib\xml\etree\ElementTree.py", line 1186, in parse
tree.parse(source, parser)
File "D:\Programs\Python34\lib\xml\etree\ElementTree.py", line 587, in parse
source = open(source, "rb")
TypeError: invalid file: (ArrayOfXmlNode){
XmlNode[] =
(XmlNode){
Hl =
(Hl){
ID = "20840097"
Name = "Daisy"
SpeciesID = "1"
Sex = "Female"
PrimaryBreed = "Terrier, Pit Bull"
SecondaryBreed = ""
SN = ""
Age = "42"
OnHold = "No"
Location = "Dog Adoption"
BehaviorResult = ""
Photo = "http://sms.petpoint.com/sms/photos/615/40f428de-c015-4334-9101-89c707383817.jpg"
}
},
Can someone get me pointed in the right direction?
I had a similar problem parsing data from a web service using Python 3.4 and suds-jurko. I was able to solve the issue using the code in this post, https://stackoverflow.com/a/34844428/5874347. I used the fastest_object_to_dict function to convert the web service response into a dictionary. From there you can parse the data ...
Add the fastest_object_to_dict function to the top of your file
Make your web service call
Create a new variable to save the dictionary response to
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
ParsedResponse = fastest_object_to_dict(result)
Your data will now be in the form of a dictionary, you can parse the dictionary on the python side as needed or send it back to your ajax call via json, and parse it with javascript.
To send it back as json
import json
import sys
sys.stdout.write("content-type: text/json\r\n\r\n")
sys.stdout.write(json.dumps(ParsedReponse))
Please try this:
result[0][0]
which will give you the first element of the array (ArrayOfXmlNode).
Similarly, try this:
result[0][1][2]
which will give you the third element of element result[0][1].
Hopefully, this offers an alternative solution.
If you are using Python, you can parse this result JSON from a XML result.
But your SOAP result needs to be a XML output, you can use the retxml=True on suds library.
I needed this result as a JSON output as well, and I ended up solving this way:
import xmltodict
# Parse the XML result into dict
data_dict = xmltodict.parse(soap_response)
# Dump the dict result into a JSON result
json_data = json.dumps(data_dict)
# Load the JSON string result
json = json.loads(json_data)

Categories

Resources