Python JSON decoder error with unicode characters in request content

Python JSON decoder error with unicode characters in request content - python

Using requests library to execute http GET that return JSON response i'm getting this error when response string contains unicode char:
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 20 (char 19)
Execute same http request with Postman the json output is:
{ "value": "VILLE D\u0019ANAUNIA" }
My python code is:
data = requests.get(uri, headers=HEADERS).text
json_data = json.loads(data)
Can I remove or replace all Unicode chars before executing conversion with json.loads(...)?

It is likely to be caused by a RIGHT SINGLE QUOTATION MARK U+2019 (’). For reasons I cannot guess, the high order byte has been dropped leaving you with a control character which should be escaped in a correct JSON string.
So the correct way would be to control what exactly the API returns. If id does return a '\u0019' control character, you should contact the API owner because the problem should be there.
As a workaround, you can try to limit the problem for your processing by filtering out non ascii or control characters:
data = requests.get(uri, headers=HEADERS).text
data = ''.join((i for i in data if 0x20 <= ord(i) < 127)) # filter out unwanted chars
json_data = json.loads(data)
You should get {'value': 'VILLE DANAUNIA'}
Alternatively, you can replace all unwanted characters with spaces:
data = requests.get(uri, headers=HEADERS).text
data = ''.join((i if 0x20 <= ord(i) < 127 else ' ' for i in data))
json_data = json.loads(data)
You would get {'value': 'VILLE D ANAUNIA'}

The code below works on python 2.7:
import json
d = json.loads('{ "value": "VILLE D\u0019ANAUNIA" }')
print(d)
The code below works on python 3.7:
import json
d = json.loads('{ "value": "VILLE D\u0019ANAUNIA" }', strict=False)
print(d)
Output:
{u'value': u'VILLE D\x19ANAUNIA'}
Another point is that requests get return the data as json:
r = requests.get('https://api.github.com/events')
r.json()

Related

How to send accented characters with diacritics in HTTP request-payload?

I am requiring to send special characters like accented characters with diacritics, e.g. o-acute ó, via API
This is my test code
import string
import http.client
import datetime
import json
def apiSendFarmacia(idatencion,articulo,deviceid):
##API PAYLOAD
now = datetime.datetime.now()
conn = http.client.HTTPSConnection("apimocha.com")
payload = json.dumps({
"idatencion": idatencion,
"articulo": articulo,
"timestamp": str(now),
"deviceId": deviceid
}).encode("utf-8")
headers = {
'Content-Type': 'application/json'
}
conn.request("POST"
,"/xxxx/api2"#"/my/api/path" #"/alexa"
, payload
, headers)
res = conn.getresponse()
httpcode = res.status
data = res.read()
return httpcode#, data.decode("utf-8")
##API PAYLOAD
when executing the function with some special characters
apiSendFarmacia(2222,"solución",2222)
the mock API will receive following JSON payload with \u00f3 instead of ó:
{
"idatencion": 2222,
"articulo": "soluci\u00f3n",
"timestamp": "2022-12-07 14:52:24.878976",
"deviceId": 2222
}
I was expecting the special character with its accent-mark ó to show in the mock API.

As you print it, it will appear as the special character:
>>> print('soluci\u00f3n')
solución
u00f3 denotes the hexadecimal representation for Unicode Character 'LATIN SMALL LETTER O WITH ACUTE' (U+00F3).
The \ is an escape-signal that tells the interpreter that the subsequent character(s) has to be treated as special character. For example \n is interpreted as newline, \t as tab and \u00f3 as Unicode character ó.

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 2 (char 3)

I know this Question is already answered, but I dont know where the Error is in my Case.
This is my Code:
import json
json_data = """
{
'position1': '516, 440',
'position2': '971, 443',
'position3': '1186, 439',
'position4': '1402, 441',
'position5': '1630, 449',
'position6': '299, 681',
'position7': '518, 684',
'position8': '736, 691',
'position9': '739, 431'
}
"""
data = json.loads(json_data)
print(data)
Im not really into working with json files, so please don't blame me if it's a really dump mistake.

Don't use triple quotes """. Instead use a dictionary with json.dumps() so that commas in your values are not misinterpreted as commas between items.
import json
json_data = {
'position1': '516, 440',
'position2': '971, 443',
'position3': '1186, 439',
'position4': '1402, 441',
'position5': '1630, 449',
'position6': '299, 681',
'position7': '518, 684',
'position8': '736, 691',
'position9': '739, 431'
}
data = json.dumps(json_data)
print(data)

If you are using triple quotes, this would work
json_data = json_data.replace("'", '"')
data = json.loads(json_data)
print(data)

Try this one
import json
json_data = {
'position1': '516, 440',
'position2': '971, 443',
'position3': '1186, 439',
'position4': '1402, 441',
'position5': '1630, 449',
'position6': '299, 681',
'position7': '518, 684',
'position8': '736, 691',
'position9': '739, 431'
}
data = json.dumps(json_data)
print(data)

Payload requests - string concatenate layout error

I have code that reads a list of numbers that are parameters for a python order:
def search(number):
url = "http://localhost:8080/sistem/checkNumberStatus"
payload = '{\n\"SessionName\":\"POC\",\n\"celFull\":\"'+number+'\"\n}'
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data = payload)
object = open('numbers_wpp.txt', 'r')
for numbers in object:
search(number)
But, when I print the payload of my json the output is:
{
"SessionName":"POC",
"celFull":"5512997708936
"
}
{
"SessionName":"POC",
"celFull":"5512997709337
"
}
{
"SessionName":"POC",
"celFull":"5512992161195"
}
When reading the file with 3 numbers, the quotes in the celFull attribute closed correctly only in the last loop (last number), while the first two were broken into quotes. This break is giving error in queries.

If numbers_wpp.txt is a file that has each number on a different line, the following code would iterate over the file line by line.
for number in object:
search(number) # <--- numbers is actually a line in your file.
Since each number is on a new line, the preceding line has a '\n' at the end.
So
123
456
Is actually
123\n456
So number is actually 123\n. This causes your payload to have a \n before the quote closes.
You could fix this by calling number.strip() which strips whitespace from either end of the string.
Alternatively, you should consider not using a handcrafted json string and let the requests library do that for you. Documentation
def search(number):
url = "http://localhost:8080/sistem/checkNumberStatus"
payload = {"SessionName": "POC", "celFull": number} # <-- python dict
response = requests.post(url, data=payload)
object = open('numbers_wpp.txt', 'r')
for line in object:
number = int(line.strip()) # <-- convert to an integer
search(number)

Convert website JSON data to CSV python, returns JSON decode error

I am trying to convert the JSON at this URL: https://wtrl.racing/assets/js/miscttt170620.php?wtrlid=63, which I have saved in a file, to a CSV using this code:
json_data_file = open('TTT json', 'r')
content = json.load(json_data_file)
csv_results = csv.writer(open("TTT_results.csv.csv", "w", newline=''))
for item in content:
print(item)
csv_results.writerow(item)
This returns: json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 489351 (char 489350), which is the '2' in this section of JSON "ll": 43.529}, {"aa":
I'm bemused as to why this would be.

Looks like the json is malformed. You need to contact the person that generated this json to fix it.
See the entries:
"cc": "Nick "Lionel" Berry(TriTalk)"
"cc": ""Sherpa" Dave (R&K Hyenas)"
These quotes are not properly escaped. It needs to be:
"cc": "Nick \"Lionel\" Berry(TriTalk)"
"cc": "\"Sherpa\" Dave (R&K Hyenas)"
Looks like you're probably requesting the wrong accept encoding when accessing the data. Try downloading it by specifying that you want a JSON response and not an HTML payload:
import requests
url = 'https://wtrl.racing/assets/js/miscttt170620.php?wtrlid=63'
headers = {'Accept': 'application/json'}
response = requests.get(url, headers=headers)
data = response.json()

Regarding Json load and dump

I am trying to substitute a value using safe substitute. Before this, I am converting an array using JSON dumps and then substituting. Once the substitution is done I am doing JSON loads and passing as a parameter to other utility. While doing this I am getting an error for JSON loads. Below is the code...
account_id={'ABC123', user_id='testing'}
var1 = {'account':account_id, 'user':user_id}
response = json.dumps(var1)
payload = Template.(test_template).safe_substitute(var1=var1)
output = json.loads(payload)
get an error when it comes to loads:
Expecting "," delimiter: line 1 column 448 (char 447)

It's seems to be a syntax error. Try, like below
account_id='ABC123'
user_id='testing'
var1 = {'account':account_id, 'user':user_id}
response = json.dumps(var1)
print(response)
# out: '{"account": "ABC123", "user": "testing"}'
output = json.loads(response)
print(output)
# out: {'user': 'testing', 'account': 'ABC123'}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python JSON decoder error with unicode characters in request content - python

Related

How to send accented characters with diacritics in HTTP request-payload?

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 2 (char 3)

Payload requests - string concatenate layout error

Convert website JSON data to CSV python, returns JSON decode error

Regarding Json load and dump

Categories

Resources