I'm going over some URL's and I can fetch most of the data I can from an API I'm using. *Imgur API. However when it finds an image that has been posted before but was eventually removed it still shows a positive URL get response (code 200), and when I use
j1 = json.loads(r_positive.text)
I get this error:
http://imgur.com/gallery/cJPSzbu.json
<Response [200]>
Traceback (most recent call last):
File "image_poller_multiple.py", line 61, in <module>
j1 = json.loads(r_positive.text)
File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
How can I "fetch" the error inside the j1 variable instead? I'd like to use a conditional structure to solve the problem and avoid my program from crashing. Something like
if j1 == ValueError:
continue
else:
do_next_procedures()
You need to use try except instead:
try:
j1 = json.loads(r_positive.text)
except ValueError:
# decoding failed
continue
else:
do_next_procedures()
See Handling Exceptions in the Python tutorial.
What really happens is that you were redirected for that URL and you got the image page instead. If you are using requests to fetch the JSON, look at the response history instead:
if r_positive.history:
# more than one request, we were redirected:
continue
else:
j1 = r_positive.json()
or you could even disallow redirections:
r = requests.post(url, allow_redirects=False)
if r.status == 200:
j1 = r.json()
The URL you listed redirects you to a HTML page. (Use curl to check things like this, he's your friend.)
The HTML page obviously cannot be parsed as JSON.
What you probably need is this:
response = fetch_the_url(url)
if response.status == 200:
try:
j1 = json.loads(response.text)
except ValueError:
# json can't be parsed
continue
Related
As a part of a small project of mine, I'm using the requests module to make an API call. Here's the snippet:
date = str(day) + '-' + str(month) + '-' + str(year)
req = "https://cdn-api.co-vin.in/api/v2/appointment/sessions/public/findByDistrict?district_id=" + str(distid) + "&date=" + date
response = requests.get(req,headers={'Content-Type': 'application/json'})
st = str(jprint(response.json()))
file = open("data.json",'w')
file.write(st)
file.close()
The jprint function is as follows:
def jprint(obj):
text = json.dumps(obj,sort_keys=True,indent=4)
return text
This is a part of a nested loop. On the first few runs, it worked successfully but after that it gave the following error:
Traceback (most recent call last):
File "vax_alert2.py", line 99, in <module>
st = str(jprint(response.json()))
File "/usr/lib/python3/dist-packages/requests/models.py", line 897, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 518, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I tried adding a sleep of 1 second but got the same error. How should I resolve it?
Also, I checked it without using the jprint function yet got the exact same error.
I would suggest recording the response in case of exception parsing the response as the response body is likely empty with an error status. It's likely that you're getting a 403 or some other error status (potentially from a DDOS aware firewall). Once you know the potentially errant (empty) response status, you may detect said status and throttle your requests accordingly.
try:
st = str(jprint(response.json()))
file = open("data.json",'w')
file.write(st)
file.close()
except:
print(response)
See the following (from https://docs.python-requests.org/en/master/user/quickstart/):
In case the JSON decoding fails, r.json() raises an exception. For
example, if the response gets a 204 (No Content), or if the response
contains invalid JSON, attempting r.json() raises
simplejson.JSONDecodeError if simplejson is installed or raises
ValueError: No JSON object could be decoded on Python 2 or
json.JSONDecodeError on Python 3.
It should be noted that the success of the call to r.json() does not
indicate the success of the response. Some servers may return a JSON
object in a failed response (e.g. error details with HTTP 500). Such
JSON will be decoded and returned. To check that a request is
successful, use r.raise_for_status() or check r.status_code is what
you expect.
When I call the json() method on request response I get an error.
Any suggestions to what could be wrong here?
My code:
import requests
import bs4
url = 'https://www.reddit.com/r/AskReddit/comments/l4styp/serious_what_is_the_the_scariest_thing_that_you/'
rsp = requests.get(url)
sc = rsp.json()
print(sc)
Output:
File "c:\VS_Code1\scrape.py", line 6, in <module>
sc = rsp.json()
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\requests\models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.496.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 5 (char 5)
Your rsp is actually returning <Response [200]> which is not a JSON. If you want to read the content of the response, you can simply do:
rsp.text
What you get from the URL you posted here is HTML, and not JSON.
This does not work because the page you are fetching does not return Json but HTML source code instead.
To fetch the content of the webpage you need to replace sc = rsp.json() with sc = rsp.text
If you need this data in Json, you can look into Reddit's API: https://www.reddit.com/dev/api
I figured out that if I want to get the JSON from the URL that I've inputted I have to add .json to the end of the URL, this is for some reason (to my knowledge) unique to Reddit and a few other sites that allow it.
I am using the interface of a website to get data, and I have run multiple programs at the same time. I wrote exception capture in the program. I still get a response 502 error and the program is interrupted, and several programs will be interrupted at the same time. What is the reason?
def search(name):
global n
path = 'https://dev.***.com/api/company/queryByName?name=' + str(name)
s = requests.session()
s.keep_alive = False # 关闭多余连接
try:
r = s.get(path,timeout=3)
print(n,r)
except (ReadTimeout,HTTPError,ConnectionError) as e:
print(e)
return search(name)
else:
n=n+1
result = json.loads(r.text)
Traceback (most recent call last):
File "D:/PyCharm Community Edition/project/company/30.py", line 72, in <module>
data1['social_credit_code'], data1['industry'], data1['reg_place'] = zip(*data1['companyName'].apply(search))
File "C:\Users\13750\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py", line 3848, in apply
739 <Response [502]>
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\_libs\lib.pyx", line 2329, in pandas._libs.lib.map_infer
File "D:/PyCharm Community Edition/project/company/30.py", line 49, in search
result = json.loads(r.text)
File "C:\Users\13750\.conda\envs\py36\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\13750\.conda\envs\py36\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\13750\.conda\envs\py36\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The requests API will only raise an exception if you are not able to communicate with a server. In this case you did reach a server, but the server then responded by telling you 502 Bad Gateway. This error usually means you communicated with some proxy server which was unable to forward your message to the final destination.
Regardless, that response will be captured by the requests API and returned as a Response object. After you receive a response you always need to make sure that the return code is what you expect (commonly 200). requests has a convenient way to do so:
r = s.get(path,timeout=3)
if r.ok:
# do your work
In this case you didnt check if the response code was okay, and because the response code indicated an error, you didn't receive any JSON data like you thought you did. Which is why the code followed through to the else statement and gave you a JSONDecodeError.
As the traceback clearly shows, a JSONDecoderError is being raised and your code is not catching it.
You should probably not attempt to decode the content of a 502 response. If you want such responses to raise an exception use raise_for_status
try:
r = s.get(path,timeout=3)
r.raise_for_status()
print(n,r)
except (ReadTimeout,HTTPError,ConnectionError) as e:
...
I tried do some integration towards serviceNow records using python script and referring example given in this link to update the records using Http Request Patch method: Table API Python
Here is my code:
#Need to install requests package for python
#sudo easy_install requests
import requests
# Set the request parameters
url = 'https://instance.owndomain.com/api/now/table/sc_req_item/2a2851fe88709010b9120e9b506dd9a9'
user = 'username'
pwd = 'password'
# Set proper headers
headers = {"Content-Type":"application/json","Accept":"application/json"}
# Do the HTTP request
response = requests.patch(url, auth=(user, pwd), headers=headers ,data='{"u_switch_status_automation":"In Processing using Python", "u_switch_description":"The Automation currently processing this SR using Python"}')
# Check for HTTP codes other than 200
if response.status_code != 200:
print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:',response.json())
exit()
# Decode the JSON response into a dictionary and use the data
print('Status:',response.status_code,'Headers:',response.headers,'Response:',response.json())
And the result I got is as follows:
/usr/lib/python2.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made to host 'instance.owndomain.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
Traceback (most recent call last):
File "/home/automation/switchConf/updateRecord.py", line 18, in <module>
print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:',response.json())
File "/usr/lib/python2.7/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
I am pretty new to python and learning how to make HTTP request and store the response in a variable.
Below is the similar kind of code snippet that I am trying to make the POST request.
import requests
import simplejson as json
api_url = https://jsonplaceholder.typicode.com/tickets
raw_body = {"searchBy":"city","searchValue":"1","processed":9,"size":47,"filter":{"cityCode":["BA","KE","BE"],"tickets":["BLUE"]}}
raw_header = {"X-Ticket-id": "1234567", "X-Ticket-TimeStamp": "11:01:1212", "X-Ticket-MessageId": "123", 'Content-Type': 'application/json'}
result = requests.post(api_url, headers=json.loads(raw_header), data=raw_body)
#Response Header
response_header_contentType = result.headers['Content-Type'] #---> I am getting response_header_contentType as "text/html; charset=utf-8"
#Trying to get the result in json format
response = result.json() # --> I am getting error at this line. May be because the server is sending the content type as "text/html" and I am trying to capture the json response.
Error in console :
Traceback (most recent call last):
File "C:\sam\python-project\v4\makeApiRequest.py", line 45, in make_API_request
response = result.json()
File "C:\Users\userName\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "C:\Users\userName\AppData\Local\Programs\Python\Python37\lib\site-packages\simplejson\__init__.py", line 525, in loads
return _default_decoder.decode(s)
File "C:\Users\userName\AppData\Local\Programs\Python\Python37\lib\site-packages\simplejson\decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "C:\Users\userName\AppData\Local\Programs\Python\Python37\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
So, how can I store the response in a variable based on the content-type sent by the server using requests.
Can somebody please help me here. I tried googling too but did not find any helpful documentation on how to capture the response based on the content-type.
as you already said your contentType is 'text/html' not 'application/json' that normally means that it can not be decoded as json.
If you look at the documentation
https://2.python-requests.org/en/master/user/quickstart/#response-content you can find that there are different ways to decode the body, if you already know you have 'text/html' it makes sense to decode it with response.text.
Hence it makes sense to distinquish based on the content type how to decode your data:
if result.headers['Content-Type'] == 'application/json':
data = result.json()
elif result.headers['Content-Type'] == 'text/html':
data = result.text
else:
data = result.raw