Handeling errors in Python requests [duplicate] - python

This question already has answers here:
Python requests.exceptions.SSLError: EOF occurred in violation of protocol
(9 answers)
Closed 1 year ago.
I am learning to use requests in Python and I need a way to get a meaningful output if the site does not exist at all.
I looked at this question, but it is unclear if the OP of the question actually wants to check if the site exists, or if it just returns an error. The problem with all of the answers that question is that if the site does not exist at all we cannot really use HTTP response headers, because no response is returned from a server that does not exist.
Here is an example.
If I use this code I will not get any errors because the site exists.
import requests
r = requests.get('https://duckduckgo.com')
However, if I enter a web page I know does not exist I will get an error
import requests
r = requests.get('https://thissitedoesnotexist.com')
if r.status_code == requests.codes.ok:
print('Site good')
else:
print('Site bad')
This error is super long and I would prefer to have a more meaningful and short error if the site does not exist.
Traceback (most recent call last):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 382, in _make_request
self._validate_conn(conn)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 1010, in _validate_conn
conn.connect()
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connection.py", line 416, in connect
self.sock = ssl_wrap_socket(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1070, in _create
self.do_handshake()
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='234876.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\ADMIN\Desktop\tetst.py", line 2, in <module>
r = requests.get('https://234876.com')
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='234876.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
Is it possible to make a function that returns, for example print('The site probably does not exist') or at least does not give an EOF error?

Normally the desirable thing to do is trap Exceptions from requests
You also can use .raise_for_status() on the Response to get a meaningful Exception for non-OK requests
However, you want to watch out for where you want to handle an Exception
immediately? can your program handle it meaningfully or should it exit?
should the caller handle a specific Exception (such as requests.exceptions.Timeout) or a more general one?
do you have many functions which call each other? should any handle some subset of possible Exceptions? and which?
See Python Exception Hierarchy for how the first-party Exceptions inheritance structure
import sys
import requests
def some_function_which_makes_requests():
r = requests.get("https://example.com", timeout=(2,10))
r.raise_for_status() # raise for non-OK
return r.json() # interpret response via some method (for example as JSON)
def main():
...
try:
result_json = some_function_which_makes_requests
except requests.exceptions.Timeout:
print("WARNING: request timed out")
result_json = None # still effectively handled for later program?
except requests.exceptions.RequestException as ex:
sys.exit(f"something wrong with Request: {repr(ex)}")
except Exception:
sys.exit(f"something wrong around Request: {repr(ex)}")
# now you can use result_json

Did some more research and just learned that I need to use a Python Try Except as mentioned by #Anand Sowmithiran. Here is a video explaining it for beginners: https://www.youtube.com/watch?v=NIWwJbo-9_8
import requests
try:
r = requests.get("http://www.duckduckgo.com")
except requests.exceptions.ConnectionError:
print('\n\tSorry. There was a network problem getting the URL. Perhaps it does not exist?\n\tCheck the URL, DNS issues or if you are being rejected by the server.')
else:
print(r)

Related

What is the correct way to attach messages to an error?

I have a project which is structured something like this, with multiple functionality and often more possible sources of error. One functionality may also call something else that raises an error.
def functionality_one(arguments) -> str:
try:
status_feedback = attempt_functionality_one(arguments)
# this would usually be multiple lines
except ValueError as e:
return "known-failure-code"
except ConnectionError as e:
raise ConnectionError("Some user-friendly message for unexpected error") from e
else:
return status_feedback
def main():
## when the relevant CLI argument is passed:
try:
status = functionality_one(arguments)
except Exception as e:
send_notification_to_user(e.args[0])
else:
send_notification_to_user(USER_FRIENDLY_SUCCESS_MESSAGES.get(status, "Success!"))
if __name__ == "__main__":
main()
Focus on this bit about re-raising errors:
except ConnectionError as e:
raise ConnectionError("Some user-friendly message for unexpected error") from e
I do this to attach a user-friendly message in the error that I can later display to the user. Is there a better way to accomplish this?
In particular, normally error tracebacks just state errors that propogate. With this method, it gives a message like "... was the direct cause of the following exception ..." and I don't know whether this is the norm in Python. Here's an example from the log file:
Traceback (most recent call last):
File "D:\username\Documents\tech-projects\project-name\src\auth.py", line 157, in login
login_request = post(
File "D:\username\Documents\tech-projects\project-name\.venv\lib\site-packages\requests\api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
File "D:\username\Documents\tech-projects\project-name\.venv\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "D:\username\Documents\tech-projects\project-name\.venv\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "D:\username\Documents\tech-projects\project-name\.venv\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "D:\username\Documents\tech-projects\project-name\.venv\lib\site-packages\requests\adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='wifi-login.university-website.domain', port=80): Max retries exceeded with url: /cgi-bin/authlogin (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001FF681B27D0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\username\Documents\tech-projects\project-name\login_cli.py", line 269, in main
status_message: str = parsed_namespace.func(parsed_namespace)
File "D:\username\Documents\tech-projects\project-name\login_cli.py", line 197, in connect
return src.auth.login(credentials)
File "D:\username\Documents\tech-projects\project-name\src\auth.py", line 164, in login
raise ConnectionError(f"Server-side error. Contact IT support or wait until morning.") from e
requests.exceptions.ConnectionError: Server-side error. Contact IT support or wait until morning.
So what's the right way to do this? Feel free to suggest a change that completely changes the structure of the program too, if you feel that's necessary.

How to check the connectivity for a URL in python

My requirement is almost same as Requests — how to tell if you're getting a success message?
But I need to print error whenever I could not reach the URL..Here is my try..
# setting up the URL and checking the conection by printing the status
url = 'https://www.google.lk'
try:
page = requests.get(url)
print(page.status_code)
except requests.exceptions.HTTPError as err:
print("Error")
The issue is rather than printing just "Error" it prints a whole error msg as below.
Traceback (most recent call last):
File "testrun.py", line 22, in <module>
page = requests.get(url)
File "/root/anaconda3/envs/py36/lib/python3.6/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/root/anaconda3/envs/py36/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/root/anaconda3/envs/py36/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/root/anaconda3/envs/py36/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/root/anaconda3/envs/py36/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='learn.microsoft.com', port=443): Max retries exceeded with url: /en-us/microsoft-365/enterprise/urls-and-ip-address-ranges?view=o365-worldwide (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7ff91a543198>: Failed to establish a new connection: [Errno 110] Connection timed out',))
Can someone show me how should I modify my code to just print "Error" only if there is any issue? Then I can extend it to some other requirement.
You're not catching the correct exception.
import requests
url = 'https://www.googlggggggge.lk'
try:
page = requests.get(url)
print(page.status_code)
except (requests.exceptions.HTTPError, requests.exceptions.ConnectionError):
print("Error")
you can also do except Exception, however note that Exception is too broad and is not recommended in most cases since it traps all errors
You need to either use a general exception except or catch all exceptions that requests module might throw, e.g. except (requests.exceptions.HTTPError, requests.exceptions.ConnectionError).
For full list see: Correct way to try/except using Python requests module?

SSL problems using Twilio

I'm trying to use Twilio to send SMS. I'm using their templates to send my first test message:
import os
from twilio.rest import Client
client = Client(my_SID, my_TOKEN)
message = client.messages \
.create(
body="Join Earth's mightiest heroes. Like Kevin Bacon.",
from_= number1,
to= number2
)
print(message.sid)
I've manually replaced the SID and the TOKEN with their respective values as per Twilio's console (the os.environ[] function doesn't work). The thing is, this error appearas as I try to run the code:
PS C:\Users\USER> & C:/Users/USER/anaconda3/python.exe "d:/Escritorio/amigo secreto/send_sms.py"
Traceback (most recent call last):
File "C:\Users\USER\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 688, in urlopen
conn = self._get_conn(timeout=pool_timeout)
File "C:\Users\USER\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 280, in _get_conn
return conn or self._new_conn()
File "C:\Users\USER\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 979, in _new_conn
raise SSLError(
urllib3.exceptions.SSLError: Can't connect to HTTPS URL because the SSL module is not available.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\USER\anaconda3\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\USER\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "C:\Users\USER\anaconda3\lib\site-packages\urllib3\util\retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.twilio.com', port=443): Max retries exceeded with url: /2010-04-01/Accounts/ACfd9e165c0a6ba1760d5671ccbfc5dbc6/Messages.json (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available."))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:/Escritorio/amigo secreto/send_sms.py", line 10, in <module>
message = client.messages \
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\rest\api\v2010\account\message\__init__.py", line 88, in create
payload = self._version.create(method='POST', uri=self._uri, data=data, )
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\base\version.py", line 193, in create
response = self.request(
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\base\version.py", line 39, in request
return self.domain.request(
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\base\domain.py", line 38, in request
return self.twilio.request(
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\rest\__init__.py", line 131, in request
return self.http_client.request(
File "C:\Users\USER\anaconda3\lib\site-packages\twilio\http\http_client.py", line 91, in request
response = session.send(
File "C:\Users\USER\anaconda3\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\USER\anaconda3\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='api.twilio.com', port=443): Max retries exceeded with url: /2010-04-01/Accounts/ACfd9e165c0a6ba1760d5671ccbfc5dbc6/Messages.json (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available."))
I've never used an API before, I could really use somebody's guidance. Thanks in advance
Twilio developer evangelist here.
That error looks as though you have issues between your installation of Anaconda and the SSL module. There's a potential fix from this GitHub issue, run:
execstack -c anaconda3/lib/libcrypto.so.1.0.0
Alternatively, other comments suggest installing OpenSSL for your environment.

Python 3 Requests errors

I'm working through Python Crash Course 2nd Ed. and in the text is some code for accessing APIs. My code is copied from the text and is as follows:
import requests
import json
from operator import itemgetter
#Fetch top stories and store in variable r
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print(f"Status code: {r.status_code}")
# #Explore data structure
# response_dict = r.json()
# readable_file = 'hn_readable.json'
# with open(readable_file, 'w') as f:
# json.dump(response_dict, f, indent=4)
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
#Make API call for each article
url = f"https://hacker-news.firebasio.com/v0/item/{submission_id}.json"
r = requests.get(url)
print(f"id: {submission_id}\tstatus code: {r.status_code}")
response_dict = r.json()
#Store dictionary of each article
submission_dict = {
'title': response_dict['title'],
'score': response_dict['score'],
'comments': response_dict['descendants'],
'link': response_dict['url'],
}
submission_dicts.append(submission_dict)
#Sort article by score
submission_dicts = sorted(submission_dicts, key=itemgetter('score'), reverse = True)
#Display information about each article, ranked by score
for submission_dict in submission_dicts:
print(f"Article title: {submission_dict['title']}")
print(f"Article link: {submission_dict['url']}")
print(f"Score: {submission_dict['score']}")
However, this is now returning the following error messages:
Status code: 200
Traceback (most recent call last):
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\connectionpool.py", line 677, in urlopen
chunked=chunked,
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\connectionpool.py", line 381, in _make_request
self._validate_conn(conn)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\connectionpool.py", line 976, in _validate_conn
conn.connect()
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\connection.py", line 370, in connect
ssl_context=context,
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\util\ssl_.py", line 377, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\snack\Python\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Users\snack\Python\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Users\snack\Python\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\urllib3\util\retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='hacker-news.firebasio.com', port=443): Max retries exceeded with url: /v0/item/23273247.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\snack\Python\proj_2\hn_submissions.py", line 24, in <module>
r = requests.get(url)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "C:\Users\snack\AppData\Roaming\Python\Python37\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='hacker-news.firebasio.com', port=443): Max retries exceeded with url: /v0/item/23273247.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)')))
[Finished in 3.6s]
I have almost no experience with this, but from what I can tell, some authentication is failing and not letting my program access the API, but I have no idea why. I've tried limiting the number of API calls by removing the loop, but it doesn't seem to help. I also tried adding the verify=False parameter into the requests.get lines, but that just kicked up different errors.
There is nothing wrong with the API call itself.
As you visit the site https://hacker-news.firebaseio.com/v0/topstories.json you can see the expected list in the browser. (Your first and working api call)
As the first number in this list is 23277594, the script start with this request https://hacker-news.firebasio.com/v0/item/23277594.json, but visiting this url via the browser will also result in warnings. (your second and failing api call)
Alright, it was typos (of course). The url in my code was https...firebasio....json instead of https...firebaseio....json. One of the results is still not working, but I'm assuming that's due to the article not having comments (i.e. descendants), so some try/ except should fix that.

How to move on if the error occur in response on python in beautiful Soup

I have made a web crawler that takes thousands of Urls from a text file and then crawls the data on that webpage.
Now that it has many Urls; some Urls are broken too.
So it gives me the error:
Traceback (most recent call last):
File "C:/Users/khize_000/PycharmProjects/untitled3/new.py", line 57, in <module>
crawl_data("http://www.foasdasdasdasdodily.com/r/126e7649cc-sweetssssie-pies-mac-and-cheese-recipe-by-the-dr-oz-show")
File "C:/Users/khize_000/PycharmProjects/untitled3/new.py", line 18, in crawl_data
data = requests.get(url)
File "C:\Python27\lib\site-packages\requests\api.py", line 67, in get
return request('get', url, params=params, **kwargs)
File "C:\Python27\lib\site-packages\requests\api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 437, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.foasdasdasdasdodily.com', port=80): Max retries exceeded with url: /r/126e7649cc-sweetssssie-pies-mac-and-cheese-recipe-by-the-dr-oz-show (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x0310FCB0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))
Here's my code:
def crawl_data(url):
global connectString
data = requests.get(url)
response = str( data )
if response != "<Response [200]>":
return
soup = BeautifulSoup(data.text,"lxml")
titledb = soup.h1.string
But it still gives me the same exception or error.
I simply want it to ignore that Urls from which there is no response
and move on to the next Url.
You need to learn about exception handling. The easiest way to ignore these errors is to surround the code that processes a single URL with a try-except construct, making you code read something like:
try:
<process a single URL>
except requests.exceptions.ConnectionError:
pass
This will mean that if the specified exception occurs your program will just execute the pass (do nothing) statement and move on to the next
Use try-except:
def crawl_data(url):
global connectString
try:
data = requests.get(url)
except requests.exceptions.ConnectionError:
return
response = str( data )
soup = BeautifulSoup(data.text,"lxml")
titledb = soup.h1.string

Categories

Resources