This question already has an answer here:
Requests failing to connect to a TLS server
(1 answer)
Closed 5 years ago.
I'm using Python and I'm trying to scrape this website:
https://online.ratb.ro/info/browsers.aspx
But I'm getting this error:
Traceback (most recent call last): File
"C:\Users\pinguluk\Desktop\Proiecte GIT\RATB Scraper\test2.py", line
3, in
test = requests.get('https://online.ratb.ro/info/browsers.aspx') File "C:\Python27\lib\site-packages\requests\api.py", line 72,
in get
return request('get', url, params=params, **kwargs) File "C:\Python27\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs) File "C:\Python27\lib\site-packages\requests\sessions.py", line 518,
in request
resp = self.send(prep, **send_kwargs) File "C:\Python27\lib\site-packages\requests\sessions.py", line 639, in
send
r = adapter.send(request, **kwargs) File "C:\Python27\lib\site-packages\requests\adapters.py", line 512, in
send
raise SSLError(e, request=request) requests.exceptions.SSLError: ("bad handshake: SysCallError(-1,
'Unexpected EOF')",)
Installed modules:
['appdirs==1.4.3', 'asn1crypto==0.22.0', 'attrs==16.3.0',
'automat==0.5.0', 'beautifulsoup4==4.5.3', 'cairocffi==0.8.0',
'certifi==2017.4.17', 'cffi==1.10.0', 'colorama==0.3.9',
'constantly==15.1.0', 'cryptography==1.8.1', 'cssselect==1.0.1',
'cycler==0.10.0', 'distributedlock==1.2', 'django-annoying==0.10.3',
'django-oauth-tokens==0.6.3', 'django-taggit==0.22.1',
'django==1.11.1', 'enum34==1.1.6', 'facepy==1.0.8',
'functools32==3.2.3.post2', 'futures==3.1.1', 'gevent==1.2.1',
'greenlet==0.4.12', 'grequests==0.3.0', 'html5lib==0.999999999',
'htmlparser==0.0.2', 'httplib2==0.10.3', 'idna==2.5',
'incremental==16.10.1', 'ipaddress==1.0.18', 'lazyme==0.0.10',
'lxml==3.7.3', 'matplotlib==2.0.2', 'mechanize==0.3.3',
'ndg-httpsclient==0.4.2', 'numpy==1.12.1', 'oauthlib==2.0.2',
'olefile==0.44', 'opencv-python==3.2.0.7', 'packaging==16.8',
'parsel==1.1.0', 'pillow==4.0.0', 'pip==9.0.1', 'py2exe==0.6.9',
'pyandoc==0.0.1', 'pyasn1-modules==0.0.8', 'pyasn1==0.2.3',
'pycairo-gtk==1.10.0', 'pycparser==2.17', 'pygtk==2.22.0',
'pyhook==1.5.1', 'pynput==1.3.2', 'pyopenssl==17.0.0',
'pyparsing==2.2.0', 'pypiwin32==219', 'pyquery==1.2.17',
'python-dateutil==2.6.0', 'python-memcached==1.58', 'pytz==2017.2',
'pywin32==221', 'queuelib==1.4.2', 'requests-futures==0.9.7',
'requests-oauthlib==0.8.0', 'requests-toolbelt==0.8.0',
'requests==2.14.2', 'restclient==0.11.0', 'robobrowser==0.5.3',
'selenium==3.4.1', 'service-identity==16.0.0', 'setuptools==35.0.2',
'simplejson==3.10.0', 'six==1.10.0', 'twitter==1.17.0',
'twitterfollowbot==2.0.2', 'urllib3==1.21.1', 'w3lib==1.17.0',
'webencodings==0.5.1', 'werkzeug==0.12.1', 'wheel==0.29.0',
'zope.interface==4.3.3']
Thanks.
I think you will have hard time solving this problem since the server you are trying to "scrape" is awfully configured (ssllabs.com gave it a grade F) and it might be that Requests don't even support any of cipher suites because they are all insecure. There might be an option of creating a custom HTTPAdapter, so you might try that out.
You can try using:
requests.get(url, verify=False)
if you don't need to check the authenticity of the SSL certificate.
Related
This question already has answers here:
Python requests.exceptions.SSLError: EOF occurred in violation of protocol
(9 answers)
Closed 1 year ago.
I am learning to use requests in Python and I need a way to get a meaningful output if the site does not exist at all.
I looked at this question, but it is unclear if the OP of the question actually wants to check if the site exists, or if it just returns an error. The problem with all of the answers that question is that if the site does not exist at all we cannot really use HTTP response headers, because no response is returned from a server that does not exist.
Here is an example.
If I use this code I will not get any errors because the site exists.
import requests
r = requests.get('https://duckduckgo.com')
However, if I enter a web page I know does not exist I will get an error
import requests
r = requests.get('https://thissitedoesnotexist.com')
if r.status_code == requests.codes.ok:
print('Site good')
else:
print('Site bad')
This error is super long and I would prefer to have a more meaningful and short error if the site does not exist.
Traceback (most recent call last):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 382, in _make_request
self._validate_conn(conn)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 1010, in _validate_conn
conn.connect()
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connection.py", line 416, in connect
self.sock = ssl_wrap_socket(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1070, in _create
self.do_handshake()
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='234876.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\ADMIN\Desktop\tetst.py", line 2, in <module>
r = requests.get('https://234876.com')
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='234876.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
Is it possible to make a function that returns, for example print('The site probably does not exist') or at least does not give an EOF error?
Normally the desirable thing to do is trap Exceptions from requests
You also can use .raise_for_status() on the Response to get a meaningful Exception for non-OK requests
However, you want to watch out for where you want to handle an Exception
immediately? can your program handle it meaningfully or should it exit?
should the caller handle a specific Exception (such as requests.exceptions.Timeout) or a more general one?
do you have many functions which call each other? should any handle some subset of possible Exceptions? and which?
See Python Exception Hierarchy for how the first-party Exceptions inheritance structure
import sys
import requests
def some_function_which_makes_requests():
r = requests.get("https://example.com", timeout=(2,10))
r.raise_for_status() # raise for non-OK
return r.json() # interpret response via some method (for example as JSON)
def main():
...
try:
result_json = some_function_which_makes_requests
except requests.exceptions.Timeout:
print("WARNING: request timed out")
result_json = None # still effectively handled for later program?
except requests.exceptions.RequestException as ex:
sys.exit(f"something wrong with Request: {repr(ex)}")
except Exception:
sys.exit(f"something wrong around Request: {repr(ex)}")
# now you can use result_json
Did some more research and just learned that I need to use a Python Try Except as mentioned by #Anand Sowmithiran. Here is a video explaining it for beginners: https://www.youtube.com/watch?v=NIWwJbo-9_8
import requests
try:
r = requests.get("http://www.duckduckgo.com")
except requests.exceptions.ConnectionError:
print('\n\tSorry. There was a network problem getting the URL. Perhaps it does not exist?\n\tCheck the URL, DNS issues or if you are being rejected by the server.')
else:
print(r)
I am experimenting in Python to retrieve the saved searches in my space in Kibana.
I am trying to follow the example at https://www.elastic.co/guide/en/kibana/current/saved-objects-api-find.html
Here is my code.
r = requests.get('http://myhost.com:9200/s/guy-levin/api/saved_objects/_find?type=search',
auth=(username, password))
print(r.status_code)
print(r.text)
print(r.json())
I get the output:
400
{"error":"no handler found for uri [/s/guy-levin/api/saved_objects/_find?type=search] and method [GET]"}
{'error': 'no handler found for uri [/s/guy-levin/api/saved_objects/_find?type=search] and method [GET]'}
I also tried es.search(), but if I try es.search(doc_type='search') [not even sure if this is right; Internet searches not helpful thus far], I get a stack trace ending with:
elasticsearch.exceptions.AuthorizationException: AuthorizationException(403, 'security_exception', 'action [indices:data/read/search] is unauthorized for user [some_user_name]')
Changing port to 5601, I got this stack trace:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/git/KibanaReader/main.py", line 207, in <module>
get_saved_search('Blah Blah REST API')
File "C:/git/KibanaReader/main.py", line 90, in get_saved_search
r = requests.get('http://kbqa2.nayax.com:5601/s/guy-levin/api/saved_objects/_find?type=search',
File "C:\git\KibanaReader\venv\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\git\KibanaReader\venv\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\git\KibanaReader\venv\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\git\KibanaReader\venv\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\git\KibanaReader\venv\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
The Saved Object API is a Kibana API, so you need to target the Kibana endpoint (port 5601 by default), not the Elasticsearch endpoint (port 9200 by default).
The right URL should be
http://myhost.com:5601/s/guy-levin/api/saved_objects/_find?type=search
^
|
change this
I would like to authenticate to server from my client using certificate that is generated from server.I have a server-ca.crt and below is the CURL command that is working.How to send similar request using python requests module .
$ curl -X GET -u sat_username:sat_password \
-H "Accept:application/json" --cacert katello-server-ca.crt \
https://satellite6.example.com/katello/api/organizations
I have tried following way and it is getting some exception, can someone help in resolving this issue.
python requestsCert.py
Traceback (most recent call last):
File "requestsCert.py", line 2, in <module>
res=requests.get('https://satellite6.example.com/katello/api/organizations', cert='/certificateTests/katello-server-ca.crt', verify=True)
File "/usr/lib/python2.7/site-packages/requests/api.py", line 68, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [SSL] PEM lib (_ssl.c:2554)
res=requests.get('https://...', cert='/certificateTests/katello-server-ca.crt', verify=True)
The cert argument in requests.get is used to specify the client certificate and key which should be used for mutual authentication. It is not used to specify the trusted CA as the --cacert argument in curl does. Instead you should use the verify argument:
res=requests.get('https://...', verify='/certificateTests/katello-server-ca.crt')
For more information see SSL Cert Verification and Client Side Certificates in the documentation for requests.
I want to build my own library to be used for server -> browser communicating via websockets. I have tried to build something, it's here: https://github.com/duverse/wsc. But sometimes, when I trying to make API Call to my own server, I've got something like this:
Python requests
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/project/.venv/local/lib/python2.7/site-packages/wsc/client/__init__.py", line 109, in stat
'Access-Key': self._access_key
File "/project/.venv/local/lib/python2.7/site-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/project/.venv/local/lib/python2.7/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/project/.venv/local/lib/python2.7/site-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/project/.venv/local/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/project/.venv/local/lib/python2.7/site-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))
I think that problem is in this method: https://github.com/duverse/wsc/blob/master/wsc/server/request.py#L113, because all websocket connections works fine. Usually it's probably tries to receive package that is empty. I am not sure.
P.S. I have used SSLSocket, and packages from the websocket request is always splited. At the first time it always sends me only one byte, and then all other.
P.P.S. WSC Server log has just it:
WSC.log
----------------------------------------
Exception happened during processing of request from ('123.123.123.123', 45757)
----------------------------------------
Without trace. (ip was replaced)
I am using Requests 1.2.3 on Windows 7 x64 and am trying to connect to (any) site via HTTPS using a HTTPS proxy by passing the proxies argument to the request.
I don't experience this error when using urllib2's ProxyHandler, so I don't think it's on my proxy's side.
>>> opener = urllib2.build_opener(urllib2.ProxyHandler({'https': 'IP:PORT'}))
>>> resp = opener.open('https://www.google.com')
>>> resp.url
'https://www.google.co.uk/'
>>> resp = requests.get('https://www.google.com', proxies={'https': 'IP:PORT'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\requests\api.py", line 55, in get
return request('get', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 335, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 438, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 331, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
I should probably note that the same error still happens if I pass verify=False to the request.
Any suggestions? I've looked at related questions but there was nothing that worked for me.
I suspect your proxy is a http proxy over which you can use https (the common case)
The problem is, that requests uses https to talk to proxies if the request itself https.
Using an explicit protocol (http) for your proxy should fix things: proxies={'https': 'http://IP:PORT'}
Also have a look at https://github.com/kennethreitz/requests/issues/1182