Opening a page with urllib2 handshake failure - python

I am simply trying to open a webpage: https://close5.com/home/
And I keep getting differing errors concerning my ssl. Here are a couple of my attempts and their errors. I am open to a fix that works in either framework. My end goal is to use turn this page into a beautifulsoup4 soup.
The error:
Traceback (most recent call last):
File "test.py", line 54, in <module>
print soup_maker_two(url)
File "test.py", line 45, in soup_maker_two
response = br.open(url)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 230, in _mech_open
response = UserAgentBase.open(self, request, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_opener.py", line 193, in open
response = urlopen(self, req, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 344, in _open
'_open', req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 332, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1170, in https_open
return self.do_open(conn_factory, req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1118, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure>
The code:
import mechanize
import ssl
from functools import wraps
def sslwrap(func):
#wraps(func)
def bar(*args, **kw):
kw['ssl_version'] = ssl.PROTOCOL_TLSv1
return func(*args, **kw)
return bar
def soup_maker_two(url):
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_equiv(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Firefox')]
ssl.wrap_socket = sslwrap(ssl.wrap_socket)
response = br.open(url)
for f in br.forms():
print f
return 'hi'
if __name__ == "__main__":
url = 'https://close5.com/'
print soup_maker_two(url)
I have also tried got this error and code combo
2nd Attempt
The error:
Traceback (most recent call last):
File "test.py", line 29, in <module>
print str(soup_maker(url))[0:1000]
File "test.py", line 22, in soup_maker
webpage = opener.open(req)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
The code:
from bs4 import BeautifulSoup
import urllib2
def soup_maker(url):
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)
result.status = code
return result
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
req = urllib2.Request(url,headers=hdr)
opener = urllib2.build_opener(RedirectHandler())
webpage = opener.open(req)
soup = BeautifulSoup(webpage, "html5lib")
return soup
if __name__ == "__main__":
url = 'https://close5.com/home/'
print str(soup_maker(url))[0:1000]
EDIT 1
from bs4 import BeautifulSoup
It was suggested that I use:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content, "html5lib")
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
This code worked for Padraic, but does not work for me. I get the error:
Traceback (most recent call last):
File "test_3.py", line 10, in <module>
print str(soup_maker(url))[:1000]
File "test_3.py", line 4, in soup_maker
soup = BeautifulSoup(requests.get(url).content, "html5lib")
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 385, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
Which is the same error as before. I am gonna guess it may have something to do with the fact that I am using Python 2.7.6, but I am uncertain. Also, I am unsure how to use that information to solve my issue.
EDIT 2
The issue may lie in the incorrect version of requests. I currently have requests==2.2.1 in my pip freeze
sudo pip install -U requests
returns
Downloading/unpacking requests from https://pypi.python.org/packages/2.7/r/requests/requests-2.9.1-py2.py3-none-any.whl#md5=58a444aaa02780ad01983f5f540e67b2
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Found existing installation: requests 2.2.1
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
Successfully installed requests
Cleaning up..
sudo pip2 install -U requests returns the same thing
sudo pip uninstall requests returns
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
I am running ubuntu 14.04 and python 2.7.6 and requests 2.2.1
Edit 3
sudo pip install --ignore-installed requests
gives
Downloading/unpacking requests
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Successfully installed requests
Cleaning up...
but sudo pip freeze still gives requests==2.2.1
Edit 4
After going through many suggestions I now have
$python
Python 2.7.6 (default, Jun 22 2015, 18:00:18)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests;requests.__version__
'2.9.1'
>>> url = 'https://close5.com/home/'
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(requests.get(url).content, "html5lib")
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:315: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:120: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 67, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
>>>

I would recommend using requests:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content)
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
Which will give you what you require:
<html><head><title>Buy & Sell Locally with Close5</title><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" name="description"/><meta content="index, follow" name="robots"/><!--link(rel="canonical" href="https://www.close5.com")-->
<link href="https://www.close5.com/images/favicons/favicon-160x160.png" rel="image_src"/><meta content="index, follow" name="robots"/><!-- Facebook Item Tags--><meta content="Buy & Sell Locally with Close5" property="og:title"/><meta content="Close5" property="og:site_name"/><!-- meta(property="og:url" content='https://www.close5.com/images/app-icon.png')--><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" property="og:description"/><meta content="1470902013158927" property="fb:app_id"/><meta content="100000228184034" property="fb:
Edit1:
Your version of pip is ancient, upgrade with pip install -U requests
Edit2:
You installed requests with apt-get so you need to:
apt-get remove python-requests
pip install --ignore-installed requests # pip install -U requests should also work
I would remove pip altogether and download get-pip.py, run python get-pip.py and stick to using pip to install packages. Most likely pip did successfully install requests, the newer version is probably further down in your path.
Edit3:
You installed requests with apt-get so you cannot remove it with pip, use apt-get remove python-requests as suggested in Edit2.
Edit4:
The link in the output explains what is happening and suggests:
pip install pyopenssl ndg-httpsclient pyasn1
You can also:
pip install requests[security]

Related

The system cannot find the file specified in `edx-dl` module

Am trying to use this Python Module. https://github.com/coursera-dl/edx-dl
Please excuse my basic knowledge.
Installed Anaconda 3 Windows 10 then:
pip install edx-dl
pip install --upgrade youtube-dl
Then to get courses did:
edx-dl -u user#user.com --list-courses
edx-dl -u user#user.com COURSE_URL
This all worked however once downloads actually started was getting:
Got SSL/Connection error: HTTP Error 403: Forbidden
Fiddler showed that it was being blocked by by Cloudfare I suspect due to User-Agent
I the installed Fake_UserAgent https://pypi.python.org/pypi/fake-useragent and added:
from fake_useragent import UserAgent #added this
def edx_get_headers():
"""
Build the Open edX headers to create future requests.
"""
logging.info('Building initial headers for future requests.')
headers = {
'User-Agent': 'edX-downloader/0.01',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8',
'Referer': EDX_HOMEPAGE,
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': _get_initial_token(EDX_HOMEPAGE),
}
ua = UserAgent() #added this
headers['User-Agent'] = ua.ie #added this
It then downloaded a pdf and an xls but got another error due to request.py adding a header so added fake to requests.py and commented out the default header as below.
from fake_useragent import UserAgent
ub = UserAgent()
self.addheaders = [('User-Agent', ub.ie)]
# self.addheaders = [('User-Agent', self.version), ('Accept', '*/*')] [('User-Agent', self.version), ('Accept', '*/*')]
The new error is below. I can't work out how to troubleshoot further. I suspect it can't find a file / path possibly due to Windows.
[download] https://youtube.com/watch?v=bKkrDLwDnDE => Downloaded\Implementing_ETL_with_SQL_Server_Integration_Services\02-Module_1__ETL_Processing\01-%(title)s-%(id)s.%(ext)s
Downloading video with URL https://youtube.com/watch?v=bKkrDLwDnDE from YouTube.
Traceback (most recent call last):
File "edx-dl.py", line 6, in <module>
edx_dl.main()
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 1080, in main
download(args, selections, filtered_units, headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 857, in download
headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 819, in download_unit
headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 801, in download_video
skip_or_download(youtube_downloads, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 788, in skip_or_download
f(url, filename, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 721, in download_url
download_youtube_url(url, filename, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 761, in download_youtube_url
execute_command(cmd, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\utils.py", line 37, in execute_command
subprocess.check_call(cmd)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
Same issue as here however no resolution or assistance had been provided so thought I would try here instead.
https://github.com/coursera-dl/edx-dl/issues/368
Advice on how to learn to troubleshoot this would be appreciated.
Debugged the code and found that couldn't find youtube-dl.
Checked echo %PATH% and realised I had path to:
C:...\Anaconda3\ but not to C:...\Anaconda3\Scripts\ (this is location of youtube_dl.exe).
I had added this path but not rebooted.
Rebooted and now resolved.
There is another easy solution and no need to use Fake_UserAgent, just use other downloaders, like wget.
Install fresh edx_dl.
If you are on Windows download wget, save it for example on H drive.
Change download_url function like this:
def download_url(url, filename, headers, args):
"""
Downloads the given url in filename.
"""
if is_youtube_url(url):
download_youtube_url(url, filename, headers, args)
else:
# jcline
cmd = (["h:\wget.exe", url, '-c', '-O', filename, '--keep-session-cookies', '--no-check-certificate'])
execute_command(cmd, args)
(Source)

Python traceback error using urllib2

I am really confused, new to Python and I am working on a script that scrapes a website for products on Python27. I am trying to use urllib2 to do this and when I run the script it prints multiple traceback errors. Suggestions?
Script:
import urllib2, zlib, json
url='https://launches.endclothing.com/api/products'
req = urllib2.Request(url)
req.add_header(':host','launches.endclothing.com');req.add_header(':method','GET');req.add_header(':path','/api/products');req.add_header(':scheme','https');req.add_header(':version','HTTP/1.1');req.add_header('accept','application/json, text/plain, */*');req.add_header('accept-encoding','gzip,deflate');req.add_header('accept-language','en-US,en;q=0.8');req.add_header('cache-control','max-age=0');req.add_header('cookie','__/');req.add_header('user-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.120 Chrome/37.0.2062.120 Safari/537.36');
resp = urllib2.urlopen(req).read()
resp = zlib.decompress(bytes(bytearray(resp)),15+32)
data = json.loads(resp)
for product in data:
for attrib in product.keys():
print str(attrib)+' :: '+ str(product[attrib])
print '\n'
Error(s):
C:\Users\Luke>py C:\Users\Luke\Documents\EndBot2.py
Traceback (most recent call last):
File "C:\Users\Luke\Documents\EndBot2.py", line 5, in <module>
resp = urllib2.urlopen(req).read()
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 391, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 409, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 369, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1181, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python27\lib\urllib2.py", line 1148, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:499: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error>
You're running into issues with SSL configuration of your request. I'm sorry, but I won't correct your code, because we're in 2016, and there's a wonderful library that you should use instead: requests
So its usage is pretty simple:
>>> user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'
>>> result = requests.get('https://launches.endclothing.com/api/products', headers={'user-agent': user_agent})
>>> result
<Response [200]>
>>> result.json()
[{u'name': u'Adidas Consortium x HighSnobiety Ultraboost', u'colour': u'Grey', u'id': 30, u'releaseDate': u'2016-04-09T00:01:00+0100', …
You'll notice that I changed the user-agent in the previous query to have it working, because weirdly enough, the website is refusing API access to requests:
>>> result = requests.get('https://launches.endclothing.com/api/products')
>>> result
<Response [403]>
>>> result.text
This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.</p></div><div class="error-right"><h3>What can I do to resolve this?</h3><p>If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p><p>If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.
Otherwise, now that you've tried requests and your life has changed, you might still run into this issue again. As you might read from many places on internet, this is related to SNI and outdated libraries and you might get headaches trying to figure this out. My best advice would be for you to upgrade to Python3, as the problem is likely to be solved by installing a new vanilla version of python and the libs involved.
HTH

Problems running Dashku python script on Amazon Linux

I am trying to run a Dashku python script on my Amazon EC2 instance running Amazon Linux.
This is the error I get:
[ec2-user#ip-10-231-47-166 ~]$ python dashku_53dc123bc0f9ac740b009af9.py
Traceback (most recent call last):
File "dashku_53dc123bc0f9ac740b009af9.py", line 16, in <module>
requests.post('https://dashku.com/api/transmission', data=json.dumps(payload),headers=headers)
File "/usr/lib/python2.6/site-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/lib/python2.6/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 335, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.6/site-packages/requests/sessions.py", line 438, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.6/site-packages/requests/adapters.py", line 331, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 1] _ssl.c:493: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Note that I calling sudo yum install requests did not do much, but calling sudo yum upgrade python-requests did install it.
The script works find on my local machine and has not been modified at all:
# Instructions
#
# easy_install requests
# python dashku_53dc123bc0f9ac740b009af9.py
#
import requests
import json
payload = {
"bigNumber": 500,
"_id": "XXXXXX",
"apiKey": "XXXXXX"
}
headers = {'content-type': 'application/json'}
requests.post('https://dashku.com/api/transmission', data=json.dumps(payload),headers=headers)
Running it on my local machine works fine and Dashku is updated accordingly.
Any ideas? Thanks.
Not sure if this is the appropriate solution, but adding verify=False to the requests.post call "fixes" the problem.
requests.post('https://dashku.com/api/transmission', data=json.dumps(payload),headers=headers, verify=False)

python requests and cx_freeze

I am trying to freeze a python app that depends on requests, but I am getting the following error:
Traceback (most recent call last):
File "c:\Python33\lib\site-packages\requests\packages\urllib3\util.py", line 630, in ssl_wrap_socket
context.load_verify_locations(ca_certs)
FileNotFoundError: [Errno 2] No such file or directory
Looks like it is having trouble finding the ssl certificate with the executable. I found this which seems to be the same problem, but I am not able to figure out how they got it to work. The main problem seems to be that the certificate bundled by requests is not copied over to the compressed library. So it seems that I will have to force cx_freeze to bundle the certificates and then point to it from my script.
Starting with this simple script everything works fine:
import requests
r = requests.get("https://yourapihere.com")
print(r.json())
Then if I add the certificate file I stat getting errors:
import requests
r = requests.get("https://yourapihere.com", cert=requests.certs.where())
print(r.json())
-
Traceback (most recent call last):
File "c:\Python33\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 480, in urlopen
body=body, headers=headers)
File "c:\Python33\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 285, in _make_request
conn.request(method, url, **httplib_request_kw)
File "c:\Python33\lib\http\client.py", line 1065, in request
self._send_request(method, url, body, headers)
File "c:\Python33\lib\http\client.py", line 1103, in _send_request
self.endheaders(body)
File "c:\Python33\lib\http\client.py", line 1061, in endheaders
self._send_output(message_body)
File "c:\Python33\lib\http\client.py", line 906, in _send_output
self.send(msg)
File "c:\Python33\lib\http\client.py", line 844, in send
self.connect()
File "c:\Python33\lib\site-packages\requests\packages\urllib3\connection.py", line 164, in connect
ssl_version=resolved_ssl_version)
File "c:\Python33\lib\site-packages\requests\packages\urllib3\util.py", line 637, in ssl_wrap_socket
context.load_cert_chain(certfile, keyfile)
ssl.SSLError: [SSL] PEM lib (_ssl.c:2155)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Python33\lib\site-packages\requests\adapters.py", line 330, in send
timeout=timeout
File "c:\Python33\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 504, in urlopen
raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: [SSL] PEM lib (_ssl.c:2155)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "example.py", line 10, in <module>
r = requests.get("https://yourapihere.com", cert=requests.certs.where())
File "c:\Python33\lib\site-packages\requests\api.py", line 55, in get
return request('get', url, **kwargs)
File "c:\Python33\lib\site-packages\requests\api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "c:\Python33\lib\site-packages\requests\sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "c:\Python33\lib\site-packages\requests\sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "c:\Python33\lib\site-packages\requests\adapters.py", line 385, in send
raise SSLError(e)
requests.exceptions.SSLError: [SSL] PEM lib (_ssl.c:2155)
I guess I am using it correctly, but cant really figure out why it is not working. I guess that after fixing this I can continue and add the certificate to the cx_freeze bundle, something like:
example.py:
import os
import requests
cert = os.path.join(os.path.dirname(requests.__file__),'cacert.pem')
r = requests.get("https://yourapihere.com", cert=cert)
print(r.json())
setup.py:
from cx_Freeze import setup, Executable
import requests.certs
build_exe_options = {"zip_includes":[(requests.certs.where(),'requests/cacert.pem')]}
executables = [
Executable('example.py')
]
setup(
executables=executables
)
if someone could give me a tip it would be much appreciated.
I found this comment from another thread that worked for me:
https://stackoverflow.com/a/25239701/3935084
Summary below:
You can also use environment variable "REQUESTS_CA_BUNDLE" (as said http://docs.python-requests.org/en/latest/user/advanced/#ssl-cert-verification)
It's much simpler than correcting all your requests:
os.environ["REQUESTS_CA_BUNDLE"] = os.path.join(os.getcwd(), "cacert.pem")
Steps to get this to work:
Explicitly tell requests where the certificate are
Tell cx_freeze to also get the certificate file when "building"
have the code to use the proper certificate file when frozen
test.py:
import os
import sys
import requests
# sets the path to the certificate file
if getattr(sys, 'frozen', False):
# if frozen, get embeded file
cacert = os.path.join(os.path.dirname(sys.executable), 'cacert.pem')
else:
# else just get the default file
cacert = requests.certs.where()
# remember to use the verify to set the certificate to be used
# I guess it could also work with REQUESTS_CA_BUNDLE, but I have not tried
r = requests.get('https://www.google.com', verify=cacert)
print(r)
setup.py:
from cx_Freeze import setup, Executable
import requests
import sys
executable = Executable( script = "test.py" )
# Add certificate to the build
options = {
"build_exe": {
'include_files' : [(requests.certs.where(), 'cacert.pem')]
}
}
setup(
version = "0",
requires = ["requests"],
options = options,
executables = [executable]
)
to build it, just:
$ python setup.py build
If successful you should see:
$ test
<Response [200]>

SSL error using Python Requests to access Shibboleth authenticated server

I'm trying to access a journal article hosted by an academic service provider (SP), using a Python script.
The server authenticates using a Shibboleth login. I read Logging into SAML/Shibboleth authenticated server using python and tried to implement a login with Python Requests.
The script starts by querying the SP for the link leading to my IDP institution, and is supposed then to authenticate automatically with the IDP. The first part works, but when following the link to the IDP, it chokes on an SSL error.
Here is what I used:
import requests
import lxml.html
LOGINLINK = 'https://www.jsave.org/action/showLogin?redirectUri=%2F'
USERAGENT = 'Mozilla/5.0 (X11; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0'
s = requests.session()
s.headers.update({'User-Agent' : USERAGENT})
# getting the page where you can search for your IDP
# need to get the cookies so we can continue
response = s.get(LOGINLINK)
rtext = response.text
print('Don\'t see your school?' in rtext) # prints True
# POSTing the name of my institution
data = {
'institutionName' : 'tubingen',
'submitForm' : 'Search',
'currUrl' : '%2Faction%2FshowBasicSearch',
'redirectUri' : '%2F',
'activity' : 'isearch'
}
response = s.post(BASEURL + '/action/showLogin', data=data)
rtext = response.text
print('university of tubingen' in rtext) # prints True
# get the link that leads to the IDP
tree = lxml.html.fromstring(rtext)
loginlinks = tree.cssselect('a.extLogin')
if (loginlinks):
loginlink = loginlinks[0].get('href')
else:
exit(1)
print('continuing to IDP')
response = s.get(loginlink)
rtext = response.text
print('zentrale Anmeldeseite' in rtext)
This yields:
continuing to IDP...
2014-04-04 10:04:06,010 - INFO - Starting new HTTPS connection (1): idp.uni-tuebingen.de
Traceback (most recent call last):
File "/usr/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 480, in urlopen
body=body, headers=headers)
File "/usr/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 285, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.4/http/client.py", line 1066, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.4/http/client.py", line 1104, in _send_request
self.endheaders(body)
File "/usr/lib/python3.4/http/client.py", line 1062, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.4/http/client.py", line 907, in _send_output
self.send(msg)
File "/usr/lib/python3.4/http/client.py", line 842, in send
self.connect()
File "/usr/lib/python3.4/site-packages/requests/packages/urllib3/connection.py", line 164, in connect
ssl_version=resolved_ssl_version)
File "/usr/lib/python3.4/site-packages/requests/packages/urllib3/util.py", line 639, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib/python3.4/ssl.py", line 344, in wrap_socket
_context=self)
File "/usr/lib/python3.4/ssl.py", line 540, in __init__
self.do_handshake()
File "/usr/lib/python3.4/ssl.py", line 767, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:598)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.4/site-packages/requests/adapters.py", line 330, in send
timeout=timeout
File "/usr/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py", line 504, in urlopen
raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:598)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./try.py", line 154, in <module>
response = s.get(loginlink)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 395, in get
return self.request('GET', url, **kwargs)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.4/site-packages/requests/adapters.py", line 385, in send
raise SSLError(e)
requests.exceptions.SSLError: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:598)
Using s.get(loginlink, verify=False) yields exactly the same error. Simply using urllib.request.urlopen(loginlink) does so, too.
Printing and pasting the link into Firefox, on the other hand, works fine.
After trying with openssl s_client it looks like the destination idp.uni-tuebingen.de:443 is only support SSLv3 and misbehaving on anything newer. With forcing SSLv3 one gets:
$ openssl s_client -connect idp.uni-tuebingen.de:443 -ssl3
CONNECTED(00000003)
depth=3 C = DE, O = Deutsche Telekom AG, OU = T-TeleSec Trust Center, CN = Deutsche Telekom Root CA 2
...
But with default setup or forcing TLv1 (-tls1) it only returns an alert:
openssl s_client -connect idp.uni-tuebingen.de:443
CONNECTED(00000003)
140493591938752:error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error:s23_clnt.c:741:
So you need to find a way to force SSLv3 for this connection. I'm not familiar with the python at this point but maybe http://docs.python-requests.org/en/latest/user/advanced/ chapter "Example: Specific SSL Version" helps.
And why it works with firefox: the browsers usually retry with a downgraded SSL version if the connects with the safer versions fail. E.g. everybody is trying to work around broken stuff so that the owner of the broken stuff has no intention to fix it :(

Categories

Resources