I use python 2.6 and request Facebook API (https). I guess my service could be target of Man In The Middle attacks.
I discovered this morning reading again urllib module documentation that :
Citation:
Warning : When opening HTTPS URLs, it is not attempted to validate the server certificate. Use at your own risk!
Do you have hints / url / examples to complete a full certificate validation ?
Thanks for your help
You could create a urllib2 opener which can do the validation for you using a custom handler. The following code is an example that works with Python 2.7.3 . It assumes you have downloaded http://curl.haxx.se/ca/cacert.pem to the same folder where the script is saved.
#!/usr/bin/env python
import urllib2
import httplib
import ssl
import socket
import os
CERT_FILE = os.path.join(os.path.dirname(__file__), 'cacert.pem')
class ValidHTTPSConnection(httplib.HTTPConnection):
"This class allows communication via SSL."
default_port = httplib.HTTPS_PORT
def __init__(self, *args, **kwargs):
httplib.HTTPConnection.__init__(self, *args, **kwargs)
def connect(self):
"Connect to a host on a given (SSL) port."
sock = socket.create_connection((self.host, self.port),
self.timeout, self.source_address)
if self._tunnel_host:
self.sock = sock
self._tunnel()
self.sock = ssl.wrap_socket(sock,
ca_certs=CERT_FILE,
cert_reqs=ssl.CERT_REQUIRED)
class ValidHTTPSHandler(urllib2.HTTPSHandler):
def https_open(self, req):
return self.do_open(ValidHTTPSConnection, req)
opener = urllib2.build_opener(ValidHTTPSHandler)
def test_access(url):
print "Acessing", url
page = opener.open(url)
print page.info()
data = page.read()
print "First 100 bytes:", data[0:100]
print "Done accesing", url
print ""
# This should work
test_access("https://www.google.com")
# Accessing a page with a self signed certificate should not work
# At the time of writing, the following page uses a self signed certificate
test_access("https://tidia.ita.br/")
Running this script you should see something a output like this:
Acessing https://www.google.com
Date: Mon, 14 Jan 2013 14:19:03 GMT
Expires: -1
...
First 100 bytes: <!doctype html><html itemscope="itemscope" itemtype="http://schema.org/WebPage"><head><meta itemprop
Done accesing https://www.google.com
Acessing https://tidia.ita.br/
Traceback (most recent call last):
File "https_validation.py", line 54, in <module>
test_access("https://tidia.ita.br/")
File "https_validation.py", line 42, in test_access
page = opener.open(url)
...
File "/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed>
If you have a trusted Certificate Authority (CA) file, you can use Python 2.6 and later's ssl library to validate the certificate. Here's some code:
import os.path
import ssl
import sys
import urlparse
import urllib
def get_ca_path():
'''Download the Mozilla CA file cached by the cURL project.
If you have a trusted CA file from your OS, return the path
to that instead.
'''
cafile_local = 'cacert.pem'
cafile_remote = 'http://curl.haxx.se/ca/cacert.pem'
if not os.path.isfile(cafile_local):
print >> sys.stderr, "Downloading %s from %s" % (
cafile_local, cafile_remote)
urllib.urlretrieve(cafile_remote, cafile_local)
return cafile_local
def check_ssl(hostname, port=443):
'''Check that an SSL certificate is valid.'''
print >> sys.stderr, "Validating SSL cert at %s:%d" % (
hostname, port)
cafile_local = get_ca_path()
try:
server_cert = ssl.get_server_certificate((hostname, port),
ca_certs=cafile_local)
except ssl.SSLError:
print >> sys.stderr, "SSL cert at %s:%d is invalid!" % (
hostname, port)
raise
class CheckedSSLUrlOpener(urllib.FancyURLopener):
'''A URL opener that checks that SSL certificates are valid
On SSL error, it will raise ssl.
'''
def open(self, fullurl, data = None):
urlbits = urlparse.urlparse(fullurl)
if urlbits.scheme == 'https':
if ':' in urlbits.netloc:
hostname, port = urlbits.netloc.split(':')
else:
hostname = urlbits.netloc
if urlbits.port is None:
port = 443
else:
port = urlbits.port
check_ssl(hostname, port)
return urllib.FancyURLopener.open(self, fullurl, data)
# Plain usage - can probably do once per day
check_ssl('www.facebook.com')
# URL Opener
opener = CheckedSSLUrlOpener()
opener.open('https://www.facebook.com/find-friends/browser/')
# Make it the default
urllib._urlopener = opener
urllib.urlopen('https://www.facebook.com/find-friends/browser/')
Some dangers with this code:
You have to trust the CA file from the cURL project (http://curl.haxx.se/ca/cacert.pem), which is a cached version of Mozilla's CA file. It's also over HTTP, so there is a potential MITM attack. It's better to replace get_ca_path with one that returns your local CA file, which will vary from host to host.
There is no attempt to see if the CA file has been updated. Eventually, root certs will expire or be deactivated, and new ones will be added. A good idea would be to use a cron job to delete the cached CA file, so that a new one is downloaded daily.
It's probably overkill to check certificates every time. You could manually check once per run, or keep a list of 'known good' hosts over the course of the run. Or, be paranoid!
Related
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 34, in <module>
page = http.robotCheck()
File "C:\Users\user001\Desktop\automation\Loader.py", line 21, in robotCheck
request = requests.get('http://*redacted*.onion/login')
File "C:\Users\user001\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\user001\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\user001\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\user001\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "C:\Users\user001\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 513, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='*redacted*.onion', port=80): Max retries exceeded with url: /login (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0348E230>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Here's code of my Loader class. Loader is responsible for uploading and making requests to resource over TOR. I configured proxy, but it still throws me error.
I disabled VPN, Windows firewall and everything what I could.
import socket
import requests
class Loader:
url = "http://*redacted*.onion/login"
user = "username"
password = 'password'
manageUrl = ''
def __init__(self, config):
self.config = config
self.restartSession()
def restartSession(self):
self.userSession = requests.session()
self.userSession.proxies['http'] = 'http://127.0.0.1:9050'
self.userSession.proxies['https'] = 'https://127.0.0.1:9051'
def robotCheck(self):
request = requests.get('http://*redacted*.onion/login')
print(request)
#self.session.post(self.robotCheckUrl, data=checkResult)
def authorization(self):
self.session.get(self.url)
authPage = self.session.post(self.url, data = self.getAuthData())
def getAuthData(self):
return {'login' : self.user, 'password' : self.password}
Code which call Loader class:
http = Loader(Config())
page = http.robotCheck()
Tor is a SOCKS proxy so the proxy configuration needs to be slightly different.
Change the following lines:
self.userSession.proxies['http'] = 'http://127.0.0.1:9050'
self.userSession.proxies['https'] = 'https://127.0.0.1:9051'
To:
self.userSession.proxies['http'] = 'socks5h://127.0.0.1:9050'
self.userSession.proxies['https'] = 'socks5h://127.0.0.1:9050'
Port 9051 is the Tor Controller port. For both HTTP and HTTPS SOCKS connections over Tor, use port 9050 (the default SOCKS port).
The socks5h scheme is necessary to have DNS names resolved over Tor instead of by the client. This privatizes DNS lookups and is necessary to be able to resolve .onion addresses.
EDIT: I was able to make a SOCKS request for a .onion address using the following example:
import socket
import requests
s = requests.session()
s.proxies['http'] = 'socks5h://127.0.0.1:9050'
s.proxies['https'] = 'socks5h://127.0.0.1:9050'
print(s.proxies)
r = s.get('http://***site***.onion/')
Make sure you have the most up-to-date requests libraries with pip3 install -U requests[socks]'
Wrap like this to get the exception only.
def robotCheck(self):
try:
request = requests.get('http://hydraruzxpnew4af.onion/login')
print(request)
except requests.exceptions.RequestException as e:
print('exception caught', e)
#self.session.post(self.robotCheckUrl, data=checkResult)
Because of the following reasons you may get those errors:
your server may refuse your connection (you're sending too many
requests from same ip address in short period of time)
Check your proxy settings
For your reference : https://github.com/requests/requests/issues/1198
I am trying to download files from FTP. It works fine at home but it doesn't work when I run through company's network. I know there is something to do with proxy. I have looked at a few posts regarding the proxy issue in Python. I have tried to set up a connection to the proxy. It works ok for url but it failed when connecting to FTP. Does anyone know a way to do that? Thanks in advance.
Below is my code:
import os
import urllib
import ftplib
from ftplib import FTP
from getpass import getpass
from urllib.request import urlopen, ProxyHandler, HTTPHandler, HTTPBasicAuthHandler, \
build_opener, install_opener
user_proxy = "XXX"
pass_proxy = "YYY"
url_proxy = "ZZZ"
port_proxy = "89"
url_proxy = "ftp://%s:%s#%s:%s" % (user_proxy, pass_proxy, url_proxy, port_proxy)
authinfo = urllib.request.HTTPBasicAuthHandler()
proxy_support = urllib.request.ProxyHandler({"ftp" : url_proxy})
# build a new opener that adds authentication and caching FTP handlers
opener = urllib.request.build_opener(proxy_support, authinfo,
urllib.request.CacheFTPHandler)
# install it
urllib.request.install_opener(opener)
#url works ok
f = urllib.request.urlopen('http://www.google.com/')
print(f.read(500))
urllib.request.install_opener(opener)
#ftp is not working
ftp = ftplib.FTP('ftp:/ba1.geog.umd.edu', 'user', 'burnt_data')
The error message I got:
730 # and socket type values to enum constants.
731 addrlist = []
--> 732 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
733 af, socktype, proto, canonname, sa = res
734 addrlist.append((_intenum_converter(af, AddressFamily),
gaierror: [Errno 11004] getaddrinfo failed
I can connect via the proxy using FileZilla by selecting custom FTP proxy with specification:
USER %u#%h %s
PASS %p
ACCT %w
FTP Proxy using FileZilla
You are connecting using an FTP proxy.
FTP proxy cannot work with HTTP, so your test against http:// URL to www.google.com is completely irrelevant and does not prove anything.
FTP proxy works as an FTP server. You connect to the proxy, instead of to the actual server. And then use some special syntax of a username (or other credentials) to specify your actual target FTP server and its credentials. In your case the special syntax of username is user#host user_proxy. Your proxy expects the proxy password in FTP ACCT command.
This should work for your specific case:
host_proxy = '192.168.149.50'
user_proxy = 'XXX'
pass_proxy = 'YYY'
user = 'user'
user_pass = 'burnt_data'
host = 'ba1.geog.umd.edu'
u = "%s#%s %s" % (user, host, user_proxy)
ftp = ftplib.FTP(host_proxy, u, user_pass, pass_proxy)
No other code should be needed (urllib or any other).
If the proxy uses a custom port (not 21), use this:
ftp = ftplib.FTP()
ftp.connect(host_proxy, port_proxy)
ftp.login(u, user_pass, pass_proxy)
Notes:
versions
Python 2.7.11 and my requests version is '2.10.0'
'OpenSSL 1.0.2d 9 Jul 2015'
Please read the below comment by Martijn Pieters before reproducing
Initially I tried to get pdf from https://www.neco.navy.mil/necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx using code as below
code1:
>>> import requests
>>> requests.get("https://www.neco.navy.mil/necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx",verify=False)
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\api.py", line 67, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: SysCallError(10054, 'WSAECONNRESE
T')",)
After googling and searching I found that you have use SSL verification and using session with adapters can solve the problem. But I still got error's please find the code and error's below
Code2:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager
import ssl
import traceback
class MyAdapter(HTTPAdapter):
def init_poolmanager(self, connections, maxsize, block=False):
self.poolmanager = PoolManager(num_pools=connections,
maxsize=maxsize,
block=block,
ssl_version=ssl.PROTOCOL_TLSv1)
s = requests.Session()
s.mount('https://', MyAdapter())
print "Mounted "
r = s.get("https://www.neco.navy.mil/necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx", stream=True, timeout=120)
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\sessions.py", line 480, in get
return self.request('GET', url, **kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "C:\Users\mob140003207\AppData\Local\Enthought\Canopy\User\lib\site-packa
ges\requests\adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: SysCallError(10054, 'WSAECONNRESET')",)
First of all, I confirm that the host, www.neco.navy.mil, is not accessible from everywhere. From some networks (geography) it works*, from others connection just hangs:
$ curl www.neco.navy.mil
curl: (7) couldn't connect to host
$ curl https://www.neco.navy.mil
curl: (7) couldn't connect to host
Second, when connection can be established there is an certificate problem:
$ curl -v https://www.neco.navy.mil
* Rebuilt URL to: https://www.neco.navy.mil/
* Hostname was NOT found in DNS cache
* Trying 205.85.2.133...
* Connected to www.neco.navy.mil (205.85.2.133) port 443 (#0)
* successfully set certificate verify locations:
* CAfile: none
CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS alert, Server hello (2):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: http://curl.haxx.se/docs/sslcerts.html
curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn't adequate, you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
the -k (or --insecure) option.
To make sure, you just feed it to Qualys SSL tester:
The CA (DoD Root CA 2) is not trusted. Moreover it's not in the chain. Note that OpenSSL validation process needs whole chain:
Firstly a certificate chain is built up starting from the supplied certificate and ending in the root CA. It is an error if the whole chain cannot be built up.
But there's only www.neco.navy.mil -> DODCA-28. It may be related to the TLD and extra security measure, but C grade alone isn't much anyway ;-)
On they Python side it won't be much different. If you don't have access to the CA, you can only disable certificate validation entirely (after you have connectivity problem solved, of course). If you have it, you can use cafile.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib2
import ssl
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
r = urllib2.urlopen('https://www.neco.navy.mil/'
'necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx',
timeout = 5, context = ctx)
print(len(r.read()))
r = urllib2.urlopen('https://www.neco.navy.mil/'
'necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx',
timeout = 5, cafile = '/path/to/DODCA-28_and_DoD_Root_CA_2.pem')
print(len(r.read()))
To reproduce with certain version of Python, use simple Dockerfile like follows:
FROM python:2.7.11
WORKDIR /opt
ADD . ./
CMD dpkg -s openssl | grep Version && ./app.py
Then run:
docker build -t ssl-test .
docker run --rm ssl-test
This snippet works for me (py2.7.11 64bits + requests==2.10.0) on windows7:
import requests
import ssl
import traceback
import shutil
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager
class MyAdapter(HTTPAdapter):
def init_poolmanager(self, connections, maxsize, block=False):
self.poolmanager = PoolManager(num_pools=connections,
maxsize=maxsize,
block=block,
ssl_version=ssl.PROTOCOL_TLSv1)
if __name__ == "__main__":
s = requests.Session()
s.mount('https://', MyAdapter())
print "Mounted "
filename = "N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx"
r = s.get(
"https://www.neco.navy.mil/necoattach/{0}".format(filename), verify=False, stream=True, timeout=120)
if r.status_code == 200:
with open(filename, 'wb') as f:
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
I use python 2.7.6 and this simple example still working on my ubuntu 14.04
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
with open('out.docx', 'wb') as h :
r = requests.get("https://www.neco.navy.mil/necoattach/N6945016R0626_2016-06-20__INFO_NAS_Pensacola_Base_Access.docx", verify=False, stream=True)
for block in r.iter_content(1024):
h.write(block)
I have configured my server to serve only https creating a self-signed certificate. I have a client that I has to validate the server's certificate and after that will download a file from the server.
How do I implement the validation in client? Is there any code example?
My question is similar with this one: How can the SSL client validate the server's certificate?
but although the fine explanation, I didn't find any help.
So far, in my code I create a directory and then I download the file with urllib2:
[...] #imports
def dir_creation(path):
try:
os.makedirs(path)
except OSError as exception:
if exception.errno != errno.EEXIST:
raise
def file_download(url):
ver_file = urllib2.urlopen(url)
data = ver_file.read()
with open(local_filename, "wb") as code:
code.write(data)
dir_creation(path)
file_download(url)
Rather than configuring your server to present a self-signed certificate, you should use a self-signed certificate as a certificate authority to sign the server certificate. (How to do this is beyond the scope of your question, but I'm sure you can find help on Stack Overflow or elsewhere.)
Now you must configure your client to trust your certificate authority. In python (2.7.9 or later), you can do this using the ssl module:
import ssl
... # create socket
ctx = ssl.create_default_context(cafile=path_to_ca_certificate)
sslsock = ctx.wrap_socket(sock)
You can then transmit and read data on the secure socket. See the ssl module documentation for more explanation.
The urllib2 API is simpler:
import urllib2
resp = urllib2.urlopen(url, cafile=path_to_ca_certificate)
resp_body = resp.read()
If you wish to use Requests, according to the documentation you can supply a path to the CA certificate as the argument to the verify parameter:
resp = requests.get(url, verify=path_to_ca_certificate)
I'm issuing a HTTPS GET request to a REST service I own with httplib2 but we're getting the error:
[Errno 8] _ssl.c:504: EOF occurred in violation of protocol
All other clients works well (browser, Java client, etc...) with the minor exception that PHP curl needed to be set to use SSL v3.
I've searched around and it seems that it is indeed an error regarding SSL version, but I can't seem to find a way to change it in httplib2.
Is there any way around it besides changing the following line in the source code:
# We should be specifying SSL version 3 or TLS v1, but the ssl module
# doesn't expose the necessary knobs. So we need to go with the default
# of SSLv23.
return ssl.wrap_socket(sock, keyfile=key_file, certfile=cert_file,
cert_reqs=cert_reqs, ca_certs=ca_certs)
I developed this workaround for httplib2 :
import httplib2
# Start of the workaround for SSL3
# This is a monkey patch / module function overriding
# to allow pages that only work with SSL3
# Build the appropriate socket wrapper for ssl
try:
import ssl # python 2.6
httplib2.ssl_SSLError = ssl.SSLError
def _ssl_wrap_socket(sock, key_file, cert_file,
disable_validation, ca_certs):
if disable_validation:
cert_reqs = ssl.CERT_NONE
else:
cert_reqs = ssl.CERT_REQUIRED
# Our fix for sites the only accepts SSL3
try:
# Trying SSLv3 first
tempsock = ssl.wrap_socket(sock, keyfile=key_file, certfile=cert_file,
cert_reqs=cert_reqs, ca_certs=ca_certs,
ssl_version=ssl.PROTOCOL_SSLv3)
except ssl.SSLError, e:
tempsock = ssl.wrap_socket(sock, keyfile=key_file, certfile=cert_file,
cert_reqs=cert_reqs, ca_certs=ca_certs,
ssl_version=ssl.PROTOCOL_SSLv23)
return tempsock
httplib2._ssl_wrap_socket = _ssl_wrap_socket
except (AttributeError, ImportError):
httplib2.ssl_SSLError = None
def _ssl_wrap_socket(sock, key_file, cert_file,
disable_validation, ca_certs):
if not disable_validation:
raise httplib2.CertificateValidationUnsupported(
"SSL certificate validation is not supported without "
"the ssl module installed. To avoid this error, install "
"the ssl module, or explicity disable validation.")
ssl_sock = socket.ssl(sock, key_file, cert_file)
return httplib.FakeSocket(sock, ssl_sock)
httplib2._ssl_wrap_socket = _ssl_wrap_socket
# End of the workaround for SSL3
if __name__ == "__main__":
h1 = httplib2.Http()
resp, content = h1.request("YOUR_SSL3_ONLY_LINK_HERE", "GET")
print(content)
This workaround was based on the workarounds for urllib2 presented at this bug report http://bugs.python.org/issue11220,
Update: Presenting a solution for httplib2. I didn't notice you were using httplib2, I thought it was urllib2.
Please refer to another StackOverflow threads that specify a solution. The way to specify a TLS version force the SSL version to TLSv1 as mentioned in the response by user favoretti within the provided link.
Hopefully, this works