I'm using a proxy (behind corporate firewall), to login to an https domain. The SSL handshake doesn't seem to be going well:
CertificateError: hostname 'ats.finra.org:443' doesn't match 'ats.finra.org'
I'm using Python 2.7.9 - Mechanize and I've gotten past all of the login, password, security questioon screens, but it is getting hung up on the certification.
Any help would be amazing. I've tried the monkeywrench found here: Forcing Mechanize to use SSLv3
Doesn't work for my code though.
If you want the code file I'd be happy to send.
You can avoid this error by monkey patching ssl:
import ssl
ssl.match_hostname = lambda cert, hostname: True
In my case the certificate's DNS name was ::1 (for local testing purposes) and hostname verification failed with
ssl.CertificateError: hostname '::1' doesn't match '::1'
To fix it somewhat properly I monkey patched ssl.match_hostname with
import ssl
ssl.match_hostname = lambda cert, hostname: hostname == cert['subjectAltName'][0][1]
Which actually checks if the hostnames match.
This bug in ssl.math_hostname appears in v2.7.9 (it's not in 2.7.5), and has to do with not stripping out the hostname from the hostname:port syntax. The following rewrite of ssl.match_hostname fixes the bug. Put this before your mechanize code:
import functools, re, urlparse
import ssl
old_match_hostname = ssl.match_hostname
#functools.wraps(old_match_hostname)
def match_hostname_bugfix_ssl_py_2_7_9(cert, hostname):
m = re.search(r':\d+$',hostname) # hostname:port
if m is not None:
o = urlparse.urlparse('https://' + hostname)
hostname = o.hostname
old_match_hostname(cert, hostname)
ssl.match_hostname = match_hostname_bugfix_ssl_py_2_7_9
The following mechanize code should now work:
import mechanize
import cookielib
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
# Follows refresh 0 but not hang on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-Agent', 'Nutscrape 1.0')]
# Use this proxy
br.set_proxies({"http": "localhost:3128", "https": "localhost:3128"})
r = br.open('https://www.duckduckgo.com:443/')
html = br.response().read()
# Examine the html response from a browser
f = open('foo.html','w')
f.write(html)
f.close()
Related
import requests
import socket
from unittest.mock import patch
orig_getaddrinfo = socket.getaddrinfo
def getaddrinfoIPv6(host, port, family=0, type=0, proto=0, flags=0):
return orig_getaddrinfo(host=host, port=port, family=socket.AF_INET6, type=type, proto=proto, flags=flags)
with patch('socket.getaddrinfo', side_effect=getaddrinfoIPv6):
r = requests.get('http://icanhazip.com')
print(r.text)
Instead of using a ipv4 proxy to connect to a website, I would like to connect using an ipv6 https proxy. I have scoured google for answers, and have not found any (that I understand)... Closest I have found is... (does not use the ipv6 proxy, instead uses my own ipv6). I am open to using something besides requests to do this, however, requests are prefered. I will be attempting to thread later on.
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
proxy = {"http":"http://username:password#[2604:0180:2:3b5:9ebc:64e9:166c:d9f9]", "https":"https://username:password#[2604:0180:2:3b5:9ebc:64e9:166c:d9f9]"}
url = "https://icanhazip.com"
r = requests.get(url, proxies=proxy, verify=False)
print(r.content)
If the code above does not work
import requests
proxy = {"http": "http://userame:password#168.235.109.30:18117", "https":"https://userame:password#168.235.109.30:18117"}
url = "https://icanhazip.com"
r = requests.get(url, proxies=proxy)
print(r.content)
This is my current provider for my ipv6 https proxy, however, they are using ipv6 over ipv4 to their clients, so this is why this code works, and the above code does not (if using the same provider) If you using a provider that supports ipv6 all by itself, then the code at the top should work for you.
You can use https://proxyturk.net/
Example curl command:
curl -m 90 -x http://proxyUsername:proxyPassword#93.104.200.99:20000 http://api6.ipify.org
You will see example result:
2a13:c206:2021:1522:9c5a:3ed5:156b:c1d0
Via Python's urllib2 I try to get data over HTTPS while I am behind a corporate NTLM proxy.
I run
proxy_url = ('http://user:pw#ntlmproxy:port/')
proxy_handler = urllib2.ProxyHandler({'http': proxy_url})
opener = urllib2.build_opener(proxy_handler, urllib2.HTTPHandler)
urllib2.install_opener(opener)
f = urllib2.urlopen('https://httpbin.org/ip')
myfile = f.read()
print myfile
but I get as error
urllib2.URLError: <urlopen error [Errno 8] _ssl.c:507:
EOF occurred in violation of protocol>
How can I fix this error?
Note 0: With the same code I can retrieve the unsecured HTTP equivalent http://httpbin.org/ip.
Note 1: From a normal browser I can access https://httpbin.org/ip (and other HTTPS sites) via the same corporate proxy.
Note 2: I was reading about many similar issues on the net and some suggested that it might be related to certificate verification, but urllib2 does not verify certificates anyway.
Note 3: Some people suggested in similar situations monekypatching, but I guess, there is no way to monkeypatch _ssl.c.
The problem is that Python's standard HTTP libraries do not speak Microsoft's proprietary NTLM authentication protocol fully.
I solved this problem by setting up a local NTLM-capable proxy - ntlmaps did the trick for me.(*) - which providesthe authentication against the corporate proxy and point my python code to this local proxy without authentication credentials.
Additionally I had to add in the above listed python code a proxy_handler for HTTPS. So I replaced the two lines
proxy_url = 'http://user:pw#ntlmproxy:port/'
proxy_handler = urllib2.ProxyHandler({'http': proxy_url})
with the two lines
proxy_url = 'http://localproxy:localport/'
proxy_url_https = 'https://localproxy:localport/'
proxy_handler = urllib2.ProxyHandler({'http': proxy_url, 'https': proxy_url_https})
Then the request works perfectly.
(*) ntlmaps is a Python program. Due to some reasons in my personal environment it was for me necessary that the proxy is a python program.)
Is there a way to tunnel HTTPs requests made with python's httplib through Fiddler's proxy, without turning Fiddler into a reverse proxy?
I tried using this example (from here):
import httplib
c = httplib.HTTPSConnection('localhost',8118)
c.set_tunnel('twitter.com',443)
c.request('GET','/')
r1 = c.getresponse()
print r1.status,r1.reason
data1 = r1.read()
print len(data1)
but it returns:
<HTML><HEAD><META http-equiv="Content-Type" content="text/html; charset=UTF-8"><TITLE>Fiddler Echo Service</TITLE></HEAD><BODY STYLE="font-family: arial,sans-serif;"><H1>Fiddler Echo Service</H1><BR /><PRE>GET / HTTP/1.1
Host: localhost:8080
Accept-Encoding: identity
</PRE><BR><HR><UL><LI>To configure Fiddler as a reverse proxy instead of seeing this page, see FiddlerRoot certificate</UL></BODY></HTML>
EDIT:
This code works. The problem was related to proxy settings in Internet Explorer. DO NOT set a proxy in these settings, otherwise code above won't work.
I've tried with urllib2 (Proxy with urllib2) through a Burp proxy and it worked:
import urllib2
proxy = urllib2.ProxyHandler({'https': '127.0.0.1:8081'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
print urllib2.urlopen('https://www.google.com').read()
Your code seems to work just fine with Burp Suite proxy.
I am trying to access a website from behind corporate firewall using below:-
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, url, username, password)
auth_handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
conn = urllib2.urlopen('http://python.org')
Getting error
URLError: <urlopen error [Errno 11004] getaddrinfo failed>
I have tried with different handlers (tried ProxyHandler also in slightly different way), but doesn't seem to work.
Any clues to what could be the reason for error and any different ways to supply the credentials and make it work?
If you are using Proxy and that proxy has Username and Password (which many corporate proxies have), you need to set the proxy handler with urllib2.
proxy_url = 'http://' + proxy_user + ':' + proxy_password + '#' + proxy_ip
proxy_support = urllib2.ProxyHandler({"http":proxy_url})
opener = urllib2.build_opener(proxy_support,urllib2.HTTPHandler)
urllib2.install_opener(opener)
HTTPBasicAuthHandler is used to provide credentials for the site which you are going to access and not for going through the proxy. The above snippet might help you.
On Windows, I observed that python uses the IE Internet Options-> LAN Settings settings.
So even if we use urllib2 to install opener and specify the proxy_url, it would continue to use the IE settings.
It worked fine finally, when I exported a system variable:
http_proxy=http://userid:pswd#proxyurl.com:port
I open urls with:
site = urllib2.urlopen('http://google.com')
And what I want to do is connect the same way with a proxy
I got somewhere telling me:
site = urllib2.urlopen('http://google.com', proxies={'http':'127.0.0.1'})
but that didn't work either.
I know urllib2 has something like a proxy handler, but I can't recall that function.
proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
You have to install a ProxyHandler
urllib2.install_opener(
urllib2.build_opener(
urllib2.ProxyHandler({'http': '127.0.0.1'})
)
)
urllib2.urlopen('http://www.google.com')
You can set proxies using environment variables.
import os
os.environ['http_proxy'] = '127.0.0.1'
os.environ['https_proxy'] = '127.0.0.1'
urllib2 will add proxy handlers automatically this way. You need to set proxies for different protocols separately otherwise they will fail (in terms of not going through proxy), see below.
For example:
proxy = urllib2.ProxyHandler({'http': '127.0.0.1'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
# next line will fail (will not go through the proxy) (https)
urllib2.urlopen('https://www.google.com')
Instead
proxy = urllib2.ProxyHandler({
'http': '127.0.0.1',
'https': '127.0.0.1'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# this way both http and https requests go through the proxy
urllib2.urlopen('http://www.google.com')
urllib2.urlopen('https://www.google.com')
To use the default system proxies (e.g. from the http_support environment variable), the following works for the current request (without installing it into urllib2 globally):
url = 'http://www.example.com/'
proxy = urllib2.ProxyHandler()
opener = urllib2.build_opener(proxy)
in_ = opener.open(url)
in_.read()
In Addition to the accepted answer:
My scipt gave me an error
File "c:\Python23\lib\urllib2.py", line 580, in proxy_open
if '#' in host:
TypeError: iterable argument required
Solution was to add http:// in front of the proxy string:
proxy = urllib2.ProxyHandler({'http': 'http://proxy.xy.z:8080'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
urllib2.urlopen('http://www.google.com')
One can also use requests if we would like to access a web page using proxies. Python 3 code:
>>> import requests
>>> url = 'http://www.google.com'
>>> proxy = '169.50.87.252:80'
>>> requests.get(url, proxies={"http":proxy})
<Response [200]>
More than one proxies can also be added.
>>> proxy1 = '169.50.87.252:80'
>>> proxy2 = '89.34.97.132:8080'
>>> requests.get(url, proxies={"http":proxy1,"http":proxy2})
<Response [200]>
In addition set the proxy for the command line session
Open a command line where you might want to run your script
netsh winhttp set proxy YourProxySERVER:yourProxyPORT
run your script in that terminal.