requests via a SOCKs proxy - python

How can I make an HTTP request via a SOCKs proxy (simply using ssh -D as the proxy)? I've tried using requests with SOCK proxies but it doesn't appear to work (I saw this pull request). For example:
proxies = { "http": "socks5://localhost:9999/" }
r = requests.post( endpoint, data=request, proxies=proxies )
It'd be convenient to keep using the requests library, but I can also switch to urllib2 if that is known to work.

Since SOCKS support has been added to requests 2.10.0, it is remarkably simple, and very close to what you have
Install requests[socks]:
$ pip install requests[socks]
Set up your proxies variable, and make use of it:
>>> import requests
>>> proxies = {
"http":"socks5://localhost:9999",
"https":"socks5://localhost:9999"
}
>>> requests.get(
"https://api.ipify.org?format=json",
proxies=proxies
).json()
{u'ip': u'123.xxx.xxx.xxx'}
A few things to note are to not use a / on the end of the proxies URL, and that you can also use socks4:// as the scheme too if the SOCKS server doesn't support SOCKS5.

SOCKS support for requests is still pending. If you want, you can view my Github repository here to see my branch of the Socksipy library. This is the branch that is currently being integrated into requests; it will be some time before requests fully supports it, though.
https://github.com/Anorov/PySocks/
It should work okay with urllib2. Import sockshandler in your file, and follow the example inside of it. You'll want to create an opener like this:
opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS5, "localhost", 9050))
Then you can use opener.open(url) and it should tunnel through the proxy.

Related

what is the proper way to use proxies with requests in python

Requests is not honoring the proxies flag.
There is something I am missing about making a request over a proxy with python requests library.
If I enable the OS system proxy, then it works, but if I make the request with just requests module proxies setting, the remote machine will not see the proxy set in requests, but will see my real ip, it is as if not proxy was set.
The bellow example will show this effect, at the time of this post the bellow proxy is alive but any working proxy should replicate the effect.
import requests
proxy ={
'http:': 'https://143.208.200.26:7878',
'https:': 'http://143.208.200.26:7878'
}
data = requests.get(url='http://ip-api.com/json', proxies=proxy).json()
print('Ip: %s\nCity: %s\nCountry: %s' % (data['query'], data['city'], data['country']))
I also tried changing the proxy_dict format:
proxy ={
'http:': '143.208.200.26:7878',
'https:': '143.208.200.26:7878'
}
But still it has not effect.
I am using:
-Windows 10
-python 3.9.6
-urllib 1.25.8
Many thanks in advance for any response to help sort this out.
Ok is working yea !!! .
The credits for solving this goes to (Olvin Rogh) Thanks Olvin for your help and pointing out my problem. I was adding colon ":" inside the keys
This code is working now.
PROXY = {'https': 'https://143.208.200.26:7878',
'http': 'http://143.208.200.26:7878'}
with requests.Session() as session:
session.proxies = PROXY
r = session.get('http://ip-api.com/json')
print(json.dumps(r.json(), indent=2))

Python - How to handle HTTPS request with (Urllib2 + SSL) though a HTTP proxy

I am trying to test a proxy connection by using urllib2.ProxyHandler. However, there probably some situation that I am going to request a HTTPS website (eg: https://www.whatismyip.com/)
Urllib2.urlopen() will throw ERROR if request a HTTPS site. So I tried to use a helper function to rewrite the URLOPEN method.
Here is the helper function:
def urlopen(url, timeout):
if hasattr(ssl, 'SSLContext'):
SslContext = ssl.create_default_context()
SslContext.check_hostname = False
SslContext.verify_mode = ssl.CERT_NONE
return urllib2.urlopen(url, timeout=timeout, context=SslContext)
else:
return urllib2.urlopen(url, timeout=timeout)
This helper function based on answer
Then I use:
urllib2.install_opener(
urllib2.build_opener(
urllib2.ProxyHandler({'http': '127.0.0.1:8080'})
)
)
to setup http proxy for urllib.opener.
Ideally, it should working when i request a website by using urlopen('http://whatismyip.com', 30) and it should pass all traffic through http proxy.
However, the urlopen() will fall into if hasattr(ssl, 'SSLContext') all the time even if it is a HTTP site. In addition, HTTPS site is not using HTTP proxy either. This cause the HTTP proxy become invalid and all traffic going through unproxied network
I also tried this answer to change HTTP into HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'}) but it still not working.
My proxy is working. If i am using urllib2.urlopen() instead of the rewrite version urlopen(), it works for HTTP site.
But, I do need consider the suitation if the urlopen gonna need to be used on a HTTPS ONLY site.
How to do that?
Thanks
UPDATE1: I cannot get this work with Python 2.7.11 and some of server working properly with Python 2.7.5. I assue it is python version issue.
Urllib2 will not go through HTTPS Proxy so all HTTPS web address will failed to use proxy.
The problem is when you pass context argument to urllib2.urlopen() then urllib2 creates opener itself instead of using the global one, which is the one that gets set when you call urllib2.install_opener(). As a result your instance of ProxyHandler which you meant to be used is not being used.
The solution is not to install opener but to use the opener directly. When building your opener, you have to pass both an instance of your ProxyHandler class (to set proxies for http and https protocols) and an instance of HTTPSHandler class (to set https context).
I created https://bugs.python.org/issue29379 for this issue.
I personally would suggest the use of something such as python-requests as it will alleviate a lot of the issues with setting up the proxy using urllib2 directly. When using requests with a proxy you will have to do: (From their documentation)
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
requests.get('http://example.org', proxies=proxies)
And disabling SSL Certificate verification is as simple as passing verify=False the requests.get command above. However, this should be used sparingly and the actual issue with the SSL Cert verification should be resolve.
One more solution is to pass context into HTTPSHandler and pass this handler into build_opener together with ProxyHandler:
proxies = {'https': 'http://localhost:8080'}
proxy = urllib2.ProxyHandler(proxies)
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
handler = urllib2.HTTPSHandler(context=context)
opener = urllib2.build_opener(proxy, handler)
urllib2.install_opener(opener)
Now you can view all your HTTPS requests/responses in your proxy.

requests: how to disable / bypass proxy

I am getting an url with:
r = requests.get("http://myserver.com")
As I can see in the 'access.log' of "myserver.com", the client's system proxy is used. But I want to disable using proxies at all with requests.
The only way I'm currently aware of for disabling proxies entirely is the following:
Create a session
Set session.trust_env to False
Create your request using that session
import requests
session = requests.Session()
session.trust_env = False
response = session.get('http://www.stackoverflow.com')
This is based on this comment by Lukasa and the (limited) documentation for requests.Session.trust_env.
Note: Setting trust_env to False also ignores the following:
Authentication information from .netrc (code)
CA bundles defined in REQUESTS_CA_BUNDLE or CURL_CA_BUNDLE (code)
If however you only want to disable proxies for a particular domain (like localhost), you can use the NO_PROXY environment variable:
import os
import requests
os.environ['NO_PROXY'] = 'stackoverflow.com'
response = requests.get('http://www.stackoverflow.com')
You can choose proxies for each request. From the docs:
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)
So to disable the proxy, just set each one to the empty string:
import requests
proxies = {
"http": "",
"https": "",
}
requests.get("http://example.org", proxies=proxies)
Update: Switched from None to "", see comments.
The way to stop requests/urllib from proxying any requests is to set the the no_proxy (or NO_PROXY) environment variable to * e.g. in bash:
export no_proxy='*'
Or from Python:
import os
os.environ['no_proxy'] = '*'
To understand why this works is because the urllib.request.getproxies function first checks for any proxies set in the environment variables (e.g. http_proxy, HTTP_PROXY, https_proxy, HTTPS_PROXY, etc) or if none are set then it will check for system configured proxies using platform specific calls (e.g. On MacOS it will check using the system scutil/configd interfaces, and on Windows it will check the Registry). As mentioned in the comments if any proxy variables are set you can reset them as #udani suggested, or unset them like this from Python:
del os.environ['HTTP_PROXY']
Then when urllib attempts to use any proxies the proxyHandler function it will check for the presence and setting of the no_proxy environment variable - which can either be set to specific hostnames as mentioned above or it can be set the special * value whereby all hosts bypass the proxy.
With Python3, jtpereyda's solution didn't work, but the following did:
proxies = {
"http": "",
"https": "",
}
requests library respects environment variables.
http://docs.python-requests.org/en/latest/user/advanced/#proxies
So try deleting environment variables HTTP_PROXY and HTTPS_PROXY.
import os
for k in list(os.environ.keys()):
if k.lower().endswith('_proxy'):
del os.environ[k]
I implemented #jtpereyda's solution in our production codebase which worked fine on normal successful HTTP requests (200 OK), but this code ended up not working when receiving an HTTP redirect (301 Moved Permamently). Instead use:
requests.get("https://pypi.org/pypi/pillow/9.0.0/json", proxies={"http": "", "https": ""})
For comparison, this line causes a requests.exception.SSLError when behind a proxy (pypi.org tries to redirect us to Pillow with an uppercase P):
requests.get("https://pypi.org/pypi/pillow/9.0.0/json", proxies={"http": None, "https": None})
r = requests.post('https://localhost:44336/api/',data='',verify=False)
I faced the same issue when connecting with localhost to access my .net backend from a Python script with the request module.
I set verify to False, which cancels the default SSL verification.
P.s - above code will throw a warning that can be neglected by below one
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
r=requests.post('https://localhost:44336/api/',data='',verify=False)
For those for which no_proxy="*" doesnt work, try 0.0.0.0/32, that worked for me.

Using urllib2 via proxy

I am trying to use urllib2 through a proxy; however, after trying just about every variation of passing my verification details using urllib2, I either get a request that hangs forever and returns nothing or I get 407 Errors. I can connect to the web fine using my browser which connects to a prox-pac and redirects accordingly; however, I can't seem to do anything via the command line curl, wget, urllib2 etc. even if I use the proxies that the prox-pac redirects to. I tried setting my proxy to all of the proxies from the pac-file using urllib2, none of which work.
My current script looks like this:
import urllib2 as url
proxy = url.ProxyHandler({'http': 'username:password#my.proxy:8080'})
auth = url.HTTPBasicAuthHandler()
opener = url.build_opener(proxy, auth, url.HTTPHandler)
url.install_opener(opener)
url.urlopen("http://www.google.com/")
which throws HTTP Error 407: Proxy Authentication Required and I also tried:
import urllib2 as url
handlePass = url.HTTPPasswordMgrWithDefaultRealm()
handlePass.add_password(None, "http://my.proxy:8080", "username", "password")
auth_handler = url.HTTPBasicAuthHandler(handlePass)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
url.urlopen("http://www.google.com")
which hangs like curl or wget timing out.
What do I need to do to diagnose the problem? How is it possible that I can connect via my browser but not from the command line on the same computer using what would appear to be the same proxy and credentials?
Might it be something to do with the router? if so, how can it distinguish between browser HTTP requests and command line HTTP requests?
Frustrations like this are what drove me to use Requests. If you're doing significant amounts of work with urllib2, you really ought to check it out. For example, to do what you wish to do using Requests, you could write:
import requests
from requests.auth import HTTPProxyAuth
proxy = {'http': 'http://my.proxy:8080'}
auth = HTTPProxyAuth('username', 'password')
r = requests.get('http://wwww.google.com/', proxies=proxy, auth=auth)
print r.text
Or you could wrap it in a Session object and every request will automatically use the proxy information (plus it will store & handle cookies automatically!):
s = requests.Session(proxies=proxy, auth=auth)
r = s.get('http://www.google.com/')
print r.text

urllib2 failing with https websites

Using urllib2 and trying to get an https page, it keeps failing with
Invalid url, unable to resolve
The url is
https://www.domainsbyproxy.com/default.aspx
but I have this happening on multiple https sites.
I am using python 2.7, and below is the code I am using to setup the connection
opener = urllib2.OpenerDirector()
opener.add_handler(urllib2.HTTPHandler())
opener.add_handler(urllib2.HTTPDefaultErrorHandler())
opener.addheaders = [('Accept-encoding', 'gzip')]
fetch_timeout = 12
response = opener.open(url, None, fetch_timeout)
The reason I am setting handlers manually is because I don't want redirects handled (which works fine). The above works fine for http requests, however https - fails.
Any clues?
You should be using HTTPSHandler instead of HTTPHandler
If you don't mind external libraries, consider the excellent requests module. It takes care of these quirks with urllib.
Your code, using requests is:
import requests
r = requests.get(url, headers={'Accept-encoding': 'gzip'}, timeout=12)

Categories

Resources