Python Requests, Work Proxy and Django REST - python

appreciate your help here, thanks in advance.
My Problem:
I am using Python's Requests module for get/post requests to a Django REST API behind a work proxy. I am unable to get past the proxy and I encounter an error. I have summarised this below:
Using the following code (what I've tried):
s = requests.Session()
s.headers = {
"User-Agent": [someGenericUserAgent]
}
s.trust_env = False
proxies = {
'http': 'http://[domain]\[userName]:[password]#[proxy]:8080',
'https': 'https://[domain]\[userName]:[password]#[proxy]:8080'
}
os.environ['NO_PROXY'] = [APIaddress]
os.environ['no_proxy'] = [APIaddress]
r = s.post(url=[APIaddress], proxies=proxies)
With this I get an error:
... OSError('Tunnel connection failed: 407 Proxy Authentication Required')))
Additional Context:
This is on a windows 10 machine.
Work uses a "automatic proxy setup" script (.pac), looking at the script there are a number of proxies that will be automatically assigned depending on the IP address of the machine. All of these proxies I have tried under [proxy] above, with the same error.
The above works when I am not running through the work network, and I don't use the additional proxy settings (removing proxies=proxies). i.e on my home network.
I have no issues with a get request via my browser via the proxy to the Django REST API view.
Things I am uncertain about:
I don't know if I am using the right [proxy]. Is there a way to verify this? I have tried using [findMyProxy].com sites, using the ip addresses it still doesn't work.
I don't know if I am using [domain]\[userName] correctly. is a \ correct? my work does use a domain.
I'm certain it is not a requests issue, as trying to do pip install --proxy http://[domain]\[userName]:[password]#[proxy]:8080 someModule bares the same 407 error.
Any help appreciated.

How I came to the solution:
I used curl to establish a <200 response>, after a lot of trial and error, success was:
$ curl -U DOMAIN\USER:PW -v -x http://LOCATION_OF_PAC_FILE --proxy-ntlm www.google.com
Where -U is the domain, user name and password.
-V is verbose, made it easier for debugging.
-x is the proxy, in my case the location to the .pac file. Curl automatically determines the proxy IP from the PAC. Requests does not do this by default (that I know of).
I used curl to determine that my proxy was using ntlm.
www.google.com as an external site to test the proxy auth.
NOTE: only 1 off \ between domain and username.
Trying to make request use ntlm, I found was impossible by default and instead used requests-ntlm2.
The PAC file through request-ntlm2 did not work so I used pypac to autodiscover the PAC file then determined the proxy based on the URL.
The working code is as follows:
from pypac import PACSession
from requests_ntlm2 import (
HttpNtlmAuth,
HttpNtlmAdapter,
NtlmCompatibility
)
username = 'DOMAIN\\USERNAME'
password = 'PW'
# Don't need the following thanks to pypacs
# proxy_ip = 'PROXY_IP'
# proxy_port = "PORT"
# proxies = {
# 'http': 'http://{}:{}'.format(proxy_ip, proxy_port),
# 'https': 'http://{}:{}'.format(proxy_ip, proxy_port)
# }
ntlm_compatibility = NtlmCompatibility.NTLMv2_DEFAULT
# session = requests.Session() <- replaced with PACSession()
session = PACSession()
session.mount(
'https://',
HttpNtlmAdapter(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
)
session.mount(
'http://',
HttpNtlmAdapter(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
)
session.auth = HttpNtlmAuth(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
# Don't need the following thanks to pypacs
# session.proxies = proxies
response = session.get('http://www.google.com')

Related

what is the proper way to use proxies with requests in python

Requests is not honoring the proxies flag.
There is something I am missing about making a request over a proxy with python requests library.
If I enable the OS system proxy, then it works, but if I make the request with just requests module proxies setting, the remote machine will not see the proxy set in requests, but will see my real ip, it is as if not proxy was set.
The bellow example will show this effect, at the time of this post the bellow proxy is alive but any working proxy should replicate the effect.
import requests
proxy ={
'http:': 'https://143.208.200.26:7878',
'https:': 'http://143.208.200.26:7878'
}
data = requests.get(url='http://ip-api.com/json', proxies=proxy).json()
print('Ip: %s\nCity: %s\nCountry: %s' % (data['query'], data['city'], data['country']))
I also tried changing the proxy_dict format:
proxy ={
'http:': '143.208.200.26:7878',
'https:': '143.208.200.26:7878'
}
But still it has not effect.
I am using:
-Windows 10
-python 3.9.6
-urllib 1.25.8
Many thanks in advance for any response to help sort this out.
Ok is working yea !!! .
The credits for solving this goes to (Olvin Rogh) Thanks Olvin for your help and pointing out my problem. I was adding colon ":" inside the keys
This code is working now.
PROXY = {'https': 'https://143.208.200.26:7878',
'http': 'http://143.208.200.26:7878'}
with requests.Session() as session:
session.proxies = PROXY
r = session.get('http://ip-api.com/json')
print(json.dumps(r.json(), indent=2))

Python request with authenticated proxy without exposing username and password.

I am using request library to connect to an api and get data. I have the proxy to connect and it works with AuthenticatedProxy
This is how my code looks.
import requests
s = requests.Session()
url = "https://testapi"
urlkey = “testkey”
proxies = {'https': 'http://<UserName>:<Password>#proxy_url:8080'}
resp = s.get(url, params={'key':urlkey }, proxies = proxies)
content = resp.content
print content
Username and Password is exposed here and I want to avoid that. How can I acheived that ? Can i ask request to use defaultCredentials or 'account that is running the python script' credentials ?
In .net, following config works:
<system.net>
<defaultProxy enabled="true" useDefaultCredentials="true">
<proxy proxyaddress="https://testapi:8080" bypassonlocal="True"/>
</defaultProxy>
</system.net>
Thanks all.
Similar question here.
From the request docs:
You can also configure proxies by setting the environment variables
HTTP_PROXY and HTTPS_PROXY.
So you could avoid revealing the credentials in the script by setting the environment variable instead:
$ export HTTPS_PROXY="http://<UserName>:<Password>#proxy_url:8080"
And then in the script you would use the proxy by just calling:
resp = s.get(url, params={'key':urlkey })

How to authenticate internal corporate proxy with credentials in order to reach external API

We have a requirement in consuming an external API, in order to reach to their endpoint, we would need to authenticate our proxy first.
How can we achieve this using python, seems like there is one in
c# ---> CredentialCache.DefaultCredentials;
How to do it in python,
so far I have tried:
import requests
proxies = {"https":"https://url:port/file"}
client_cert = ("key/path", "cert/path")
data = """xml request"""
requests.post(url, proxy=proxy, data=data, cert=client_cert)
I have read in the docs saying there is http digest authentication like
I can use https://username:password#url:port/file .
Any suggestions?
ERROR:
HTTPSConnectionPool, failed to establish connection
Actually, My question has answer:
proxy = {"http": "http://username:password#proxy:port", "https":"http://username:password#proxy:port"}
requests.post(url, headers, auth, cert, payload, proxies=proxy) #===> works
or else we can set the environment variable.
export https_proxy = "http://username:password#proxy:port"
export http_proxy = "http://username:password#proxy:port"
In my case there were multiple proxies for our company and I was using the incorrect proxy details. When I tried with an accurate one. It worked.
Thanks to Stack

requests library https get via proxy leads to error

Trying to send a simple get request via a proxy. I have the 'Proxy-Authorization' and 'Authorization' headers, don't think I needed the 'Authorization' header, but added it anyway.
import requests
URL = 'https://www.google.com'
sess = requests.Session()
user = 'someuser'
password = 'somepass'
token = base64.encodestring('%s:%s'%(user,password)).strip()
sess.headers.update({'Proxy-Authorization':'Basic %s'%token})
sess.headers['Authorization'] = 'Basic %s'%token
resp = sess.get(URL)
I get the following error:
requests.packages.urllib3.exceptions.ProxyError: Cannot connect to proxy. Socket error: Tunnel connection failed: 407 Proxy Authentication Required.
However when I change the URL to simple http://www.google.com, it works fine.
Do proxies use Basic, Digest, or some other sort of authentication for https? Is it proxy server specific? How do I discover that info? I need to achieve this using the requests library.
UPDATE
Its seems that with HTTP requests we have to pass in a Proxy-Authorization header, but with HTTPS requests, we need to format the proxy URL with the username and password
#HTTP
import requests, base64
URL = 'http://www.google.com'
user = <username>
password = <password>
proxy = {'http': 'http://<IP>:<PORT>}
token = base64.encodestring('%s:%s' %(user, password)).strip()
myheader = {'Proxy-Authorization': 'Basic %s' %token}
r = requests.get(URL, proxies = proxies, headers = myheader)
print r.status_code # 200
#HTTPS
import requests
URL = 'https://www.google.com'
user = <username>
password = <password>
proxy = {'http': 'http://<user>:<password>#<IP>:<PORT>}
r = requests.get(URL, proxies = proxy)
print r.status_code # 200
When sending an HTTP request, if I leave out the header and pass in a proxy formatted with user/pass, I get a 407 response.
When sending an HTTPS request, if I pass in the header and leave the proxy unformatted I get a ProxyError mentioned earlier.
I am using requests 2.0.0, and a Squid proxy-caching web server. Why doesn't the header option work for HTTPS? Why does the formatted proxy not work for HTTP?
The answer is that the HTTP case is bugged. The expected behaviour in that case is the same as the HTTPS case: that is, you provide your authentication credentials in the proxy URL.
The reason the header option doesn't work for HTTPS is that HTTPS via proxies is totally different to HTTP via proxies. When you route a HTTP request via a proxy, you essentially just send a standard HTTP request to the proxy with a path that indicates a totally different host, like this:
GET http://www.google.com/ HTTP/1.1
Host: www.google.com
The proxy then basically forwards this on.
For HTTPS that can't possibly work, because you need to negotiate an SSL connection with the remote server. Rather than doing anything like the HTTP case, you use the CONNECT verb. The proxy server connects to the remote end on behalf of the client, and from them on just proxies the TCP data. (More information here.)
When you attach a Proxy-Authorization header to the HTTPS request, we don't put it on the CONNECT message, we put it on the tunnelled HTTPS message. This means the proxy never sees it, so refuses your connection. We special-case the authentication information in the proxy URL to make sure it attaches the header correctly to the CONNECT message.
Requests and urllib3 are currently in discussion about the right place for this bug fix to go. The GitHub issue is currently here. I expect that the fix will be in the next Requests release.

Using urllib2 via proxy

I am trying to use urllib2 through a proxy; however, after trying just about every variation of passing my verification details using urllib2, I either get a request that hangs forever and returns nothing or I get 407 Errors. I can connect to the web fine using my browser which connects to a prox-pac and redirects accordingly; however, I can't seem to do anything via the command line curl, wget, urllib2 etc. even if I use the proxies that the prox-pac redirects to. I tried setting my proxy to all of the proxies from the pac-file using urllib2, none of which work.
My current script looks like this:
import urllib2 as url
proxy = url.ProxyHandler({'http': 'username:password#my.proxy:8080'})
auth = url.HTTPBasicAuthHandler()
opener = url.build_opener(proxy, auth, url.HTTPHandler)
url.install_opener(opener)
url.urlopen("http://www.google.com/")
which throws HTTP Error 407: Proxy Authentication Required and I also tried:
import urllib2 as url
handlePass = url.HTTPPasswordMgrWithDefaultRealm()
handlePass.add_password(None, "http://my.proxy:8080", "username", "password")
auth_handler = url.HTTPBasicAuthHandler(handlePass)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
url.urlopen("http://www.google.com")
which hangs like curl or wget timing out.
What do I need to do to diagnose the problem? How is it possible that I can connect via my browser but not from the command line on the same computer using what would appear to be the same proxy and credentials?
Might it be something to do with the router? if so, how can it distinguish between browser HTTP requests and command line HTTP requests?
Frustrations like this are what drove me to use Requests. If you're doing significant amounts of work with urllib2, you really ought to check it out. For example, to do what you wish to do using Requests, you could write:
import requests
from requests.auth import HTTPProxyAuth
proxy = {'http': 'http://my.proxy:8080'}
auth = HTTPProxyAuth('username', 'password')
r = requests.get('http://wwww.google.com/', proxies=proxy, auth=auth)
print r.text
Or you could wrap it in a Session object and every request will automatically use the proxy information (plus it will store & handle cookies automatically!):
s = requests.Session(proxies=proxy, auth=auth)
r = s.get('http://www.google.com/')
print r.text

Categories

Resources