I'm attempting to use Bing's Entity search API, however I need to go through a proxy in order to establish a connection.
My company has giving me a proxy in the following format to use: 'http://username:password#webproxy.subdomain.website.com:8080'
I've tried using HTTPConnection.set_tunnel(host, port=None, headers=None) with no avail. Does anyone have any idea how to run the following Python query below using the proxy I was provided?
import http.client, urllib.parse
import json
proxy = 'http://username:pwd#webproxy.subdomain.website.com:8080'
subscriptionKey = 'MY_KEY_HERE'
host = 'api.cognitive.microsoft.com'
path = '/bing/v7.0/entities'
mkt = 'en-US'
query = 'italian restaurants near me'
params = '?mkt=' + mkt + '&q=' + urllib.parse.quote (query)
def get_suggestions ():
headers = {'Ocp-Apim-Subscription-Key': subscriptionKey}
conn = http.client.HTTPSConnection(host)
conn.request("GET", path + params, None, headers)
response = conn.getresponse()
return response.read()
result = get_suggestions ()
print (json.dumps(json.loads(result), indent=4))
As a side note, I was able to run the following sucessfully with the proxy.
nltk.set_proxy('http://username:pwd#webproxy.subdomain.website.com:8080')
nltk.download('wordnet')
I ended up using the requests package, they make it very simple to channel your request through a proxy, sample code below for reference:
import requests
proxies = {
'http': f"http://username:pwd#webproxy.subdomain.website.com:8080",
'https': f"https://username:pwd#webproxy.subdomain.website.com:8080"
}
url = 'https://api.cognitive.microsoft.com/bing/v7.0/entities/'
# query string parameters
params = 'mkt=' + 'en-US' + '&q=' + urllib.parse.quote (query)
# custom headers
headers = {'Ocp-Apim-Subscription-Key': subscriptionKey}
#start session
session = requests.Session()
#persist proxy across request
session.proxies = proxies
# make GET request
r = session.get(url, params=params, headers=headers)
If you use urlib2 or requests, the easiest way to configure a proxy is setting the HTTP_PROXY or the HTTPS_PROXY environment variable:
export HTTPS_PROXY=https://user:pass#host:port
This way the library handles the proxy configuration for you, and if your code runs inside a container you can pass that information at runtime.
Related
I am trying to send a SOAP request using Python requests library. I am behind a corporate proxy to which I should authenticate using NTLM authentication. However, when I try to send the request, I always get status code 407 from proxy server and the following error:
The NTLM challenge header was not found - HTTP 407: failed to establish NTLM proxy tunnel
Code snippet below:
import wincertstore
import os
from requests_ntlm2 import HttpNtlmAuth, NtlmCompatibility, HttpNtlmAdapter
import requests_pkcs12
import requests
import traceback
def create_pem_of_system_certs():
certfile = wincertstore.CertFile()
certfile.addstore("CA")
certfile.addstore("ROOT")
return certfile.name
def get_headers():
return {
"content-type": "text/xml",
"SOAPAction": "",
"Proxy-Connection": "Keep-Alive"
}
os.environ["REQUESTS_CA_BUNDLE"] = create_pem_of_system_certs()
local_user = os.environ["userdomain"] + "\\" + os.getlogin() # domain\\username
local_user_password = "local_user_password"
pkcs12_pass = "p12_pass"
pkcs12_file = "file.p12"
content = '<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> ' \
'<soapenv:Header/>' \
'<soapenv:Body>' \
'...' \
'</soapenv:Body>' \
'</soapenv:Envelope>'
ntlm_proxy = "proxy_ip:proxy_port"
http_dict = "http://{}".format(ntlm_proxy)
https_dict = http_dict
proxies = {
"http": http_dict,
"https": https_dict
}
ntlm_compatibility = NtlmCompatibility.NTLMv2_DEFAULT
url = "URL of WSDL"
session = requests.Session()
session.mount(
url,
requests_pkcs12.Pkcs12Adapter(
pkcs12_filename=pkcs12_file,
pkcs12_password=pkcs12_pass
)
)
session.mount(
"https://",
HttpNtlmAdapter(
local_user,
local_user_password,
ntlm_compatibility=ntlm_compatibility
)
)
session.mount(
"http://",
HttpNtlmAdapter(
local_user,
local_user_password,
ntlm_compatibility=ntlm_compatibility
)
)
session.auth = HttpNtlmAuth(
local_user,
local_user_password,
ntlm_compatibility=ntlm_compatibility
)
session.proxies = proxies
try:
response = session.post(url=url,
data=content,
headers=get_headers()
)
print("Status code: {}".format(response.status_code))
except:
print("{}".format(traceback.format_exc()))
I tried with different levels of NTLM, according to:
https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc960646%28v=technet.10%29
I was searching for different examples on how to use NTLM authentication with Python requests library, but I couldn't find any that would include the headers, specified as parameter to POST method, as in my snippet below. Sending request without headers as the parameter, I get "no SOAPAction header" error from server. I suppose I should add some additional authorization headers inside the headers dict or should the libraries do that?
Can anyone please help me with this?
Thank you very much!
Best regards.
I am trying to send GET request through a proxy with authentification.
I have the following existing code:
import httplib
username = 'myname'
password = '1234'
proxyserver = "136.137.138.139"
url = "http://google.com"
c = httplib.HTTPConnection(proxyserver, 83, timeout = 30)
c.connect()
c.request("GET", url)
resp = c.getresponse()
data = resp.read()
print data
when running this code, I get an answer from the proxy saying that I must provide authentification, which is correct.
In my code, I don't use login and password. My problem is that i don't know how to use them !
Any idea ?
You can refer this code if you specifically want to use httplib.
https://gist.github.com/beugley/13dd4cba88a19169bcb0
But you could also use the easier requests module.
import requests
proxies = {
"http": "http://username:password#proxyserver:port/",
# "https": "https://username:password#proxyserver:port/",
}
url = 'http://google.com'
data = requests.get(url, proxies=proxies)
I'm using the Python requests package to send http requests. I want to add a single proxy to the requests session object. eg.
session = requests.Session()
session.proxies = {...} # Here I want to add a single proxy
Currently I am looping through a bunch of proxies, and at each iteration a new session is made. I only want to set a single proxy for each iteration.
The only example I see in the documentation is:
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)
I've tried to follow this, but to no avail. Here is my code from the script:
# eg. line = 59.43.102.33:80
r = s.get('http://icanhazip.com', proxies={'http': 'http://' + line})
But I get an error:
requests.packages.urllib3.exceptions.LocationParseError: Failed to parse 59.43.102.33:80
How is it possible to set a single proxy on a session object?
In addition to #neowu' answer, if you would like to set a proxy for the lifetime of a session object, you can also do the following -
import requests
proxies = {'http': 'http://10.11.4.254:3128'}
s = requests.session()
s.proxies.update(proxies)
s.get("http://www.example.com") # Here the proxies will also be automatically used because we have attached those to the session object, so no need to pass separately in each call
In fact, you are right, but you must ensure your defination of 'line', I have tried this , it's ok:
>>> import requests
>>> s = requests.Session()
>>> s.get("http://www.baidu.com", proxies={'http': 'http://10.11.4.254:3128'})
<Response [200]>
Did you define the line like line = ' 59.43.102.33:80', there is a space at the front of address.
There are other ways you can set proxies, apart from the solutions you have got so far:
import requests
with requests.Session() as s:
# either like this
s.proxies = {'https': 'http://105.234.154.195:8888', 'http': 'http://199.188.92.69:8000'}
# or like this
s.proxies['https'] = 'http://105.234.154.195:8888'
r = s.get(link)
Hopefully this may lead to an answer:
urllib3.util.url.parse_url(url)
Given a url, return a parsed Url namedtuple. Best-effort is performed to parse incomplete urls. Fields not provided will be None.
retrived from https://urllib3.readthedocs.org/en/latest/helpers.html
I was using Mechanize module a while ago, and now try to use Requests module.
(Python mechanize doesn't work when HTTPS and Proxy Authentication required)
I have to go through proxy-server when I access the Internet.
The proxy-server requires authentication. I wrote the following codes.
import requests
from requests.auth import HTTPProxyAuth
proxies = {"http":"192.168.20.130:8080"}
auth = HTTPProxyAuth("username", "password")
r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)
The above codes work well when proxy-server requires basic authentication.
Now I want to know what I have to do when proxy-server requires digest authentication.
HTTPProxyAuth seems not to be effective in digest authentication (r.status_code returns 407).
No need to implement your own! in most cases
Requests has built in support for proxies, for basic authentication:
proxies = { 'https' : 'https://user:password#proxyip:port' }
r = requests.get('https://url', proxies=proxies)
see more on the docs
Or in case you need digest authentication HTTPDigestAuth may help.
Or you might need try to extend it like yutaka2487 did bellow.
Note: must use ip of proxy server not its name!
I wrote the class that can be used in proxy authentication (based on digest auth).
I borrowed almost all codes from requests.auth.HTTPDigestAuth.
import requests
import requests.auth
class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):
def handle_407(self, r):
"""Takes the given response and tries digest-auth, if needed."""
num_407_calls = r.request.hooks['response'].count(self.handle_407)
s_auth = r.headers.get('Proxy-authenticate', '')
if 'digest' in s_auth.lower() and num_407_calls < 2:
self.chal = requests.auth.parse_dict_header(s_auth.replace('Digest ', ''))
# Consume content and release the original connection
# to allow our new request to reuse the same one.
r.content
r.raw.release_conn()
r.request.headers['Authorization'] = self.build_digest_header(r.request.method, r.request.url)
r.request.send(anyway=True)
_r = r.request.response
_r.history.append(r)
return _r
return r
def __call__(self, r):
if self.last_nonce:
r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)
r.register_hook('response', self.handle_407)
return r
Usage:
proxies = {
"http" :"192.168.20.130:8080",
"https":"192.168.20.130:8080",
}
auth = HTTPProxyDigestAuth("username", "password")
# HTTP
r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK
# HTTPS
r = requests.get("https://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK
I've written a Python module (available here) which makes it possible to authenticate with a HTTP proxy using the digest scheme. It works when connecting to HTTPS websites (through monkey patching) and allows to authenticate with the website as well. This should work with latest requests library for both Python 2 and 3.
The following example fetches the webpage https://httpbin.org/ip through HTTP proxy 1.2.3.4:8080 which requires HTTP digest authentication using user name user1 and password password1:
import requests
from requests_digest_proxy import HTTPProxyDigestAuth
s = requests.Session()
s.proxies = {
'http': 'http://1.2.3.4:8080/',
'https': 'http://1.2.3.4:8080/'
}
s.auth = HTTPProxyDigestAuth('user1', 'password1')
print(s.get('https://httpbin.org/ip').text)
Should the website requires some kind of HTTP authentication, this can be specified to HTTPProxyDigestAuth constructor this way:
# HTTP Basic authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),
auth=requests.auth.HTTPBasicAuth('user1', 'password0'))
print(s.get('https://httpbin.org/basic-auth/user1/password0').text))
# HTTP Digest authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),,
auth=requests.auth.HTTPDigestAuth('user1', 'password0'))
print(s.get('https://httpbin.org/digest-auth/auth/user1/password0').text)
This snippet works for both types of requests (http and https). Tested on the current version of requests (2.23.0).
import re
import requests
from requests.utils import get_auth_from_url
from requests.auth import HTTPDigestAuth
from requests.utils import parse_dict_header
from urllib3.util import parse_url
def get_proxy_autorization_header(proxy, method):
username, password = get_auth_from_url(proxy)
auth = HTTPProxyDigestAuth(username, password)
proxy_url = parse_url(proxy)
proxy_response = requests.request(method, proxy_url, auth=auth)
return proxy_response.request.headers['Proxy-Authorization']
class HTTPSAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
def proxy_headers(self, proxy):
headers = {}
proxy_auth_header = get_proxy_autorization_header(proxy, 'CONNECT')
headers['Proxy-Authorization'] = proxy_auth_header
return headers
class HTTPAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
def proxy_headers(self, proxy):
return {}
def add_headers(self, request, **kwargs):
proxy = kwargs['proxies'].get('http', '')
if proxy:
proxy_auth_header = get_proxy_autorization_header(proxy, request.method)
request.headers['Proxy-Authorization'] = proxy_auth_header
class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):
def init_per_thread_state(self):
# Ensure state is initialized just once per-thread
if not hasattr(self._thread_local, 'init'):
self._thread_local.init = True
self._thread_local.last_nonce = ''
self._thread_local.nonce_count = 0
self._thread_local.chal = {}
self._thread_local.pos = None
self._thread_local.num_407_calls = None
def handle_407(self, r, **kwargs):
"""
Takes the given response and tries digest-auth, if needed.
:rtype: requests.Response
"""
# If response is not 407, do not auth
if r.status_code != 407:
self._thread_local.num_407_calls = 1
return r
s_auth = r.headers.get('proxy-authenticate', '')
if 'digest' in s_auth.lower() and self._thread_local.num_407_calls < 2:
self._thread_local.num_407_calls += 1
pat = re.compile(r'digest ', flags=re.IGNORECASE)
self._thread_local.chal = requests.utils.parse_dict_header(
pat.sub('', s_auth, count=1))
# Consume content and release the original connection
# to allow our new request to reuse the same one.
r.content
r.close()
prep = r.request.copy()
requests.cookies.extract_cookies_to_jar(prep._cookies, r.request, r.raw)
prep.prepare_cookies(prep._cookies)
prep.headers['Proxy-Authorization'] = self.build_digest_header(prep.method, prep.url)
_r = r.connection.send(prep, **kwargs)
_r.history.append(r)
_r.request = prep
return _r
self._thread_local.num_407_calls = 1
return r
def __call__(self, r):
# Initialize per-thread state, if needed
self.init_per_thread_state()
# If we have a saved nonce, skip the 407
if self._thread_local.last_nonce:
r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)
r.register_hook('response', self.handle_407)
self._thread_local.num_407_calls = 1
return r
session = requests.Session()
session.proxies = {
'http': 'http://username:password#proxyhost:proxyport',
'https': 'http://username:password#proxyhost:proxyport'
}
session.trust_env = False
session.mount('http://', HTTPAdapterWithProxyDigestAuth())
session.mount('https://', HTTPSAdapterWithProxyDigestAuth())
response_http = session.get("http://ww3.safestyle-windows.co.uk/the-secret-door/")
print(response_http.status_code)
response_https = session.get("https://stackoverflow.com/questions/13506455/how-to-pass-proxy-authentication-requires-digest-auth-by-using-python-requests")
print(response_https.status_code)
Generally, the problem of proxy autorization is also relevant for other types of authentication (ntlm, kerberos) when connecting using the protocol HTTPS. And despite the large number of issues (since 2013, and maybe there are earlier ones that I did not find):
in requests: Digest Proxy Auth, NTLM Proxy Auth, Kerberos Proxy Auth
in urlib3: NTLM Proxy Auth, NTLM Proxy Auth
and many many others,the problem is still not resolved.
The root of the problem in the function _tunnel of the module httplib(python2)/http.client(python3). In case of unsuccessful connection attempt, it raises an OSError without returning a response code (407 in our case) and additional data needed to build the autorization header. Lukasa gave a explanation here.
As long as there is no solution from maintainers of urllib3 (or requests), we can only use various workarounds (for example, use the approach of #Tey' or do something like this).In my version of workaround, we pre-prepare the necessary authorization data by sending a request to the proxy server and processing the received response.
You can use digest authentication by using requests.auth.HTTPDigestAuth instead of requests.auth.HTTPProxyAuth
For those of you that still end up here, there appears to be a project called requests-toolbelt that has this plus other common but not built in functionality of requests.
https://toolbelt.readthedocs.org/en/latest/authentication.html#httpproxydigestauth
This works for me. Actually, don't know about security of user:password in this soulution:
import requests
import os
http_proxyf = 'http://user:password#proxyip:port'
os.environ["http_proxy"] = http_proxyf
os.environ["https_proxy"] = http_proxyf
sess = requests.Session()
# maybe need sess.trust_env = True
print(sess.get('https://some.org').text)
import requests
import os
# in my case I had to add my local domain
proxies = {
'http': 'proxy.myagency.com:8080',
'https': 'user#localdomain:password#proxy.myagency.com:8080',
}
r=requests.get('https://api.github.com/events', proxies=proxies)
print(r.text)
Here is an answer that is not for http Basic Authentication - for example a transperant proxy within organization.
import requests
url = 'https://someaddress-behindproxy.com'
params = {'apikey': '123456789'} #if you need params
proxies = {'https': 'https://proxyaddress.com:3128'} #or some other port
response = requests.get(url, proxies=proxies, params=params)
I hope this helps someone.
Just a short, simple one about the excellent Requests module for Python.
I can't seem to find in the documentation what the variable 'proxies' should contain. When I send it a dict with a standard "IP:PORT" value it rejected it asking for 2 values.
So, I guess (because this doesn't seem to be covered in the docs) that the first value is the ip and the second the port?
The docs mention this only:
proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
So I tried this... what should I be doing?
proxy = { ip: port}
and should I convert these to some type before putting them in the dict?
r = requests.get(url,headers=headers,proxies=proxy)
The proxies' dict syntax is {"protocol": "scheme://ip:port", ...}. With it you can specify different (or the same) proxie(s) for requests using http, https, and ftp protocols:
http_proxy = "http://10.10.1.10:3128"
https_proxy = "https://10.10.1.11:1080"
ftp_proxy = "ftp://10.10.1.10:3128"
proxies = {
"http" : http_proxy,
"https" : https_proxy,
"ftp" : ftp_proxy
}
r = requests.get(url, headers=headers, proxies=proxies)
Deduced from the requests documentation:
Parameters:
method – method for the new Request object.
url – URL for the new Request object.
...
proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
...
On linux you can also do this via the HTTP_PROXY, HTTPS_PROXY, and FTP_PROXY environment variables:
export HTTP_PROXY=10.10.1.10:3128
export HTTPS_PROXY=10.10.1.11:1080
export FTP_PROXY=10.10.1.10:3128
On Windows:
set http_proxy=10.10.1.10:3128
set https_proxy=10.10.1.11:1080
set ftp_proxy=10.10.1.10:3128
You can refer to the proxy documentation here.
If you need to use a proxy, you can configure individual requests with the proxies argument to any request method:
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "https://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)
To use HTTP Basic Auth with your proxy, use the http://user:password#host.com/ syntax:
proxies = {
"http": "http://user:pass#10.10.1.10:3128/"
}
I have found that urllib has some really good code to pick up the system's proxy settings and they happen to be in the correct form to use directly. You can use this like:
import urllib
...
r = requests.get('http://example.org', proxies=urllib.request.getproxies())
It works really well and urllib knows about getting Mac OS X and Windows settings as well.
The accepted answer was a good start for me, but I kept getting the following error:
AssertionError: Not supported proxy scheme None
Fix to this was to specify the http:// in the proxy url thus:
http_proxy = "http://194.62.145.248:8080"
https_proxy = "https://194.62.145.248:8080"
ftp_proxy = "10.10.1.10:3128"
proxyDict = {
"http" : http_proxy,
"https" : https_proxy,
"ftp" : ftp_proxy
}
I'd be interested as to why the original works for some people but not me.
Edit: I see the main answer is now updated to reflect this :)
If you'd like to persisist cookies and session data, you'd best do it like this:
import requests
proxies = {
'http': 'http://user:pass#10.10.1.0:3128',
'https': 'https://user:pass#10.10.1.0:3128',
}
# Create the session and set the proxies.
s = requests.Session()
s.proxies = proxies
# Make the HTTP request through the session.
r = s.get('http://www.showmemyip.com/')
8 years late. But I like:
import os
import requests
os.environ['HTTP_PROXY'] = os.environ['http_proxy'] = 'http://http-connect-proxy:3128/'
os.environ['HTTPS_PROXY'] = os.environ['https_proxy'] = 'http://http-connect-proxy:3128/'
os.environ['NO_PROXY'] = os.environ['no_proxy'] = '127.0.0.1,localhost,.local'
r = requests.get('https://example.com') # , verify=False
The documentation
gives a very clear example of the proxies usage
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
requests.get('http://example.org', proxies=proxies)
What isn't documented, however, is the fact that you can even configure proxies for individual urls even if the schema is the same!
This comes in handy when you want to use different proxies for different websites you wish to scrape.
proxies = {
'http://example.org': 'http://10.10.1.10:3128',
'http://something.test': 'http://10.10.1.10:1080',
}
requests.get('http://something.test/some/url', proxies=proxies)
Additionally, requests.get essentially uses the requests.Session under the hood, so if you need more control, use it directly
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
session = requests.Session()
session.proxies.update(proxies)
session.get('http://example.org')
I use it to set a fallback (a default proxy) that handles all traffic that doesn't match the schemas/urls specified in the dictionary
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
session = requests.Session()
session.proxies.setdefault('http', 'http://127.0.0.1:9009')
session.proxies.update(proxies)
session.get('http://example.org')
i just made a proxy graber and also can connect with same grabed proxy without any input
here is :
#Import Modules
from termcolor import colored
from selenium import webdriver
import requests
import os
import sys
import time
#Proxy Grab
options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://www.sslproxies.org/")
tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")
for column in cell:
column = column.text.split(" ")
print(colored(column[0]+":"+column[1],'yellow'))
driver.quit()
print("")
os.system('clear')
os.system('cls')
#Proxy Connection
print(colored('Getting Proxies from graber...','green'))
time.sleep(2)
os.system('clear')
os.system('cls')
proxy = {"http": "http://"+ column[0]+":"+column[1]}
url = 'https://mobile.facebook.com/login'
r = requests.get(url, proxies=proxy)
print("")
print(colored('Connecting using proxy' ,'green'))
print("")
sts = r.status_code
here is my basic class in python for the requests module with some proxy configs and stopwatch !
import requests
import time
class BaseCheck():
def __init__(self, url):
self.http_proxy = "http://user:pw#proxy:8080"
self.https_proxy = "http://user:pw#proxy:8080"
self.ftp_proxy = "http://user:pw#proxy:8080"
self.proxyDict = {
"http" : self.http_proxy,
"https" : self.https_proxy,
"ftp" : self.ftp_proxy
}
self.url = url
def makearr(tsteps):
global stemps
global steps
stemps = {}
for step in tsteps:
stemps[step] = { 'start': 0, 'end': 0 }
steps = tsteps
makearr(['init','check'])
def starttime(typ = ""):
for stemp in stemps:
if typ == "":
stemps[stemp]['start'] = time.time()
else:
stemps[stemp][typ] = time.time()
starttime()
def __str__(self):
return str(self.url)
def getrequests(self):
g=requests.get(self.url,proxies=self.proxyDict)
print g.status_code
print g.content
print self.url
stemps['init']['end'] = time.time()
#print stemps['init']['end'] - stemps['init']['start']
x= stemps['init']['end'] - stemps['init']['start']
print x
test=BaseCheck(url='http://google.com')
test.getrequests()
It’s a bit late but here is a wrapper class that simplifies scraping proxies and then making an http POST or GET:
ProxyRequests
https://github.com/rootVIII/proxy_requests
Already tested, the following code works. Need to use HTTPProxyAuth.
import requests
from requests.auth import HTTPProxyAuth
USE_PROXY = True
proxy_user = "aaa"
proxy_password = "bbb"
http_proxy = "http://your_proxy_server:8080"
https_proxy = "http://your_proxy_server:8080"
proxies = {
"http": http_proxy,
"https": https_proxy
}
def test(name):
print(f'Hi, {name}') # Press Ctrl+F8 to toggle the breakpoint.
# Create the session and set the proxies.
session = requests.Session()
if USE_PROXY:
session.trust_env = False
session.proxies = proxies
session.auth = HTTPProxyAuth(proxy_user, proxy_password)
r = session.get('https://www.stackoverflow.com')
print(r.status_code)
if __name__ == '__main__':
test('aaa')
I share some code how to fetch proxies from the site "https://free-proxy-list.net" and store data to a file compatible with tools like "Elite Proxy Switcher"(format IP:PORT):
##PROXY_UPDATER - get free proxies from https://free-proxy-list.net/
from lxml.html import fromstring
import requests
from itertools import cycle
import traceback
import re
######################FIND PROXIES#########################################
def get_proxies():
url = 'https://free-proxy-list.net/'
response = requests.get(url)
parser = fromstring(response.text)
proxies = set()
for i in parser.xpath('//tbody/tr')[:299]: #299 proxies max
proxy = ":".join([i.xpath('.//td[1]/text()')
[0],i.xpath('.//td[2]/text()')[0]])
proxies.add(proxy)
return proxies
######################write to file in format IP:PORT######################
try:
proxies = get_proxies()
f=open('proxy_list.txt','w')
for proxy in proxies:
f.write(proxy+'\n')
f.close()
print ("DONE")
except:
print ("MAJOR ERROR")