How to use proxy with Robobrowser

How to use proxy with Robobrowser - python

I'm working with http://robobrowser.readthedocs.org/en/latest/readme.html, (a new python library based on the beautiful soup and request libraries) within django. My django app contains :
def index(request):
p=str(request.POST.get('p', False)) # p='https://www.yahoo.com/'
pr="http://10.10.1.10:3128/"
setProxy(pr)
browser = RoboBrowser(history=True)
postedmessage = browser.open(p)
return HttpResponse(postedmessage)
I would like to add a proxy to my code but can't find a reference in the docs on how to do this. Is it possible to do this?
EDIT:
following your recommendation I've changed the code to
pr="http://10.10.1.10:3128/"
setProxy(pr)
browser = RoboBrowser(history=True)
with:
def setProxy(pr):
import os
os.environ['HTTP_PROXY'] = pr
return
I'm now getting:
Django Version: 1.6.4
Exception Type: LocationParseError
Exception Value:
Failed to parse: Failed to parse: 10.10.1.10:3128
Any ideas on what to do next? I can't find a reference to this error

After some recent API cleanup in RoboBrowser, there are now two relatively straightforward ways to control proxies. First, you can configure proxies in your requests session, and then pass that session to your browser. This will apply your proxies to all requests made through the browser.
from requests import Session
from robobrowser import RoboBrowser
session = Session()
session.proxies = {'http': 'http://my.proxy.com/'}
browser = RoboBrowser(session=session)
Second, you can set proxies on a per-request basis. The open, follow_link, and submit_form methods of RoboBrowser now accept keyword arguments for requests.Session.send. For example:
browser.open('http://stackoverflow.com/', proxies={'http': 'http://your.proxy.com'})

Since RoboBrowser uses the request library, you can try to set the proxies as mentioned in the request docs by setting the environment variables HTTP_PROXY and HTTPS_PROXY.

Related

what is the proper way to use proxies with requests in python

Requests is not honoring the proxies flag.
There is something I am missing about making a request over a proxy with python requests library.
If I enable the OS system proxy, then it works, but if I make the request with just requests module proxies setting, the remote machine will not see the proxy set in requests, but will see my real ip, it is as if not proxy was set.
The bellow example will show this effect, at the time of this post the bellow proxy is alive but any working proxy should replicate the effect.
import requests
proxy ={
'http:': 'https://143.208.200.26:7878',
'https:': 'http://143.208.200.26:7878'
}
data = requests.get(url='http://ip-api.com/json', proxies=proxy).json()
print('Ip: %s\nCity: %s\nCountry: %s' % (data['query'], data['city'], data['country']))
I also tried changing the proxy_dict format:
proxy ={
'http:': '143.208.200.26:7878',
'https:': '143.208.200.26:7878'
}
But still it has not effect.
I am using:
-Windows 10
-python 3.9.6
-urllib 1.25.8
Many thanks in advance for any response to help sort this out.

Ok is working yea !!! .
The credits for solving this goes to (Olvin Rogh) Thanks Olvin for your help and pointing out my problem. I was adding colon ":" inside the keys
This code is working now.
PROXY = {'https': 'https://143.208.200.26:7878',
'http': 'http://143.208.200.26:7878'}
with requests.Session() as session:
session.proxies = PROXY
r = session.get('http://ip-api.com/json')
print(json.dumps(r.json(), indent=2))

How to set proxy while using urllib3.PoolManager in python

I am currently using connection pool provided by urllib3 in python like the following,
pool = urllib3.PoolManager(maxsize = 10)
resp = pool.request('GET', 'http://example.com')
content = resp.read()
resp.release_conn()
However, I don't know how to set proxy while using this connection pool. I tried to set proxy in the 'request' like pool.request('GET', 'http://example.com', proxies={'http': '123.123.123.123:8888'} but it didn't work.
Can someone tell me how to set the proxy while using connection pool
Thanks~

There is an example for how to use a proxy with urllib3 in the Advanced Usage section of the documentation. I adapted it to fit your example:
import urllib3
proxy = urllib3.ProxyManager('http://123.123.123.123:8888/', maxsize=10)
resp = proxy.request('GET', 'http://example.com/')
content = resp.read()
# You don't actually need to release_conn() if you're reading the full response.
# This will be a harmless no-op:
resp.release_conn()
The ProxyManager behaves the same way as a PoolManager would.

Python requests NTLM without password [duplicate]

How can I use automatic NTLM authentication from python on Windows?
I want to be able to access the TFS REST API from windows without hardcoding my password, the same as I do from the web browser (firefox's network.automatic-ntlm-auth.trusted-uris, for example).

I found this answer which works great for me because:
I'm only going to run it from Windows, so portability isn't a problem
The response is a simple json document, so no need to store an open session
It's using the WinHTTP.WinHTTPRequest.5.1 COM object to handle authentication natively:
import win32com.client
URL = 'http://bigcorp/tfs/page.aspx'
COM_OBJ = win32com.client.Dispatch('WinHTTP.WinHTTPRequest.5.1')
COM_OBJ.SetAutoLogonPolicy(0)
COM_OBJ.Open('GET', URL, False)
COM_OBJ.Send()
print(COM_OBJ.ResponseText)

You can do that with https://github.com/requests/requests-kerberos. Under the hood it's using https://github.com/mongodb-labs/winkerberos. The latter is marked as Beta, I'm not sure how stable it is. But I have requests-kerberos in use for a while without any issue.
Maybe a more stable solution would be https://github.com/brandond/requests-negotiate-sspi, which is using pywin32's SSPI implementation.

I found solution here https://github.com/mullender/python-ntlm/issues/21
pip install requests
pip install requests_negotiate_sspi
import requests
from requests_negotiate_sspi import HttpNegotiateAuth
GetUrl = "http://servername/api/controller/Methodname" # Here you need to set your get Web api url
response = requests.get(GetUrl, auth=HttpNegotiateAuth())
print("Get Request Outpot:")
print("--------------------")
print(response.content)
for request by https:
import requests
from requests_negotiate_sspi import HttpNegotiateAuth
import urllib3
urllib3.disable_warnings()
GetUrl = "https://servername/api/controller/Methodname" # Here you need to set your get Web api url
response = requests.get(GetUrl, auth=HttpNegotiateAuth(), verify=False)
print("Get Request Outpot:")
print("--------------------")
print(response.content)

NTLM credentials are based on data obtained during the interactive logon process, and include a one-way hash of the password. You have to provide the credential.
Python has requests_ntlm library that allows for HTTP NTLM authentication.
You can reference this article to access the TFS REST API :
Python Script to Access Team Foundation Server (TFS) Rest API
If you are using TFS 2017 or VSTS, you can try to use Personal Access Token in a Basic Auth HTTP Header along with your REST request.

Using urllib2 via proxy

I am trying to use urllib2 through a proxy; however, after trying just about every variation of passing my verification details using urllib2, I either get a request that hangs forever and returns nothing or I get 407 Errors. I can connect to the web fine using my browser which connects to a prox-pac and redirects accordingly; however, I can't seem to do anything via the command line curl, wget, urllib2 etc. even if I use the proxies that the prox-pac redirects to. I tried setting my proxy to all of the proxies from the pac-file using urllib2, none of which work.
My current script looks like this:
import urllib2 as url
proxy = url.ProxyHandler({'http': 'username:password#my.proxy:8080'})
auth = url.HTTPBasicAuthHandler()
opener = url.build_opener(proxy, auth, url.HTTPHandler)
url.install_opener(opener)
url.urlopen("http://www.google.com/")
which throws HTTP Error 407: Proxy Authentication Required and I also tried:
import urllib2 as url
handlePass = url.HTTPPasswordMgrWithDefaultRealm()
handlePass.add_password(None, "http://my.proxy:8080", "username", "password")
auth_handler = url.HTTPBasicAuthHandler(handlePass)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
url.urlopen("http://www.google.com")
which hangs like curl or wget timing out.
What do I need to do to diagnose the problem? How is it possible that I can connect via my browser but not from the command line on the same computer using what would appear to be the same proxy and credentials?
Might it be something to do with the router? if so, how can it distinguish between browser HTTP requests and command line HTTP requests?

Frustrations like this are what drove me to use Requests. If you're doing significant amounts of work with urllib2, you really ought to check it out. For example, to do what you wish to do using Requests, you could write:
import requests
from requests.auth import HTTPProxyAuth
proxy = {'http': 'http://my.proxy:8080'}
auth = HTTPProxyAuth('username', 'password')
r = requests.get('http://wwww.google.com/', proxies=proxy, auth=auth)
print r.text
Or you could wrap it in a Session object and every request will automatically use the proxy information (plus it will store & handle cookies automatically!):
s = requests.Session(proxies=proxy, auth=auth)
r = s.get('http://www.google.com/')
print r.text

urllib2 failing with https websites

Using urllib2 and trying to get an https page, it keeps failing with
Invalid url, unable to resolve
The url is
https://www.domainsbyproxy.com/default.aspx
but I have this happening on multiple https sites.
I am using python 2.7, and below is the code I am using to setup the connection
opener = urllib2.OpenerDirector()
opener.add_handler(urllib2.HTTPHandler())
opener.add_handler(urllib2.HTTPDefaultErrorHandler())
opener.addheaders = [('Accept-encoding', 'gzip')]
fetch_timeout = 12
response = opener.open(url, None, fetch_timeout)
The reason I am setting handlers manually is because I don't want redirects handled (which works fine). The above works fine for http requests, however https - fails.
Any clues?

You should be using HTTPSHandler instead of HTTPHandler

If you don't mind external libraries, consider the excellent requests module. It takes care of these quirks with urllib.
Your code, using requests is:
import requests
r = requests.get(url, headers={'Accept-encoding': 'gzip'}, timeout=12)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use proxy with Robobrowser - python

Since RoboBrowser uses the request library, you can try to set the proxies as mentioned in the request docs by setting the environment variables HTTP_PROXY and HTTPS_PROXY.

Related

what is the proper way to use proxies with requests in python

How to set proxy while using urllib3.PoolManager in python

Python requests NTLM without password [duplicate]

Using urllib2 via proxy

urllib2 failing with https websites

Categories

Resources