Selenium Chrome & Firefox WebDriver: Set HTTPS proxy in Python - python

I've done plenty of searching however there is a lot of confusing snippets out there that are very similar.
I've attempted to use the DesiredCapabilities, ChromeOptions, Options and a series of arguments but nothing is working :( It fails to set a proxy.
For example (ChromeOptions)
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy=https://' + proxy_ip_and_port)
chrome_options.add_argument('--proxy-auth=' + proxy_user_and_pass)
chrome_options.add_argument('--proxy-type=https')
browser = webdriver.Chrome("C:\drivers\chromedriver.exe")
Another example (Options)
options = Options()
options.add_argument('--proxy=https://' + proxy_ip_and_port)
options.add_argument('--proxy-auth=' + proxy_user_and_pass)
options.add_argument('--proxy-type=https')
browser = webdriver.Chrome("C:\drivers\chromedriver.exe", chrome_options=options)
I've also used --proxy-server instead of --proxy-auth, --proxy-type... etc even in the format of: '--proxy-server=http://' + proxy_user_and_pass + '#' + proxy_ip_and_port
Another example (DesiredCapabilities)
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
capabilities = dict(DesiredCapabilities.CHROME)
capabilities['proxy'] = {'proxyType': 'MANUAL',
'httpProxy': proxy_ip_and_port,
'ftpProxy': proxy_ip_and_port,
'sslProxy': proxy_ip_and_port,
'noProxy': '',
'class': "org.openqa.selenium.Proxy",
'autodetect': False}
capabilities['proxy']['socksUsername'] = proxy_user
capabilities['proxy']['socksPassword'] = proxy_pass
browser = webdriver.Chrome(executable_path="C:\drivers\chromedriver.exe", desired_capabilities=capabilities)
I've tried in Firefox too but the same issue happens, it uses the browser with my normal IP.

According to the latest documentation (Jul 2020) you set the DesiredCapabilities for either FIREFOX or CHROME.
I've tested it for Firefox. You can check your browser's connection settings afterwards to validate the proxy was set correctly.
from selenium import webdriver
PROXY = "<HOST>:<PORT>" # HOST can be IP or name
webdriver.DesiredCapabilities.FIREFOX['proxy'] = {
"httpProxy": PROXY,
"ftpProxy": PROXY,
"sslProxy": PROXY, # this is the https proxy
"proxyType": "MANUAL",
}
with webdriver.Firefox() as driver:
# Open URL
driver.get("https://selenium.dev")
The proxy-dict itself is documented in the Selenium wiki. You'll see here that the attribute sslProxy sets the proxy for https.
I haven't tested it for Chrome though. If it shouldn't work, you may find clues in Google's ChromeDriver documentation. According to this you also need to instantiate the webdriver with the desired_capabilites parameter (which is then actually very similar to your example, so this is now more of a guess than a proven solution):
caps = webdriver.DesiredCapabilities.CHROME.copy()
caps['proxy'] = ... # like described above
driver = webdriver.Chrome(desired_capabilities=caps)
driver.get("https://selenium.dev")

Related

I cannot get my proxies to work on Selenium

I have searched on stackoverflow as well as just googled how to use Proxies with Selenium. I found two different ways but none are working for me. Can you guys please help me figure out what I am doing wrong?
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType
proxy = "YYY.YYY.YYY.YY:XXXX"
prox = Proxy()
prox.proxy_type = ProxyType.MANUAL
prox.http_proxy = proxy
prox.https_proxy = proxy
capabilities = webdriver.DesiredCapabilities.CHROME
prox.add_to_capabilities(capabilities)
options = webdriver.ChromeOptions()
options.add_experimental_option('detach', True)
driver = webdriver.Chrome(desired_capabilities=capabilities, options=options)
Code Above did not work. The page would open but if I go to "Whatsmyip.com" I could see my home IP.
I then tried another method I found on this link:
https://www.browserstack.com/guide/set-proxy-in-selenium
proxy = "YYY.YYY.YYY.YY:XXXX"
options = webdriver.ChromeOptions()
options.add_experimental_option('detach', True)
options.add_argument("--proxy--server=%s" % proxy)
driver = webdriver.Chrome(options = options)
Same result as with the previous method. Browser will open but home IP.
Worth mentioning that I tried with USER:PASS proxies, as well as IP Authorized proxies. None worked!
In addition to helping me figure out how to use proxies, I would also like to understand why these methods are different. On the one hand, the Selenium documentation talks about a proxy class which you access via the "common.proxy" class, yet the second method is directly using Chrome's options and not Selenium's proxy class. I am confused as to why have two methods, and of course which one works more reliably.
Thanks
Authentificated proxies aren't supported for chrome by default. If you still need them, refer to Selenium-Profiles.
Your second code snippet should work as following:
proxy = "https://host_or_ip:port" # or "socks5://" or "http://"
options = webdriver.ChromeOptions()
options.add_argument("--proxy--server=%s" % proxy)
driver = webdriver.Chrome(options = options)
driver.get("http://lumtest.com/myip.json") # test proxy
input("Press ENTER to exit")
If it still doesn't work, check your proxy with curl or python requests.

Selenium Chrome WebDriver doesn't use proxy

I'm using Selenium webdriver to open a webpage and I set up a proxy for the driver to use. The code is listed below:
PATH = "C:\Program Files (x86)\chromedriver.exe"
PROXY = "212.237.16.60:3128" # IP:PORT or HOST:PORT
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f'--proxy-server={PROXY}')
proxy = Proxy()
proxy.auto_detect = False
proxy.http_proxy = PROXY
proxy.sslProxy = PROXY
proxy.socks_proxy = PROXY
capabilities = webdriver.DesiredCapabilities.CHROME
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(PATH, chrome_options=chrome_options,desired_capabilities=capabilities)
driver.get("https://whatismyipaddress.com")
The problem is that the web driver is not using the given proxy and it accesses the page with my normal IP. I already tried every type of code I could find on the internet and it didn't work. I also tried to set a proxy directly in my pc settings and when I open a normal chrome page it works fine (it's not a proxy server problem then), but if I open a page with the driver it still uses my normal IP and somehow bypasses the proxy. I also tried changing the proxy settings of the IDE (pycharm) and still it's not working. I'm out of ideas, could someone help me?
This should work.
Code snippet-
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
PROXY = "212.237.16.60:3128"
#add proxy in chrome_options
chrome_options.add_argument(f'--proxy-server={PROXY}')
driver = webdriver.Chrome(PATH,options=chrome_options)
#to check new IP
driver.get("https://api.ipify.org/?format=json")
Note:- chrome_options is deprecated now, you have to use options instead

Not able to open the webpage through selenium python

I am new to selenium python and I am trying to scrape the data from a website. Below is the code, where I have taken all the necessary precautions to not get blocked.
from random import randrange
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
#Function to generate random useragent.
def generate_user_agent():
user_agents_file = open("user_agents.txt", "r")
user_agents = user_agents_file.read().split("\n")
i = randrange(len(user_agents))
userAgent = user_agents[i]
user_agents_file.close()
return userAgent
#Function to generate random IP address.
def generate_ip_address():
proxies_file = open("proxyscrape_premium_http_proxies.txt", "r")
proxies = proxies_file.read().split("\n")
i = randrange(len(proxies))
proxy = proxies[i]
proxies_file.close()
return proxy
#Function to create and set chrome options.
def set_chrome_options():
proxy = generate_ip_address()
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--incognito")
options.add_argument(f'--proxy-server={proxy}')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
return options, proxy
#Function to create a webdriver object and set its properties.
def create_webdriver():
options, proxy = set_chrome_options()
userAgent = generate_user_agent()
webdriver.DesiredCapabilities.CHROME['proxy'] = {
"httpProxy": proxy,
"ftpProxy": proxy,
"sslProxy": proxy,
"proxyType": "MANUAL",}
webdriver.DesiredCapabilities.CHROME['acceptSslCerts']=True
driver = webdriver.Chrome(options=options, executable_path=r'chromedriver.exe')
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": userAgent})
return driver
url = 'http://www.doctolib.de/impfung-covid-19-corona/berlin'
driver = create_webdriver()
driver.get(url)
The webpage is not opened via selenium web driver(but can be opened normally). Below is the screenshot of how the browser is opened when I run the code.
Please let me know If I am missing something. Any help would be highly appreciated
PS: I am using the premium proxies for IP rotation.
Browser_output
I've had similar experience in the past where the website detects that selenium is being used, even after using several methods like IP rotation, User-Agent rotation or using proxies.
I would suggest you to use the undetected_chromedriver library.
pip install undetected-chromedriver
It's able to load the website without any problem.
The code snippet is given below:-
import undetected_chromedriver.v2 as uc
driver = uc.Chrome()
with driver:
driver.get('http://www.doctolib.de/impfung-covid-19-corona/berlin')
I was having similar issue with Firefox on Linux. I just deleted the log file which was quite big for text file (4.8 mb) created by geckodriver and everything started to work fine again

How to open Tor using Python Selenium on Windows [duplicate]

I am trying to connect to a Tor browser but get an error stating "proxyConnectFailure" any ideas I have tried multiple attempts to get into the basics of Tor browser to get it connected but all in vain if any could help life could be saved big time:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary(r"C:\Users\Admin\Desktop\Tor Browser\Browser\firefox.exe")
profile = FirefoxProfile(r"C:\Users\Admin\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default")
# Configured profile settings.
proxyIP = "127.0.0.1"
proxyPort = 9150
proxy_settings = {"network.proxy.type":1,
"network.proxy.socks": proxyIP,
"network.proxy.socks_port": proxyPort,
"network.proxy.socks_remote_dns": True,
}
driver = webdriver.Firefox(firefox_binary=binary,proxy=proxy_settings)
def interactWithSite(driver):
driver.get("https://www.google.com")
driver.save_screenshot("screenshot.png")
interactWithSite(driver)
To connect to a Tor Browser through a FirefoxProfile you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
torexe = os.popen(r'C:\Users\AtechM_03\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe')
profile = FirefoxProfile(r'C:\Users\AtechM_03\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile= profile, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://check.torproject.org")
Browser Snapshot:
You can find a relevant discussion in How to use Tor with Chrome browser through Selenium
I would like to expand on #DebanjanB answer by adding the Linux counterpart:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
torexe = os.popen('some/path/tor-browser_en-US/Browser/start-tor-browser')
# in my case, I installed it under a folder tor-browser_en-US after
# downloading and extracting it from
# https://www.torproject.org/download/ for linux
profile = FirefoxProfile(
'some/path/tor-browser_en-US/Browser/TorBrowser/Data/Browser/profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()
firefox_options = webdriver.FirefoxOptions()
firefox_options.binary_location = '/usr/bin/firefox'
# /usr/bin/firefox is default location of firefox - for me anyway
driver = webdriver.Firefox(
firefox_profile=profile, options=firefox_options,
executable_path='wherever/you/installed/geckodriver')
# I keep my geckodriver(s) in a special folder sorted by versions.
# Geckodriver downloadable here:
# https://github.com/mozilla/geckodriver/releases/
driver.get("http://check.torproject.org")
The verified answer does not work in case of opening dot onion sites(I believe that's something to do with tor network which is not allowing access to normal firefox).
As for the latest tor browser (from the tor browser bundle), starting it using selenium causes some error due to which the browser cannot start tor proxy itself causing proxy and timeout errors(doesn't matter if tor proxy is started by python or manually or not started at all). This could also be due to port 9050 or 9150 being used by tor proxy and not being available to browser's tor instance but this does not explain the error caused when no instance of tor proxy is running.
The solution i have found is to start the tor proxy as normal, manually or using os.popen("tor.exe") and configure tor browser to not start tor proxy.
here's the code:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
os.popen(r'e:\\bla\\bla\\bla\\tor\\Tor\\tor.exe')
binary=FirefoxBinary(r'e:\\bla\\bla\\bla\\Tor Browser\\Browser\\firefox.exe')
fp=FirefoxProfile(r'e:\\foo\\bar\\bla\\Tor Browser\\Browser\\TorBrowser\\Data\\Browser\\profile.default')
fp.set_preference('extensions.torlauncher.start_tor',False)#note this
fp.set_preference('network.proxy.type',1)
fp.set_preference('network.proxy.socks', '127.0.0.1')
fp.set_preference('network.proxy.socks_port', 9050)
fp.set_preference("network.proxy.socks_remote_dns", True)
fp.update_preferences()
driver = webdriver.Firefox(firefox_profile=fp,firefox_binary=binary)
driver.get("http://check.torproject.org")
driver.get('https://www.bbcnewsv2vjtpsuy.onion/')
*note fp.set_preference('extensions.torlauncher.start_tor',False) on line 10 is being used to configure tor to not start its own tor instance so that it uses the proxy config and tor instance started above.
lo and behold as the tbb starts working like normal firefox bot browser

How to set luminati proxy on Selenium Webdriver for chrome in python?

I want to set luminati proxy in webdriver.Chrome for selenium python.
I have tried using this following command:
from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.proxy import *
PROXY = '127.0.0.1:24000'
proxy = Proxy()
proxy.http_proxy = PROXY
proxy.ftp_proxy = PROXY
proxy.sslProxy = PROXY
proxy.no_proxy = "localhost" #etc... ;)
proxy.proxy_type = ProxyType.MANUAL
#limunati customer info
proxy.socksUsername = 'lum-customer-XXXX-zone-XXXX'
proxy.socksPassword = "XXXX"
capabilities = webdriver.DesiredCapabilities.CHROME
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(desired_capabilities=capabilities)
I used my Luminati username,zone and password to setup this. But it's not working.
Try this code snippet:
from selenium import webdriver
# http://username:password#localhost:8080
PROXY = "http://lum-customer-XXXX-zone-XXXX:XXXX#localhost:8080"
# Create a copy of desired capabilities object.
desired_capabilities = webdriver.DesiredCapabilities.CHROME.copy()
# Change the proxy properties of that copy.
desired_capabilities['proxy'] = {
"httpProxy":PROXY,
"ftpProxy":PROXY,
"sslProxy":PROXY,
"noProxy":None,
"proxyType":"MANUAL",
"class":"org.openqa.selenium.Proxy",
"autodetect":False
}
# you have to use remote, otherwise you'll have to code it yourself in python to
# dynamically changing the system proxy preferences
driver = webdriver.Remote("http://localhost:4444/wd/hub", desired_capabilities)
from official resource.
Try this code snippet. It's work for me with Chrome, but you can replace Chrome by Firefox :
from selenium.webdriver.chrome.options import Options
from seleniumwire import webdriver
import time
PATH = "path to driver"
options = {
'proxy':
{
'http': 'http://lum-customer-CUST_ID-zone-ZONE_NAME-country-COUNTRY_TARGET:PASSWORD#zproxy.lum-superproxy.io:22225',
'https': 'https://lum-customer-CUST_ID-zone-ZONE_NAME-country-COUNTRY_TARGET:PASSWORD#zproxy.lum-superproxy.io:22225'
},
}
driver = webdriver.Chrome(PATH, seleniumwire_options=options)
# If need to add headers to request
# driver.header_overrides = {
# }
driver.get("https://lumtest.com/myip.json")
print(driver.execute_script('return document.body.innerHTML;')) #print page html
time.sleep(3)
driver.quit()
Likely it's not working because you haven't removed setting of the proxy.no_proxy = 'localhost', that value shall be excluded.
If you haven't used Python Selenium Luminati Proxy Desired Capabilities that link before it will be useful ...
String proxi = "--your proxy url and port";// like 12345:1212
Proxy proxy = new Proxy();
proxy.setHttpProxy(proxi);
//limunati customer info
proxy.setSocksUsername("Your Username");
proxy.setSocksPassword("Your Password");
ChromeOptions options = new ChromeOptions();
options.setCapability("proxy", proxy);
System.setProperty("webdriver.chrome.driver", "Drivers/chromedriver.exe");
driver = new ChromeDriver(options);
driver.get("https://www.google.com");

Categories

Resources