Following this answer, I changed the proxy of my driver like so:
from selenium import webdriver
def set_proxy(driver, http_addr='', http_port=0, ssl_addr='', ssl_port=0, socks_addr='', socks_port=0):
driver.execute("SET_CONTEXT", {"context": "chrome"})
try:
driver.execute_script("""
Services.prefs.setIntPref('network.proxy.type', 1);
Services.prefs.setCharPref("network.proxy.http", arguments[0]);
Services.prefs.setIntPref("network.proxy.http_port", arguments[1]);
Services.prefs.setCharPref("network.proxy.ssl", arguments[2]);
Services.prefs.setIntPref("network.proxy.ssl_port", arguments[3]);
Services.prefs.setCharPref('network.proxy.socks', arguments[4]);
Services.prefs.setIntPref('network.proxy.socks_port', arguments[5]);
""", http_addr, http_port, ssl_addr, ssl_port, socks_addr, socks_port)
finally:
driver.execute("SET_CONTEXT", {"context": "content"})
driver = webdriver.Firefox()
set_proxy(driver, http_addr="212.35.56.21", http_port=8080)
driver.get("https://google.fr/")
It ran, but I'm not sure if the firefox browser is using the proxy that I gave to it or not. I thus have two questions:
Is this the correct way to change the proxy?
How can I get the value of the current proxy used by the driver?
Related
I tried running a python test script for a login page with saucelabs. I got this error.
selenium.common.exceptions.WebDriverException: Message: failed serving request POST /wd/hub/session: Unauthorized
I looked up online for a solution, found something on this link. But got nothing
selenium - 4.0.0
python - 3.8
Here's the code:
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options as ChromeOptions
options = ChromeOptions()
options.browser_version = '96'
options.platform_name = 'Windows 10'
options.headless = True
sauce_options = {'username': 'sauce_username',
'accessKey': 'sauce_access_key',
}
options.set_capability('sauce:options', sauce_options)
sauce_url = "https://{}:{}#ondemand.us-west-1.saucelabs.com/wd/hub".format(sauce_options['username'],sauce_options['accessKey'])
driver = webdriver.Remote(command_executor=sauce_url, options=options)
driver.get('https://cq-portal.qa.webomates.com/#/login')
time.sleep(10)
user=driver.find_element('css selector','#userName')
password=driver.find_element('id','password')
login=driver.find_element('xpath','//*[#id="submitButton"]')
user.send_keys('userId')
password.send_keys('password')
login.click()
time.sleep(10)
print('login successful')
driver.quit()
Your code is authenticating two different ways, which I suspect is the problem.
You're passing in sauce_options in a W3C compatible way (which is good), but you've also configured HTTP-style credentials, even though they're empty. In the sauce_url, the {}:{} section basically sets up a username and accessKey of nil.
If you're going to pass in credentials via sauce:options, you should remove that everything between the protocol and the # symbol in the URL, eg:
sauce_url = "https://ondemand.us-west-1.saucelabs.com/wd/hub".format(sauce_options['username'],sauce_options['accessKey'])
I would like to start a selenium browser with a particular setup (privoxy, Tor, randon user agent...) in a function and then call this function in my code. I have created a python script mybrowser.py with this inside:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from fake_useragent import UserAgent
from stem import Signal
from stem.control import Controller
class MyBrowserClass:
def start_browser():
service_args = [
'--proxy=127.0.0.1:8118',
'--proxy-type= http',
]
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (UserAgent().random)
browser = webdriver.PhantomJS(service_args = service_args, desired_capabilities=dcap)
return browser
def set_new_ip():
with Controller.from_port(port=9051) as controller:
controller.authenticate(password=password)
controller.signal(Signal.NEWNYM)
Then I import it into another script myscraping.py with this inside:
import mybrowser
import time
browser= mybrowser.MyBrowserClass.start_browser()
browser.get("https://canihazip.com/s")
print(browser.page_source)
mybrowser.MyBrowserClass.set_new_ip()
time.sleep(12)
browser.get("https://canihazip.com/s")
print(browser.page_source)
The browser is working - I can access the page and retrieve it with .page_source.
But the IP doesn't change between the first and the second print. If I move the content of the function inside myscraping.py (and remove the import + function call) then the IP change.
Why? Is it a problem with returning the browser? How can I fix this?
Actually, the situation is a bit more complex. When I connect to https://check.torproject.org before and after the call to mybrowser.set_new_ip() and the wait of 12 sec (cf the lines below), the IP given by the webpage changes between the first and the second call. So my Ip is changed (according to Tor) but neither https://httpbin.org/ip nor icanhazip.com detects the change in the IP.
...
browser.get("https://canihazip.com/s")
print(browser.page_source)
browser.get("https://check.torproject.org/")
print(browser.find_element_by_xpath('//div[#class="content"]').text )
mybrowser.set_new_ip()
time.sleep(12)
browser.get("https://check.torproject.org/")
print(browser.find_element_by_xpath('//div[#class="content"]').text )
browser.get("https://canihazip.com/s")
print(browser.page_source)
So the IP that are printed are like that:
42.38.215.198 (canihazip before mybrowser.set_new_ip() )
42.38.215.198 (check.torproject before mybrowser.set_new_ip() )
106.184.130.30 (check.torproject after mybrowser.set_new_ip() )
42.38.215.198 (canihazip after mybrowser.set_new_ip())
Privoxy configuration: in C:\Program Files (x86)\Privoxy\config.txt, I have uncommented this line (9050 is the port Tor uses):
forward-socks5t / 127.0.0.1:9050
Tor configuration: in torcc, I have this:
ControlPort 9051
HashedControlPassword : xxxx
This is probably because of PhantomJS keeping a memory cache of requested content. So your first visit using a PhantomJS browser can have a dynamic result but that result is then cached and each consecutive visit uses that cached page.
This memory cache has caused issues like CSRF-Token's not changing on refresh and now I believe it is the root cause of your problem. The issue was presented and resolved in 2013 but the solution is a method, clearMemoryCache, found in PhantomJS's WebPage class. Sadly, we are dealing with a Selenium webdriver.PhantomJS instance.
So, unless I am overseeing something, it'd be tough to access this method through Selenium's abstraction.
The only solution I see fit is to use another webdriver that doesn't have a memory cache like PhantomJS's. I have tested it using Chrome and it works perfectly:
103.***.**.***
72.***.***.***
Also, as a side note, Selenium is phasing out PhantomJS:
UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead
You might need to check if a new nym is available or not.
is_newnym_available - true if tor would currently accept a NEWNYM signal
if controller.is_newnym_available():
controller.signal(Signal.NEWNYM)
I want to get a new window url by using selenium, and using PhantomJs is more efficient than Firefox.
python code is here:
from selenium import webdriver
renren = webdriver.Firefox()
#renren = webdriver.PhantomJS()
renren.get("file:///home/xjz/Desktop/html/easy.html")
renren.execute_script("windows()")
now_handle1 = renren.current_window_handle
all_handles1 = renren.window_handles
for handle1 in all_handles1:
if handle1 != now_handle1:
renren.switch_to_window(handle1)
print renren.current_url
print renren.page_source
In script "windows()", it will open a new window for http://www.renren.com/.
When I use Firefox,I get current url and context of http://www.renren.com/ . But I get "about:blank" of the url and "" of the context.It means I get failed when I use PhantomJS.
So how can I get current url when I use selenium with PhantomJS.
Thanks a lot.
You can add sleep time in your code before getting the current URL.
from selenium import webdriver
renren = webdriver.Firefox()
...
...
...
import time
time.sleep(10)# in seconds
print renren.current_url
..
Is it possible to catch the event when the url is changed inside my browser using selenium?
Here is my scenario:
I load my website test.com
After all the static files are loaded, when executing one of the js file, I am redirected (not sure how) to another page redirect-one.test.com/blah
My browser gets the url redirect-one.test.com/blah and gets a 307 response to go to redirect-two.test.com/blahblah
Here my browser receives a final 302 to go to final.test.com/
The page of final.test.com/ is loaded and at the end of this, selenium enables me to search for elements and so on...
I'd like to be able to intercept (and time the moment it happens) each time I am redirected.
After that, I still need to do some other steps for which selenium is more suitable:
Enter my username and password
Test some functionnalities
Log out
Here a sample of how I tried to intercept the first redirect:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.support.ui import WebDriverWait
def url_contains(url):
def check_contains_url(driver):
return (url in driver.current_url)
return check_contains_url
driver = webdriver.Remote(
command_executor='http://127.0.0.1:4444/wd/hub',
desired_capabilities=DesiredCapabilities.FIREFOX)
driver.get("http://test.com/")
try:
url = "redirect-one.test.com"
first_redirect = WebDriverWait(driver, 20).until(url_contains(url))
print("found first redirect")
finally:
print("move on to the next redirect...."
Is this even possible using selenium?
I cannot change the behavior of the website and the reason it is built like this is because of an SSO mechanism I cannot bypass.
I realize I specified python but I am open to tools in other languages.
Selenium is not the tool for this. All the redirects that the browser encounters are handled by the browser in a way that Selenium does not allow you to check.
You can perform the checks using urllib2, or if you prefer a sane interface, using requests.
I would like to capture network traffic by using Selenium Webdriver on Python. Therefore, I must use a proxy (like BrowserMobProxy)
When I use webdriver.Chrome:
from browsermobproxy import Server
server = Server("~/browsermob-proxy")
server.start()
proxy = server.create_proxy()
from selenium import webdriver
co = webdriver.ChromeOptions()
co.add_argument('--proxy-server={host}:{port}'.format(host='localhost', port=proxy.port))
driver = webdriver.Chrome(executable_path = "~/chromedriver", chrome_options=co)
proxy.new_har
driver.get(url)
proxy.har # returns a HAR
for ent in proxy.har['log']['entries']:
print ent['request']['url']
the webpage is loaded properly and all requests are available and accessible in the HAR file.
But when I use webdriver.Firefox:
# The same as above
# ...
from selenium import webdriver
profile = webdriver.FirefoxProfile()
driver = webdriver.Firefox(firefox_profile=profile, proxy = proxy.selenium_proxy())
proxy.new_har
driver.get(url)
proxy.har # returns a HAR
for ent in proxy.har['log']['entries']:
print ent['request']['url']
The webpage cannot be loaded properly and the number of requests in the HAR file is smaller than the number of requests that should be.
Do you have any idea what the problem of proxy settings in the second code? How should I fix it to use webdriver.Firefox properly for my purpose?
Just stumbled across this project https://github.com/derekargueta/selenium-profiler. Spits out all network data for a URL. Shouldn't be hard to hack and integrate into whatever tests you're running.
Original source: https://www.openhub.net/p/selenium-profiler
For me, following code component works just fine.
profile = webdriver.FirefoxProfile()
profile.set_proxy(proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)
I am sharing my solution, this would not write any logs to any file
but you can collect all sort of messages such as Errors,
Warnings, Logs, Info, Debug , CSS, XHR as well as Requests(traffic)
1. We are going to create Firefox profile so that we can enable option of
"Persist Logs" on Firefox (you can try it to enable on your default browser and see
if it launches with "Persist Logs" without creating firefox profile )
2. we need to modify the Firefox initialize code
where this line will do magic : options.AddArgument("--jsconsole");
so complete Selenium Firefox code would be, this will open Browser Console
everytime you execute your automation :
else if (browser.Equals(Constant.Firefox))
{
var profileManager = new FirefoxProfileManager();
FirefoxProfile profile = profileManager.GetProfile("ConsoleLogs");
FirefoxDriverService service = FirefoxDriverService.CreateDefaultService(DrivePath);
service.FirefoxBinaryPath = DrivePath;
profile.SetPreference("security.sandbox.content.level", 5);
profile.SetPreference("dom.webnotifications.enabled", false);
profile.AcceptUntrustedCertificates = true;
FirefoxOptions options = new FirefoxOptions();
options.AddArgument("--jsconsole");
options.AcceptInsecureCertificates = true;
options.Profile = profile;
options.SetPreference("browser.popups.showPopupBlocker", false);
driver = new FirefoxDriver(service.FirefoxBinaryPath, options, TimeSpan.FromSeconds(100));
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(10);
}
3. Now you can write your logic since you have traffic/ logging window open so don't
go to next execution if test fails. That way Browser Console will keep your errors
messages and help you to troubleshoot further
Browser : Firefox v 61
How can you launch Browser Console for firefox:
1. open firefox (and give any URL )
2. Press Ctrl+Shift+J (or Cmd+Shift+J on a Mac)
Link : https://developer.mozilla.org/en-US/docs/Tools/Browser_Console