Python Selenium (waiting for a frame) [duplicate] - python

I have these includes:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
Browser set up via
browser = webdriver.Firefox()
browser.get(loginURL)
However sometimes I do
browser.switch_to_frame("nameofframe")
And it won't work (sometimes it does, sometimes it doesn't).
I am not sure if this is because Selenium isn't actually waiting for pages to load before it executes the rest of the code or what. Is there a way to force a webpage load?
Because sometimes I'll do something like
browser.find_element_by_name("txtPassword").send_keys(password + Keys.RETURN)
#sends login information, goes to next page and clicks on Relevant Link Text
browser.find_element_by_partial_link_text("Relevant Link Text").click()
And it'll work great most of the time, but sometimes I'll get an error where it can't find "Relevant Link Text" because it can't "see" it or some other such thing.
Also, is there a better way to check if an element exists or not? That is, what is the best way to handle:
browser.find_element_by_id("something")
When that element may or may not exist?

You could use WebDriverWait:
from contextlib import closing
from selenium.webdriver import Chrome as Browser
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import NoSuchFrameException
def frame_available_cb(frame_reference):
"""Return a callback that checks whether the frame is available."""
def callback(browser):
try:
browser.switch_to_frame(frame_reference)
except NoSuchFrameException:
return False
else:
return True
return callback
with closing(Browser()) as browser:
browser.get(url)
# wait for frame
WebDriverWait(browser, timeout=10).until(frame_available_cb("frame name"))

Related

With Selenium python, do I need to refresh the page when waiting for hidden btn to appear and be clickable?

I'm trying to make a small program that looks at a web page at a hidden button (uses hide in the class) and waits for it to be clickable before clicking it. The code is below. I'm wondering if the WebDriverWait and element_to_be_clickable functions will already by refreshing things or if I would have to manually refresh the page.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
from selenium.common.exceptions import WebDriverException
driver = webdriver.Firefox()
driver.get(<URL>)
print("beginning 120s wait")
time.sleep(120)
print("finished 120s wait")
try:
element = WebDriverWait(driver, 1000).until(
EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
)
print("It went through")
element.click()
driver.execute_script("alert('It went through!');")
finally:
driver.execute_script("alert('Did it work?');")
First of all, I am not really sure if just searching by the class name minus the "hide" part will actually find the correct element, but the larger issue is that I do not know if the button will only be visible after refreshing the page. If I need to refresh, then it gets annoying because most sites throw up additional captchas for both Firefox or Chrome when they figure out a bot is accessing the site. (That's why I have the initial sleep: so that I can finish any captcha manually first)
So, do I need to have a refresh in my code, or will it be fine without it? If I do need it, how do I implement it? Do I just add it like:
try:
element = WebDriverWait(driver, 1000).until(
drive.refresh()
EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
)
And sorry if this has been answered elsewhere, I searched a bunch, but I have not quite found the answer on this site.
First, you shouldn't use sleep the WebDriverWait with the correct EC will do the trick.
As for the EC.element_to_be_clickable this is the code behind the function:
def element_to_be_clickable(locator):
""" An Expectation for checking an element is visible and enabled such that
you can click it."""
def _predicate(driver):
element = visibility_of_element_located(locator)(driver)
if element and element.is_enabled():
return element
else:
return False
return _predicate
As you can see the EC.element_to_be_clickable function does not refresh the browser.
If you insist you need the refresh the correct way to implement it will be:
try:
element = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
except (NoSuchElementException, StaleElementReferenceException):
driver.refresh()
element = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
I don't think the refresh will help with the hidden element...

How to open a page with Selenium WebDriver (Chrome), without waiting for it to load in Python?

driver.get(URL) function waits for the page to finish loading before returning to execution.
In Java there is a method driver.navigate().to(URL), which does not wait for the page to load, however, I cannot find such method in Python.
For firefox, I could run profile.set_preference("webdriver.load.strategy", "unstable"), however Chrome does not have such preference.
There is also driver.set_page_load_timeout(0.2) with TimeoutException handling
However, when the exception is handled, the chrome driver stops loading the page.
EDIT: Rephrased the question a bit and added one solution that does not work.
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "none" # Do not wait for full page load
driver = webdriver.Chrome(desired_capabilities=caps, executable_path="path/to/chromedriver.exe")
driver.get('some url')
#some code to run straight away
This code will not wait for the page to fully load before it moves to the next line. When using the 'None' Page load strategy you will probably have to make your own custom wait to check if the elements or element you need are done loading.
You could use:
import time
time.sleep()
Or if you want to only wait the required time before an element has loaded in:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
WebDriverWait(driver, timeout=10).until(
ec.visibility_of_element_located((By.ID, "your_element_id"))
)
This should help you with whatever you are trying to do. Since you gave no code, I can only assume this will work as I could not give a specific answer for a specific web page.

Python Selenium - Webdriver interacts with data from previous page

I am trying to create a piece of code that takes an input from the user and searches for that input in his gmail account and then checks all the boxes from the sender which is the same as the input.
Here is my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium.webdriver import Remote
from selenium.webdriver.support.ui import WebDriverWait
import urllib
driver = webdriver.Chrome('D:/chromedriver_win32/chromedriver.exe')
driver.get("http://www.gmail.com")
elem = driver.find_element_by_id('Email')
elem.send_keys("****************")
elem2 = driver.find_element_by_id('next')
elem2.send_keys(Keys.ENTER)
driver.maximize_window()
driver.implicitly_wait(20)
elem3 = driver.find_element_by_name('Passwd')
elem3.send_keys("*************")
driver.find_element_by_id('signIn').send_keys(Keys.ENTER)
driver.implicitly_wait(20)
inp = 'randominput'
driver.find_element_by_name('q').send_keys(inp)
driver.find_element_by_css_selector('button.gbqfb').send_keys(Keys.ENTER)
x = driver.current_url
for i in driver.find_elements_by_css_selector('.zA.zE'):
print(i.find_element_by_class_name('zF').get_attribute('name'))
if(i.find_element_by_class_name('zF').get_attribute('name') == inp):
i.find_element_by_css_selector('.oZ-jc.T-Jo.J-J5-Ji').click()
This main problem is that although the webdriver shows the new page where it has searched for the query but when the code interacts with the page it interacts with the previous one.
I have tried putting implicit wait. And when I check for the current url it shows the new url.
The problem is that this is a single page app so selenium isn't going to wait for new data to load. You need to figure out a way to wait for the search results to come back. Whether it be based off of the 'Loading...' that appears at the top or even just waiting for the first result to change.
Grab an element off of the first page and wait for it to go stale. That will tell you that the page is loading. Then wait for the element you want. This is about the only way to ensure that the page has reloaded and you aren't referencing the original page with a dynamic page.
To wait for stale, see staleness_of on http://selenium-python.readthedocs.io/waits.html.

How to detect if page has completely loaded in Selenium?

What I am to do is trying to iterate over all select values and grab some data. Data is loaded via AJAX so the page isn't reloading - the data in the table is just changing. As you can see I have tried to use explicit wait but that doesn't work - select value can be "4" but information in the table can still belong to the "3" value. I would like to know how can I be sure that the page is really updated?
There're some similar questions but I didn't find proper solution.
for i in range(N):
try:
driver.find_element_by_xpath("//select[#id='search-type']/option[#value='%s']"%i).click()
driver.find_element_by_name("submit").click()
except:
print i
continue
elem = driver.find_element_by_id("my_id")
wait.until(lambda driver: driver.find_element_by_id('search-type').get_attribute("value")==str(i))
if no_results(get_bs_source(driver.page_source)):
print "No results %s"%i
if is_limited(get_bs_source(driver.page_source)):
print "LIMITED %s"%i
data+=get_links_from_current_page(driver)
For sure I can use time.sleep(N) but that's not the way I would like to do this.
Selenium is using document.ready state to determine if page has loaded or not.
WebDriverWait implementing such kind of logic. You should not use document.readyState directly!
In most of functions it's waiting, for example:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
browser = webdriver.Firefox()
browser.get("http://something.com")
delay = 10 # seconds
WebDriverWait(browser, delay).until(EC.presence_of_element_located(browser.find_element_by_id('IdOfMyElement')))
In this case it will execute document.ready 2 times: 1st time browser.get, second time while WebDriverWait.
Selenium implementation: https://github.com/SeleniumHQ/selenium/blob/01399fffecd5a20af6f31aed6cb0043c9c5cde65/java/client/src/com/thoughtworks/selenium/webdriven/commands/WaitForPageToLoad.java#L59-L73
var javaScriptExecutor = Instance as IJavaScriptExecutor;
bool isPageLoaded = javaScriptExecutor.ExecuteScript("return document.readyState;").Equals("complete");

Selenium - Python bindings - Detect new AJAX data

As a beginner programmer, I have found a lot of useful information on this site, but could not find an answer to my specific question. I want to scrape data from a webpage, but some of the data I am interested in scraping can only be obtained after clicking a "more" button. The below code executes without producing an error, but it does not appear to click the "more" button and display the additional data on the page. I am only interested in viewing the information on the "Transcripts" tab, which seems to complicate things a bit for me because there are "more" buttons on the other tabs. The relevant portion of my code is as follows:
from mechanize import Browser
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver import ActionChains
import urllib2
import mechanize
import logging
import time
import httplib
import os
import selenium
url="http://seekingalpha.com/symbol/IBM/transcripts"
ua='Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0 (compatible;)'
br=Browser()
br.addheaders=[('User-Agent', ua), ('Accept', '*/*')]
br.set_debug_http(True)
br.set_debug_responses(True)
logging.getLogger('mechanize').setLevel(logging.DEBUG)
br.set_handle_robots(False)
chromedriver="~/chromedriver"
os.environ["webdriver.chrome.driver"]=chromedriver
driver=webdriver.Chrome(chromedriver)
time.sleep(1)
httplib.HTTPConnection._http_vsn=10
httplib.HTTPConnection._http_vsn_str='HTTP/1.0'
page=br.open(url)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
actions=ActionChains(driver)
elem=driver.find_element_by_css_selector("div #transcripts_show_more div#more.older_archives")
actions.move_to_element(elem).click()
A couple of things:
Given you're using selenium, you don't need either mechanize or urllib2 as selenium is doing the actual page loading. As for the other imports (httplib, logging, os and time), they're either unused or redundant.
For my own convenience, I changed the code to use Firefox; you can change it back to Chrome (or other any browser).
In regards to the ActionChains, you don't them here as you're only doing a single click (nothing to chain really).
Given the browser is receiving data (via AJAX) instead of loading a new page, we don't know when the new data has appeared; so we need to detect the change.
We know that 'clicking' the button loads more <li> tags, so we can check if the number of <li> tags has changed. That's what this line does:
WebDriverWait(selenium_browser, 10).until(lambda driver: len(driver.find_elements_by_xpath("//div[#id='headlines_transcripts']//li")) != old_count)
It will wait up to 10 seconds, periodically comparing the current number of <li> tags from before and during the button click.
import selenium
from selenium import webdriver
from selenium.common.exceptions import StaleElementReferenceException
from selenium.common.exceptions import WebDriverException
from selenium.common.exceptions import TimeoutException as SeleniumTimeoutException
from selenium.webdriver.support.ui import WebDriverWait
url = "http://seekingalpha.com/symbol/IBM/transcripts"
selenium_browser = webdriver.Firefox()
selenium_browser.set_page_load_timeout(30)
selenium_browser.get(url)
selenium_browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
elem = selenium_browser.find_element_by_css_selector("div #transcripts_show_more div#more.older_archives")
old_count = len(selenium_browser.find_elements_by_xpath("//div[#id='headlines_transcripts']//li"))
elem.click()
try:
WebDriverWait(selenium_browser, 10).until(lambda driver: len(driver.find_elements_by_xpath("//div[#id='headlines_transcripts']//li")) != old_count)
except StaleElementReferenceException:
pass
except SeleniumTimeoutException:
pass
print(selenium_browser.page_source.encode("ascii", "ignore"))
I'm on python2.7; if you're on python3.X, you probably won't need .encode("ascii", "ignore").

Categories

Resources