Using Selenium to get past dropdown (md-select) - python

I'm stuck with a dropdown that I can't get pass in Selenium.
I'm trying to collect some price data using Selenium from this link:
https://xxx. In this link, you need to click on a button (Next), then select any option in the subsequent dropdown, then press (Next) again to advance into the information page that I wanted to collect some information. I'm stuck at the dropdown - I am unable to select any option.
This is my code thus far:
browser.get("https://xxx/#/pricePlans/step1")
wait = WebDriverWait(browser, 10)
while True:
try:
button = browser.find_element_by_css_selector('body > div.md-dialog-container.ng-scope > md-dialog > md-dialog-actions > div > button')
except TimeoutException:
break
button.click()
options_box= browser.find_element_by_class_name('bullet-content-title')
wait = WebDriverWait(browser, 5)
options_box.click()
The issue lies with the dropdown options (It has options like HDB 1-room, HDB 2-room etc). I tried to reference the option box by XPATH, CSS selector, class_name (as seen above) but with the snippet above, Spyder issues time-out. Other snippets I tried included:
ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "bullet-content-title")))
using XPATH, class_name but no luck.
I'm a newbie at web scraping who got thus far by searching the SO, but I am unable to find much solutions regarding (md-select) dropdowns.
I also attempted to use
ActionChains(driver).move_to_element(options_box).click(options_box)
but I did not see any clicking nor mouse movements so i'm stumped.
I appreciate any advice at this point of time. Thank you so much!
Edit:
Code Snippets and Responses:
from selenium import webdriver
from selenium.webdriver.support import ui
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.action_chains import ActionChains
option = webdriver.ChromeOptions()
option.add_argument('--incognito')
browser = webdriver.Chrome(executable_path='C:\\ChromeDriver\\chromedriver.exe', options=option)
browser.get("https://xxx")
wait = WebDriverWait(browser, 10)
while True:
try:
button = browser.find_element_by_css_selector('body > div.md-dialog-container.ng-scope > md-dialog > md-dialog-actions > div > button')
except TimeoutException:
break
button.click()
options_box = browser.find_element_by_class_name('bullet-content-title')
wait = WebDriverWait(browser, 5)
options_box.click()
This returns "StaleElementReferenceException: stale element reference: element is not attached to the page document"
Which I assume it is due to the presence of the second "Next" Button which is inert at the moment.
options_box = ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "bullet-content-title")))
options_box.click()
Does nothing. Spyder eventually returned me TimeOut Error.

#AndrewRay answer is good for getting the value but not for selecting the options. you can do this to select the options.
#browser.get("https://......")
wait = WebDriverWait(browser, 10)
try:
browser.find_element_by_css_selector('button.green-btn').click()
# wait until dialog dissapear
wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, 'md-dialog[aria-describedby="dialogContent_0"]')))
# click the dropdown
browser.find_element_by_css_selector('md-input-container').click()
# select the option element
setOptionElement = browser.find_element_by_css_selector('md-option[value="HDB Executive"]')
# need to scrollIntoView if the option in the bottom
# or you get error the element not clickable
browser.execute_script('arguments[0].scrollIntoView();arguments[0].click()', setOptionElement)
except Exception as ex:
print(ex)

driver.get('https://compare.openelectricitymarket.sg/#/pricePlans/step1')
time.sleep(5)
next_btn = driver.find_element_by_css_selector('button.green-btn')
next_btn.click()
dropdown = driver.find_element_by_id('select_4')
options = dropdown.find_elements_by_tag_name('md-option')
for option in options:
print option.get_attribute('value')
Hope this helps. Use the .get_attribute method to find the value of the option and click that option if matches the desired value. :)

Related

With Selenium python, do I need to refresh the page when waiting for hidden btn to appear and be clickable?

I'm trying to make a small program that looks at a web page at a hidden button (uses hide in the class) and waits for it to be clickable before clicking it. The code is below. I'm wondering if the WebDriverWait and element_to_be_clickable functions will already by refreshing things or if I would have to manually refresh the page.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
from selenium.common.exceptions import WebDriverException
driver = webdriver.Firefox()
driver.get(<URL>)
print("beginning 120s wait")
time.sleep(120)
print("finished 120s wait")
try:
element = WebDriverWait(driver, 1000).until(
EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
)
print("It went through")
element.click()
driver.execute_script("alert('It went through!');")
finally:
driver.execute_script("alert('Did it work?');")
First of all, I am not really sure if just searching by the class name minus the "hide" part will actually find the correct element, but the larger issue is that I do not know if the button will only be visible after refreshing the page. If I need to refresh, then it gets annoying because most sites throw up additional captchas for both Firefox or Chrome when they figure out a bot is accessing the site. (That's why I have the initial sleep: so that I can finish any captcha manually first)
So, do I need to have a refresh in my code, or will it be fine without it? If I do need it, how do I implement it? Do I just add it like:
try:
element = WebDriverWait(driver, 1000).until(
drive.refresh()
EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
)
And sorry if this has been answered elsewhere, I searched a bunch, but I have not quite found the answer on this site.
First, you shouldn't use sleep the WebDriverWait with the correct EC will do the trick.
As for the EC.element_to_be_clickable this is the code behind the function:
def element_to_be_clickable(locator):
""" An Expectation for checking an element is visible and enabled such that
you can click it."""
def _predicate(driver):
element = visibility_of_element_located(locator)(driver)
if element and element.is_enabled():
return element
else:
return False
return _predicate
As you can see the EC.element_to_be_clickable function does not refresh the browser.
If you insist you need the refresh the correct way to implement it will be:
try:
element = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
except (NoSuchElementException, StaleElementReferenceException):
driver.refresh()
element = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.CLASS_NAME, "btn add"))
I don't think the refresh will help with the hidden element...

Having trouble referencing to a certain element on page with Selenium

I am having a terribly hard time referencing to a certain "next page" button on a website that I am trying to scrape links from [https://www.sreality.cz/adresar?strana=2]. If you scroll down you can see a red right arrow button that you can click to go to the next page and so the website load new dynamic content. Every approach seems to report the same exact error and I don't know how am I supposed to point to the element without running into it.
This is the code that I currently have :
from selenium import webdriver
chromedriver_path = "/home/user/Dokumenty/iCloud/RealityScraper/chromedriver"
driver = webdriver.Chrome(chromedriver_path)
print("WebDriver Successfully Initialized")
driver.get("https://www.sreality.cz/adresar?strana=2")
links = driver.find_elements_by_css_selector("h2.title a")
nextPage = driver.find_element_by_css_selector("li.paging-item a.btn-paging-pn.icof.icon-arr-right.paging-next")
for link in links:
print(link.get_attribute("href"))
nextPage.click()
The "nextPage" variable is holding a supposed value to be clicked on once the "links" variable search finishes scraping all the links from the company titles. However when I run this code I get an error :
selenium.common.exceptions.StaleElementReferenceException: Message:
stale element reference: element is not attached to the page document
I have been searching for various fixes online but none of them seemed to resolve the issue. I think that the issue at this point is not caused by the element not loading quickly enough but rather Selenium having trouble finding the element because of wrong reference.
Because of this I have tried using XPath to accurately point to the actual element and so I changed the "nextPage" variable to :
nextPage = driver.find_element_by_xpath("""/html/body/div[2]/div[1]/div[2]/div[2]/div[4]/div/div/div/div[2]/div/div[2]/ul[1]/li[12]/a""")
Which returns exactly the same error as stated above. I have been trying to find a solution to this for hours now and I can't understand where the issue lies. I would be grateful if anyone could explain to me what am I doing wrong. Thanks to anyone.
If you want to get all the ng-href tags from every page. Or you could look into their api.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
driver.get("https://www.sreality.cz/adresar?strana=2")
wait = WebDriverWait(driver, 10)
while True:
try:
links = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "h2.title > a")))
#print(len(links))
for link in links:
print(link.get_attribute("ng-href"))
nextPage = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.btn-paging-pn.icof.icon-arr-right.paging-next")))
nextPage.click()
time.sleep(10)
except Exception as e:
print(e)
break
First of all never use the absolute xpath it will breakdown easily, Use the relative xpath.
Secondly, i think the error you are getting is because after clicking "Next" button for the first time it loads a new page. Which has a different DOM structure and that's why you are not able to find that element.
You can try searching for the element after every new page load (after clicking "Next" button everytime.)
// imports
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
// initialize
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 20)
action = ActionChains(driver)
// Try to use the below code and see if it works.
Next_btn = wait.until(EC.presence_of_element_located((By.XPATH, '(//li[#class="paging-item"])[2]')))
action.move_to_element(Next_btn).click().perform()

Selenium crashes when I'm trying to parse the next page (and seven after it) on a website. Any way to tackle this?

I want to parse an IMDb film rating located here on around 8 pages. In order to do that I'm using Selenium, and I'm having trouble with clicks, proceeding algorithm to next page. In the end I need 1000 titles when I'll continue using BeautifulSoup. Code below isn't working, I need to use button 'NEXT' with this HTML:
<a class="flat-button lister-page-next next-page" href="/list/ls000004717/?page=2">
Next
</a>
This is the code:
from selenium import webdriver as wb
browser = wb.Chrome()
browser.get('https://www.imdb.com/list/ls000004717/')
field = browser.find_element_by_name("flat-button lister-page-next next-page").click()
Error is the following:
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".flat-button lister-page-next next-page"}
(Session info: chrome=78.0.3904.108)
I suppose I lack knowledge of syntax needed, or maybe I mixed it up a little. I tried searching on SO, though every example is pretty unique and I don't possess the knowledge to extrapolate these cases fully. Any way Selenium can handle that?
You could try using an XPath to query on the Next text inside the button. You should also probably invoke WebDriverWait since you are navigating across multiple pages, then scroll into view since this is at the bottom of the page:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from time import sleep
browser = wb.Chrome()
browser.get('https://www.imdb.com/list/ls000004717/')
# keep clicking next until we reach the end
for i in range(0,9):
# wait up to 10s before locating next button
try:
next_button = WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#class, 'page') and contains(text(), 'Next')]")))
# scroll down to button using Javascript
browser.execute_script("arguments[0].scrollIntoView(true);", next_button)
# click the button
# next_button.click() this throws exception -- replace with JS click
browser.execute_script("arguments[0].click();", next_button)
# I never recommend using sleep like this, but WebDriverWait is not waiting on next button to fully load, so it goes stale.
sleep(5)
# case: next button no longer exists, we have reached the end
except TimeoutException:
break
I also wrapped everything in a try / except TimeoutException block to handle the case where we have reached the end of pages, and Next button no longer exists, thus breaking out of the loop. This worked on multiple pages for me.
I also had to add an explicit sleep(5) because even after invoking WebDriverWait on element_to_be_clickable, next_button was still throwing StaleElementReferenceException. It seems like WebDriverWait was finishing before page was fully loaded, causing the status of next_button to change after it had been located. Normally adding sleep(5) is bad practice, but there did not seem to be another workaround here. If anyone else has a suggestion on this, feel free to comment / edit the answer.
There are a couple of ways that could work:
1. Use a selector for the next button and loop until the end:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
browser = webdriver.Chrome()
browser.get('https://www.imdb.com/list/ls000004717/')
selector = 'a[class*="next-page"]'
num_pages = 10
for page in range(pages):
# Wait for the element to load
WebDriverWait(browser, 10).until(ec.presence_of_element_located((By.CSS_SELECTOR, selector)))
# ... Do rating parsing here
browser.find_element_by_css_selector(selector).click()
Instead of clicking on the element, the other option could be to navigate to the next page using broswer.get('...'):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
# Set up browser as before and navigate to the page
browser = webdriver.Chrome()
browser.get('https://www.imdb.com/list/ls000004717/')
selector = 'a[class*="next-page"]'
base_url = 'https://www.imdb.com/list/ls000004717/'
page_extension = '?page='
# Already at page = 1, so only needs to loop 9 times
for page in range(2, pages + 1):
# Wait for the page to load
WebDriverWait(browser, 10).until(ec.presence_of_element_located((By.CSS_SELECTOR, selector)))
# ... Do rating parsing here
next_page = base_url + page_extension + str(page)
browser.get(next_page)
As a note: field = browser.find_element_by_name("...").click() will not assign field to a webelement, as the click() method has no return value.
You could try a partial css selector.
browser.find_element_by_css_selector("a[class*='next-page']").click()
To click on the element with text as NEXT till the 901 - 1,000 of 1,000 page you have to:
scrollIntoView() the element once the visibility_of_element_located() is achieved.
Induce WebDriverWait for the element_to_be_clickable()
You can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.imdb.com/list/ls000004717/')
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.pagination-range"))))
while True:
try:
WebDriverWait(driver, 20).until(EC.invisibility_of_element((By.CSS_SELECTOR, "div.row.text-center.lister-working.hidden")))
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.pagination-range"))))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.flat-button.lister-page-next.next-page"))).click()
print("Clicked on NEXT button")
except TimeoutException as e:
print("No more NEXT button")
break
driver.quit()
Console Output:
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
Clicked on NEXT button
No more NEXT button

Click on link using selenium webdriver

I am trying to click on a link and can't seem to get it to work. I click all the way up to the page I need, but then it won't click the last link. The code is as follows:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.keys import Keys
import time
from bs4 import BeautifulSoup
import requests
import pandas as pd
import openpyxl
from password import DKpassword
#import SendKeys
beginningTime = time.time()
browser = webdriver.Chrome()
browser.get('https://www.draftkings.com/lobby')
browser.maximize_window()
time.sleep(5)
signinLink = browser.find_element_by_xpath("""//*[#id="react-mobile-home"]/section/section[2]/div[2]/div[3]/div/input""")
signinLink.click()
signinLink.send_keys("abcdefg")
signinLink.send_keys(Keys.TAB)
passwordLink = browser.find_element_by_xpath("""//*[#id="react-mobile-home"]/section/section[2]/div[2]/div[4]/div/input""")
passwordLink.send_keys(DKpassword)
passwordLink.send_keys(Keys.ENTER)
time.sleep(5)
if browser.current_url == "https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby1":
signin = browser.find_element_by_partial_link_text("SIGN IN")
signin.click()
elif browser.current_url == "https://www.draftkings.com/lobby#/featured":
mlbLink = browser.find_element_by_partial_link_text("MLB")
mlbLink.click()
else:
print("error")
time.sleep(5)
featuredGame = browser.find_element_by_class_name("GameSetTile_tag")
featuredGame.click()
time.sleep(5)
firstContest = browser.find_element_by_partial_link_text("Enter")
firstContest.click()
I receive the error message that the element is not clickable at point... and another element would receive the click. Any help would be greatly appreciated. I don't care which contest is clicked on as long as it is on the featured page which the previous code directs it too.
There can be multiple reasons for that.
1. You might have to scroll down or might have to perform some action so that it'll be visible to script.
for scroll down you can use this code :
browser.execute_script("window.scrollTo(0, Y)")
where Y is the height (on a fullhd monitor it's 1080)
2. There can be multiple web element present , in this case you have to use unique element from dom , you can check that by right click > inspect > under element section > CTRL+F > write your locator (in your case xpath) to check how many entries are present.
You can try with this xpath also :
//a[contains(text(),'Enter') and contains(#href,'/contest/draftteam') and contains(#class,'dk-btn-dark')]
Code :
WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//a[contains(text(),'Enter') and contains(#href,'/contest/draftteam') and contains(#class,'dk-btn-dark')]"))
firstContest = browser.find_element_by_xpath("//a[contains(text(),'Enter') and contains(#href,'/contest/draftteam') and contains(#class,'dk-btn-dark')]")
firstContest.click()
Replace click event with action class, which will solve this Exception
from selenium.webdriver.common.action_chains import ActionChains
actions = ActionChains(driver)
actions.move_to_element(firstContest).click().perform()
I was able to overcome this exception by using Click through JavaScript executor instead of regular Element.click(), as below-
WebElement element = webDriver.findElement(By.xpath(webElemXpath));
try {
JavascriptExecutor ex = (JavascriptExecutor) webDriver;
ex.executeScript("arguments[0].click();", element);
logger.info(elementName was + " clicked");
}
catch (NoSuchElementException | TimeoutException e) {
logger.error(e.getMessage());
}

Python Selenium Select: "Element <option> could not be scrolled into view"

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time
url = "https://www.bungol.ca/"
driver = webdriver.Firefox(executable_path ='/usr/local/bin/geckodriver')
driver.get(url)
#myElem = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'IdOfMyElement')))
searchbutt = """/html/body/section/div[2]/div/div[1]/form/div/button""" #click search to get to map
active_listing = """//*[#id="activeListings"]"""
search_wait = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, searchbutt)))
search_wait.click()
active_wait = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, active_listing)))
active_wait.click()
driver.find_element_by_xpath("""//*[#id="useDateRange"]""").click() #use data range
time.sleep(1)
driver.find_element_by_xpath("""//*[#id="dateRangeStart"]""").click() #start range
time.sleep(1)
month_path = driver.find_element_by_xpath("""/html/body/div[17]/div/div/div[1]/select""") #click the month to bring combo box list option
driver.execute_script("arguments[0].click();", month_path)
I am trying to select January 1, 2015 on this calendar which requires you to click:
https://www.bungol.ca/map/?
use date range (no problem doing this)
click the start range (no problem)
click the month which brings up a combo form of options (can't click)
click year - can't click
click the date - can't click
I tried:
locate the element by xpath and css path but neither method works.
move_to_element method but still doesn't work
switch to frame method - doesn't work because it's not inside an iframe
use javascript to click it found here: How do you click on an element which is hidden using Selenium WebDriver?
scroll to element - doesn't do anything because element is already on screen
Here is a code block to solve your issue, I reorganized some of your code as well. Please make sure to import this from selenium.webdriver.support.ui import Select so you can use the Select in the code block:
driver.get('https://www.bungol.ca/')
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, './/button[#type="submit" and text()="Search"]'))).click()
wait.until(EC.element_to_be_clickable((By.ID, 'activeListings'))).click()
wait.until(EC.element_to_be_clickable((By.ID, 'useDateRange'))).click()
# I found that I had to click the start date every time I wanted to interact with
# anything related to the date selection div/table
wait.until(EC.element_to_be_clickable((By.XPATH, './/input[#id="dateRangeStart" and #name="soldDateStart"]')))
driver.find_element_by_xpath('.//input[#id="dateRangeStart" and #name="soldDateStart"]').click()
yearSelect = Select(driver.find_element_by_xpath('.//select[#class="pika-select pika-select-year"]'))
yearSelect.select_by_visible_text('2015')
wait.until(EC.element_to_be_clickable((By.XPATH, './/input[#id="dateRangeStart" and #name="soldDateStart"]')))
driver.find_element_by_xpath('.//input[#id="dateRangeStart" and #name="soldDateStart"]').click()
monthSelect = Select(driver.find_element_by_xpath('.//select[#class="pika-select pika-select-month"]'))
monthSelect.select_by_visible_text('January')
wait.until(EC.element_to_be_clickable((By.XPATH, './/input[#id="dateRangeStart" and #name="soldDateStart"]')))
driver.find_element_by_xpath('.//input[#id="dateRangeStart" and #name="soldDateStart"]').click()
driver.find_element_by_xpath('.//td[#data-day="1"]').click()
After running that, you should have the date selected January 1, 2015 for the first part of the range. You can use the same techniques to select the second part of the range if needed.
For more information on how to use the Select please visit THIS.

Categories

Resources