I'm trying to get an HTML text off a webpage using python selenium. But Selenium appears to be unable to locate the element. Not sure if I am doing anything wrong and is looking for some answers here.
Please see a part of my code below:
wait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'current_address_name')))
full_address = driver.find_element_by_id('current_adress_name')
street_address = full_address.text.split('(S)')[0]
print('Street Address: ' + str(street_address))
all_street_address.append(street_address)
This is the error message (timeout). Supposedly because it cannot find the element:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/Volumes/GoogleDrive/My Drive/PycharmProjects/googlestuff/gsheet_test_mod.py", line 121, in web_extraction
wait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'current_address_name')))
File "/Users/cadellteng/googstuff/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
And this is a part of the HTML code:
<div id="current_address_name" style="display:none">4 Third Avenue (S)266576</div>
A little background about this thing though. This part of the element doesn't seem to be a visible element on the page because when I put my mouse over the element, it does not highlight any part of the page. Is it because Selenium cannot extract elements that are not displayed? Or am I doing something wrong here? Please advise.
Just in case you need the full code, this is the page: Singapore Streetdirectory
The visibility_of_element_located method is for "checking that an element is present on the DOM of a page and visible.".
Clearly this element is not visible, as the element has style="display:none" set.
You could use presence_of_element_located instead, which will just check that the element is present in the DOM tree:
wait(driver, 10).until(EC.presence_of_element_located((By.ID, 'current_address_name')))
Fixing a few other issues with your code, the full minimal, reproducible example would be:
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium import webdriver
all_street_address = []
driver = webdriver.Chrome()
wait = WebDriverWait
driver.get("https://www.streetdirectory.com/sg/4-third-avenue-266576/1_27869.html")
wait(driver, 10).until(EC.presence_of_element_located((By.ID, 'current_address_name')))
full_address = driver.find_element_by_id('current_address_name')
street_address = full_address.get_attribute('innerHTML').split('(S)')[0]
print('Street Address: ' + str(street_address))
all_street_address.append(street_address)
Related
I wrote the following code in order to scrape the text of the element <h3 class="h4 mb-10">Total nodes: 1,587</h3>
from https://blockchair.com/dogecoin/nodes.
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
path = "/usr/local/bin/chromedriver"
driver = webdriver.Chrome(path)
driver.get("https://blockchair.com/dogecoin/nodes")
def scraping_fnd():
try:
#nodes = driver.find_element_by_class_name("h4 mb-10")
#NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".h4 mb-10"}. There are 3 elements pf this class
#nodes = WebDriverWait(driver, 10).until(expected_conditions.presence_of_element_located((By.CLASS_NAME, "h4 mb-10")))#selenium.common.exceptions.TimeoutException: Message:
nodes = WebDriverWait(driver, 10).until(expected_conditions.presence_of_element_located((By.CSS_SELECTOR, ".h4 mb-10")))#selenium.common.exceptions.TimeoutException: Message:
nodes = nodes.text
print(nodes)
finally:
driver.quit()#Closes the tab even when return is executed
scraping_fnd()
I'm aware that there are perhaps less bloated options than selenium to scrape the target in question, yet the said code is just a snippet,
a part of a more extensive script that relies on selenium for its other tasks. Thus let us limit the scope of the answers to selenium
only.
Although there are three elements of the class "h4 mb-10" on the page, I am unable to locate the element. When I call driver.find_element_by_class_name("h4 mb-10"), I get:
Traceback (most recent call last):
File "./protocols.py", line 34, in <module>
scraping_fnd()
File "./protocols.py", line 20, in scraping_fnd
nodes = driver.find_element_by_class_name("h4 mb-10")#(f"//span[#title = \"{name}\"]")
File "/home/jerzy/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 564, in find_element_by_class_name
return self.find_element(by=By.CLASS_NAME, value=name)
File "/home/jerzy/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/home/jerzy/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/jerzy/.local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".h4 mb-10"}
(Session info: chrome=90.0.4430.212)
XP
Applying waits, currently commented out in the snippet, was to no avail. I came across this question and so I tried calling WebDriverWait(driver, 10).until(expected_conditions.presence_of_element_located((By.CSS_SELECTOR, ".h4 mb-10"))).
I got :
Traceback (most recent call last):
File "./protocols.py", line 33, in <module>
scraping_fnd()
File "./protocols.py", line 23, in scraping_fnd
nodes = WebDriverWait(driver, 10).until(expected_conditions.presence_of_element_located((By.CSS_SELECTOR, ".h4 mb-10")))#selenium.common.exceptions.TimeoutException: Message:
File "/home/jerzy/.local/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
I have no clue what am I doing wrong. Is it doable to scrape the target with selenium without using Xpaths?
Try with this xpath. Find all the elements with this xpath and with indexing extract the required Text.
//div[#class='nodes_chain']/div[1]/h3 # for "Total nodes" option
//div[#class='nodes_chain']//h3 # for all the option.
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(executable_path="path to chromedriver.exe")
driver.maximize_window()
driver.implicitly_wait(10)
driver.get("https://blockchair.com/dogecoin/nodes")
wait = WebDriverWait(driver,30)
option = wait.until(EC.presence_of_element_located((By.XPATH,"//div[#class='nodes_chain']/div[1]/h3")))
print(option.text)
options = wait.until(EC.presence_of_all_elements_located((By.XPATH,"//div[#class='nodes_chain']//h3")))
# First option
print(options[0].text)
# All the options
for opt in options:
print(opt.text)
Total nodes: 1,562
Total nodes: 1,562
Total nodes: 1,562
Node versions
Block heights
If you want to go with css selector it would be
.h4.mb-10
You can go with class name
If either one uniquely identifies the element or xpath
//h3[#class="h4 mb-10"]
Try the below (Note that selenium is not involved in the solution)
import requests
from bs4 import BeautifulSoup
r = requests.get('https://blockchair.com/dogecoin/nodes')
soup = BeautifulSoup(r.content, 'html.parser')
lst = soup.find_all('h3',{"class": "h4 mb-10"})
lst = [h for h in lst if 'Total' in h.text]
print(lst[0].text)
output
Total nodes: 1,593
Utilizing Python and Selenium, I am attempting to click on the 'Matrix' and 'Transaction Desk' Angular buttons on a specific webpage (see screenshot). I have looked into this for hours now and have not found anything that has worked. It actually seems like there are a lot of unsolved questions still out there around this topic. The source code (see screenshot) does not appear to be within any iframe. The traceback just tells me that it could not find the specific code and times out.
My code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome("C:\\Users\\Matt\\Documents\\WebDriver\\chromedriver_win32\\chromedriver.exe")
driver.get("https://www.stellarmls.com/")
#Password authentication HERE, so you will be unable to access this site yourself.
driver.maximize_window()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[#id="collapse911"]/app-links-panel-body/div/div/div[1]'))).click()
print(driver.page_source)
Traceback:
Traceback (most recent call last):
File "C:\Users\Matt\Documents\Splitt\ROI Property Finder.py", line 53, in <module>
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[#id="collapse911"]/app-links-panel-body/div/div/div[1]'))).click()
File "C:\Users\Matt\Python3.9\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Supplementary screenshots:
iframe:
I'm trying to get to google.com and type something in the search bar. But the cookies alert pop-up always gets in my way. So I have to click the button 'I agree'. I know I have to wait a little bit of time before searching for the element but even if I'll wait with WebDriverWait() function, or .implicitly_wait() it just doesn't want to find the element (I used the search by xpath and .click() to press the button).
Spent hours trying to find a solution...
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def main():
driver = webdriver.Chrome('chromedriver.exe')
url = 'https://www.google.com/'
driver.get(url)
# The following line is supposed to wait for the button to appear and then click:
agreeButton = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[#id='introAgreeButton']")))
agreeButton.click()
main()
Also i should include the error here:
Traceback (most recent call last):
File "{SCRIPT PATH}", line 49, in <module>
main()
File "{SCRIPT PATH}", line 46, in main
agreeButton = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[#id='introAgreeButton']")))
File "C:\ProgramData\Anaconda3\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
I am trying to accept cookies and search in google. But I am facing a problem that I never faced before.
Webdriver cant find the accept cookies button element. I looked everything. I tried to see if something changes the xpath(Is there any thing triggers any other element so the xpath will change) but I have zero in my hand. I tried using accept.send_keys(Keys.RETURN) and accept.click(). Nothing seems to work.
What I have for now;
def GoogleIt():
path = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get("https://google.com")
wait(5)
accept = driver.find_element_by_xpath("/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]")
accept.send_keys(Keys.RETURN)
accept.click()
search = driver.find_element_by_xpath("/html/body/div/div[2]/form/div[2]/div[1]/div[1]/div/div[2]/input")
search.click()
search.send_keys(talk)
Note: wait() is a different name of time.sleep()
Error;
Traceback (most recent call last):
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\tkinter\__init__.py", line 1883, in __call__
return self.func(*args)
File "C:/Users/Teknoloji/Desktop/Phyton/Assistant V1/Assistant.py", line 20, in GoogleIt
accept = driver.find_element_by_xpath("/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]")
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\Teknoloji\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/c-wiz/div[2]/div/div/div/div/div[2]/form/div/div[2]"}
(Session info: chrome=85.0.4183.121)
Process finished with exit code -1
If you need more information I am here.
Thanks...
To click on accept cookies button which is inside an iframe you need to switch to iframe first.
Induce WebDriverWait() and frame_to_be_available_and_switch_to_it() and following css selector.
Induce WebDriverWait() and element_to_be_clickable() and following xpath.
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[src^='https://consent.google.com']")))
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[#id='introAgreeButton']"))).click()
You need to import following libraries.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
In case KunduK's solution doesn't work for some of those reading this, use "By.ID" parameter instead of the "By.CSS_SELECTOR" in the frame_to_be_available_and_switch_to_it() method -
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "name_of_the_iframe")))
After the latest update of Selenium, the above solutions did not work for me.
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium import webdriver
from selenium.webdriver.common.by import By
firefox_binary = FirefoxBinary()
browser = webdriver.Firefox(firefox_binary=firefox_binary)
Let's say you want to get this page from Google:
browser.get('https://www.google.com/search?q=cats&source=lnms&tbm=isch')
To continue, you have to click on "Accept all"
Then, you use Selenium to click the button:
browser.find_element(By.XPATH,"//.[#aria-label='Accept all']").click()
Then, you can continue to Google page!
I run the script and it gives me one of two errors:
selenium.common.exceptions.ElementNotInteractableException: Message: Element <a class="grid_size in-stock" href="javascript:void(0);"> could not be scrolled into view
or
Element not found error
Right now this is the error it's giving. Sometimes it works, sometimes it doesn't. I have been trying to change the timing around to get it working right but to no avail. It will not work.
Code:
import requests
from selenium.webdriver.support import ui
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
def get_page(model, sku):
url = "https://www.footlocker.com/product/model:"+str(model)+"/sku:"+ str(sku)+"/"
return url
browser = webdriver.Firefox()
page=browser.get(get_page(277097,"8448001"))
browser.find_element_by_xpath("//*[#id='pdp_size_select_mask']").click()
link = ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#size_selection_list")))
browser.find_element_by_css_selector('#size_selection_list').click()
browser.find_element_by_css_selector('#size_selection_list > a:nth-child(8)').click()
browser.find_element_by_xpath("//*[#id='pdp_addtocart_button']").click()
checkout = browser.get('https://www.footlocker.com/shoppingcart/default.cfm?sku=')
checkoutbutton = browser.find_element_by_css_selector('#cart_checkout_button').click()
The website automatically opens the size_selection_list div, so you don't need to click on it. But you do need to wait for the particular list element that you want to select. This code worked for me on this site a couple of times consistently.
from selenium.webdriver.support import ui
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
def get_page(model, sku):
url = "https://www.footlocker.com/product/model:"+str(model)+"/sku:"+ str(sku)+"/"
return url
browser = webdriver.Firefox()
page=browser.get(get_page(277097,"8448001"))
browser.find_element_by_xpath("//*[#id='pdp_size_select_mask']").click()
shoesize = ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'a.grid_size:nth-child(8)')))
shoesize.click()
browser.find_element_by_xpath("//*[#id='pdp_addtocart_button']").click()
checkout = browser.get('https://www.footlocker.com/shoppingcart/default.cfm?sku=')
checkoutbutton = browser.find_element_by_css_selector('#cart_checkout_button').click()
I don't have enough reputation to comment, so: can you provide more of the stack trace for the "element not found" error? I stepped through the live Foot Locker site and didn't find the cart_checkout_button, though I may have made a mistake.
I'm also not sure (in your specific example) that you need to be clicking the size_selection_list the first time, before clicking the child, but I'm not as concerned about that. The syntax overall looks OK to me.
edit with the provided stack trace:
Traceback (most recent call last): File "./footlocker_price.py", line 29, in browser.find_element_by_css_selector('#size_selection_list > a:nth-child(8)').click()
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webelement.py", line 501, in _execute
return self._parent.execute(command, params)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 308, in execute
self.error_handler.check_response(response)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotInteractableException: Message: Element could not be scrolled into view
What this means is that the click on #size_selection_list > a:nth-child(8) is failing because it can't be interacted with directly. Some other element is in the way. reference
Due to the way the particular page you're interacting with works (which is here for others reading this), I believe the size selection list is simply hidden when the page loads, and is displayed after you click on the Size button.
It seems like the ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#size_selection_list"))) line should do what you want, though, so I'm at a little bit of a loss. If you add in a time.sleep(5) before you click on the size element, does it work? It's not ideal but maybe it will get you moving on to other parts of this.