I just wonder, how to have the browser wait before clicking on a link? My goal is that I'm scraping from a dynamic webpage, the content is dynamic but I manage to get the form id. The only problem is that the submit button is only displayed after 2-3 seconds. However, my Firefox driver start clicking on the link immediately when the page is loaded (not the dynamic part).
Is there any way that I can make my browser wait 2-3 seconds until the submit button appears? I tried to use time.sleep() but it pauses everything, the submit button doesn't appear during time.sleep but appears after 2-3 seconds when time.sleep ends.
You can set wait like following :
Explicit wait :
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.ID, "myElement"))
Implicit wait :
driver.implicitly_wait(20) # seconds
driver.get("Your-URL")
myElement = driver.find_element_by_id("myElement")
You can use any of above. Both are valid.
You need to use Selenium Waits.
In particular, element_to_be_clickable expected condition is what fits better than others:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(driver, 10).until(
EC.element_to_be_clickable((By.ID, "myDynamicElement"))
)
element.click()
where driver is your webdriver instance, 10 is the amount of seconds to wait for an element. With this setup, selenium would try to locate an element every 500 milliseconds for 10 seconds in total. It would throw TimeoutException after 10 seconds left if element would not be found.
Related
I am currently scraping website https://glasschain.org (to be precise let's focus on example - https://glasschain.org/btc/wallet/100478015). There is a tab named 'Addresses', when I want to go to the X-th page (in this example equals 5 000) from available over 12 000 (to parallel some operations and not do it iteratively). To do it manually, I need to click on ... button and write a number and click ENTER. I use Python and selenium library to do it automatically and I have managed with clicking the button but I can't send a value to a field which appears after my click (in source code of website after clicking to button there is no new input field where I can send that value so I have tried to send it to the only existing element - earlier clicked button). Can anyone help me with that ? :)
My code:
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(path, options=chrome_options)
driver.get('https://glasschain.org/btc/wallet/100478015')
time.sleep(5)
adr_button = driver.find_elements_by_id('tab-Addresses')[1]
driver.execute_script("arguments[0].click();", adr_button)
time.sleep(5)
dots_button = driver.find_elements_by_class_name('ellipse')[0]
driver.execute_script("arguments[0].click();", dots_button)
dots_button.send_keys(5000)
My error:
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable (Session info: chrome=110.0.5481.77)
Version of selenium I am working with: 3.141.0
I don't have your specific version of selenium, but here's a solution that works with a more recent version, selenium 4.7.2:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from time import sleep
s = Service('bin/chromedriver.exe')
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(service=s, options=chrome_options)
driver.get('https://glasschain.org/btc/wallet/100478015')
delay = 3 # seconds
try:
adr_button = WebDriverWait(driver, delay).until(ec.presence_of_element_located((By.ID, 'tab-Addresses')))
except TimeoutException:
print("Loading timed out, exiting...")
exit()
# click the GDPR cookies accept button, if that hasn't happened yet
gdpr_button = driver.find_element(by='id', value='gdpr')
if gdpr_button:
driver.execute_script("gdpr('accept');")
driver.execute_script("arguments[0].click();", adr_button)
try:
dots_button = WebDriverWait(driver, delay).until(ec.presence_of_element_located((By.CLASS_NAME, 'ellipse')))
except TimeoutException:
print("Loading timed out, exiting...")
exit()
# click the dots button, to make the input appear
driver.execute_script("arguments[0].click();", dots_button)
# select the input, modify the value and then make it lose focus
dots_input = dots_button.find_element('xpath', "//span[#class='ellipse clickable']/input")
print(dots_input)
driver.execute_script("arguments[0].setAttribute('value','5000');", dots_input)
driver.execute_script("arguments[0].blur();", dots_input)
# just so you can see before the driver closes the browser again
sleep(10)
Note that, instead of waiting a number of seconds, the script simply waits until a specific element is loaded and then continues. There's still a delay defined, but this is now the maximum amount of time the script will wait for the element to load, instead of just waiting a fixed amount of time.
Once the button is clicked, a nested input element appears, which is selected and then JavaScript is executed to change its value and cause it to lose focus (blur), so that the new value gets applied.
(note that I have the Chrome driver in a subfolder next to the script, called bin)
I try to webscrape linkedin sales navigator but the page loads other elements only if i scroll down and wait a bit. I tried to execute this a following way but it just added me 3 more elements so I get 6 out of 25.
profile_url = "https://www.linkedin.com/sales/search/people?query=(recentSearchParam%3A(doLogHistory%3Atrue)%2Cfilters%3AList((type%3AREGION%2Cvalues%3AList((id%3A102393603%2Ctext%3AAsia%2CselectionType%3AINCLUDED)))))&sessionId=fvwzsZ8CTAKdGri5T5mYZw%3D%3D"
driver.get(profile_url)
time.sleep(20)
action = webdriver.ActionChains(driver)
to_scroll = driver.find_element(By.XPATH, "//button[#id='ember105']")
to_scroll_up = driver.find_element(By.XPATH, "//button[#id='ember141']")
action.move_to_element(to_scroll)
action.perform()
time.sleep(3)
action.move_to_element(to_scroll_up)
action.perform()
time.sleep(3)
action.move_to_element(to_scroll)
action.perform()
time.sleep(3)
How to solve it?
You can do this by using Explicit Wait. As it will make sure the element is visible on UI. You can identify an element that gets loaded after the page is completely loaded and replace XPATH_of_element with its XPATH. Also, add this code before extracting the data so it works properly.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 15).until(EC.visibility_of_element_located((By.XPATH, 'XPATH_of_element')))
You can visit this blog by Shreya Bose for more information.
I'm trying to find an element in Yandex.ru through selenium and click on it.enter image description here
the code is being processed but the click is not happening, I'm assuming selenium doesn't see the element.
def start_bot():
for i in req:
browser.get('https://yandex.ru/search/?lr=65&text='+i)
time.sleep(2)
print(browser.find_element(By.XPATH, '//*[#id="search-result"]/li[1]/div/div[2]/div[2]').click()
start_bot()
Referring to the parent class is not an option, how can I solve this problem?
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
# Assuming browser is already defined
try:
max_delay = 10 # seconds
browser.get('https://yandex.ru/search/?lr=65&text=1')
button = WebDriverWait(browser, max_delay).until(
EC.element_to_be_clickable((By.XPATH, '//*[#id="search-result"]/li[1]/div/div[2]/div[2]'))
)
button.click()
except TimeoutException:
print('Timeout: Element not found or clickable')
Instead of waiting for a fixed delay (2 seconds) as you have done, this code will wait for the element specified by the XPath to be clickable. If the element is not clickable after max_delay (10 seconds), an error of type TimeoutException will be thrown.
You can easily adapt this code to open multiple urls in a loop (as you have done in your code).
PS: Please fix your code's formatting (the function call start_bot() is rendered as text, not code.)
I'm using Selenium with Chrome driver to scrap pages that contain SVG .
I need a way to make Selenium wait until the svg is completely loaded otherwise I will get some incomplete charts when I scrap.
For the moment the script wait for 10sec before it start scrapping but that's is a lot for scraping 20000 pages .
def page_loaded(driver):
path = "//*[local-name() = 'svg']"
time.sleep(10)
return driver.find_element_by_xpath(path)
wait = WebDriverWait(self.driver, 10)
wait.until(page_loaded)
is there any efficient way to check if the SVG is loaded before starting to scrap?
An example from Selenium documentation:
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.ID, 'someid')))
So in your case it should be :
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(self.driver, 10)
element = wait.until(EC.presence_of_element_located((By.XPATH, path)))
Here 10 in WebDriverWait(driver, 10) is the maximum seconds of wait. ie it waits until 10 or condition whichever is first.
Some common conditions that are frequently of use when automating web browsers:
title_is title_contains
presence_of_element_located
visibility_of_element_located visibility_of
presence_of_all_elements_located
text_to_be_present_in_element
text_to_be_present_in_element_value
etc.
More available here.
Also here's the documentation for expected conditions support.
Another way you can tackle this is write your on method like:
def find_svg(driver):
element = driver.find_element_by_xpath(path)
if element:
return element
else:
return False
And then call Webdriver wait like:
element = WebDriverWait(driver, max_secs).until(find_svg)
I'm running Selenium in Python ide with geckodriver.
The site I'm trying to open has a timer of 30 seconds that after this 30 seconds a button appears and I send a click on it.
What I'm asking is the following:
Can I somehow ignore/skip/speed up the waiting time?
Right now what I'm doing is the following:
driver = webdriver.Firefox()
driver.get("SITE_URL")
sleep(30)
driver.find_element_by_id("proceed").click()
Which is very inefficient because every time I run the code to do some tests I need to wait.
Thanks in advance, Avi.
UPDATE:
I haven't found a way to get over the obstacle but until I do I'm trying to focus the next achievable progress:
<video class="jw-video jw-reset" disableremoteplayback="" webkit-playsinline="" playsinline="" preload="metadata" src="//SITE.SITE.SITE/SITE/480/213925.mp4?token=jbavPPLqNqkQT1SEUt4crg&time=1525458550" style="object-fit: fill;"></video>
(censored site's name)
In each page there is a video, all the videos are under the class "jw-video jw-reset"
I had trouble using find element by class so I used:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "video[class='jw-video jw-reset']")))
It works but I can't figure how to select the element's src...
As per your code trial you can remove the time.sleep(30) and induce WebDriverWait for the element to be clickable as follows :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
# lines of code
WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.ID, "proceed"))).click()
Note : Configure the WebDriverWait instance with the maximum time limit as per your usecase. The expected_conditions method element_to_be_clickable() will return the WebElement as soon as the element is visible and enabled such that you can click it.