Clicking through multiple links using selenium - python

Trying to break a bigger problem I have into smaller chunks
main question
I am currently inputting a boxer's name into an autocomplete box, selecting the first option that comes up (boxer's name) then clicking view more until I get a list of all the boxer's fights and the view more button stops appearing.
I am then trying to create a list of onclick hrefs I would like to click then iteratively clicking on each and getting the html from each page/fight. I would ideally want to extract the text in particular.
This is the code I have written:
page_link = 'http://beta.compuboxdata.com/fighter'
chromedriver = 'C:\\Users\\User\\Downloads\\chromedriver'
cdriver = webdriver.Chrome(chromedriver)
cdriver.maximize_window()
cdriver.get(page_link)
wait = WebDriverWait(cdriver,20)
wait.until(EC.visibility_of_element_located((By.ID,'s2id_autogen1'))).send_keys('Deontay Wilder')
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'select2-result-label'))).click()
while True:
try:
element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
except TimeoutException:
break
# fighters = cdriver.find_elements_by_xpath("//div[#class='row row-bottom-margin-5']/div[2]")
links = [x.get_attribute('onclick') for x in wait.until(EC.visibility_of_element_located((By.XPATH, "//*[contains(#onclick, 'get_fight_report')]/a")))]
htmls = []
for link in links:
cdriver.get(link)
htmls.append(cddriver.page_source)
Running this however gives me the error message:
ElementClickInterceptedException Traceback (most recent call last)
<ipython-input-229-1ee2547c0362> in <module>
10 while True:
11 try:
---> 12 element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
13 except TimeoutException:
14 break
ElementClickInterceptedException: Message: element click intercepted: Element <a class="view_more" href="javascript:void(0);" onclick="_search('0')"></a> is not clickable at point (274, 774). Other element would receive the click: <div class="col-xs-12 col-sm-12 col-md-12 col-lg-12">...</div>
(Session info: chrome=78.0.3904.108)
UPDATE
I have tried looking at a few answers with similar error messages and tried this
while True:
try:
element = cdriver.find_element_by_class_name('view_more')
webdriver.ActionChains(cdriver).move_to_element(element).click(element).perform()
# element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
except TimeoutException:
break
links = [x.get_attribute('onclick') for x in wait.until(EC.visibility_of_element_located((By.XPATH, "//*[contains(#onclick, 'get_fight_report')]/a")))]
htmls = []
for link in links:
cdriver.get(link)
htmls.append(cddriver.page_source)
but this seems to create some sort of infinite loop at the ActionChains point. Seems to be constantly waiting for the view more href to appear

click function should already move the window so the element is in the viewable window. So you don't need that action chain (I think...) but the original error shows some other element OVER the view more button.
You may need to remove (or hide) this element from the DOM, or if it's a html window, "dismiss" it. So pinpointing this covering element is key and then deciding on a strategy to uncover the view more button.
Your site http://beta.compuboxdata.com/fighter doesn't seem to be working at the time so I can't dig in further.

Related

ElementClickInterceptedException: element click intercepted. Other element would receive the click: Selenium Python

I have seen other questions about this error but my case is that in my program the other element should receive the click. In details: the webdriver is scrolling through google search and it must click every website it finds but the program is preventing that. How can I make it NOT search the previous site it clicked?
This is the function. The program is looping it and after the first loop it scrolls down and the error occurs:
def get_info():
browser.switch_to.window(browser.window_handles[2])
description = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "h3"))
).text
site = WebDriverWait(browser, 10).until(
EC.presence_of_element_located((By.TAG_NAME, "cite"))
)
site.click()
url=browser.current_url
#removes the https:// and the / of the url
#to get just the domain of the website
try:
link=url.split("https://")
link1=link[1].split("/")
link2=link1[0]
link3=link2.split("www.")
real_link=link3[1]
except IndexError:
link=url.split("https://")
link1=link[1].split("/")
real_link=link1[0]
time.sleep(3)
screenshot=browser.save_screenshot("photos/"+"(" + real_link + ")" + ".png")
global content
content=[]
content.append(real_link)
content.append(description)
print(content)
browser.back()
time.sleep(5)
browser.execute_script("window.scrollBy(0,400)","")
time.sleep(5)
You can create a list of clicked website and check every time if that link is clicked or not. Here's the demo code :
clicked_website=[]
url=browser.current_url
clicked_website.append(url)
# Now while clicking
if <new_url> not in clicked_website:
<>.click()
This is just an idea how to implement. Your code is mess, I didn't understand clearly so, implement in your code by yourself.

Locating and storing multiple elements in selenium

I need to find and store the location of some elements so the bot can click on those elements even the if page changes. I have read online that for a single element storing the location of that element in a variable can help however I could not find a way to store locations of multiple elements in python. Here is my code
comment_button = driver.find_elements_by_css_selector("svg[aria-label='Comment']")
for element in comment_button:
comment_location = element.location
sleep(2)
for element in comment_location:
element.click()
this code gives out this error:
line 44, in <module>
element.click()
AttributeError: 'str' object has no attribute 'click'
Is there a way to do this so when the page refreshes the script can store the locations and move on to the next location to execute element.click without any errors?
I have tried implementing ActionChains into my code
comment_button = driver.find_elements_by_css_selector("svg[aria-label='Comment']")
for element in comment_button:
ac = ActionChains(driver)
element.click()
ac.move_to_element(element).move_by_offset(0, 0).click().perform()
sleep(2)
comment_button = driver.find_element_by_css_selector("svg[aria-label='Comment']")
comment_button.click()
sleep(2)
comment_box = driver.find_element_by_css_selector("textarea[aria-label='Add a comment…']")
comment_box.click()
comment_box = driver.find_element_by_css_selector("textarea[aria-label='Add a comment…']")
comment_box.send_keys("xxxx")
post_button = driver.find_element_by_xpath("//button[#type='submit']")
post_button.click()
sleep(2)
driver.back()
scroll()
However this method gives out the same error saying that the page was refreshed and the object can not be found.
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of <svg class="_8-yf5 "> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
Edited:
Assuming No. of such elements are not changing after refresh of page, you can use below code
commentbtns = driver.find_elements_by_css_selector("svg[aria-label='Comment']")
for n in range(1, len(commentbtns)+1):
Path = "(//*[name()='svg'])["+str(n)+"]"
time.sleep(2)
driver.find_element_by_xpath(Path).click()
You can use more sophisticated ways to wait for element to load properly. However for simplicity purpose i have used time.sleep.

Element click intercepted selenium

Seem to be getting this error when trying to click view more until the end of the page (until I don't see the view more option), but getting this error message
ElementClickInterceptedException 14 wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'select2-result-label'))).click()
15 while True:
---> 16 wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
17 try:
18 element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more')))
ElementClickInterceptedException: Message: element click intercepted: Element is not clickable at point (231, 783)
(Session info: chrome=78.0.3904.108)
This is the code I have
while True:
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
try:
element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more')))
element.click()
except TimeoutException:
break
Here is the html from the site
<a class="view_more" href="javascript:void(0);" onclick="_search('0')">VIEW MORE ...</a>
This is the website
page_link = 'http://beta.compuboxdata.com/fighter'
Firstly, Why do you have the same line of code duplicated in two different ways?
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more'))).click()
is functionally equivalent to:
try:
element = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'view_more')))
element.click()
except TimeoutException:
break
save for handling the timeout exception.
I would just remove the first line, as I don't see the point in it.
On to your actual issue, see the answer to this stack overflow question: https://stackoverflow.com/a/44916498/3715974
My best guess is that it's due to a javascript/ajax call that's loading contents onto the page and the view more button isn't available immediately, causing your code to panic. Read through that answer and it may give you more insight, but you could also try simply catching that exception, delaying for a small period of time and trying again.

How can I check if an element exists on a page using Selenium XPath?

I'm writing a script in to do some webscraping on my Firebase for a few select users. After accessing the events page for a user, I want to check for the condition that no events have been logged by that user first.
For this, I am using Selenium and Python. Using XPath seems to work fine for locating links and navigation in all other parts of the script, except for accessing elements in a table. At first, I thought I might have been using the wrong XPath expression, so I copied the path directly from Chrome's inspection window, but still no luck.
As an alternative, I have tried to copy the page source and pass it into Beautiful Soup, and then parse it there to check for the element. No luck there either.
Here's some of the code, and some of the HTML I'm trying to parse. Where am I going wrong?
# Using WebDriver - always triggers an exception
def check_if_user_has_any_data():
try:
time.sleep(10)
element = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//*[#id="event-table"]/div/div/div[2]/mobile-table/md-whiteframe/div[1]/ga-no-data-table/div')))
print(type(element))
if element == True:
print("Found empty state by copying XPath expression directly. It is a bit risky, but it seems to have worked")
else:
print("didn’t find empty state")
except:
print("could not find the empty state element", EC)
# Using Beautiful Soup
def check_if_user_has_any_data#2():
time.sleep(10)
html = driver.execute_script("return document.documentElement.outerHTML")
soup = BeautifulSoup(html, 'html.parser')
print(soup.text[:500])
print(len(soup.findAll('div', {"class": "table-row-no-data ng-scope"})))
HTML
<div class="table-row-no-data ng-scope" ng-if="::config" ng-class="{overlay: config.isBuilderOpen()}">
<div class="no-data-content layout-align-center-center layout-row" layout="row" layout-align="center center">
<!-- ... -->
</div>
The first version triggers the exception and is expected to evaluate 'element' as True. Actual, the element is not found.
The second version prints the first 500 characters (correctly, as far as I can tell), but it returns '0'. It is expected to return '1' after inspecting the page source.
Use the following code:
elements = driver.find_elements_by_xpath("//*[#id='event-table']/div/div/div[2]/mobile-table/md-whiteframe/div[1]/ga-no-data-table/div")
size = len(elements)
if len(elements) > 0:
# Element is present. Do your action
else:
# Element is not present. Do alternative action
Note: find_elements will not generate or throw any exception
Here is the method that generally I use.
Imports
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
Method
def is_element_present(self, how, what):
try:
self.driver.find_element(by=how, value=what)
except NoSuchElementException as e:
return False
return True
Some things load dynamically. It is better to just set a timeout on a wait exception.
If you're using Python and Selenium, you can use this:
try:
driver.find_element_by_xpath("<Full XPath expression>") # Test the element if exist
# <Other code>
except:
# <Run these if element doesn't exist>
I've solved it. The page had a bunch of different iframe elements, and I didn't know that one had to switch between frames in Selenium to access those elements.
There was nothing wrong with the initial code, or the suggested solutions which also worked fine when I tested them.
Here's the code I used to test it:
# Time for the page to load
time.sleep(20)
# Find all iframes
iframes = driver.find_elements_by_tag_name("iframe")
# From inspecting page source, it looks like the index for the relevant iframe is [0]
x = len(iframes)
print("Found ", x, " iFrames") # Should return 5
driver.switch_to.frame(iframes[0])
print("switched to frame [0]")
if WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//*[#class="no-data-title ng-binding"]'))):
print("Found it in this frame!")
Check the length of the element you are retrieving with an if statement,
Example:
element = ('https://www.example.com').
if len(element) > 1:
# Do something.

web scraping contat list selenium python

How can i loop through contacts in group in Discord using selenium in Python?
I tried this code, and i have this error:
selenium.common.exceptions.StaleElementReferenceException: Message: The element reference of is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed
Problem is scroller and contacts are constantly updating...
I tried this code:
while True:
num=0
try:
users_list = driver.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
for user in users_list:
num+=1
user.click()
driver.execute_script("arguments[0].scrollIntoView();",user)
print('User number {}'.format(num))
except StaleElementReferenceException and ElementClickInterceptedException:
print('bad')
driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight",users_list)
From your given code, you only scroll element, so the reason of Stale exception is you not wait page load complete, or at least not wait the contacts not load complete.
For debug purpose, you can simple add a long sleep before the loop, like sleep(15), and replace to explicit wait if production code, like
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
Detail of Explicit Wait at here
If you call click() in loop, you need to find the elements again in loop:
while True:
num=0
try:
time.sleep(15)
users_list = driver
.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
length = len(users_list)
for num in range(0, length):
user = users_list[num]
user.click()
time.sleep(15)
driver.execute_script("arguments[0].scrollIntoView();",user)
print('User number {}'.format(num+1))
// because the above `click` make page happen changes
// so selenium will treat it as a new page,
// those element reference found on `old` page, can not work on `new` page
// you need to find elements belongs to `old` page again on `new` page
// find users_list again from `new` page
users_list = driver
.find_elements_by_css_selector("div.memberOnline-1CIh-0.member-3W1lQa")
except StaleElementReferenceException and ElementClickInterceptedException:
print('bad')
driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight",
users_list)

Categories

Resources