Locating an element using Python and Selenium via innerHTML

Locating an element using Python and Selenium via innerHTML - python

I'm new to Selenium and I'm trying to write my first real script using the package for Python.
I'm using:
Windows 10
Python 3.10.5
Selenium 4.3.0
So far I've been able to do everything I need with different selectors, like ID, name, XPATH etc.
However I've stumbled upon an issue where I need to find a specific element by using the innerHTML of it.
The issue I'm facing is that I need to find an element with the innerHTML-value of "Changed" as seen in the HTML below.
The first challenge I'm facing is that the element doesn't have a unique ID, name or otherwise to identify it and there's many objects/elements of "dlx-treeview-node".
The second challenge is that XPATH won't work because the element changes position depending on where you are on the website (the number of "dlx-treeview-node"-elements change), so if I use XPATH I'll get the wrong element depending on where I am.
I can successfully get the name by using the below XPATH, "get_attribute" and printing to console, which is why I know it's innerHTML and not innerText, but as mentioned this will change depending on where I am on the website.
I would really appreciate any help I can get to solve this challenge and to learn more about the use of Selenium with Python.
Code trials:
select_filter_name = wait.until(EC.element_to_be_clickable((By.XPATH, "/html/body/div/app-root/dlx-select-filter-attribute-dialog/dlx-dialog-window/div/div[2]/div/div/div[5]/div/div/dlx-view-column-selector-component/div[1]/dlx-treeview/div/dlx-treeview-nodes/div/dlx-treeview-nodes/div/dlx-treeview-node[16]/div/div/div/div[2]/div/dlx-text-truncater/div")))
filter_name = select_filter_name.get_attribute("innerHTML")
print(filter_name)
HTML:
<dlx-treeview-node _nghost-nrk-c188="" class="ng-star-inserted">
<div _ngcontent-nrk-c188="" dlx-droppable="" dlx-draggable="" dlx-file-drop="" class="d-flex flex-column position-relative dlx-hover on-hover-show-expandable-menu bg-control-active bg-control-hover">
<div _ngcontent-nrk-c188="" class="d-flex flex-row ml-2">
<div _ngcontent-nrk-c188="" class="d-flex flex-row text-nowrap expand-horizontal" style="padding-left: 15px;">
<!---->
<div _ngcontent-nrk-c188="" class="d-flex align-self-center ng-star-inserted" style="min-width: 16px; margin-left: 3px;">
<!---->
</div>
<!---->
<div _ngcontent-nrk-c188="" class="d-flex flex-1 flex-no-overflow-x" style="padding: 3.5px 0px;">
<div class="d-flex flex-row justify-content-start flex-no-overflow-x align-items-center expand-horizontal ng-star-inserted">
<!---->
<dlx-text-truncater class="overflow-hidden d-flex flex-no-overflow-x ng-star-inserted">
<div class="text-truncate expand-horizontal ng-star-inserted">Changed</div>
<!---->
<!---->
</dlx-text-truncater>
<!---->
</div>
<!---->
<!---->
<!---->
</div>
</div>
<!---->
<!---->
</div>
</div>
<!---->
<dlx-attachment-content _ngcontent-nrk-c188="">
<div style="position: fixed; z-index: 10001; left: -10000px; top: -10000px; pointer-events: auto;">
<!---->
<!---->
</div>
</dlx-attachment-content>
</dlx-treeview-node>
Edit-1:
NOTE: I'm not sure I'm using the correct terms for HTML, so please correct me if I'm wrong.
I've learned that I have a follow up question:
How do I search for the text as described, but only searching in the "dlx-treeview-node" (there's about 100 of these)? So basically searching in the "children" of these.
The question is because I've learned that there are more elements with the specific text I'm searching for in other places.
Edit-2/solution:
I ended up finding my own solution before I received answers - I'm writing it here in case it can help anyone else.
The reply that is marked as "answer" is because this came the closest to what I needed.
The final code ended up like this (first searching the nodes - then searching the children for the specific innerHTML):
select_filter_name = wait.until(EC.element_to_be_clickable((By.XPATH, "//dlx-treeview-node[.//div[text()='Changed']]")))

Presuming the innerText of the <div> element as a unique text within the HTML DOM to locate the element with the innerHTML as Changed you can use either of the following xpath based locator strategies:
Using xpath and text():
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Changed']")))
Using xpath and contains():
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(., 'Changed')]")))

just run this code on your page and you will get an array of all elements which are a div with the value of Changed
# Define XPath Function (used in the next step)
driver.execute_script("function getXPathOfElement(elt) {var path = "";for (; elt && elt.nodeType == 1; elt = elt.parentNode) { idx = getElementIdx(elt); xname = elt.tagName; if (idx > 1) xname += "[" + idx + "]"; path = "/" + xname + path;} return path;}")
# Get all XPaths for all nodes which are a div with the text of "changed"
xpaths = driver.execute_script("return Array.from(document.querySelectorAll(\"div\")).find(el => el.textContent.includes('Changed')).map((node)=>{ return getXPathOfElement(node)});');
write up
the first execute adds a javascript function to the dom called getXPathOfElement this function accepts a html node element and will provide the xpath string for said node.
the second execute gets all elements which are a div with the text of Changed this will then loop through each element and then provide you with an array of strings, where each string is an xpath by calling the above getXPathOfElement function on each node.
the js is quite simple and harmless.
Tips
check if xpaths length is more than or equal to 1
index xpaths such as xpaths[0] or do loops to make your changes
you will now have an xpath which can be used like a normal selector.
good luck
Edit 1
execute_script() synchronously executes JavaScript in the current window/frame.
or find more here

Related

XPATH target div and image in loop?

Here's the document struvture:
<div class="search-results-container">
<div>
<div class="feed-shared-update-v2">
<div class="update-components-actor">
<div class="update-components-actor__image">
<img class="presence-entity__image" src="https://www.testimage.com/test.jpg"/>
<span></span>
<span>test</span>
</div>
</div>
</div>
</div>
<div>
<div class="feed-shared-update-v2">
<div class="update-components-actor">
<div class="update-components-actor__image">
<img class="presence-entity__image" src="https://www.testimage.com/test.jpg"/>
<span></span>
<span>test</span>
</div>
</div>
</div>
</div>
</div>
not sure the best way to do this but hoping someone can help. I have a for loop that grabs all the divs that precede a div with class "feed-shared-update-v2". This works:
elements = driver.find_elements(By.XPATH, "//*[contains(#class, 'feed-shared-update-v2')]//preceding::div[1]");
I then run a for loop over it:
for card in elements:
however i'm having trouble trying to target the img and the second span in these for loops. I tried:
for card in elements:
profilePic = card.find_element(By.XPATH, ".//following::div[#class='update-components-actor']//following::img[1]").get_attribute('src')
text = card.find_element(By.XPATH, ".//following::div[#class='update-components-text']//following::span[2]").text
but this produces a error saying:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":".//following::div[#class='update-components-actor']//following::img[1]"}
so I'm hoping someone can point me in the right direction as to what i'm doing wrong. I know its my xpath syntax and i'm not allowed to chain "followings" (although even just trying .//following doesn't work, so is ".//" not the right syntax?) but i'm not sure what the right syntax should be, especially since the span does not have a class. :(
Thanks!

I guess you are overusing the following:: axis. Simply try the following (no pun intended):
For your first expression use
//*[contains(#class, 'feed-shared-update-v2')]/..
This will select the parent <div> of the <div class="feed-shared-update-v2">. So you will select the whole surrounding element.
To retrieve the children you want, use these XPaths: .//img/#src and .//span[2]. Full code is
for card in elements:
profilePic = card.find_element(By.XPATH, ".//img").get_attribute('src')
text = card.find_element(By.XPATH, ".//span[2]").text
That's all. Hope it helps.

It seems in the span that there is not such class of div called: update-components-text
did you mean: update-components-actor?
Im not such a fan of xpath, but when i copied your html and img selector, it did find me 2 img, maybe you are not waiting for the element to load, and then it fails?
try using implicit/explicit waits in your code.
I know you are using xpath, but concider using css
This might do the trick:
.feed-shared-update-v2 span:nth-of-type(2)
And if you want a css of the img:
.feed-shared-update-v2 img

Selenium + Python: Print the text attribute of an element

I would like to navigate through a website, find an element and print it.
Python version: 3.10; Selenium Webdriver: Firefox; IDE: PyCharm 2021.3.2 (CE);
OS: Fedora 35 VM
I am able to navigate to the appropriate page where the text is generated in a drop down menu.
When I locate the element by CSS Selector and attempt to print it, the output does print the text "None".
I would like it to print the Plan Name which in this case is "Dual Complete Plan 1".
The element is not always present so I also need to catch any exceptions.
The relevant HTML code of the element I am trying to print:
<span class="OSFillParent" data-expression="" style="font-size: 12px; margin-top: 5px;">Dual Complete Plan 1</span>
More of the HTML code of the element I am trying to print (element I am trying to capture is below the fourth div):
<td data-header="Plan Name">
<div id="b8-b40-l1_0-132_0-$b2" class="OSBlockWidget" data-block="Content.AccordionItem">
<div id="b8-b40-l1_0-132_0-b2-SectionItem" class="section-expandable open is--open small-accordion" data-container="" data-expanded="true" aria-expanded="true" aria-disabled="false" role="tab">
<div id="b8-b40-l1_0-132_0-b2-TitleWrapper" class="section-expandable-title" data-container="" style="cursor: pointer;" role+"button" aria-hidden="false" aria-expanmded="true" tabindex="0" aria-controls="b8-b40-l1_0-132_0-b2-Content" EVENT FLEX
<div id="b8-b40-l1_0-132_0-b2-Title" class="dividers full-width">
<span class="OSFillParent" data-expression="" style="font-size: 12px; margin-top: 5px;">Dual Complete Plan 1</span>
</div>
<div class="section-expandable-icon" data-container="" aria-hidden="true"
::after
</div>
</div>
<div id="b8-b40-l1_0-132_0-b2-ContentWrapper" class="section-expandable-content no-padding is--expanded" data-container="" tabindex="0" aria-hidden="false" aria-labelledby="b8-b40-l1_0-132_0-b2-TitleWrapper">
<div id="b8-b40-l1_0-132_0-b2-Content" role="tabpanel">
<a data-link="" href="https://www.communityplan.com" target="_blank" title="Click for more information"> EVENT
<span class="OSFillParent" data-expression="" style="font-size: 12px;">www.CommunityPlan.com</span>
</a>
<span class="OSFillParent" data-expression="" style="font-size: 12px:">Phone Number: 8005224700</span>
</div>
</div>
</div>
</div>
</td>
My relevant Selenium code:
# Find the Plan Name & if present set it to the variable "Advantage"
try:
Advantage = (WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "#b8-b40-l1_0-132_0-b2-Title > span:nth-child(1)"))).get_attribute("value"))
except:
pass
print('\033[91;46m', Advantage, '\033[0m')
I expect the output to be "Dual Complete Plan 1", which is what I see on the screen and in the HTML. Instead I get the following:
None
Apparently the "Advantage" variable is being set to "None".
Why?
I can see the text "Dual Complete Plan 1" that I want to print in the HTML code above.
What am I doing wrong?
I feel like I need a primer on "get attribute"?

To get the text Dual Complete Plan 1 you need to use
element.text
or
element.get_attribute("innerHTML")
or
element.get_attribute("textContent")
Instead of presence_of_element_located() use visibility_of_element_located()
and following css selector to identify
div[id*='Title'] > span.OSFillParent
Or
div.dividers.full-width > span.OSFillParent
Code:
try:
Advantage = WebDriverWait(driver, 5).until(
EC.visibility_of_element_located((By.CSS_SELECTOR, "div[id*='Title'] > span.OSFillParent"))).text
except:
pass
print(Advantage )

Python, Selenium: How to get text next to element

I'm fairly new to selenium and I'm trying to get the text of a cell next to a known element.
This is an excerpt of a webtable:
<div class="row">
<div class="cell">
text-to-copy
</div>
<div class="cell">
<input type="text" size="10" id="known_id" onchange="update(this.id);" onclick="setElementId(this.id);"/>
X
</div>
<div class="cell right">
<div id="some_id">?</div>
</div>
</div>
It looks something like this:
From this table I would like to get the text-to-copy with selenium. As the composition of the table can vary, there is no way to know that cells xpath. Therefore I can not use selenium_driver.find_element_by_xpath(). The only known thing is the id of the cell next to it (id=known_id).
The following pseudo code is to illustrate what I'm looking for:
element = selenium_driver.find_element_by_id("known_id")
result = element.get_visible_text_from_cell_before_element()
Is there a way to get the visible text (text-to-copy) with selenium?

I believe you can fairly use xpath, all other locators that Selenium supports would not work, becasue we have to traverse upward in DOM.
The below xpath is dependent on known_id
//input[contains(#id,'known_id')]/../preceding-sibling::div
You have to either use .text or .get_attribute etc to get the text.
Sample code :
time.sleep(5)
element = selenium_driver.find_element_by_xpath("//input[contains(#id,'known_id')]/../preceding-sibling::div").get_attribute('innerText')
print(element)

Alternative to time.sleep() in selenium using python while web scraping?

I need to scrape price of certain listed food items basis different locations in the country. There's an input text box that allows me to enter the name of the city & pressing "Enter" shows me the list of items available in that city.
Here's how I am trying to automate this:
driver.get("https://grofers.com/")
ele = driver.find_element_by_xpath("//input[#data-test-id='area-input-box']")`
ele.send_keys(area)
ele.send_keys(Keys.RETURN)
Here's the HTML I'm working with:
<div style="margin-left: 51px; height: 36px;">
<div style="display: flex; height: 100%;">
<button class="btn location-box mask-button">Detect my location</button>
<div class="oval-container">
<div class="oval">
<span class="separator-text">
<div class="or">OR</div>
</span>
</div>
</div>
<div style="width: 220px;">
<div class="modal-right__input-wrapper">
<div class="display--table full-width">
<div class="display--table-cell full-width">
<div id="map-canvas"></div>
<div class="Select location-search-input-v1 is-searchable Select--single">
<div class="Select-control">
<div class="Select-multi-value-wrapper" id="react-select-2--value">
<div class="Select-placeholder">Type your city Society/Colony/Area</div>
<div class="Select-input" style="display: inline-block;">**<input data-test-id="area-input-box" aria-activedescendant="react-select-2--value" aria-expanded="false" aria-haspopup="false" aria-owns="" role="combobox" value="">**</div>
</div>
<span class="Select-arrow-zone"><span class="Select-arrow"></span></span>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
The problem is - after send_keys, the website takes time to autofill the input box AFTER WHICH I need to press enter.
I tried using time.sleep(2) after send_keys but this leads to pop-up disappearing & a StaleElementException when I do Keys.RETURN.
Have been stuck on this for quite some time now. Any help/pointers would be appreciated.

Selenium actually has an article on this with Explicit and Implicit waits, I think this is the one you're looking for:
# Wait until an element with id='myNewInput' has class 'myCSSClass'
wait = WebDriverWait(driver, 10)
element = wait.until(element_has_css_class((By.ID, 'myNewInput'), "myCSSClass"))
https://selenium-python.readthedocs.io/waits.html That's the article

You can also create custom wait conditions when none of the previous convenience methods fit your requirements. A custom wait condition can be created using a class with call method which returns False when the condition doesn’t match.
class element_has_css_class(object):
"""An expectation for checking that an element has a particular css class.
locator - used to find the element
returns the WebElement once it has the particular css class
"""
def __init__(self, locator, css_class):
self.locator = locator
self.css_class = css_class
def __call__(self, driver):
element = driver.find_element(*self.locator) # Finding the referenced element
if self.css_class in element.get_attribute("class"):
return element
else:
return False
# Wait until an element with id='myNewInput' has class 'myCSSClass'
wait = WebDriverWait(driver, 10)
element = wait.until(element_has_css_class((By.ID, 'myNewInput'), "myCSSClass"))

Unable to locate element Selenium webdriver || Python

<div class="container-fluid ">
<div class="navbar-header">
<span id="problem_hide_search" class="nav navbar-left">
<span id="ca660735dba5d3003d7e5478dc9619b2_title" class="list_search_title navbar-text " style="float: left; display:inherit">Go to</span>
<div style="float: left; display:inherit">
<div class="input-group" style="width: 300px;">
<span class="input-group-addon input-group-select">
<label class="sr-only" for="ca660735dba5d3003d7e5478dc9619b2_text">Search</label>
<input id="ca660735dba5d3003d7e5478dc9619b2_text" class="form-control" name="ca660735dba5d3003d7e5478dc9619b2_text" style="width: 150px;" placeholder="Search"/>
</div>
</div>
<script data-comment="widget search load event">addLoadEvent(function () { new GlideWidgetSearch('ca660735dba5d3003d7e5478dc9619b2', 'problem', 'true'); });</script>
Am trying to locate the Search box by switching into iframe and selecting by
search_box = driver.find_element_by_xpath('//*#id="ca660735dba5d3003d7e5478dc9619b2_text"]')
But i get error unable to locate Message: no such element: Unable to locate element:
Even thought I find one matching node.

As you mentioned in your question that you are trying to locate the Search box by switching into iframe and selecting as per the best practices you should :
Induce WebDriverWait for the frame to be available to switch as follows :
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it(By.ID,"id_of_iframe"))
Here you will find a detailed discussion How can I select a html element no matter what frame it is in in selenium?
While you look out for an element within an <iframe> tag induce WebDriverWait with proper expected_conditions. Considering the fact that you intend to send text to the element you can use the following line of code :
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='navbar-header']//input[#class='form-control' and contains(#id,'_text')]"))).send_keys("hello")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Locating an element using Python and Selenium via innerHTML - python

Related

XPATH target div and image in loop?

Selenium + Python: Print the text attribute of an element

Python, Selenium: How to get text next to element

Alternative to time.sleep() in selenium using python while web scraping?

Unable to locate element Selenium webdriver || Python

Categories

Resources