Xpath: matching one element based on attribute of neighboring element

Xpath: matching one element based on attribute of neighboring element - python

I've got markup of the following format I'm attempting to work with using Selenium/Python:
<tr>
<td>google</td>
<td>useless text</td>
<td>useless text2</td>
<td>useless text3</td>
<td>emailaddress</td>
</tr>
The idea being that given a known email address (part of the href in the emailaddress td), I can get to (and click) the a in the first td. It looks like xpath is the best choice to accomplish this with Selenium. I'm trying the following xpath:
//*[#id="page_content"]/table/tbody/tr[2]/td[2]/div/table[1]/tbody/tr/td[4]/a[contains(#href, "mailto:needle#email.com")]/../../td/a[0]
But I'm getting this error:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"xpathhere"}
I do know the xpath to get to the "needle#email.com" a is correct, as it's just copied from chrome dev tools, so the error must be with the part of the xpath after reaching the first a element. Can anyone shed some light on the problem with my xpath?

Try to use below code:
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
xpath = '//td[a[#href="needle#email.com"]]/preceding-sibling::td/a'
wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, xpath))).click()
This should allow you to match first link based on href attribute of the last link in table row (tr) and click it once it became clickable

First, note that (this may be a meaningless typo) you are looking for "mailto:needle#email.com" while the value of your href attribute is "needle#email.com".
Second, you actually know how to get back [...]. But Xpath indexing starts with 1. Thus why this 'a[0]', is this also a meaningless typo ?
Anyway, this xpath would get your sibling
'//a[contains(#href, "needle#email.com")]/../../td[1]/a[1]'
Or more accurately than using contains (since you may have other email adresses that can be matched, e.g. "otherneedle#email.com")
'//a[#href="needle#email.com"]/../../td[1]/a[1]'
Or even better, i.e. with no index and no parent/child like exploration.
'//td[a[#href="needle#email.com"]]/preceding-sibling::td/a'
All tested.

Try to find the tr that contains that email, and click on the first link from it.
//tr[.//a[contains(#href, 'your_email')]]//a
or
//tr[.//a[contains(#href, 'your_email')]]//a[#href]
or
//tr[.//a[contains(#href, 'your_email')]]//a[contains(#href, 'common_url_part')]

Your HTML should be this.
<tr>
<td>google</td>
<td>useless text</td>
<td>useless text2</td>
<td>useless text3</td>
<td>emailaddress</td>
</tr>
Otherwise, your user can click on the link until he or she works himself into a frenzy. :)
Then you can do this in selenium.
>>> from selenium import webdriver
>>> driver = webdriver.Chrome()
>>> driver.get("file://c:/scratch/temp2.htm")
>>> link = driver.find_element_by_xpath('.//a[contains(#href,"needle#email.com")]')
>>> link.click()
I've used contains because an email address in a link can be something like mailto:Jose Greco <needle.email.com>.
PS: And incidentally I've just executed this stuff on my machine.

Related

How do i find element in website that constanly change its label ID deep inside nested DIV?

I have a site that tracking all our shipment and i would like to use Selenium with Python to fetch all the value into my Excel file. But i'm currently stuck at getting Selenium to work. It will keep reporting back to me that the element is nowhere to be found. The element are also deep deep in the nested of DIV stuff.
at first i tried using
driver.find_element(By.XPATH,"//*[#id='trackItNowForm:j_idt529:0:j_idt535']").text
And quickly found out that part of the ID will constantly changing. (the "idt529:0:j_idt535" part in ID will constantly change to 519, 509 and so on. The zero path will also change based on lenght of data.) Without a clear structure. With that, By.ID is also obviously out of the window.
So now i'm currently out of idea on what to do next. Maybe i didn't put in the corrected syntax? I'm new to Selenium so i'm still confuse on a lot of things (like how XPATH are written and so on.) So an explanation will be appreciated.

You can locate that element by it's TrackingNumber class name. Maybe combined with more known attributes values.
Please try this:
driver.find_element(By.XPATH,"//label[contains(#id,'trackItNowForm') and(contains(#class,'TrackingNumber'))]").text
You may probably will need to add a delay to wait for the element to be visible. In this case WebDriverWait expected conditions is the right way to do it, as following:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(browser, 20)
wait.until(EC.visibility_of_element_located((By.XPATH, "//label[contains(#id,'trackItNowForm') and(contains(#class,'TrackingNumber'))]"))).text

You can try following css selector to identify the element.
option 1:
driver.find_element(By.CSS_SELECTOR,"[id^='trackItNowForm'][id*='id']").text
option 2:
driver.find_element(By.CSS_SELECTOR,"[id^='trackItNowForm'][id*='id'][class*='TrackingNumber']").text
where ^ means starts-with and * contains.
In option 1, it is checking id attribute starts-with trackItNowForm and also contains id

You should understand how xpath works.
For example, for the code you provided:
driver.find_element(By.XPATH,"//div[#class='col-sm-10']/label").text
looks for the first occurrence of a label under a div with the attribute class='col-sm-10' and then takes its text.
Useful links:
XPath syntax
XPath cheatsheet

Selenium not extracting info using xpath

I am trying to extract some information from the amazon website using selenium. But I am not able to scrape that information using xpath in selenium.
In the image below I want to extract the info highlighted.
This is the code I am using
try:
path = "//div[#id='desktop_buybox']//div[#class='a-box-inner']//span[#class='a-size-small')]"
seller_element = WebDriverWait(driver, 5).until(
EC.visibility_of_element_located((By.XPATH, path)))
except Exception as e:
print(e)
When I run this code, it shows that there is an error with seller_element = WebDriverWait(driver, 5).until( EC.visibility_of_element_located((By.XPATH, path))) but does not say what exception it is.
I tried looking online and found that this happens when selenium is not able to find the element in the webpage.
But I think the path I have specified is right. Please help me.
Thanks in advance
[EDIT-1]
This is the exception I am getting
Message:

//div[class='a-section a-spacing-none a-spacing-top-base']//span[class='a-size-small a-color-secondary']
XPath could be something like this. You can shorten this.
CSS selector could be and so forth.
.a-section.a-spacing-none.a-spacing-top-base
.a-size-small.a-color-secondary

I think the reason is xpath expression is not correct.
Take the following element as an example, it means the span has two class:
<span class="a-size-small a-color-secondary">
So, span[#class='a-size-small') will not work.
Instead of this, you can ues xpath as
//span[contains(#class, 'a-size-small') and contains(#class, 'a-color-secondary')]
or cssSelector as
span.a-size-small.a-color-secondary

Amazon is updating its content on the basis of the country you are living in, as I have clicked on the link provided by you, there I did not find the element you are looking for simply because the item is not sold here in India.
So in short if you are sitting in India and try to find your element, it is not there, but as you change the location to "United States". it is appearing there.
Solution - Change the location

To print the Ships from and sold by Amazon.com of an element you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.a-section.a-spacing-none.a-spacing-top-base > span.a-size-small.a-color-secondary"))).get_attribute("innerHTML"))
Using XPATH and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='a-section a-spacing-none a-spacing-top-base']/span[#class='a-size-small a-color-secondary']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Element has ID in inspection mode, but not in raw HTML

I am currently working on a small web scraping script with Python and Selenium.
I am trying to get some information from a table, which has in inspection mode a certain ID.
However, when I open the page as raw HTML (which I did after not being able to locate that table using neither xpath or css_selector), the table does not have the mentioned ID.
How is that possible?
For better explanation:
This is what it looks like in inspection mode in my browser
<table id='ext-gen1076' class='bats-table bats-table--center'>
[...]
</table>
And this is what it looks like when I open the page as raw HTML file
<table class='bats-table bats-table--center'>
[...]
</table>
How is it possible, that the ID is simply disappearing?
(JFI, this is my first question so apologies for the bad formatting!).
Thanks in advance!

The reason is, the ID was added during the runtime.

The value of the id attribute i.e. ext-gen1076 contains a number and is clearly dynamically generated. The prefix of the value of the id attribute i.e. ext-gen indicates that the id was generated runtime using Ext JS.
Ext JS
Ext JS is a JavaScript framework for building data-intensive, cross-platform web and mobile applications for any modern device.
This usecase
Possibly you have identified the <table> element even before the JavaScript have rendered the complete DOM Tree. Hence the id attribute was missing.
Identifying Ext JS elements
As the value of the id attribute changes i.e. dynamic in nature you won't be able to use the complete value of the id attribute and can use only partial value which is static. As per the HTML you have provided:
<table id='ext-gen1076' class='bats-table bats-table--center'>
[...]
</table>
To identify the <table> node you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table[id^='ext-gen']")))
Using XPATH:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[starts-with(#id,'ext-gen')]")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
However, there would be a lot other elements with id attribute starting with ext-gen. So to uniquely identify the <table> element you need to club up the class attribute as follows:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.bats-table.bats-table--center[id^='ext-gen']")))
Using XPATH:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[#class='bats-table bats-table--center' and starts-with(#id,'ext-gen')]")))
Reference
You can find a relevant detailed discussion in:
How to find element by part of its id name in selenium with python
How to get selectors with dynamic part inside using Selenium with Python?

Element is visible but cannot be found by webdriver

I'm using selenium and python for automated test and have an issue when try to click on an element in web.
I'm using find_element_by_xpath, provide the correct path (I have try on browser and it return 1 of 1).
driver = webdriver.Chrome()
driver.get('page_url')
driver.find_element_by_xpath(//button[#class="aoc-end md-button md-ink-ripple"])
Here is html:
<div class="_md md-open-menu-container md-whiteframe-z2 md-active md-clickable" id="menu_container_77" style="top: 499px; left: 866px; transform-origin: right top;" aria-hidden="false">
<md-menu-content class="agent-dropdown-menu-content new-menu-content" width="3" role="menu"><!----><md-menu-item>
<button class="aoc-end-work md-button md-ink-ripple" type="button" ng-transclude="" ng-disabled="!agent.capabilities.canSupervisorLogout" ng-click="logOutAgent(agent.userHandle)" role="menuitem">
The element should be found but actual result is that selenium.common.exceptions.ElementNotVisibleException: Message: element not visible

From the exception, it seems the element is present of the page but currently not visible.
Possible reason for invisibility of element can be (element is masked by another element, element can be in a drop - down list etc. ). If element is in drop-down list, then first open the list and then find the element.
Hope it will help.

Finding locators in web application developed in angular is quite tricky, especially when the developers don't follow any of guidelines useful from an automation perspective, like adding unique attributes for every web elements and etc...
In your case, it seems that the same thing is happening.
Following locator should work for you (by the way, you missed "" after and before locator in driver.find_element_by_xpath() method):
driver.find_element_by_xpath("//button[#ng-click='logOutAgent(agent.userHandle)']");

Since the element you are trying to click is a dynamic element, you will need to wait for it to become clickable before clicking on it. Use Expected conditions and WebDriver wait:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//button[#class='aoc-end md-button md-ink-ripple']"))).click()
Also, notice the " " and ' ' quotes on the outer and inner xpath selector.

Select the focused element in Selenium

What I'm dealing with
In one of my tests I need to interact with a pop-up input field that is very hard to select using a regular css selector or xpath. I know, however, that this pop-up box will have the focus.
How can I use the fact that it's focused to properly assert that the input field contains some text?
Pseudocode of what I'm looking for:
element = driver.find_element_by_FIND_THE_ELEMENT_WITH_FOCUS
assert elem.text == "foobar"
Partial solutions I've come across:
Possibly a working solution in Ruby:
element = #driver.find_element :css, 'input:focus'
sleep(3)
element.send_keys "Hello WebDriver!"
assert_equal(element.attribute('value'),"Hello WebDriver!")
Getting the focused element with jQuery:
var focus = $(document.activeElement);
Could you provide a solution in python please?
EDIT:
Here's the HTML of the element:
<div class="css-1492t68">Select a speaker...</div>
I realized my question could've lead to some confusion as the element itself is does not have the semantic <input> tag but is rather a <div>. I suppose this is handled by the quill-editor library, which our dev team used when designing this app.
Here's a screenshot of the pop-up box:

Since the element does not have a consistent CSS class, using an XPath expression is probably going to be your best bet.
You need to wait for the element to be clickable (which means intractable) by matching a <div> that contains the expected text, and has a CSS class that begins with css-. Doing this should allow enough time for JavaScript to do its thing in the browser to initialize the quill-editor and display it:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ExpectedConditions
xpath = "//div[starts-with(#class, 'css-')][contains(., 'The text you expect to find')]"
expected_condition = ExpectedConditions.element_to_be_clickable((By.XPATH, xpath))
element = WebDriverWait(driver, 20).until(expected_condition)
If this pop up field is directly inside a particular parent element, you can further limit your xpath expression accordingly.
If this succeeds, no error will be thrown. If it does not succeed, it should throw a WebDriverTimeoutException (or whatever this error is called in Python).

use:
elem = driver.switch_to.active_element

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Xpath: matching one element based on attribute of neighboring element - python

Try to find the tr that contains that email, and click on the first link from it. //tr[.//a[contains(#href, 'your_email')]]//a or //tr[.//a[contains(#href, 'your_email')]]//a[#href] or //tr[.//a[contains(#href, 'your_email')]]//a[contains(#href, 'common_url_part')]

Related

How do i find element in website that constanly change its label ID deep inside nested DIV?

Selenium not extracting info using xpath

Element has ID in inspection mode, but not in raw HTML

Element is visible but cannot be found by webdriver

Select the focused element in Selenium

Categories

Resources