How to use function ends-with in selenium? - python

I look up the information that lxml does not support xpath2.0 so that it can't use ends-with, so selenium can't use ends-with how to use it or replace ends-with. thank you very much indeed!!!
HTML sample
<span id="xxxxx_close">wwwww</span>
The 'xxxxx' part of #id is random

You can apply an ends-with CSS selector:
By.cssSelector("[id$=_close]")
There's no need of including span tag in css selector search as well.

The ends-with XPath Constraint Function is part of XPath v2.0 but as per the current implementation Selenium supports XPath v1.0.
As per the HTML you have shared to identify the element you can use either of the Locator Strategies:
XPath using contains():
xpath using contains for id attribute:
driver.findElement(By.xpath("//span[contains(#id,'_close')]")).click();
xpath using contains for id and innerHTML attribute:
driver.findElement(By.xpath("//span[contains(#id,'_close') and contains(.,'wwwww')]")).click();
Alternatively, you can also use CssSelector as follows:
css_selector using ends-with (i.e. $ wildcard) clause for id attribute:
driver.find_element_by_css_selector("span[id$='_close']").click();
css_selector using contains (i.e. * wildcard) clause for id attribute:
driver.find_element_by_css_selector("span[id*='_close']").click();

Related

How to use mutiple attributes (including a partial string match) with find_elements in Selenium for Python

Currently I have this line of code which correctly selects this type of object on the webpage I'm trying to manipulate with Selenium:
pointsObj = driver.find_elements(By.CLASS_NAME,'treeImg')
What I need to do is add in a partial string match condition as well which looks in the section "CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)" in the line below.
<span class="treeImg v65point" style="cursor:pointer;">CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)</span>
I found online there's the ChainedBy option but I can't think of how to reference that text in the span. Do I need to use XPath? I tried that for a second but I couldn't think of how to parse it.
Refering both the CLASS_NAME and the innerText you can use either of the following locator strategies:
xpath using the classname treeImg and partial innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[contains(#class, 'treeImg') and contains(., 'AHU-01_ahu_ChilledWtrVlvOutVolts')]")
xpath using all the classnames and entire innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[#class='treeImg v65point' and text()='CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)']")

Python - Selenium get Element by attribute and the full attribute value

Hi how can I get an element by attribute and the attribute value in Python Selenium?
For example I have class="class1 class2 class3".
Now I want to get the element with the attribute class what ca.rries the classes "class1 class2 class3".
Is this possible?
If I use xpath, I always need to add the element type, input, option,...
I try to avoid the element type since it varies sometimes.
While constructing locators considering css-selectors or xpath you have to use the different attributes and the attribute-values to identify the WebElement uniquely within the DOM Tree.
The generic way is:
Using css_selector:
button.classname[attributeA='attributeA_value'][attributeB='attributeB_value']
Using xpath and attributes:
//button[#attributeA='attributeA_value'][#attributeB='attributeB_value']
As an example, for an element like:
<button type="button" aria-hidden="true" class="close alert alert-close" data-notify="dismiss">Close</button>
You can identify the Close element using either of it's the attributes and the corresponding attribute-values using either of the Locator Strategies:
Using css_selector:
button.close.alert.alert-close[data-notify='dismiss']
classes-> ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and attributes:
//button[#class='close alert alert-close' and #data-notify='dismiss']
class attributes ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and innerText:
//button[text()='Close']
innerText ^
The CSS selectors would be formatted like this:
'[attribute]'
'[attribute="value"]'
For example, the selector for the input field on google.com would be:
'input[name="q"]'
to answer this part
If I use xpath, I always need to add the element type, input,
option,... I try to avoid the element type since it varies sometimes.
you can use //* and then attribute type and attribute value.
//*[#class='class1 class2 class3']
//* represent any or all nodes

How to work with multiple classes using BeautifulSoup & BS4

I am trying to write youtube scraper and as part of my task I need to work with multiple classes at bs4.
HTML looks like
<span id="video-title" class="style-scope ytd-playlist-panel-video-renderer">
</span>
My aim to use class attribute to get all 50 different music and work with them.
I have tried like that and it returns me nothing.
soup_obj.find_all("span", {"class":"style-scope ytd-playlist-panel-video-renderer"})
and I also tried as Selenium style (instead of spaces between class pass dot(.)
soup_obj.find_all("span", {"class":"style-scope.ytd-playlist-panel-video-renderer"})
Does anyone have idea about it ?
This should work
soup_obj.find_all("span", {"class":["style-scope", "ytd-playlist-panel-video-renderer"]})
Using Selenium you can't send multiple classnames within:
driver.find_elements(By.CLASS_NAME, "classname")
If your aim is to use only class attribute then you need to pass only a single classname and you can use either of the following Locator Strategies:
Using classname as style-scope:
elements = driver.find_elements(By.CLASS_NAME, "style-scope")
Using classname as style-scope:
elements = driver.find_elements(By.CLASS_NAME, "ytd-playlist-panel-video-renderer")
To pass both the classnames you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
elements = driver.find_elements(By.CSS_SELECTOR, "span.style-scope.ytd-playlist-panel-video-renderer")
Using XPATH:
elements = driver.find_elements(By.XPATH, "//span[#class='style-scope ytd-playlist-panel-video-renderer']")

Get data-ref using selenium in python

I have a website with such class:
<html>
<body><div class="class1" data-ref="data"></div></body>
<html>
I want to use selenium in python 3 to get the data-ref (i.e. "data").
I know you can get the text using .text is there something like that for data-ref?
Just do this:
get_attribute("attribute name")
In your case
get_attribute("data-ref")
You can first find the element using its xpath and then can find the data-ref value using get_attribute method.
You can do it like:
element = driver.find_element_by_xpath("//div[#class='class1']")
# Find the value of data-ref
value = element.get_attribute("data-ref")
To retrieve the value of data-ref attribute i.e. data you can use either of the following Locator Strategies:
Using XPATH:
print(driver.find_element_by_xpath("//div[#class='class1']").get_attribute("data-ref"))
Using CSS_SELECTOR:
print(driver.find_element_by_xpath("div.class1").get_attribute("data-ref"))

Element has ID in inspection mode, but not in raw HTML

I am currently working on a small web scraping script with Python and Selenium.
I am trying to get some information from a table, which has in inspection mode a certain ID.
However, when I open the page as raw HTML (which I did after not being able to locate that table using neither xpath or css_selector), the table does not have the mentioned ID.
How is that possible?
For better explanation:
This is what it looks like in inspection mode in my browser
<table id='ext-gen1076' class='bats-table bats-table--center'>
[...]
</table>
And this is what it looks like when I open the page as raw HTML file
<table class='bats-table bats-table--center'>
[...]
</table>
How is it possible, that the ID is simply disappearing?
(JFI, this is my first question so apologies for the bad formatting!).
Thanks in advance!
The reason is, the ID was added during the runtime.
The value of the id attribute i.e. ext-gen1076 contains a number and is clearly dynamically generated. The prefix of the value of the id attribute i.e. ext-gen indicates that the id was generated runtime using Ext JS.
Ext JS
Ext JS is a JavaScript framework for building data-intensive, cross-platform web and mobile applications for any modern device.
This usecase
Possibly you have identified the <table> element even before the JavaScript have rendered the complete DOM Tree. Hence the id attribute was missing.
Identifying Ext JS elements
As the value of the id attribute changes i.e. dynamic in nature you won't be able to use the complete value of the id attribute and can use only partial value which is static. As per the HTML you have provided:
<table id='ext-gen1076' class='bats-table bats-table--center'>
[...]
</table>
To identify the <table> node you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table[id^='ext-gen']")))
Using XPATH:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[starts-with(#id,'ext-gen')]")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
However, there would be a lot other elements with id attribute starting with ext-gen. So to uniquely identify the <table> element you need to club up the class attribute as follows:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.bats-table.bats-table--center[id^='ext-gen']")))
Using XPATH:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[#class='bats-table bats-table--center' and starts-with(#id,'ext-gen')]")))
Reference
You can find a relevant detailed discussion in:
How to find element by part of its id name in selenium with python
How to get selectors with dynamic part inside using Selenium with Python?

Categories

Resources