Get data-ref using selenium in python - python

I have a website with such class:
<html>
<body><div class="class1" data-ref="data"></div></body>
<html>
I want to use selenium in python 3 to get the data-ref (i.e. "data").
I know you can get the text using .text is there something like that for data-ref?

Just do this:
get_attribute("attribute name")
In your case
get_attribute("data-ref")

You can first find the element using its xpath and then can find the data-ref value using get_attribute method.
You can do it like:
element = driver.find_element_by_xpath("//div[#class='class1']")
# Find the value of data-ref
value = element.get_attribute("data-ref")

To retrieve the value of data-ref attribute i.e. data you can use either of the following Locator Strategies:
Using XPATH:
print(driver.find_element_by_xpath("//div[#class='class1']").get_attribute("data-ref"))
Using CSS_SELECTOR:
print(driver.find_element_by_xpath("div.class1").get_attribute("data-ref"))

Related

How can I print an HTML text element [object Text] with Selenium?

I'm attempting to extract "479" from this sample HTML:
<div data-testid="testid">
"479"
" Miles Away"
</div>
I'm using the following Selenium code in Python:
xpath = 'html/body/div/text()[1]'
WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, xpath)))
distance = driver.find_element(By.XPATH, xpath)
print(distance)
Which returns the following error:
'The result of the xpath expression "html/body/div/text()[1]" is: [object Text]. It should be an element.'
I've attempted to remove 'text()[1]' from the end of my xpath, theoretically printing off all data contained the in the HTML div, but it will instead print a blank line when I do so.
Note: I'm an amateur and self-taught (via mostly Google, YouTube, and this site), so some of my wordage may not be correct. I apologize in advanced.
Given the html:
<div data-testid="testid">
"479"
" Miles Away"
</div>
Both the texts 479 and Miles Away are with in 2 different text nodes.
Selenium doesn't supports text() as it returns a text node, where as Selenium expects back a WebElement. Hence you see the error:
The result of the xpath expression "html/body/div/text()[1]" is: [object Text]. It should be an element.
Solution
To extract the text 479 you can use either of the following locator strategies:
Using xpath through execute_script() and textContent:
print(driver.execute_script('return arguments[0].firstChild.textContent;', WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, "//div[#data-testid='testid']")))).strip())
Using xpath through splitlines() and get_attribute():
print(WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, "//div[#data-testid='testid']"))).get_attribute("innerHTML").splitlines()[1])
The problem is that you can't treat text like that, the text() function returns everything as a string including a line break. I think there is no split function that can help you with that, I advise you to get the text in a python variable and do a split('\n') to the text.
xpath = 'html/body/div/text()'
WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.XPATH, xpath)))
distance = driver.find_element(By.XPATH, xpath)
print(distance.split('\n')[0])
You should take the entire element (without text()) using only
html/body/div
then from returned element get text, which will be: "479" " Miles Away" .
Then using split method from python you can take that number(split by \n, space, or ").
Selenium doesn't support the following xpath
xpath = 'html/body/div/text()[1]'
To identify the element uniquely, Your xpath should be like
xpath = '//div[#data-testid="testid"]'
WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, xpath)))
distance = driver.find_element(By.XPATH, xpath).text
print(distance)
To get the text of the element you have to use element.text

How to use mutiple attributes (including a partial string match) with find_elements in Selenium for Python

Currently I have this line of code which correctly selects this type of object on the webpage I'm trying to manipulate with Selenium:
pointsObj = driver.find_elements(By.CLASS_NAME,'treeImg')
What I need to do is add in a partial string match condition as well which looks in the section "CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)" in the line below.
<span class="treeImg v65point" style="cursor:pointer;">CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)</span>
I found online there's the ChainedBy option but I can't think of how to reference that text in the span. Do I need to use XPath? I tried that for a second but I couldn't think of how to parse it.
Refering both the CLASS_NAME and the innerText you can use either of the following locator strategies:
xpath using the classname treeImg and partial innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[contains(#class, 'treeImg') and contains(., 'AHU-01_ahu_ChilledWtrVlvOutVolts')]")
xpath using all the classnames and entire innerText:
pointsObj = driver.find_elements(By.XPATH,"//span[#class='treeImg v65point' and text()='CLGV (AHU-01_ahu_ChilledWtrVlvOutVolts)']")

Python - Selenium get Element by attribute and the full attribute value

Hi how can I get an element by attribute and the attribute value in Python Selenium?
For example I have class="class1 class2 class3".
Now I want to get the element with the attribute class what ca.rries the classes "class1 class2 class3".
Is this possible?
If I use xpath, I always need to add the element type, input, option,...
I try to avoid the element type since it varies sometimes.
While constructing locators considering css-selectors or xpath you have to use the different attributes and the attribute-values to identify the WebElement uniquely within the DOM Tree.
The generic way is:
Using css_selector:
button.classname[attributeA='attributeA_value'][attributeB='attributeB_value']
Using xpath and attributes:
//button[#attributeA='attributeA_value'][#attributeB='attributeB_value']
As an example, for an element like:
<button type="button" aria-hidden="true" class="close alert alert-close" data-notify="dismiss">Close</button>
You can identify the Close element using either of it's the attributes and the corresponding attribute-values using either of the Locator Strategies:
Using css_selector:
button.close.alert.alert-close[data-notify='dismiss']
classes-> ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and attributes:
//button[#class='close alert alert-close' and #data-notify='dismiss']
class attributes ^ ^^ ^^^ data-notify ^^^^ attribute
Using xpath and innerText:
//button[text()='Close']
innerText ^
The CSS selectors would be formatted like this:
'[attribute]'
'[attribute="value"]'
For example, the selector for the input field on google.com would be:
'input[name="q"]'
to answer this part
If I use xpath, I always need to add the element type, input,
option,... I try to avoid the element type since it varies sometimes.
you can use //* and then attribute type and attribute value.
//*[#class='class1 class2 class3']
//* represent any or all nodes

How to work with multiple classes using BeautifulSoup & BS4

I am trying to write youtube scraper and as part of my task I need to work with multiple classes at bs4.
HTML looks like
<span id="video-title" class="style-scope ytd-playlist-panel-video-renderer">
</span>
My aim to use class attribute to get all 50 different music and work with them.
I have tried like that and it returns me nothing.
soup_obj.find_all("span", {"class":"style-scope ytd-playlist-panel-video-renderer"})
and I also tried as Selenium style (instead of spaces between class pass dot(.)
soup_obj.find_all("span", {"class":"style-scope.ytd-playlist-panel-video-renderer"})
Does anyone have idea about it ?
This should work
soup_obj.find_all("span", {"class":["style-scope", "ytd-playlist-panel-video-renderer"]})
Using Selenium you can't send multiple classnames within:
driver.find_elements(By.CLASS_NAME, "classname")
If your aim is to use only class attribute then you need to pass only a single classname and you can use either of the following Locator Strategies:
Using classname as style-scope:
elements = driver.find_elements(By.CLASS_NAME, "style-scope")
Using classname as style-scope:
elements = driver.find_elements(By.CLASS_NAME, "ytd-playlist-panel-video-renderer")
To pass both the classnames you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
elements = driver.find_elements(By.CSS_SELECTOR, "span.style-scope.ytd-playlist-panel-video-renderer")
Using XPATH:
elements = driver.find_elements(By.XPATH, "//span[#class='style-scope ytd-playlist-panel-video-renderer']")

How to use function ends-with in selenium?

I look up the information that lxml does not support xpath2.0 so that it can't use ends-with, so selenium can't use ends-with how to use it or replace ends-with. thank you very much indeed!!!
HTML sample
<span id="xxxxx_close">wwwww</span>
The 'xxxxx' part of #id is random
You can apply an ends-with CSS selector:
By.cssSelector("[id$=_close]")
There's no need of including span tag in css selector search as well.
The ends-with XPath Constraint Function is part of XPath v2.0 but as per the current implementation Selenium supports XPath v1.0.
As per the HTML you have shared to identify the element you can use either of the Locator Strategies:
XPath using contains():
xpath using contains for id attribute:
driver.findElement(By.xpath("//span[contains(#id,'_close')]")).click();
xpath using contains for id and innerHTML attribute:
driver.findElement(By.xpath("//span[contains(#id,'_close') and contains(.,'wwwww')]")).click();
Alternatively, you can also use CssSelector as follows:
css_selector using ends-with (i.e. $ wildcard) clause for id attribute:
driver.find_element_by_css_selector("span[id$='_close']").click();
css_selector using contains (i.e. * wildcard) clause for id attribute:
driver.find_element_by_css_selector("span[id*='_close']").click();

Categories

Resources