Extract first span child with Selenium - python

I want to extract the first span with the text Extract this text. Already tried:
element.find_element_by_css_selector(".moreContent span:nth-child(1)").text.strip('"')
This is not working, I am not sure why. The output is just empty.
<p class="mainText">
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
<span class="moreEllipses">… </span>
<span class="moreContent">
<span> Extract this text </span>
<span class="link moreLink">Show More</span>
</span>
</p>
However I am getting this, so Selenium finds the element but why the output is empty:
<selenium.webdriver.remote.webelement.WebElement (session="e7012b303842651848aa0b0e40f5d5c1", element="df5644e9-fc98-4300-ad86-9ff433154d82")>
EDIT:
I managed to solve this by clicking on show more button. For some reason i can't extract the content if not visible even if present in page.

As per your cssSelector it seems you are targeting below
<span> Extract this text </span>
You can use below Xpath:
(//p[#class='mainText']//span[#class='moreContent']/span)[1]
OR
(//span[#class='moreContent']/span)[1]
Example Code:
element = driver.find_element_by_xpath("(//p[#class='mainText']//span[#class='moreContent']/span)[1]").text

To extract the text from the first <span> i.e. Extract this text you need to to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text property:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.mainText span.moreContent>span"))).text)
Using XPATH and get_attribute() method:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[#class='mainText']//span[#class='moreContent']/span"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Related

How to extract text between ::before and ::after

I would like to extract the text between ::before and ::after into a string. How can I use a for loop to extract all the text in selenium Python?
The text i is in between the ::before and ::after pseudoelements. So to extract the text you can use either of the following Locator Strategies:
Using css_selector:
print(driver.find_element(By.CSS_SELECTOR, "div.kbkey.button.red").text)
Using xpath:
print(driver.find_element(By.XPATH, "//div[#class='kbkey button red']").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS-SELECTOR:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.kbkey.button.red"))).text)
Using XPATH:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='kbkey button red']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method gets the given attribute or property of the element.
text attribute returns the text of the element.
Difference between text and innerHTML using Selenium
::before and ::after are just a pseudo elements.
Here you can extract the text from the div element itself.
In case there are several divs with class kbkey button red you can do something like this:
buttons = driver.find_elements_by_css_selector("div.kbkey.button.red")
for button in buttons:
print(button.text)

Finding web element of dynamic websites using selenium python

I want to scrape text of few fields on the basis of their web elements (xpath, classes etc).
<div class = myOnlyElement>
<div> ......
<div class = afafasf> ......</div>
<div class = klklkl> ......
<div class = qwqwqwq> ......
<div class = reaction> text i need</div>
</div>
</div>
</div>
</div>
<div class = myElement>
<div> ......
<div class = dfdfdf> ......</div>
<div class = ghgghghg> ......
<div class = erererere> ......
<div class = reaction> text i don't need</div>
</div>
</div>
</div>
</div>
Suppose I have backend of element like this. I find element like:
myelem = driver.find_element_by_classname('myOnlyElement')
Now I only want to pick class "reaction" with text I need.
I am doing like:
myelem.find_element_by_classname('reaction')
if this class is present it captures it, but in some cases it goes for class = "reaction" whose text is "text i don't need"
Hope I have clearly mentioned my question. Can you please help me
my friend, best solution when it comes to this stuff, right click on the webpage, where you see the text. Right click in the DOM inspector and click Copy -> Copy Full XPath value. then you might need to do .text .source to get those values. but try and play around.
To print the text text i need you can use either of the following Locator Strategies:
Using css_selector and get_attribute():
print(driver.find_element_by_css_selector("div.myOnlyElement div.reaction").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//div[#class='myOnlyElement']//div[#class='reaction']").text)
Ideally, to print the text text i need you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.myOnlyElement div.reaction"))).get_attribute("innerHTML"))
Using XPATH and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='myOnlyElement']//div[#class='reaction']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Xpath Selenium- How to locate a element by the sub text

Hope you´re really fine and can help me with this short question.
I´m trying to locate the following object id=C39_W133_V136_thtmlb_button_27 but using the text that is located after an span (text = "Edit"). Please I tried different ways but didn´t work till now, any idea?
<a href="javascript:void(0)" class="th-bt th-bt-icontext-dis icon-font" tabindex="-1" oncontextmenu="return false;" ondragstart="return false;" id="C39_W133_V136_thtmlb_button_27">
::before
<img class="th-bt-img" src="/SAP/BC/BSP/SAP/thtmlb_styles/sap_skins/belize/images/1x1.png">
<span class="th-bt-span"><b class="th-bt-b">Edit</b></span>
<b class="th-bt-b">Edit</b>
</a>
In order to locate an element using text contained in an element, the only option is to use XPath.
//a[./b[.='Edit']]
^ Start at the top of the document and find an A tag
^ ...that has a descendant B tag
^ ...that contains the text 'Edit'
To locate the <a> element which have a descended <span> with text as Edit you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using XPATH:
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(#id, 'thtmlb_button')][.//b[text()='Edit']]")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

How to iterate through webelements to extract text from HTML tags in Selenium Web Automation (Python)?

I am making a reddit bot that will look for certain attributes in comments, use selenium to visit the information website, and use driver.find_elements_by... to get the value inside those tags.
Now, driver.find_elements_by... is not iterable, and there are multiple <span class="name">Lorem Ipsum</span> tags with text inside them that I want obtained. I am storing this as a variable and replying to the comment via PRAW.
Suppose that the HTML is this:
<span class="name">Lorem</span>
<span class="name">Ipsum</span>
<span class="name">Dolor</span>
<span class="name">Sit</span>
<span class="name">Amet</span>
So, how could I obtain the text from all of the <span class="name"> tags, and when I store it as a variable and reply, will it just put all the text together without spaces or will it format it with a space between each text, supposing that I write:
tags = driver.find_element_by...
comment.reply("Tags: {}".format(tags))
And if it just puts all the text together, how can I format it so that there are spaces?
To extract the texts e.g. Lorem, Ipsum, Dolor, Sit, Amet, etc from all of the <span> using Selenium and python you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute("innerHTML"):
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.name")))])
Using XPATH and text attribute:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[#class='name']")))])
Console Output:
['Lorem', 'Ipsum', 'Dolor', 'Sit', 'Amet']
Note: This is list of type string and you can manipulate according to your requirement.
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

How can I get this html code's by xpath for using selenium on python?

I have an update button and in python, bot ı want to click this button but ı try many options but not working ı can't locate element by XPath. Please help me? Thanks for your help.
Note: I need to locate by using the text "Update" because a lot of buttons ı have on a webpage.
<span class="a-spacing-top-small">
<span class="a-button a-button-primary a-button-small sc-update-link" id="a-autoid-2">
<span class="a-button-inner">
<a href="javascript:void(0);" data-action="update" class="a-button-text" role="button" id="a-autoid-2-announce" style="">Update
<span class="aok-offscreen">Nike Academy 18 Rain Jacket Men's (Obsidian, M)</span>
</a>
</span>
</span>
</span>
This should work:
button = driver.find_element_by_xpath("//a[#data-action,'update']")
Where driver is your instance of driver.
The desired element is a JavaScript enabled element so to locate/click() on the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "span.a-button.a-button-primary.a-button-small.sc-update-link[id^='a-autoid'] a.a-button-text[id$='announce']>span.aok-offscreen"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[#class='a-button a-button-primary a-button-small sc-update-link' and starts-with(#id,'a-autoid')]//a[#class='a-button-text' and contains(#id,'announce')]/span[#class='aok-offscreen' and starts-with(., 'Nike Academy 18 Rain Jacket Men')]"))).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Categories

Resources