How can I get the text in Selenium?

How can I get the text in Selenium? - python

I want to get the text of an element in selenium. First I did this:
team1_names = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, ".home span"))
)
for kir in team1_names:
print(kir.text)
It didn’t work out. So I tried this:
team1_name = driver.find_elements_by_css_selector('.home span')
print(team1_name.getText())
so team1_name.text doesn’t work either.
So what's wrong with it?

You need to take care of a couple of things here:
presence_of_element_located() is the expectation for checking that an element is present on the DOM of a page. This does not necessarily mean that the element is visible.
Additionally, presence_of_element_located() returns a WebElement, not a list. Hence, iterating through for won't work.
Solution
As a solution you need to induce WebDriverWait for the visibility_of_element_located(), and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".home span"))).text)
Using XPATH and get_attribute():
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//*[#class='home']//span"))).get_attribute("innerHTML"))
Note: You have to add the following imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
The get_attribute() method Gets the given attribute or property of the element.
The text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Related

Python Selenium How to take grandchild text

I am working with this website= https://www.offerte.smartpaws.de/
In the final page once you have put all your variables in, we get a page with this info:
and the HTML for the price looks like this:
I am looking to store the prices for each product, but it seems like I have to select the parent element based on the text "Optimal Care", "Classic Case", etc.. and go down to the child element to extract the price.
I am not sure how to do this, my current code for this specific scenario:
price_optimal = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//form[#class="summary-table-title" and #text="Optimal Care"]/*/*/'))).get_attribute("text()")
How would I take a text based on a parent element text?
Thanks

If you want to only retrieve price then you can use the below XPath:
//div[#class='summary-table-price price-monthly']//div[2]
with find_elements or with explicit waits.
With find_elements
Code:
for price in driver.find_elements(By.XPATH, "//div[#class='summary-table-price price-monthly']//div[2]"):
print(price.get_attribute('innerText'))
With ExplicitWaits:
for price in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[#class='summary-table-price price-monthly']//div[2]"))):
print(price.get_attribute('innerText'))
Imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
If you want to extract based on the plan name such as Classic Care
You should use the below XPath:
//div[text()='Classic Care']//following-sibling::div[#class='summary-table-price price-monthly']/descendant::div[2]
and use it like this:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Classic Care']//following-sibling::div[#class='summary-table-price price-monthly']/descendant::div[2]"))).get_attribute('innerText'))
Also, for other plans, all you will have to do is to replace Classic care with Optimal Care or Basic care in the XPath.

How to find the class path for the number of recovered people from covid using Selenium and Python

So, I need to get the text (number of recovered people from covid) from this webpage into the console, but I can't find the class for the numbers can someone help me to locate the class, so I can print the numbers into the console. I need to use PhantomJS cuz I don't want the log to open when I run the code.
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('https://www.tvnet.lv/covid19Live')
text = driver.find_element_by_class_name("covid-summary__count covid-c-recovered")
print(text)

find_element_by_class_name() expects a single class as an argument but you are providing two class names (class is a "multi-valued attribute", multiple values are separated by a space).
Either check for a single class:
driver.find_element_by_class_name("covid-c-recovered")
Or, switch to a CSS selector:
driver.find_element_by_css_selector(".covid-summary__count.covid-c-recovered")
Digging Deeper
Let's look at the source code. When elements are searched by class name, Python selenium actually constructs a CSS selector under the hood:
elif by == By.CLASS_NAME:
by = By.CSS_SELECTOR
value = ".%s" % value
This means that when you've used covid-summary__count covid-c-recovered as a class name value, the actual CSS selector that was used to find an element happened to be:
.covid-summary__count covid-c-recovered
which understandably did not match any elements (covid-c-recovered would be considered as a tag name here).

If you want the number make sure you have dots between class names.
driver.get('https://www.tvnet.lv/covid19Live')
element = driver.find_element_by_class_name("covid-summary__count.covid-c-recovered")
print(element.text)
Outputs
19 072

From per the documentation of selenium.webdriver.common.by implementation:
class selenium.webdriver.common.by.By
Set of supported locator strategies.
CLASS_NAME = 'class name'
So using find_element_by_class_name() you won't be able to pass multiple class names as it accepts a single class.
Solution
To print the number of people HEALED you can use either of the following Locator Strategies:
LATVIJĀ:
print(driver.find_element_by_xpath("//h1[contains(., 'COVID-19 LATVIJĀ')]//following::ul[1]//p[#class='covid-summary__count covid-c-recovered']").text)
PASAULĒ:
print(driver.find_element_by_xpath("//h1[contains(., 'COVID-19 PASAULĒ')]//following::ul[1]//p[#class='covid-summary__count covid-c-recovered']").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
LATVIJĀ:
driver.get("https://www.tvnet.lv/covid19Live")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[contains(., 'COVID-19 LATVIJĀ')]//following::ul[1]//p[#class='covid-summary__count covid-c-recovered']"))).text)
PASAULĒ:
driver.get("https://www.tvnet.lv/covid19Live")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[contains(., 'COVID-19 PASAULĒ')]//following::ul[1]//p[#class='covid-summary__count covid-c-recovered']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Console Output:
19 072
52 546 925
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
You can find a couple of relevant detailed discussions in:
Invalid selector: Compound class names not permitted error using Selenium
How to locate an element with multiple class names?

Trying to click amazon Best Sellers Rank (Python)

Hello i'm trying to click those links but when I try to click with
driver.find_element_by_xpath('//*[#id="productDetails_detailBullets_sections1"]/tbody/tr[6]/td/span/span[2]/a').click()
its work's but problem is every items has different path and its changing and it does not work for some items
URL: https://www.amazon.com/MICHELANGELO-Piece-Rainbow-Kitchen-Knife/dp/B074T6C4YS/ref=zg_bs_289857_1?_encoding=UTF8&psc=1&refRID=K5GAX1GF2SDZMN3NS403>

The Amazon webpage has 3 entries for Best Sellers Rank. An effective approach would be to collect the href of all the three(3) Best Sellers, store them in a list and open in a seperate tab to scrape. To construct the list you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://www.amazon.com/dp/B074T6C4YS')
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#productDetails_detailBullets_sections1 td>span>span a")))])
Using CSS_SELECTOR in a single line:
driver.get('https://www.amazon.com/dp/B074T6C4YS')
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[#id='productDetails_detailBullets_sections1']//td/span/span//a")))])
Console Output:
['https://www.amazon.com/gp/bestsellers/kitchen/ref=pd_zg_ts_kitchen', 'https://www.amazon.com/gp/bestsellers/kitchen/289857/ref=pd_zg_hrsr_kitchen', 'https://www.amazon.com/gp/bestsellers/kitchen/289862/ref=pd_zg_hrsr_kitchen']
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

This is easy one, even you did not specify which link do you want, only from the table all different links that will transfer you to table.
You need to use customized xpath like e.g.
//*[#id="productDetails_detailBullets_sections1"]/tbody/tr[6]/td/span/span['+i+']/a'
Where i will be your iterator in for loop. To get valu of i use something like
driver.find_elements_by_xpath('//*[#id="productDetails_detailBullets_sections1"]/tbody/tr[6]/td/span/span').size();

Selenium not extracting info using xpath

I am trying to extract some information from the amazon website using selenium. But I am not able to scrape that information using xpath in selenium.
In the image below I want to extract the info highlighted.
This is the code I am using
try:
path = "//div[#id='desktop_buybox']//div[#class='a-box-inner']//span[#class='a-size-small')]"
seller_element = WebDriverWait(driver, 5).until(
EC.visibility_of_element_located((By.XPATH, path)))
except Exception as e:
print(e)
When I run this code, it shows that there is an error with seller_element = WebDriverWait(driver, 5).until( EC.visibility_of_element_located((By.XPATH, path))) but does not say what exception it is.
I tried looking online and found that this happens when selenium is not able to find the element in the webpage.
But I think the path I have specified is right. Please help me.
Thanks in advance
[EDIT-1]
This is the exception I am getting
Message:

//div[class='a-section a-spacing-none a-spacing-top-base']//span[class='a-size-small a-color-secondary']
XPath could be something like this. You can shorten this.
CSS selector could be and so forth.
.a-section.a-spacing-none.a-spacing-top-base
.a-size-small.a-color-secondary

I think the reason is xpath expression is not correct.
Take the following element as an example, it means the span has two class:
<span class="a-size-small a-color-secondary">
So, span[#class='a-size-small') will not work.
Instead of this, you can ues xpath as
//span[contains(#class, 'a-size-small') and contains(#class, 'a-color-secondary')]
or cssSelector as
span.a-size-small.a-color-secondary

Amazon is updating its content on the basis of the country you are living in, as I have clicked on the link provided by you, there I did not find the element you are looking for simply because the item is not sold here in India.
So in short if you are sitting in India and try to find your element, it is not there, but as you change the location to "United States". it is appearing there.
Solution - Change the location

To print the Ships from and sold by Amazon.com of an element you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.a-section.a-spacing-none.a-spacing-top-base > span.a-size-small.a-color-secondary"))).get_attribute("innerHTML"))
Using XPATH and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='a-section a-spacing-none a-spacing-top-base']/span[#class='a-size-small a-color-secondary']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Order of found elements in Selenium

I'm using selenium with python to interact with a webpage.
There is a table in the webpage. I'm trying to access to its rows with this code:
rows = driver.find_elements_by_class_name("data-row")
It works as expected. It returns all elements of the table.
The question is, is the order of the returned elements guaranteed to be the same as they appear on the page?
For example, Will the first row that I see in the table in browser ALWAYS be the 0th index in the array?

You shouldn't be depending on the fact whether Selenium returns the elements in the same order as they appear on the webpage or DOM Tree.
Each WebElement within the HTML DOM can be identified uniquely using either of the Locator Strategies.
Though you were able to pull out all the desired elements using find_elements_by_class_name() as follows:
rows = driver.find_elements_by_class_name("data-row")
Ideally, you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
Using CLASS_NAME:
element = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "data-row")))
Using CSS_SELECTOR:
element = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".data-row")))
Using XPATH:
element = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//*[#class='data-row']")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a detailed discussion in WebDriverWait not working as expected

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I get the text in Selenium? - python

Related

Python Selenium How to take grandchild text

How to find the class path for the number of recovered people from covid using Selenium and Python

Trying to click amazon Best Sellers Rank (Python)

Selenium not extracting info using xpath

Order of found elements in Selenium

Categories

Resources