Selenium - find article by h1 and p text - python

How to find article on website by h1 and p text like on image below?
I tried this, where I can found all articles and I don't know how to find this one with text in h1 and by text in p. And then I would like to click on this.
text = driver.find_elements_by_xpath("//article/div[contains(#class,'inner-article')]/h1")

To extract and print the text Beanie Custom First and Red you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
Printing Beanie Custom First:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "article.inner-article h1 > a.name-link[href='/shop/asd']"))).text)
Printing Red:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "article.inner-article p > a.name-link[href='/shop/asd']"))).text)
Using XPATH and get_attribute():
Printing Beanie Custom First:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//article[#class='inner-article']//h1/a[#class='name-link' and #href='/shop/asd']"))).get_attribute("innerHTML"))
Printing Red:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//article[#class='inner-article']//p/a[#class='name-link' and #href='/shop/asd']"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

text = driver.find_elements_by_xpath("//article/div[contains(#class,'inner-article')][h1/a[contains(text(),"Beanie")]][p/a[contains(text(),"Red")]]")
you can use above xpath, which will check whehter the parent element article/div has child elements h1/a and p/a with texts Beanie and Red respectively
in w3chool html editor is inside iframe so switch to iframe in your seelnium tests before tryng to find the element

Related

Extract list of text from g element within svg tag using Selenium and Python

I would like to extract the following text fields that are located within a g tag which is located in a svg tag (the url: https://www.msci.com/our-solutions/esg-investing/esg-ratings-climate-search-tool). I put in a company name and search for it, expand the last drop down menu and want to extract information from the <svg>.
HTML Part I:
HTML Part II:
I tried what has been suggested here: Extracting text from svg using python and selenium
but I did not mange to get it.
My code:
test = driver.find_element(By.XPATH, "//*[local-name()='svg' and #class='highcharts-root']//*[local-name()='g' and #class='highcharts-axis-labels highcharts-xaxis-labels']//*[name()='text']").text
print(test)
To extract the texts e.g. Sep-18, Nov-22 you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "svg.highcharts-root g.highcharts-axis-labels.highcharts-xaxis-labels text")))])
Using XPATH and get_attribute("innerHTML"):
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//*[local-name()='svg' and #class='highcharts-root']//*[local-name()='g' and #class='highcharts-axis-labels highcharts-xaxis-labels']//*[local-name()='text']")))])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
References
You can find a couple of relevant detailed discussions in:
Reading title tag in svg?
Creating XPATH for svg tag

Get Youtube video title using classname and text attribute using Selenium and Python

Hi I'm using Python Selenium Webdriver to get Youtube title but keep getting more info than I'd like.
The line is:
driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
Is there any way to fix it and make it more efficient so that it displays only the title.
Here is the test script Im using:
from selenium import webdriver as wd
from time import sleep as zz
driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)
To print the title text OBI-WAN KENOBI Official Trailer (2022) Teaser you can use either of the following Locator Strategies:
Using css_selector and get_attribute("innerHTML"):
print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element(By.XPATH, "//h1[#class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[#class='style-scope ytd-video-primary-info-renderer']").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
Using XPATH and get_attribute("innerHTML"):
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[#class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[#class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Console Output:
OBI-WAN KENOBI Official Trailer (2022) Teaser
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Unable to print the text using Python Selenium?

So I want to print out "ou" from this HTML via XPath.
<div class="syllable">ou</div>
This is my code:
elem = driver.find_elements_by_xpath('*my xpath*')
for my_elem in elem:
print(my_elem.text)
I used .text in this code but it just prints out nothing when it runs. It should be printing ou but it just prints nothing at all. If someone could please tell me what I am doing wrong that would be great.
To print the text ou you can use either of the following Locator Strategies:
Using css_selector and get_attribute("innerHTML"):
print(driver.find_element_by_css_selector("div.syllable").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//div[#class='syllable']").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.syllable"))).text)
Using XPATH and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='syllable']"))).get_attribute("innerHTML"))
Console Output:
ou
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium
if you are unable to print the text using .text
then use .get_attribute("innerHTML")
this will defiantly print out the text.
eg.
your code should look like this
elem = driver.find_elements_by_xpath('*my xpath*').get_attribute("innerHTML")
for my_elem in elem:
print(my_elem)

Xpath and CssSelector not extracting data using Selenium and Python

I am trying to pull the total listings at the top of the main section (in this case it currently says 72 Listings). I have tried both By.XPATH and By.CSS_SELECTOR with no luck.... Any idea why this isn't working?
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
n = WebDriverWait(driver, 0.01).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".col-xs-9 pull-left"))).text
To print the text 72 Listings you can use either of the following Locator Strategies:
Using css_selector and get_attribute("innerHTML"):
print(driver.find_element_by_css_selector("section#section_billboard h2 small").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//section[#id='section_billboard']//h2//small").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "section#section_billboard h2 small"))).text)
Using XPATH and get_attribute():
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[#id='section_billboard']//h2//small"))).get_attribute("innerHTML"))
Console Output:
value
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Selenium wont return text

So I have this code which is referencing this html class:
<div class="text-right _2jRRJJvarKXJGP9oRP-Bv0 pbBzNArud8-t5yrNZi8Wg rankings-list-volume">2775.60</div>
Vol = driver.find_element_by_xpath('/html/body/div[1]/div[1]/div[2]/div[1]/div/div[2]/div/main/div/div/div[2]/a[1]/div[8]').text
print(Vol)
If I run it without the .text it returns just the element name and such so I know it is finding the element but it won't return the text. The 2775.60 value within the HTML is what it should be returning but instead, it just prints nothing.
To print the text 2769.94 you can use either of the following Locator Strategies:
Using css_selector and get_attribute():
print(driver.find_element_by_css_selector("div.rankings-list-volume").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//div[contains(#class, 'rankings-list-volume')]").text)
Ideally, to print the text 2769.94 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.rankings-list-volume"))).get_attribute("innerHTML"))
Using XPATH and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(#class, 'rankings-list-volume')]"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Categories

Resources