How to extract the text from the HTML using Selenium and Python - python

I have this HTML:
And I want get this text "rataoriginal". (This text changes, I need this part of the code as text)
I've tried
xpath = "//span[#class='_5h6Y_ _3Whw5 selectable-text invisible-space copyable-text']"
auxa = driver.find_element_by_xpath(xpath).text
print(auxa)
But it prints the same as print("\n"). I don't want to use beaultifulsoup for a while.
This HTML is from 'https://web.whatsapp.com'

The WebElement is a dynamic element, so to print the values you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.selectable-text.invisible-space.copyable-text[dir='auto']"))).text)
Using XPATH:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(#class, '') and contains(#class, 'invisible-space')][contains(#class, '') and #dir='auto']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
References
You can find a relevant discussion in:
How to retrieve the text of a WebElement using Selenium - Python

//*[contains(text(),"rataoriginal")] please use this xpath

Related

Find Text By Nth Class Name is Selenium

I would like to get the the Date time text on this webpage but had issues locating it. A help will be appreciated.
If I use this codde,
var = wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'col-md-3'))).text
I get the text in the first class (col-md-39). How do a I get the text in third class in this instance.
The Date Time text is within a text node. So to print the text you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using XPATH and childNodes[n]:
print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='row']//div[#class='col-md-3'][.//strong[text()='Last communication']]")))).strip())
Using XPATH and splitlines():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='row']//div[#class='col-md-3'][.//strong[text()='Last communication']]"))).get_attribute("innerHTML").splitlines()[-1])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You need to change the locator to get the desired result:
By.XPath, '//*[text()="Last communication"]//parent::div'

Xpath and CssSelector not extracting data using Selenium and Python

I am trying to pull the total listings at the top of the main section (in this case it currently says 72 Listings). I have tried both By.XPATH and By.CSS_SELECTOR with no luck.... Any idea why this isn't working?
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
n = WebDriverWait(driver, 0.01).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".col-xs-9 pull-left"))).text
To print the text 72 Listings you can use either of the following Locator Strategies:
Using css_selector and get_attribute("innerHTML"):
print(driver.find_element_by_css_selector("section#section_billboard h2 small").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//section[#id='section_billboard']//h2//small").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and text attribute:
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "section#section_billboard h2 small"))).text)
Using XPATH and get_attribute():
driver.get('https://swappa.com/mobile/buy/apple-iphone-8/att')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[#id='section_billboard']//h2//small"))).get_attribute("innerHTML"))
Console Output:
value
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

How to click an element using Selenium Python

So I have this element in HTML...
<span style="font-size:11pt">Security</span>
... and I want to click it using Selenium in python. How can I do it? (Maybe I can identify it using onclick, but how?)
Also, here is what I've tried:
security = driver.find_element_by_xpath("//button[#onclick=\"on_catolog(2);\"]")
security.click()
You used wrong tag name in your XPath "//button[#onclick=\"on_catolog(2);\"]". It's not a button, but anchor tag. Try
"//a[#onclick='on_catolog(2);']"
Or click by link text:
security = driver.find_element_by_link_text("Security")
security.click()
To click on the element with text as Security you can use either of the following Locator Strategies:
Using css_selector:
driver.find_element_by_css_selector("a[onclick^='on_catolog'] > span[style='font-size:11pt']").click()
Using xpath:
driver.find_element_by_xpath("//a[starts-with(#onclick, 'on_catolog')]/span[text()='Security']").click()
Ideally, to click on the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[onclick^='on_catolog'] > span[style='font-size:11pt']"))).click()
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[starts-with(#onclick, 'on_catolog')]/span[text()='Security']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Selenium wont return text

So I have this code which is referencing this html class:
<div class="text-right _2jRRJJvarKXJGP9oRP-Bv0 pbBzNArud8-t5yrNZi8Wg rankings-list-volume">2775.60</div>
Vol = driver.find_element_by_xpath('/html/body/div[1]/div[1]/div[2]/div[1]/div/div[2]/div/main/div/div/div[2]/a[1]/div[8]').text
print(Vol)
If I run it without the .text it returns just the element name and such so I know it is finding the element but it won't return the text. The 2775.60 value within the HTML is what it should be returning but instead, it just prints nothing.
To print the text 2769.94 you can use either of the following Locator Strategies:
Using css_selector and get_attribute():
print(driver.find_element_by_css_selector("div.rankings-list-volume").get_attribute("innerHTML"))
Using xpath and text attribute:
print(driver.find_element_by_xpath("//div[contains(#class, 'rankings-list-volume')]").text)
Ideally, to print the text 2769.94 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute():
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.rankings-list-volume"))).get_attribute("innerHTML"))
Using XPATH and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(#class, 'rankings-list-volume')]"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

How can I select the image link from src html part using python selenium

I'm trying to get all the links of the images from the below mentioned html code type:
<img src="https://img.xyz.com/rcmshopping/PROD_IMAGES/fffff_2018_05_10_10_49_23.jpg">
I am not able to fetch any link from this part of html out f 3 any 1 is good for me. Im new to python please guide.
To print the value of the src attribute you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using XPATH:
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[#class='elevatezoom-gallery']/img"))).get_attribute("src"))
Using CSS_SELECTOR:
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.elevatezoom-gallery>img"))).get_attribute("src"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can achieve that by using the find_elements_by_css_selector() function.
links = self.webdriver.find_elements_by_css_selector('img[src="https://img.xyz.com/rcmshopping/PROD_IMAGES/fffff_2018_05_10_10_49_23.jpg"])

Categories

Resources