How to get div text by id with selenium

How to get div text by id with selenium - python

I am trying to get a book's description text from its amazon webpage. I get the book's image and title and price fine by using driver.find_element_by_id, but when it comes to the description which is in a div with id="iframeContent", it doesn't work. Why? I have also tried WebDriverWait but no luck.
I use the following code:
def get_product_description(self, url):
"""Returns the product description of the Amazon URL."""
self.driver.get(url)
try:
product_desc = self.driver.find_element_by_id("iframeContent")
except:
pass
if product_desc is None:
product_desc = "Not available"
return product_desc

Since that element is inside the iframe you have to switch to that iframe in order to access elements inside the iframe.
So first you have to locate the iframe
iframe = driver.find_element_by_id("bookDesc_iframe")
then switch to it with
driver.switch_to.frame(iframe)
Now you can access the element located by the div#iframeContent css_selector
And finally you should get out from the iframe to the default content with
driver.switch_to.default_content()
Credits to the author

you need to use text method from selenium.
try:
product_desc = self.driver.find_element_by_id("iframeContent").text
print(product_desc)
except:
pass
Update 1:
There's a iframe involved so you need to change the focus of webdriver to iframe first
iframe = self.driver.find_element_by_id("bookDesc_iframe")
self.driver.switch_to.frame(iframe)
product_desc = self.driver.find_element_by_id("iframeContent").text
print(product_desc)
Now you can access every element which is inside bookDesc_iframe iframe, but the elements which are outside this frame you need to set the webdriver focus to default first, something like this below :
self.driver.switch_to.default_content()

Related

Switching to a frame and back invalidates WebElement

This function receives a WebElement and uses it to locate other elements. But when I switch to an iframe and back, the WebElement seems to be invalidated. The error I get is "stale element reference: element is not attached to the page document".
def map_vid(vid):
vid_name = vid.find_element(By.XPATH, "./*[1]").text
frame = driver.find_element(By.XPATH, FRAME_XPATH)
driver.switch_to.frame(frame)
quality_button = driver.find_element(By.CLASS_NAME, 'icon-cog')
quality_button.click()
driver.switch_to.default_content()
close_button = vid.find_element(By.XPATH, CLOSE_BUTTON_XPATH)
close_button.click()
Relocating the WebElement is not an option, since it's passed as an argument. Any help would be greatly appriciated :)

You need to pass a By locator as a parameter instead of passing a WebElement parameter.
You can not make this your code working with passing a WebElement parameter since Selenium WebElement is a reference to physical web element on the page.
As you can see, switching to/from an iframe causes previously found WebElements to become Stale.
Passing a By locator will allow you to find the passed parent element inside the method each time you will need to use it.
Something like the following:
def map_vid(vid_by):
vid = driver.find_element(vid_by)
vid_name = vid.find_element(By.XPATH, "./*[1]").text
frame = driver.find_element(By.XPATH, FRAME_XPATH)
driver.switch_to.frame(frame)
quality_button = driver.find_element(By.CLASS_NAME, 'icon-cog')
quality_button.click()
driver.switch_to.default_content()
vid = driver.find_element(vid_by)
close_button = vid.find_element(By.XPATH, CLOSE_BUTTON_XPATH)
close_button.click()

Instagram Following Selenium Python

I am trying to figure out how to get Selenium to click the following section of an Instagram account, here is the html for it.
<a class=" _81NM2" href="/peacocktv/following/" tabindex="0"><span class="g47SY lOXF2">219</span> following</a>
When using instagram, the direct /username/following link does not actually bring up the following list, you would have to click the field manually on the website for it to appear.
log_in = WebDriverWait(driver, 8).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[type = 'submit']"))).click()
not_now = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]')))
not_now.click()
'''not_now2 = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//button[contains(text(), "Not Now")]')))
not_now2.click()'''
search_select = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//input[#placeholder = 'Search']")))
search_select.clear()
'''search_select.send_keys("russwest44", Keys.ENTER)
search_select.send_keys(Keys.ENTER)'''
# Clicks who the person is following:
following = driver.find_elements_by_xpath("//button[contains(text(),'following')]")
following.click()
Which is the best way to come up with the solution

Seems like couple of issues with your current code:
# Clicks who the person is following:
following = driver.find_elements_by_xpath("//button[contains(text(),'following')]")
following.click()
if you pay attention, you've shared HTML with a tag, so your xpath should be
//a[contains(text(),'following')]
not
//button[contains(text(),'following')]
also, you are using find_elements_by_xpath it would return a list of web elements. you can not directly trigger .click on it.
either switch to find_element or
following[0].click()

Selenium cant locate element inside ::before ::after

Element to be located
I am trying to locate a span element inside a webpage, I have tried by XPath but its raise timeout error, I want to locate title span element inside Facebook marketplace product. url
here is my code :
def title_detector():
title = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, 'path'))).text
list_data = title.split("ISBN", 1)

Try this xpath //span[contains(text(),'isbn')]

You can't locate pseudo elements with XPath, only with CSS selector.
I see it's FaceBook with it's ugly class names...
I'm not sure this will work for you, maybe these class names are dynamic, but it worked for me this time.
Anyway, the css_locator for that span element is .dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id
So, since we are trying to get it's before we can do it with the following JavaScript script:
span_locator = `.dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id`
script = "return window.getComputedStyle(document.querySelector('{}'),':before').getPropertyValue('content')".format(span_locator)
print(driver.execute_script(script).strip())
In case the css selector above not working since the class names are dynamic there - try to locate that span with some stable css_locator, it is possible. Just have to try it several times until you see which class names are stable and which are not.
UPD:
You don't need to locate the pseudo elements there, will be enough to catch that span itself. So, it will be enough something like this:
span_locator = `.dati1w0a.qt6c0cv9.hv4rvrfc.discj3wi .d2edcug0.hpfvmrgz.qv66sw1b.c1et5uql.lr9zc1uh.a8c37x1j.keod5gw0.nxhoafnm.aigsh9s9.qg6bub1s.fe6kdd0r.mau55g9w.c8b282yb.iv3no6db.o0t2es00.f530mmz5.hnhda86s.oo9gr5id`
def title_detector():
title = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, 'span_locator'))).text
title = title.strip()
list_data = title.split("ISBN", 1)

How to get iframe source from page_source when the id isn't on the iframe

Hello today i wanna ask how to get the link inside the page source but without id, i asked before how to get the link with id ok now i understand, but i've tried the same method with another link and i was not successful about that so here is my code:
from selenium import webdriver
# Create a new instance of the Firefox driver
driver_path = r"C:\Users\666\Desktop\New folder (8)\chromedriver.exe"
driver = webdriver.Chrome(driver_path)
# go to the google home page
driver.get("https://www.gledalica.com/sa-prevodom/iron-fist-s02e01-video_02cb355f8.html")
# find the element that's name attribute is q (the google search box)
element = driver.find_element_by_id("Playerholder")
frame = driver.find_element_by_tag_name("iframe")
driver.switch_to.frame("iframe")
link = frame.get_attribute("src")
driver.quit()
Like this here: enter image description here

There are multiple way to get it. In this case one of easiest is by using a CSS selector:
frame = find_element_by_css_selector('#Playerholder iframe')
This looks for the element with id = "Playerholder" in the html and then look for a child of it that is an iframe.

BeautifulSoup in Python - DIV Contents are not displaying

I would like to start by saying I reviewed several solutions on this site, but none seem to be working for me.
I am simply trying to access the contents of a div tag from this website: https://play.spotify.com/chart/3S3GshZPn5WzysgDvfTywr, but the contents are not showing.
Here is the code I have so far:
SpotifyGlobViralurl='https://play.spotify.com/chart/3S3GshZPn5WzysgDvfTywr'
browser.get(SpotifyGlobViralurl)
page = browser.page_source
soup = BeautifulSoup(page)
#the div contents exist in an iframe, so now we call the iframe contents of the 3rd iframe on page:
iFrames=[]
iframexx = soup.find_all('iframe')
response = urllib2.urlopen(iframexx[3].attrs['src'])
iframe_soup = BeautifulSoup(response)
divcontents = iframe_soup.find('div', id='main-container')
I am trying to pull the contents of the 'main-container' div but as you will see, it appears empty when stored in the divcontent variable created. However, if you visit the actual URL and inspect the elements, you will find this 'main-container' div statement filled with all of its contents.
I appreciate the help.

That's because it the container is loaded dynamically. I've noticed you are using selenium, you have to continue using it, switch to the iframe and wait for main-container to load:
wait = WebDriverWait(browser, 10)
# wait for iframe to become visible
iframe = wait.until(EC.visibility_of_element_located((By.XPATH, "//iframe[starts-with(#id, 'browse-app-spotify:app:chart:')]")))
browser.switch_to.frame(iframe)
# wait for header in the container to appear
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#main-container #header")))
container = browser.find_element_by_id("main-container")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get div text by id with selenium - python

Related

Switching to a frame and back invalidates WebElement

Instagram Following Selenium Python

Selenium cant locate element inside ::before ::after

How to get iframe source from page_source when the id isn't on the iframe

BeautifulSoup in Python - DIV Contents are not displaying

Categories

Resources