Hey geeks I am new to selenium and automate testing and I am trying to extract Span values(i.e. l e k C N t in below example) from webpage but either it gives empty value or No such element error. Can anyone help in it!
Language: python
Webpage:
<div _ngcontent-serverapp-c6="" class="captcha-letters">
<span _ngcontent-serverapp-c6="">l</span>
<span _ngcontent-serverapp-c6="">e</span>
<span _ngcontent-serverapp-c6="">k</span>
<span _ngcontent-serverapp-c6="">C</span>
<span _ngcontent-serverapp-c6="">N</span>
<span _ngcontent-serverapp-c6="">t</span>
</div>
To print the text from each span element element on the page, this should work:
elements = driver.find_elements_by_css_selector('span')
for element in elements:
print(element.text)
If you need to be more specific and grab only certain span elements, we will need more HTML to go off of.
Related
I have a web page something like:
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link1'/>
<div> SKU#1</div>
</div>
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link2'/>
<div> SKU#2</div>
</div>
<div class='product-pod-padding'>
<a class='header product-pod--ie-fix' href='link3'/>
<div> SKU#3</div>
</div>
When I tried to loop through the products with the following code, it will give us expected outcome:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(product.text)
SKU#1
SKU#2
SKU#3
However, if I try to locate the href of each product, it will only return the first item's link:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
print(product.text)
url=product.find_element_by_xpath("//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
SKU#1
link1
SKU#2
link1
SKU#3
link1
What should I do to get the corrected links?
This should make your code functional:
[...]
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
print(product.text)
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
[..]
The crux here is the dot in front of xpath, which means searching within the element only.
You need to use a relative XPath in order to locate a node inside another node.
//a[#class='header product-pod--ie-fix'] will always return a first match from the beginning of the DOM.
You need to put a dot . on the front of the XPath locator
".//a[#class='header product-pod--ie-fix']"
This will retrieve you a desired element inside a parent element:
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
So, your entire code could be as following:
products=driver.find_elements_by_xpath("//div[#class='product-pod-padding']")
for index, product in enumerate(products):
print(index)
url=product.find_element_by_xpath(".//a[#class='header product-pod--ie-fix']").get_attribute('href')
print(url)
I am including portions of the HTML below. I believe I have found the larger element using the command:
driver.find_element_by_xpath('//div[#id="day-number-4"]')
That div ID is unique on the web page, and the command above did not create an exception nor error.
The HTML code that defines that element is:
<div id="day-number-4" class="c-schedule-calendar__class-schedule-content tabcontent js-class-schedule-content u-is-none u-is-block" data-index="3">
Now, the hard part. Inside that div element are a number of "<li"'s that take the form of:
<li tabindex="0" class="row c-schedule-calendar__class-schedule-listitem-wrapper c-schedule-calendar__workout-schedule-list-item" data-index="0" data-workout-id="205687" data-club-id="229">
and then followed a clickable button in this format:
<button class="c-btn-outlined class-action" data-class-action="book-class" data-class-action-step="class-action-confirmation" data-workout-id="205687" data-club-id="229" data-waitlistable="true"><span class="c-btn__label">Join Waitlist</span></button>
I always want to click on the 3rd button inside that <div element. The only thing unique would be the data-index, which starts at 0 for the 1st <li", and 2 for the 3rd So I want to find the clickable button that will follow this HTML code:
<li tabindex="0" class="row c-schedule-calendar__class-schedule-listitem-wrapper c-schedule-calendar__workout-schedule-list-item" data-index="2" data-workout-id="206706" data-club-id="229">
I cannot search on data-index as "data-index="2"" appears many times on the web page.
How do I do this?
I answered my own question. I used multiple searches within the specific element using this code:
Day_of_Timeslot = driver.find_element_by_xpath('//div[#id="day-number-4"]')
Precise_Timeslot = Day_of_Timeslot.find_element_by_xpath(".//li[#data-index='1']")
Actual_Button = Precise_Timeslot.find_element_by_xpath(".//button[#data-class-action='book-class']").click()
I'm new in Python and Selenium. I have this code:
<div class="Product_ProductInfo__23DMi">
<p style="font-weight: bold;">4.50</p>
<p>Bread</p>
<p>390 g</p>
</div>
I want to get access to the second <p> tag and get its value (I mean Bread).
For the first <p> tag, I used:
self.driver.find_element_by_xpath('//div[#class="Product_ProductInfo__23DMi"]/p')
But I don't know how to get to the other one.
Thanks.
You can do that using find_elements_by_css_selector() function and then selecting the second element of it.
a = self.webdriver.find_element_by_css_selector('div[class="Product_ProductInfo__23DMi"]')
second_p = a.find_elements_by_css_selector('p')[1]
You can use :nth-of-type(<index>) (index starts with 1) which is you can say css property.
a = self.webdriver.find_element_by_css_selector('div[class="Product_ProductInfo__23DMi"] > p:nth-of-type(2)')
I have this html:
<div class="sys_key-facts">
<p>
<strong>Full-time </strong>1 year<br>
<strong>Part-time</strong> 2 years
</p>
</div>
I want to get the value of Part-time(After Part-time), which is:
2 years</p>
My Xpath code is :
//div[#class="sys_key-facts"]/p[strong[contains(text(), "Part-time")]][preceding-sibling::br]/text()
but this is returning empty. Please let me know where I am mistaking.
Thanks
The problem is that p[preceding-sibling::br] means that paragraph has line break sibling while br is actually a child of p - not a sibling
So you can update your XPath as
//div[#class="sys_key-facts"]/p[strong[contains(text(), "Part-time")] and br]/text()[last()]
or try below XPath to get required output:
//div[#class="sys_key-facts"]//strong[.="Part-time"]/following-sibling::text()
for link in data_links:
driver.get(link)
review_dict = {}
# get the size of company
size = driver.find_element_by_xpath('//[#id="EmpBasicInfo"]//span')
#location = ??? need to get this part as well.
my concern:
I am trying to scrape a website. I am using selenium/python to scrape the "501 to 1000 employees" and "Biotech & Pharmaceuticals" from the span, but I am not able to extract the text element from the website using xpath.I have tried getText, get attribute everything. Please, help!
This is the output for each iteration:I am not getting the text value.
Thank you in advance!
It seems you want only the text, instead of interacting with some element, one solution is to use BeautifulSoup to parse the html for you, with selenium getting the code built by JavaScript, you should first get the html content with html = driver.page_source, and then you can do something like:
html ='''
<div id="CompanyContainer">
<div id="EmpBasicInfo">
<div class="">
<div class="infoEntity"></div>
<div class="infoEntity">
<label>Industry</label>
<span class="value">Woodcliff</span>
</div>
<div class="infoEntity">
<label>Size</label>
<span class="value">501 to 1000 employees</span>
</div>
</div>
</div>
</div>
''' # Just a sample, since I don't have the actual page to interact with.
soup = BeautifulSoup(html, 'html.parser')
>>> soup.find("div", {"id":"EmpBasicInfo"}).findAll("div", {"class":"infoEntity"})[2].find("span").text
'501 to 1000 employees'
Or, of course, avoiding specific indexing and looking for the <label>Size</label>, it should be more readable:
>>> [a.span.text for a in soup.findAll("div", {"class":"infoEntity"}) if (a.label and a.label.text == 'Size')]
['501 to 1000 employees']
Using selenium you can do:
>>> driver.find_element_by_xpath("//*[#id='EmpBasicInfo']/div[1]/div/div[3]/span").text
'501 to 1000 employees'